python


Celery chunks large data set


I'm trying to use celery's chunks functionality to divide my iterable dataset into pieces, which is then sent to a celery task for further processing.
I have a query_set that I got from making the following sqlalchemy call
query_set = MyModel.query.join(OtherModel).all())
Currently, query_set is a list of tuples. The lenth of query_results is at 40,000 and growing.
I have another function (celery task) that crunches the data in query_set, whose definition is
#celery_app.task
def crunch_qs(query_set):
. . .
. . .
Since query_set is a list of tuples, I figured I could pass it directly to crunch_qs like this
crunched_qs = crunch_qs.chunks(query_set, 5000)()
results = crunched_qs.get()
That did not work. It gave me an unexpected result. It was unpacking the items in each query_set's tuple and sending them to crunch_qs.
So crunch_qs would receive **query_set[0] on first iteration, which raised the following error
TypeError: crunch_qs() takes exactly 1 argument (10 given)
len(query_set[0]) = 10
I also tried..
crunched_qs = crunch_qs.chunks((row,) for row in query_set, 5000)()
results = crunched_qs.get()
That worked a little better. The TypeError went away. However, my crunch_qs function is now getting each row (tuple) as a parameter instead of a list of tuples whose length is 5000.
Any help/ideas on how to pass a list of tuples to celery chunks would be highly appreciated.
Thanks in advance

Related Links

dictionary variable formatting in Python's Mysqldb
How to see print output from generator before ending the cycle?
Python script- need help understanding this while loop [closed]
Reset weights of a pretrained incetion_v3 model in Tensorflow
Trying to connect to FTP site but getting wrong SSL version number error?
Bokeh plot conditional background color
module installation - Mock
Best Way to create a bounding box for object detection
Read csv from Amazon s3 using python2.7
Could someone explain this Python import error? Anaconda 3.5
Python Tor server. stem/flask not showing images in tor
Python3 Socket / Selector register more than one event
iPython sent as background process when bash script executed as subprocess
Matplotlib xticks as days
Renaming tuple column name in dataframe
Jupyter Notebook Toggle Scrolling using Magic Command

Categories

HOME
jdbc
jenkins-plugins
erlang
project-management
websphere
winapi
ncurses
agile
smartphone
travis-ci
apache2
specflow
tizen-wearable-sdk
data-science-experience
oclint
image-recognition
x-frame-options
reduction
riak-ts
ios10.3
mousewheel
sql-update
statusbar
fop
piwik
nmf
cronet
shopping-cart
onesignal
cep
fetch-api
cgal
pitest
websauna
gpib
piecewise
strophe
exponential
web-audio-api
nunit-3.0
shibboleth
nlb
parse-android-sdk
concrete5-5.7
ntfs-mft
soundjs
intellij-idea-2016
rdw
component-pascal
forerunnerdb
tofixed
butterknife
jcreator
emoticons
keycode
atlassian-crowd
alter
packagemaker
glkit
try-finally
nofollow
spring-lemon
elastix
gradle-release-plugin
logcat
distribute
enyo
app.xaml
site-prism
matcaffe
apigee-baas
mptcp
microblaze
ibmsbt
x-ua-compatible
sortable
author
angulartics
trimming
ng-hide
surveyor-gem
shortcuts
adomd.net
angularjs-timeout
css-tables
cadisplaylink
continuous-testing
wcf-web-api
expression-evaluation
file-exists
ruby-debug
adobe-contribute
pantheios
hadoop-plugins
web-statistics
method-signature
evb
document-library

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App