python


Celery chunks large data set


I'm trying to use celery's chunks functionality to divide my iterable dataset into pieces, which is then sent to a celery task for further processing.
I have a query_set that I got from making the following sqlalchemy call
query_set = MyModel.query.join(OtherModel).all())
Currently, query_set is a list of tuples. The lenth of query_results is at 40,000 and growing.
I have another function (celery task) that crunches the data in query_set, whose definition is
#celery_app.task
def crunch_qs(query_set):
. . .
. . .
Since query_set is a list of tuples, I figured I could pass it directly to crunch_qs like this
crunched_qs = crunch_qs.chunks(query_set, 5000)()
results = crunched_qs.get()
That did not work. It gave me an unexpected result. It was unpacking the items in each query_set's tuple and sending them to crunch_qs.
So crunch_qs would receive **query_set[0] on first iteration, which raised the following error
TypeError: crunch_qs() takes exactly 1 argument (10 given)
len(query_set[0]) = 10
I also tried..
crunched_qs = crunch_qs.chunks((row,) for row in query_set, 5000)()
results = crunched_qs.get()
That worked a little better. The TypeError went away. However, my crunch_qs function is now getting each row (tuple) as a parameter instead of a list of tuples whose length is 5000.
Any help/ideas on how to pass a list of tuples to celery chunks would be highly appreciated.
Thanks in advance


Related Links

Is it normal for pygtk library download with macports to take a long time and print 33000 lines to terminal?
List out of range
Getting AttributeError Message when trying to close a file
sklearn LinearRegression reports error
Yapsy throws TypeError on init, missing arguments on init
Using or avoiding a class for credentials in a Python API; what is most Pythonic?
Synchronize pool of workers - Python and multiproccessing
Python: Matching/Substitution on multiple patterns
getting UnboundLocalError: local variable referenced before assignment error
405 error with AJAX for submission
Parsing data for xml file in python
Feed google charts custom properties like color through gviz_api
combining python with fortran, trouble with tutorial
unable to deploy portia spider with scrapyd-deploy
How do I get a function inside of a while loop in a function to work properly?
Celery / RabbitMQ / Django not running tasks

Categories

HOME
system-verilog
webrtc
translation
sbt-assembly
docker-windows
rfid
data-synchronization
game-physics
cpanel
gwtp
ejs
titan
derived
zend-framework-mvc
ejbca
mousewheel
android-fragmentactivity
uiautomator
rails-activerecord
cruisecontrol.net
bcrypt
http-method
scalajs-react
samsung-mobile
croppic
compare-and-swap
version-numbering
dss
hana-studio
fileinfo
pep8-assembly
superscript
gpib
vegan
ioc-container
piecewise
hotmail
phpspreadsheet
passenger
windows-10-iot-core
wfp
office365connectors
android-download-manager
parse-android-sdk
issue-tracking
magiczoomplus
nashorn
android-maps-v2
tango
amd
python-idle
tuleap
gd
broadcast
livescribe
react-native-fbsdk
opencpu
addin-express
pdfkit
barcode-printing
backstop.js
slam-algorithm
excon
scalar
jackson-databind
listadapter
shipitjs
squirrel
aerogear
approval-tests
void
throughput
getrusage
built-in
mogrify
risk-analysis
ffserver
rx-groovy
pyopengl
lines-of-code
google-earth-plugin
rhel5
and-operator
ogr2ogr
adaptive-compression
neoload
.net-cf-3.5
robospice
ng-hide
swrl
html-form-post
fraud-prevention
pep8
htmltextwriter
gdataxml
cgpath
symfony-2.0
deploying
unions
hobo
viewdidload
querystringparameter
zookeeper
windows-live-id
gethashcode
svn-hooks
osx-leopard
visual-studio-2010-beta-2
nt4
odbc-sql-server-driver





Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm