Celery chunks large data set
I'm trying to use celery's chunks functionality to divide my iterable dataset into pieces, which is then sent to a celery task for further processing. I have a query_set that I got from making the following sqlalchemy call query_set = MyModel.query.join(OtherModel).all()) Currently, query_set is a list of tuples. The lenth of query_results is at 40,000 and growing. I have another function (celery task) that crunches the data in query_set, whose definition is #celery_app.task def crunch_qs(query_set): . . . . . . Since query_set is a list of tuples, I figured I could pass it directly to crunch_qs like this crunched_qs = crunch_qs.chunks(query_set, 5000)() results = crunched_qs.get() That did not work. It gave me an unexpected result. It was unpacking the items in each query_set's tuple and sending them to crunch_qs. So crunch_qs would receive **query_set on first iteration, which raised the following error TypeError: crunch_qs() takes exactly 1 argument (10 given) len(query_set) = 10 I also tried.. crunched_qs = crunch_qs.chunks((row,) for row in query_set, 5000)() results = crunched_qs.get() That worked a little better. The TypeError went away. However, my crunch_qs function is now getting each row (tuple) as a parameter instead of a list of tuples whose length is 5000. Any help/ideas on how to pass a list of tuples to celery chunks would be highly appreciated. Thanks in advance
Is it normal for pygtk library download with macports to take a long time and print 33000 lines to terminal?
List out of range
Getting AttributeError Message when trying to close a file
sklearn LinearRegression reports error
Yapsy throws TypeError on init, missing arguments on init
Using or avoiding a class for credentials in a Python API; what is most Pythonic?
Synchronize pool of workers - Python and multiproccessing
Python: Matching/Substitution on multiple patterns
getting UnboundLocalError: local variable referenced before assignment error
405 error with AJAX for submission
Parsing data for xml file in python
Feed google charts custom properties like color through gviz_api
combining python with fortran, trouble with tutorial
unable to deploy portia spider with scrapyd-deploy
How do I get a function inside of a while loop in a function to work properly?
Celery / RabbitMQ / Django not running tasks