Is it OK to create very large tuples in Python?
I have a quite large list (>1K elements) of objects of the same type in my Python program. The list is never modified - no elements are added, removed or changed. Are there any downside to putting the objects into a tuple instead of a list? On the one hand, tuples are immutable so that matches my requirements. On the other hand, using such a large tuple just feels wrong. In my mind, tuples has always been for small collections. It's a double, a tripple, a quadruple... Not a two-thousand-and-fiftyseven-duple. Is my fear of large tuples somehow justified? Is it bad for performance, unpythonic, or otherwise bad practice?
In CPython, go ahead. Under the covers, the only real difference between the storage of lists and tuples is that the C-level array holding the tuple elements is allocated in the tuple object, while a list object contains a pointer to a C-level array holding the list elements, which is allocated separately from the list object. The list implementation needs to do that because the list may grow, and so the memory containing the C-level vector may need to change its base address. A tuple can't change size, so the memory for it is allocated directly in the tuple object. I've created tuples with millions of elements, and yet I lived to type about it ;-) Obscure In CPython, there can even be "a reason" to prefer giant tuples: the cyclic garbage collection scheme exempts a tuple from periodic scanning if it only contains immutable objects. Then the tuple can never be part of a cycle, so cyclic gc can ignore it. The same optimization cannot be used for lists; just because a list contains only immutable objects during one run of cyclic gc says nothing about whether that will still be the case during the next run. This is almost never highly significant, but it can save a percent or so in a long-running program, and the benefit of exempting giant tuples grows the bigger they are.
Yes, it is OK. However, depending on the operations you're doing, you might want to consider using the set function in Python. This will convert your input iterable (tuple, list, or other) to a set. Sets are nice for a few reasons, but especially because you get a unique list of items that has constant time lookup for items. There's nothing "un-pythonic" about holding large data sets in memory, though.
How to calculate the mean of a column by decade in Python
How to read webpages that are without .htm* extension using Python?
can I use python's 'socket' module to connect to a wireless ethernet host?
Requests VS Urllib 2 [closed]
Call python class from another Python script
How can I use BeautifulSoup to get a few contents that comes after a specific text on a page?
Reinstall python 2.7.12 and python 3.5.2
How to keep chrome browser window open after selenium script finishes on python
Outlook email attachment downloader (Date range)
Create array based on conditional logic of values in other arrays in Python
Python qt - TableWidget update MySQL
Getting black plots with plt.imshow after multiplying image array by a scalar
Updating a value in a Pandas dataframe seems to update all dataframes
4 entry box nummeric keypad
os.walk for loop not executing [duplicate]
python matplotlib polar plot