Is it OK to create very large tuples in Python?
I have a quite large list (>1K elements) of objects of the same type in my Python program. The list is never modified - no elements are added, removed or changed. Are there any downside to putting the objects into a tuple instead of a list? On the one hand, tuples are immutable so that matches my requirements. On the other hand, using such a large tuple just feels wrong. In my mind, tuples has always been for small collections. It's a double, a tripple, a quadruple... Not a two-thousand-and-fiftyseven-duple. Is my fear of large tuples somehow justified? Is it bad for performance, unpythonic, or otherwise bad practice?
In CPython, go ahead. Under the covers, the only real difference between the storage of lists and tuples is that the C-level array holding the tuple elements is allocated in the tuple object, while a list object contains a pointer to a C-level array holding the list elements, which is allocated separately from the list object. The list implementation needs to do that because the list may grow, and so the memory containing the C-level vector may need to change its base address. A tuple can't change size, so the memory for it is allocated directly in the tuple object. I've created tuples with millions of elements, and yet I lived to type about it ;-) Obscure In CPython, there can even be "a reason" to prefer giant tuples: the cyclic garbage collection scheme exempts a tuple from periodic scanning if it only contains immutable objects. Then the tuple can never be part of a cycle, so cyclic gc can ignore it. The same optimization cannot be used for lists; just because a list contains only immutable objects during one run of cyclic gc says nothing about whether that will still be the case during the next run. This is almost never highly significant, but it can save a percent or so in a long-running program, and the benefit of exempting giant tuples grows the bigger they are.
Yes, it is OK. However, depending on the operations you're doing, you might want to consider using the set function in Python. This will convert your input iterable (tuple, list, or other) to a set. Sets are nice for a few reasons, but especially because you get a unique list of items that has constant time lookup for items. There's nothing "un-pythonic" about holding large data sets in memory, though.
Function sameChars doesn't work properly
split more than one files tkinter frame
Pandas custom date interval/frequency
For loop or while loop?
Is there a way to broadcast a message to all (or filtered) WebSocket clients connected to a WebSocket server? [duplicate]
Delete Python module from disk after import while keeping it available in memory?
Create with imshow the same plot as pcolormesh [duplicate]
realitive import inside zipimport python
Google App Engine unable to find dev_appserver.py
reading last line of txt file in python and change it into variable to make calculation
What is the best way to merge multiple dictionaries?
Python 2.7: When importing into dataframe, I get IO error 'file does not exist', even when I provide absolute path
sphinx documentation: split a python source into sections, using autodoc
Installing dependencies of debian/control file
Why does Python's SysLogHandler require an address?
How to convert a .pptx to .pdf using Python