python


Is it OK to create very large tuples in Python?


I have a quite large list (>1K elements) of objects of the same type in my Python program. The list is never modified - no elements are added, removed or changed. Are there any downside to putting the objects into a tuple instead of a list?
On the one hand, tuples are immutable so that matches my requirements. On the other hand, using such a large tuple just feels wrong. In my mind, tuples has always been for small collections. It's a double, a tripple, a quadruple... Not a two-thousand-and-fiftyseven-duple.
Is my fear of large tuples somehow justified? Is it bad for performance, unpythonic, or otherwise bad practice?
In CPython, go ahead. Under the covers, the only real difference between the storage of lists and tuples is that the C-level array holding the tuple elements is allocated in the tuple object, while a list object contains a pointer to a C-level array holding the list elements, which is allocated separately from the list object. The list implementation needs to do that because the list may grow, and so the memory containing the C-level vector may need to change its base address. A tuple can't change size, so the memory for it is allocated directly in the tuple object.
I've created tuples with millions of elements, and yet I lived to type about it ;-)
Obscure
In CPython, there can even be "a reason" to prefer giant tuples: the cyclic garbage collection scheme exempts a tuple from periodic scanning if it only contains immutable objects. Then the tuple can never be part of a cycle, so cyclic gc can ignore it. The same optimization cannot be used for lists; just because a list contains only immutable objects during one run of cyclic gc says nothing about whether that will still be the case during the next run.
This is almost never highly significant, but it can save a percent or so in a long-running program, and the benefit of exempting giant tuples grows the bigger they are.
Yes, it is OK.
However, depending on the operations you're doing, you might want to consider using the set function in Python. This will convert your input iterable (tuple, list, or other) to a set. Sets are nice for a few reasons, but especially because you get a unique list of items that has constant time lookup for items.
There's nothing "un-pythonic" about holding large data sets in memory, though.

Related Links

Function sameChars doesn't work properly
split more than one files tkinter frame
Pandas custom date interval/frequency
For loop or while loop?
Is there a way to broadcast a message to all (or filtered) WebSocket clients connected to a WebSocket server? [duplicate]
Delete Python module from disk after import while keeping it available in memory?
Create with imshow the same plot as pcolormesh [duplicate]
realitive import inside zipimport python
Google App Engine unable to find dev_appserver.py
reading last line of txt file in python and change it into variable to make calculation
What is the best way to merge multiple dictionaries?
Python 2.7: When importing into dataframe, I get IO error 'file does not exist', even when I provide absolute path
sphinx documentation: split a python source into sections, using autodoc
Installing dependencies of debian/control file
Why does Python's SysLogHandler require an address?
How to convert a .pptx to .pdf using Python

Categories

HOME
makefile
jsf
ember.js
batch-processing
app-inventor
tinyos
wsdl
uibutton
translation
otrs
tomcat7
csvhelper
sbt-assembly
recyclerview
specflow
proguard
internet-explorer-11
eval
currency
contact-form-7
diagram
x11
dryioc
dlib
scenebuilder
android-toolbar
sonarqube-msbuild-runner
countif
piwik
extractor
myob
jflex
vertex-buffer
backup-strategies
pdflatex
geomesa
forecasting
zero
declare
silverlight-3.0
jquery-multidatespicker
alchemy.js
socketscan
tripwire
bing-translator-api
ivona
dojox.grid.datagrid
openweathermap
proof
applepayjs
spring-data-hadoop
quintus
in-memory-database
iphone-developer-program
polar-coordinates
vici
project-template
rmongodb
software-product-lines
brython
fps
project-online
distribute
inet
amazon-kcl
interactive-brokers
xmp
license-key
eyeql
wikitext
risk-analysis
debian-based
subversion-edge
viewflipper
pgagent
iostream
system.web
voldemort
slick-2.0
jstack
opensocial
internal
logentries
installshield-2009
multiple-conditions
xml-libxml
domain-calculus
phpsh
dynamic-data
bulbs
linkedhashset
javascriptserializer
parameterization
responsetext
jpf
objective-c-protocol
supersized
nintendo
fileutils
paintcomponent
gtk2hs
rubycas
dashcode
swfloader
lazy-c++
longjmp
opcodes
database-diagramming

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App