python


How do I initialize a Counter from a list of key-value pairs?


If I have a sequence of (key, value) pairs, I can quickly initialize a dictionary like this:
>>> data = [ ('a', 1), ('b', 2) ]
>>> dict(data)
{'a': 1, 'b': 2}
I would like to do the same with a Counter dictionary; but how? Both the constructor and the update() method treat the ordered pairs as keys, not key-value pairs:
>>> from collections import Counter
>>> Counter(data)
Counter({('a', 1): 1, ('b', 2): 1})
The best I could manage was to use a temporary dictionary, which is ugly and needlessly circuitous:
>>> Counter(dict(data))
Counter({'b': 2, 'a': 1})
Is there a proper way to directly initialize a Counter from a list of (key, count) pairs? My use case involves reading lots of saved counts from files (with unique keys).
I would just do a loop:
for obj, cnt in [ ('a', 1), ('b', 2) ]:
counter[obj] = cnt
You could also just call the parent dict.update method:
>>> from collections import Counter
>>> data = [ ('a', 1), ('b', 2) ]
>>> c = Counter()
>>> dict.update(c, data)
>>> c
Counter({'b': 2, 'a': 1})
Lastly, there isn't anything wrong with your original solution:
Counter(dict(list_of_pairs))
The expensive part of creating dictionaries or counters is hashing all of the keys and doing periodic resizes. Once the dictionary is made, converting it to a Counter is very cheap about as fast as a dict.copy(). The hash values are reused and the final Counter hash table is pre-sized (no need for resizing).
From docs:
Elements are counted from an iterable or initialized from another mapping (or counter)
So it's a No, you need to convert it to mapping and then initialize Counter. And Yes when you initialized with dict it was the right move.
UPDATE
I agree that #RaymondHettinger code looks good, and actually it's faster
from collections import Counter
from random import choice
from string import ascii_letters
a=[(choice(ascii_letters), i) for i in range(100)]
Tested with Python 3.6.1 and IPython 6
Initialization with dict:
%%timeit
c1=Counter(dict(a))
Output
12.1 µs ± 342 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Update with dict.update()
%%timeit
c2=Counter()
dict.update(c2, a)
Output:
7.21 µs ± 236 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
If your list of keys in the (key, value) pairs are already unique -- no duplicates -- you can use Raymond Hettinger's great solution.
Beware though you only get the last value for any given key if there are duplicate keys:
>>> data=[ ('a', 1), ('b', 2), ('a', 3), ('b', 4) ]
>>> c=Counter()
>>> dict.update(c, data)
>>> c
Counter({'b': 4, 'a': 3}) # note 'a' and 'b' are only the last value...
Same with dict:
>>> Counter(dict(data))
Counter({'b': 4, 'a': 3})
But Counters are most often used to count totals including of duplicates. If you want the sum of 'a' and 'b' entries, you need to loop over all the pairs:
>>> c=Counter()
>>> for k, v in data:
... c[k]+=v
...
>>> c
Counter({'b': 6, 'a': 4}) # the sum of the 'k' entries given 'v'

Related Links

Modifying and rewriting .csv files in Python
How do I scroll to a certain widget in a QScrollArea
NoSuchKey when getting a signed url for a cloudstorage object with a space in the name
Not sure how to parse this
Error drawing 3D graph in python
How to change objects in a python script by using a keyboardinterrupt for specific keys?
Arguments to an object's parent's function
python regex ignoring underscore incorrectly
Will installing Anaconda3 change Mac OS X default Python version to 3.4?
Devices Labels. Python Code Debugging
Adding list of values to rows, turning the dataframe into long format afterwards
bottle_mysql Encoding failure
What exactly does the {'page': 1} mean here? “BuildError: ('main.user_profile', {'page': 1}, None)”
Saving the results as LUT
Python 2.7 : Pytz : ImportError: cannot import name timezone
Python format UnicodeDecodeError

Categories

HOME
pug
ibm-watson-cognitive
answer-set-programming
webrtc
nuxeo
concourse
azureservicebus
ll
phonegap-cli
snap.svg
jframe
xcode8.3
shipping
pca
window
jboss-eap-7
ssr
gwtp
store
equalizer
react-leaflet
r-lavaan
mule-studio
jsonserializer
flexboxgrid
tcpclient
rworldmap
searchbar
yosys
unpack
pptp
edge-detection
superscript
ivy
picturebox
helix-3d-toolkit
nunit-3.0
bing-maps-api
vmd
codesys
freetts
distributed-transactions
dojox.mobile
service-fabric-stateful
fastq
3scale
automator
livefyre
gauss
istorage
chrome-remote-desktop
adler32
svn-merge
pspice
build-process
libtiff.net
master
xib
debugdiag
chessboard.js
ruby-2.0
visible
spoofing
unsatisfiedlinkerror
database-backups
achievements
selecteditem
approval-tests
insertion-sort
hateoas
hyperthreading
hornetq
code-readability
response-headers
jazz
code-first-migrations
robocode
google-earth-plugin
pgagent
java-collections-api
ember-addon
level
initialization-vector
flash-cc
json-patch
winrt-httpclient
windows-mobile-6
ng-hide
dynamic-binding
haskell-platform
lame
cgimageref
qss
snapjs
domain-calculus
gdt
emacs23
throttling
coff
zend-form-element
mvccontrib-grid
interface-design

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App