You are on page 1of 19

Python Dictionary

Dictionary List Set Tuple Array Class Compression Console Convert Datetime Def Exception File Generator If JIT Lambda Loop Map Math None Number Path Print Re Sort Split String Urllib A dictionary optimizes element lookups. It associates keys to values. Each key must have a value. In Python, dictionary is a core type. It has use in many programs. We create, mutate and performance-tune dictionaries.

Go

Python

Get
There are many ways to get values from a dictionary. The shortest way is to use the "[" and "]" characters. We access a value directly this way. But this syntax causes a KeyError if the key is not found. KeyError Instead: We can use the get() method with one or two arguments. This causes no error if a key is not found. It returns None. Argument 1: The first argument to get() is the key you are testing.

This argument is required. Argument 2: The second, optional argument to get() is the default value. This is returned if the key is not found.
B a s e do n : P y t h o n3 . 2 . 4 P r o g r a mt h a tg e t sv a l u e s :P y t h o n p l a n t s={ } #A d dt h r e ek e y v a l u et u p l e st ot h ed i c t i o n a r y p l a n t s [ " r a d i s h " ]=2 p l a n t s [ " s q u a s h " ]=4 p l a n t s [ " c a r r o t " ]=7 #G e ts y n t a x1 p r i n t ( p l a n t s [ " r a d i s h " ] ) #G e ts y n t a x2 p r i n t ( p l a n t s . g e t ( " t u n a " ) ) p r i n t ( p l a n t s . g e t ( " t u n a " ," n ot u n af o u n d " ) ) O u t p u t 2 N o n e n ot u n af o u n d

The second argument to get is sometimes important. It is valid to assign a key to None. In this case, the return value of get() with one argument is ambiguous. Either the key does not exist, or it exists with a value of None. But: With get() and a second argument, we can use a special value. We can test this to see

if the key does not exist.

In
A Python dictionary may or may not contain a specific key. We need to test this. One way to do so is with the in-keyword. This keyword returns 1 (meaning true) if the key exists as part of a key-value tuple in the dictionary. And: If the key does not exist, the in-keyword returns 0, indicating false. This is helpful in an if-statement. Next: In this example, we test one key that happens to already exist ("tuna"). And then we test another key that does not ("elephant").
P r o g r a mt h a tu s e si n :P y t h o n a n i m a l s={ } a n i m a l s [ " m o n k e y " ]=1 a n i m a l s [ " t u n a " ]=2 a n i m a l s [ " g i r a f f e " ]=4 #U s ei n i f" t u n a "i na n i m a l s : p r i n t ( " H a st u n a " ) e l s e : p r i n t ( " N ot u n a " ) #U s ei no nn o n e x i s t e n tk e y i f" e l e p h a n t "i na n i m a l s : p r i n t ( " H a se l e p h a n t " ) e l s e : p r i n t ( " N oe l e p h a n t " ) O u t p u t H a st u n a N oe l e p h a n t

In computer science, predicate methods return true (1) or false (0). The in-keyword on dictionaries in Python is a predicate method. It is used for testing. It does not mutate the object's state. Also: The get() method can instead be used to test for a key's existence. Please see the get() section on this page.

Len
The len() method can be used on a dictionary. This returns the number of keyvalue tuples contained within. It doesn't matter what data types the keys and values are. Len also works on lists and strings. List Caution: The length returned for the dictionary does not separately consider keys and values. Each pair adds one to the length.
P r o g r a mt h a tu s e sl e no nd i c t i o n a r y :P y t h o n a n i m a l s={ " p a r r o t ":2 ," f i s h ":6 } #T h el e nm e t h o dr e t u r n st h ec o u n to fk e y v a l u et u p l e s p r i n t ( " L e n g t h : " ,l e n ( a n i m a l s ) ) O u t p u t L e n g t h :2

As you may know, len() can be used on other data types in Python as well. For example, it acts upon a list, returning the number of elements within. And it handles tuples as well. Note: Calling len() on a dictionary directly is more efficient than calling keys(), values() or items() and calling len() on that. Because: Using len() directly requires fewer steps. And it involves less indirection.

Keys, values
A dictionary contains keys and values. By default these keys and values cannot be stored in a list. But with the keys() and values() methods, we can store these elements in a list. Next: A dictionary of three key-value pairs is created. This dictionary could be used to store hit counts on a website's pages. Then: We introduce two list variables, named keys and values. These are separate from the keys() and values() method calls. The keys list contains three values, as the len() method result indicates. The three strings it contains are "home", "sitemap" and "about". The values list also contains three elements.

These are the numbers 125, 27 and 43. Strings


P r o g r a mt h a tu s e sk e y s :P y t h o n h i t s={ " h o m e ":1 2 5 ," s i t e m a p ":2 7 ," a b o u t ": 4 3 } k e y s=h i t s . k e y s ( ) v a l u e s=h i t s . v a l u e s ( ) p r i n t ( " K e y s : " ) p r i n t ( k e y s ) p r i n t ( l e n ( k e y s ) ) p r i n t ( " V a l u e s : " ) p r i n t ( v a l u e s ) p r i n t ( l e n ( v a l u e s ) ) O u t p u t K e y s : d i c t _ k e y s ( [ ' h o m e ' ,' a b o u t ' ,' s i t e m a p ' ] ) 3 V a l u e s : d i c t _ v a l u e s ( [ 1 2 5 ,4 3 ,2 7 ] ) 3

In performance analysis, there is no benefit to separating the keys and values into lists. Instead, using the dictionary directly requires fewer objects. This speeds up Python programs. But: Pulling the keys and values into lists does provide a way to sort the elements in a dictionary. This can be done with sort(). Note: Alternatively, the reverse() method, with or without sort(), can be employed here. The elements returned by keys() and values() are not ordered. In the above script

output, the keys list is not alphabetically sorted. It is not even in the same order it was specified in the program. Also: The values elements are similarly without order. Looking at the program, the keys match with the values in the list indices.

Sorted keys
Keys in a dictionary are not sorted in any specific order. They are unordered because of how hashing algorithms work. This optimizes performance. But sometimes we need to sort keys in a dictionary. Tip: This can be done by using another method, sorted(), on the keys. This creates a sorted view. In this example, we acquire the keys list with keys(). It then calls the sorted() method, on the same line, on the keys variable. When we display the contents of the keys list, we find it is alphabetically ordered.
P r o g r a mt h a ts o r t sk e y si nd i c t i o n a r y :P y t h o n #S a m ea sp r e v i o u sp r o g r a m h i t s={ " h o m e ":1 2 4 ," s i t e m a p ":2 6 ," a b o u t ": 3 2 } #S o r tt h ek e y sf r o mt h ed i c t i o n a r y k e y s=s o r t e d ( h i t s . k e y s ( ) ) p r i n t ( k e y s )

O u t p u t [ ' a b o u t ' ,' h o m e ' ,' s i t e m a p ' ]

Items
A dictionary also provides the items() method. With items(), we receive a list of two-element tuples. Each tuple contains, as its first element, the key. And its second element is the value. Tip: With tuples, we can address the first element with an index of zero. And the second element has the index of one. Tuple The program also uses a for-loop on the items() list. It uses the print() method with two arguments. These are placed on the same line. In this way we achieve imperative looping over the elements.
P r o g r a mt h a tu s e si t e m sm e t h o d :P y t h o n r e n t s={ " a p a r t m e n t ":1 0 0 0 ," h o u s e ":1 3 0 0 } #C o n v e r tt ol i s to ft u p l e s r e n t I t e m s=r e n t s . i t e m s ( ) #L o o pa n dd i s p l a yt u p l ei t e m s f o rr e n t I t e mi nr e n t I t e m s : p r i n t ( " P l a c e : " ,r e n t I t e m [ 0 ] ) p r i n t ( " C o s t : " ,r e n t I t e m [ 1 ] ) p r i n t ( " " ) O u t p u t P l a c e :h o u s e C o s t :1 3 0 0 P l a c e :a p a r t m e n t C o s t :1 0 0 0

Please note that you cannot assign the elements in the tuples. If you try to assign rentItem[0] or rentItem[1], you will get an error. This is the result from the Python interpreter.
P y t h o ne r r o r : T y p e E r r o r :' t u p l e 'o b j e c td o e sn o ts u p p o r ti t e ma s s i g n m e n t

Items, unpack
The items() list can be used in another for-loop syntax. We can unpack the two parts of each tuple in items() directly in the for-loop. In this example, we use the identifier "k" for the key, and "v" for the value.
P r o g r a mt h a tu n p a c k si t e m s :P y t h o n #C r e a t ead i c t i o n a r y . d a t a={ " a ":1 ," b ":2 ," c ":3 } #L o o po v e ri t e m sa n du n p a c ke a c hi t e m . f o rk ,vi nd a t a . i t e m s ( ) : #D i s p l a yk e ya n dv a l u e . p r i n t ( k ,v ) O u t p u t a1 c3 b2

For-loop
Next, a dictionary in Python can be directly looped over with a for-loop. This accesses only the keys in the dictionary. To get a value, you will need to look up the value

through the dictionary again. Items: You can call the items() method to get a list of tuples. No extra hash lookups will be needed to access values.
P r o g r a mt h a tl o o p so v e rd i c t i o n a r y :P y t h o n p l a n t s={ " r a d i s h ":2 ," s q u a s h ":4 ," c a r r o t ":7 } #L o o po v e rd i c t i o n a r yd i r e c t l y . #. . . T h i so n l ya c c e s s e sk e y s . f o rp l a n ti np l a n t s : p r i n t ( p l a n t ) O u t p u t r a d i s h c a r r o t s q u a s h

We see that the plant variable, in the forloop, contains only the key. The value is not available. We would need to call plants.get(plant) to get the value from the key in the for-loop body.

Del
How can we remove a key-value tuple from a dictionary? We apply the del method to a dictionary entry. This is done with the "[" and "]" syntax form. In this program, we initialize a dictionary with three key-value tuples. Then: We remove the tuple with key "windows". When we display the dictionary, it now contains only two key-value pairs.
P r o g r a mt h a tu s e sd e ld i c t i o n a r y :P y t h o n s y s t e m s={ " m a c ":1 ," w i n d o w s ":5 ," l i n u x ":1 }

#R e m o v ek e y v a l u ea tw i n d o w s d e ls y s t e m s [ " w i n d o w s " ] #D i s p l a yd i c t i o n a r y p r i n t ( s y s t e m s ) O u t p u t { ' m a c ' :1 ,' l i n u x ' :1 }

An alternative to using del on a dictionary is to change the key's value to a special value. This is similar to the null object refactoring strategy. We can keep the key but mutate its value to a known null value. Note: This will have different performance characteristics. It may be better or worse depending on the program.

Update
Sometimes we have a second dictionary that contains different, or additional, values. With update(), we change the first dictionary to have new values from the second dictionary. This modifies also existing values. Let's start with this simple example Python script. It creates two dictionaries named pets1 and pets2. The pets2 dictionary has a different value for the dog keyit has the value "animal", not "canine". Also: The pets2 dictionary contains a new keyvalue pair. In this pair the key is "parakeet"

and the value is "bird". The program calls the update() method on the pets1 dictionary instance. The argument to invoke is the pets2 dictionary. And the two print statements display the contents of each dictionary. Console
P r o g r a mt h a tu s e su p d a t e ,d i c t i o n a r y :P y t h o n #F i r s td i c t i o n a r y p e t s 1={ " c a t " :" f e l i n e " ," d o g " :" c a n i n e " } #S e c o n dd i c t i o n a r y p e t s 2={ " d o g " :" a n i m a l " ," p a r a k e e t " :" b i r d " } #U p d a t ef i r s td i c t i o n a r yw i t hs e c o n d p e t s 1 . u p d a t e ( p e t s 2 ) #D i s p l a yb o t hd i c t i o n a r i e s p r i n t ( p e t s 1 ) p r i n t ( p e t s 2 ) O u t p u t { ' p a r a k e e t ' :' b i r d ' ,' d o g ' :' a n i m a l ' ,' c a t ' :' f e l i n e ' } { ' d o g ' :' a n i m a l ' ,' p a r a k e e t ' :' b i r d ' }

Let's examine the first output linethe contents of pets1. The value of "dog" is now set to "animal". This means the original value in pets1, "canine", is lost. The new value in pets2 replaced it. Also: The key-value pair from pets2 only, with the key "parakeet", was added to the pets1 dictionary. So: Existing values are replaced with new values that match.

New values are added if no matches exist.

Copy
Another dictionary method is copy. This method performs a shallow copy of the entire dictionary. Every key-value tuple in the dictionary is copied. This is different from simply creating a new variable reference. In this program, we create a copy of the original dictionary. And then we modify some of the values within the copy. When we display the two dictionaries, we see that the original dictionary was not modified. Instead: Only the memory of the new, copied dictionary reflects the changes we made.
P r o g r a mt h a tu s e sc o p y :P y t h o n o r i g i n a l={ " b o x ":1 ," c a t ":2 ," a p p l e ":5 } #C r e a t ec o p yo fd i c t i o n a r y m o d i f i e d=o r i g i n a l . c o p y ( ) #C h a n g ec o p yo n l y m o d i f i e d [ " c a t " ]=2 0 0 m o d i f i e d [ " a p p l e " ]=9 #O r i g i n a li ss t i l lt h es a m e p r i n t ( o r i g i n a l ) p r i n t ( m o d i f i e d ) O u t p u t { ' b o x ' :1 ,' a p p l e ' :5 ,' c a t ' :2 } { ' b o x ' :1 ,' a p p l e ' :9 ,' c a t ' :2 0 0 }

Fromkeys
Many keys can be added to a dictionary with the same value. The fromkeys() method receives a sequence of keys, such as a list. It creates a dictionary with each of those keys. We can specify a value as the second argument. Values: If you specify the second argument to fromdict(), each key has that value in the newly-created dictionary.
P r o g r a mt h a tu s e sd i c t . f r o m k e y s :P y t h o n #Al i s to fk e y s . k e y s=[ " b i r d " ," p l a n t " ," f i s h " ] #C r e a t ed i c t i o n a r yf r o mk e y s . d=d i c t . f r o m k e y s ( k e y s ,5 ) #D i s p l a y . p r i n t ( d ) O u t p u t { ' p l a n t ' :5 ,' b i r d ' :5 ,' f i s h ' :5 }

Memoize
One classic computer programming optimization is called memoization. And this optimization can be implemented easily with a dictionary in Python. In memoization, a function (def) computes

normally its result. Memoize And: Once the computation is done, it stores its result in a cache. In the cache, the argument is the key. And the result is the value. Def When a memoized function is called, it first checks this cache to see if it has been, with this argument, run before. And if it has, it returns simply its cachedmemoizedreturn value. Therefore: No further computations need be done. The savings are greater when the function takes longer to run initially. Also, the number and range of arguments is important. If a function is only called once with a certain argument, memoization has no benefit. If a function is called with many arguments, memoization too becomes less effective.

Get performance
Performance testing is challenging. Lots of things can go wrong. In this test I compared a loop that uses the get() method with a loop that uses both the inkeyword and then another lookup. Version 1:

This version uses a second argument to get(). It tests that against the result and then proceeds if the value was found. Version 2: This version uses "in" and then a lookup. Twice as many lookups into the dictionary occur. But fewer statements are executed. If
P r o g r a mt h a tb e n c h m a r k sg e t :P y t h o n i m p o r tt i m e #I n p u td i c t i o n a r y s y s t e m s={ " m a c ":1 ," w i n d o w s ":5 ," l i n u x ":1 } #T i m e1 p r i n t ( t i m e . t i m e ( ) ) #g e tv e r s i o n : i=0 v=0 x=0 w h i l ei<1 0 0 0 0 0 0 0 : x=s y s t e m s . g e t ( " w i n d o w s " ,1 ) i fx! =1 : v=x i+ =1 #T i m e2 p r i n t ( t i m e . t i m e ( ) ) #i nv e r s i o n : i=0 v=0 w h i l ei<1 0 0 0 0 0 0 0 : i f" w i n d o w s "i ns y s t e m s : v=s y s t e m s [ " w i n d o w s " ] i+ =1 #T i m e3 p r i n t ( t i m e . t i m e ( ) ) O u t p u t 1 3 4 5 8 1 9 6 9 7 . 2 5 7 1 3 4 5 8 1 9 7 0 1 . 1 5 5( g e t=3 . 9 0s ) 1 3 4 5 8 1 9 7 0 3 . 4 5 3( i n =2 . 3 0s )

The results show that version 1, which uses get() was slower than version 2, which used two lookups. The test results were the same when the two loops were run in the opposite order (version 2 before version 1).

Loop performance
A dictionary can be looped over in different ways. In this benchmark we test two different approaches. The requirement is that we access the key and the value in each iteration of the loop. Loops Version 1: This version loops over the keys of the dictionary with a while-loop. It then does an extra lookup to get the value. Version 2: This version instead uses a list of tuples containing the keys and values. It actually does not touch the original dictionary. But: Version 2 has the same effectwe access the keys and values. The cost of calling items() initially is not counted here.
P r o g r a mt h a tb e n c h m a r k sl o o p s :P y t h o n i m p o r tt i m e d a t a={ " m i c h a e l ":1 ," j a m e s ":1 ," m a r y ":2 ," d a l e ":5 } i t e m s=d a t a . i t e m s ( ) p r i n t ( t i m e . t i m e ( ) ) #V e r s i o n1 :g e t

i=0 w h i l ei<1 0 0 0 0 0 0 0 : v=0 f o rk e yi nd a t a : v=d a t a [ k e y ] i+ =1 p r i n t ( t i m e . t i m e ( ) ) #V e r s i o n2 :i t e m s i=0 w h i l ei<1 0 0 0 0 0 0 0 : v=0 f o rt u p l ei ni t e m s : v=t u p l e [ 1 ] i+ =1 p r i n t ( t i m e . t i m e ( ) ) O u t p u t 1 3 4 5 6 0 2 7 4 9 . 4 1 1 3 4 5 6 0 2 7 6 4 . 2 9( v e r s i o n1=1 4 . 8 8s ) 1 3 4 5 6 0 2 7 7 7 . 6 8( v e r s i o n2=1 3 . 3 9s )

We see that looping over a list of tuples is faster than directly looping over a dictionary. This makes sense. With the list, no lookups are done. But with the dictionary, we continually perform lookups. Caution: This benchmark has a flaw. It does not count the cost of the initial call to items(). And: Using items() may not speed up programs overall. It may introduce other performance deficits, as by using more memory. Further:

More memory usage reduces spatial locality. Memory is stored further apart: it is less fast to access.

Summary
In computer science a dictionary is usually implemented as a hash table. In a hash table, a special hashing algorithm translates a key (often a string) into an integer. And this integer is used to locate the data. This yields a speedup.

You might also like