Lesson-6.1
Dictionaries
Introduction
Let us assume that we want to store the following information in Python:
Country | Capital |
---|---|
Brazil | Brasilia |
Russia | Moscow |
India | New Delhi |
China | Beijing |
South Africa | Cape Town |
A minor geographical observation: South Africa has three capitals; we have only mentioned the legislative capital for convenience. A geopolitical point: these five countries form a part of a block called BRICS [refer].
Coming back to Python, a dictionary is possibly the most interesting data structure offered by Python. It is basically a look-up table. This is how we would store the details of the BRICS nations and their capitals:
brics = {
'Brazil': 'Brasilia',
'Russia': 'Moscow',
'India': 'New Delhi',
'China': 'Beijing',
'South Africa': 'Cape Town'
}
A dictionary is a collection of key-value pairs. In the code given above, brics
is a dictionary. It has countries mapped to their respective capitals. For instance, 'India'
is mapped to 'New Delhi'
. Here, 'India'
is the key and 'New Delhi'
is the value. That is, the country is the key and its capital is the value. A dictionary object is of type dict
:
To access the value corresponding to a given key, we do the following:
The value corresponding to a given key can be updated:
New key-value pairs can be added to a dictionary. Let us expand the horizons of our dictionary to include countries outside the BRICS nations. It no longer makes sense to call this brics
, so let us create a new dictionary called globe
which starts off as a copy of brics
. Recall the copy()
method that we used to copy lists. A similar method is defined for dictionaries:
Adding a new key-value pair is as simple as the statement given in line 9 of the code given above. Keys of a dictionary are unique. This means that a dictionary cannot have two or more identical keys mapped to different values. On the other hand, two different keys could have the same value. For example:
Trying to access a key that is not present in the dictionary will result in a KeyError
:
##### Alarm! Wrong code snippet! ######
some_dict = {'0': 'zero', '1': 'one'}
print(some_dict[0])
##### Alarm! Wrong code snippet! ######
More Examples
The key of a dictionary can be any immutable object. There is a small catch here. We will return to this constraint in the next section. Let us look at different combinations key-value pairs that are possible beginning with the basic types: int
, str
, float
, bool
:
Next, we have dictionaries that have list
and tuple
as the type of their values:
Tuples can be keys, provided they don't contain any mutable objects within them:
Towards the end, we will look at an example where a tuple cannot be a key. Finally, the richness of dictionaries comes out in the following example:
# mixed
report_card = {
'name': 'Ramanujan',
'age': 18,
'school': 'KV',
'marks': (75, 80, 60, 95, 100)
}
More on Keys
Earlier, we have mentioned that the keys of dictionaries have to be immutable, but this statement is not entirely accurate. In this section, we will explore why. What happens if we use a list as a key?
It throws a TypeError
with the following message: unhashable type: 'list'
. A list cannot be a key in a dictionary; but the error message doesn't talk about immutability, instead it says that the list
type is unhashable. A more accurate statement about keys in a dictionary is given below:
The keys of a dictionary must be hashable.
To understand what we mean by the term hashable, we shall briefly look at the way Python implements dictionaries. The following section on hash tables is a bit involved and can be skipped.
Hash Tables
Python dictionaries are implemented using a data structure called a hash table. It is best to think about a hash table as a book-rack that has a number of rows. Picture the key-value pairs as books that are going to be stored in these racks. To access a book, we need to know the row number in which it is present. This is where the idea of a hash function comes in. The hash function is denoted by \(h\) and converts the key to the row number.
The hash function accepts a key \(k\) as input and returns a value, \(h(k)\), as output. This is called the hash value. In our analogy, the hash value is synonymous with the rack number. Once we know the rack number, the book (key-value) stored in it can be easily retrieved. The description is somewhat naive, but you get the point.
Now, an object in Python is hashable if it has a hash value which never changes during its lifetime and can be compared to other objects. Most of the immutable objects that we have seen so far are hashable: int
, float
, str
, bool
. Mutable containers such as lists are not hashable. So, can we just go back to the original definition and claim that all immutable objects can be used as keys in dictionaries? No! Consider the following example:
Though some_tuple
is immutable, it contains a sequence of lists which are mutable. According to the Python documentation, immutable containers are hashable only if their elements are hashable. So, some_tuple
is not hashable, and hence it cannot be used as a key! For a better explanation, check out the docs.
Iterating over Dictionaries
We can iterate over the keys of a dictionary:
squares.keys()
returns a sequence of keys over which we can iterate. Python makes things even more simple and lets us drop the keys
method.
We can also iterate over the key-value pairs in a dictionary:
Growing a Dictionary
An empty dictionary can be defined in one of the following ways:
Let us now solve the following problem:
Ceate a dictionary from a list of words that maps words to their lengths.
Mutability
Like lists dictionaries are mutable objects. To see the mutability of dict
objects in action, consider the following code:
dict_1 = {'one': 1, 'two': 2, 'three': 3}
dict_2 = dict_1
dict_2['four'] = 4
print(dict_1, dict_2)
print(dict_1 is dict_2)
We see that dict_2
is alias of dict_1
and both point to the same object. If we want a new dict
object with the same contents as dict_1
, we could either use the copy()
method or the dict
built-in function:
dict_1 = {'one': 1, 'two': 2, 'three': 3}
dict_2 = dict_1.copy() # dict(dict_1) also works
dict_2['four'] = 4
print(dict_1, dict_2)
print(dict_1 is not dict_2)
The last line prints True
which confirms that we have two different objects. So modifying one doesn't affect the other. But note that copy()
only produces a shallow copy. As long as the values are immutable, this doesn't matter. But if we have mutable values, then we have a problem:
dict_1 = {'one': [1], 'two': [1, 1], 'three': [1, 1, 1]}
dict_2 = dict_1.copy()
dict_2['one'].append(100)
print(dict_1, dict_2)
print(dict_1 is not dict_2)
print(dict_1['one'] is dict_2['one'])
Here, we see that the value corresponding to the key 'one'
in both dictionaries gets affected. This is because dict_1['one']
and dict_2['one']
are still the same object. This can be seen from the last statement of the code given above. To set this right, we need to do a deepcopy: