Skip to main content

Section A.3 Container data types

In Python, data types which can contain other data are called container data types. The three easiest general containers to work with are lists, tuples, and dictionaries. A typical use of a container is to store data collected in a repeated operation of a program.

Subsection A.3.1 The list data type

Lists serve as a general purpose container.

Technology A.3.1. Defining a list.

Any comma-separated sequence of syntactically valid expressions enclosed in a matching pair of square brackets [ and ] is a list. The empty list [] is a valid list.
>>> foo = "This is a string, not a list"
>>> bar = [1, 2, foo, [5, 5, 5]]
>>> type(bar)
<class 'list'>
>>> print( bar )
[1, 2, 'This is a string, not a list', [5, 5, 5]]
Listing A.3.2. Some list data
Since lists contain other data it is very important to be able to access that data. Python provides two related ways to access the data contained in a list: indexing and slicing.

Subsubsection A.3.1.1 Indexing and slicing

The index of an element of a list is the distance in the list from that element to the start of the list. The element at the start of the list is thus in index 0, the next element in index 1, the next in index 2, and so on. Thus a list consisting of 10 elements has indices 0 through 9. For a list the_list, we access the element in index \(i\) by using the_list[i]; this is similar to indexing a mathematical sequence as \((a_0,a_1,a_2,\dotsc,a_k)\text{.}\)
>>> the_list = ['bob', 'larry', 3.14159, 1-1j, 'fifth']
>>> the_list[0]
'bob'
Unlike a mathematical sequence, Python allows indexing from the back of the list as well. When indexing from the back of the list, it is important to remember that the same principle applies as above: the negative index is the distance from the “front” of an element to the back of the list. Continuing the above, we have:
>>> the_list[-2]
(1-1j)
>>> the_list[-1]
'fifth'
Since \(0 = -0\text{,}\) the negative zero index is the same as the zero index:
>>> the_list[-0]
'bob'
>>> the_list[0]
'bob'
To access a sublist of a list, we use a slice. The first number of the slice is the index of the element to start the sublist, and the second number of the slice is the index of the first element not to include. To slice starting at index 1 and stopping before index 5 of the_list, we would use the_list[1:5]. Since the_list only had five elements we will define a longer list to experiment.
>>> new_list = [1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377,
... 610, 987, 1597, 2584, 4181, 6765, 10946]
>>> new_list[1:5]
[2, 3, 5, 8]
You can slice by skipping indices, as well, by supplying a third term.
>>> new_list[1:15:2]
[2, 5, 13, 34, 89, 233, 610]
Remark A.3.3.
Python allows an interesting behavior with regard to indexing and slicing. While you cannot index beyond the bounds of a list, you are always permitted to slice beyond the bounds of the list. Thus with the above code, the_list[8] would generate an IndexError, but new_list[50:100:3] would output the empty list, [].
Finally, if you want to slice from the beginning of the list up to the \(j\) th index, you use new_list[:j]. If you want to slice from the \(i\) th index to the end of the list, you use new_list[i:]. If you want to slice a whole copy of a list (there are valid reasons to do this), use new_list[:]. If you want the reverse of a list, you slice the whole thing from beginning to end using a skip of negative one: new_list[::-1].

Subsubsection A.3.1.2 Operators and methods

Indexing and slicing are just the beginnings of what can be done with lists. We can also combine two lists into a longer one (additively) or elongate a list by making it contain several copied of itself (multiplicatively). Continuing the above examples, we have:
>>> the_list + new_list
['bob', 'larry', 3.14159, (1-1j), 'fifth', 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946]
>>> 3 * the_list
['bob', 'larry', 3.14159, (1-1j), 'fifth', 'bob', 'larry', 3.14159, (1-1j), 'fifth', 'bob', 'larry', 3.14159, (1-1j), 'fifth']
Remark A.3.4. list addition is not commutative.
Generally speaking, if a and b are lists, then a+b is different from b+a.
These are operators which we are used to working with algebraically, now given new meaning to interact with lists. Methods are different: they are functions specific to the list data type accessed using dot notation. When working with a new data type, a good way to see the full list of all of its methods is to ask for help, but beware! The help files are very detailed. Try help(list) at the Python prompt.
Many of the methods (all of those named __likethis__) are special. We will discuss those later in the text when we discuss operator overloading in [provisional cross-reference: sec-overloading]. The list class has several other methods as well. Assuming that we are using the_list from above, here is a basic overview of several of the more useful methods:
  • the_list.append(x): Increases the length of the_list by sticking the value x at the end
  • the_list.count(x): Returns the number of times x occurs in the_list without changing the_list

Subsubsection A.3.1.3 Mutability

The property of mutability is special: if a numerically-index contained allows you to assign a new value into a particular index, then that container is mutable. If not, the container is immutable. Lists are a mutable data type, since we can make an assignment new_list[5] = 'reginald', so long as new_list[5] is a valid index.

Subsection A.3.2 The tuple data type

A tuple behaves very much like a list, except for two important points: they are enclosed in parentheses ( and ) instead of square braclets, and they are immutable. You cannout assign a new variable to a particular index of a tuple without instead overwriting the whole tuple.
Since a tuple must be overwritten to be changed, several of the methods differ from those for the list class. The differences can be seen in a careful comparison of help(tuple) versus help(list).

Subsection A.3.3 The dict data type

In order to explain how a Python dict behaves, it is instructive to consider the mathematical definition of a function.

Definition A.3.5. Mathematical function.

A function \(f\) from a set \(X\) to a set \(Y\) is a rule which assigns to each value in the set \(X\) exactly one value from the set \(Y\text{.}\)
The set \(X\) is called the domain of \(f\) and the set \(Y\) is called the codomain of \(f\). The set \(\set{f(x):x\in X}\) is the image of \(f\), or sometimes the image of \(X\) under \(f\).
Hence if \(f:X\to Y\) is a function, \(x_1,x_2 \in X\) , and \(y_1,y_2\in Y\) such that \(x_1 = x_2\) , \(f(x_1)=y_1\) , and \(f(x_2)=y_2\) , then \(y_1=y_2\) .
In Python, a dict acts very much like a simple mathematical function, where the domain is a set of hashable objects. The elements of the domain of a dict are called its keys and the elements of its image are called its values. Since the idea of a hashable object is complicated, a simpler idea is to think that keys of a dict must be immutable items or non-containers.

Technology A.3.6. Defining a dict.

Dictionaries can be defined in two important ways.
  1. my_dict = {k1:v1, k2:v2, ..., k100:v100} would produce a dict with 100 paired keys and values.
  2. my_dict = dict( [(k1, v1), (k2, v2), ..., (k100, v100)] ) would produce the same dict.
Python dictionaries are mutable, and in fact once a dict has been defined, new key-value pairs can be added to the dictionary simply by assigning the new value to the correct key: my_dict[new_key] = new_value.

Subsection A.3.4 Sets are a special container

It is also possible to construct mathematical sets in Python, using the set(some_object) constructor. The argument some_object must be itself a container of hashable objects, much like the keys of a dict.

Technology A.3.8. Defining a set.

Usually the argument to set is a list or tuple, but str is also permitted since strings (of length 1) are hashable. Otherwise, all normal mathematical properties of expected for sets hold for the set class.
>>> a = set([1,2,3,4])
>>> b = set([1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,4])
>>> print(a==b)
True
Listing A.3.9. Defining sets removes repeated elements