Skip to main content

Section 6.2 Minimum spanning trees and forests

We revisit the original idea for the text, the minimum spanning tree. To begin we will need a precise definition.

Definition 6.2.1. Weighted graphs.

A weighted graph is an ordered triple \(\Gamma=(V,E,w)\) where \(V\) is a set of vertices, \(E\) is a set edges, and \(w:E\to \Reals\) is a edge-weight function.
A weighted digraph is an ordered triple \(\arr{\Gamma}=(V,A,w)\) where the edge set \(E\) is replaced by a set of ordered pairs of vertices, which are called arrows or directed edges. The weight function then is \(w:A\to\Reals\text{.}\)
The problem of finding a spanning tree in a connected graph is identical to the problem of finding a minimum spanning tree in a connected weighted graph, since in the former case you can have the weight function be constant-valued. There are two algorithms commonly used to teach this topic: the Jarník-Prim algorithm and Kruskal’s algorithm, the latter of which will produce a minimum spanning forest consisting of minimum spanning trees of each component of the original graph.

Subsection 6.2.1 Jarník-Prim Algorithm

The first algorithm we will explore to build a minimum spanning tree was developed by Czech mathematician Vojtĕch Jarník in 1930 and then independently rediscovered by Robert Prim in 1957 and Edsger Dijkstra in 1959. The weights of each edge are considered exactly to be costs of adding the edge to the graph. Unfortunately, without the introduction of a more complicated data structure than we have previously discussed we must implement the algorithm in a sub-optimal way. Using arrays or lists the computational complexity of the algorithm is \(O(\abs{V}^2)\text{;}\) optimal strategies involve the use of heaps.
The idea of the algorithm is very simple, and the implementation can be made simple at the cost of longer run times. Suppose there is a finite weighted graph \(\Gamma=(V,E)\) with weight function \(w:E\to\mathbb{R}^+\) and a cost function \(C:V\to\mathbb{R}^+\cup\set{\infty}\) such that initially \(C(v)=\infty\) for all \(v\in V\text{.}\) Choose a vertex \(v_0\in V\) and set \(C(v_0) = 0\text{,}\) and let \(V(T) = \set{v_0}\) be the ordered set of vertices of a tree.
The algorithm proceeds from here inductively: suppose \(u\) is the vertex most recently added to \(V(T)\) and let \(N(u)\) be the neighbors of \(u\) in \(\Gamma\) which are not already vertices of \(T\text{.}\) For each \(v\in N(u)\text{,}\) let \(C(v) = \min(C(v),C(u)+w(uv))\text{.}\) In the case where the cost of a vertex \(v\) is changed, denote \(u\) as the predecessor of \(v\text{.}\) After so considering each vertex, any vertex of \(V\setminus V(T)\) of least cost is the next vertex to add to \(V(T)\text{.}\) As long as the original graph is connected, every vertex will eventually be added to \(V(T)\text{,}\) and since no edge is added between a new vertex and an old vertex of \(V(T)\text{,}\) the result is necessarily a tree.

Example 6.2.2. An example applying the Jarník-Prim algorithm.

Consider a graph with the following weighted edges:
\(v_iv_j\) \(w(v_iv_j)\) \(v_iv_j\) \(w(v_iv_j)\) \(v_iv_j\) \(w(v_iv_j)\) \(v_iv_j\) \(w(v_iv_j)\)
\(v_{0}v_{3} \) 1 \(v_{0}v_{4} \) 2 \(v_{0}v_{5} \) 4 \(v_{0}v_{8} \) 1
\(v_{0}v_{9} \) 9 \(v_{1}v_{2} \) 5 \(v_{1}v_{6} \) 10 \(v_{1}v_{8} \) 4
\(v_{2}v_{6} \) 8 \(v_{2}v_{7} \) 5 \(v_{2}v_{8} \) 7 \(v_{3}v_{4} \) 5
\(v_{3}v_{7} \) 7 \(v_{3}v_{9} \) 9 \(v_{4}v_{8} \) 9 \(v_{4}v_{9} \) 7
\(v_{5}v_{7} \) 9 \(v_{5}v_{8} \) 5 \(v_{5}v_{9} \) 7 \(v_{6}v_{8} \) 10
We will asume that the first vertex to be added is \(v_0\text{,}\) so that \(C(0)=0\) and \(C(i)=\infty\text{,}\) for each \(i=2, 3, \dotsc, 9\text{.}\) Adding \(v_0\) to \(V(t)\) updates the cost and predecessor functions to the following:
\(i\) 0 1 2 3 4 5 6 7 8 9
\(C(v_i)\) 0* \(\infty\) \(\infty\) 1 2 4 \(\infty\) \(\infty\) 1 9
\(P(v_i)\) * ? ? 0 0 0 ? ? 0 0
The notation \(C(v_0)=0^*\) indicates that the vertex has already been added and does not need to be considered. Consulting the table, we see that the least expensive and least indexed vertex to next add to the tree is \(v_3\text{.}\)
The neighbors of \(v_3\) are \(v_4\text{,}\) \(v_7\text{,}\) and \(v_9\text{,}\) but the cost to add \(v_4\) with predecessor \(v_3\) is \(1+4=5>2=C(4)\) and \(1+9 = 10>9 = C(9)\text{;}\) hence the only change is to vertex \(v_7\text{.}\)
\(i\) 0 1 2 3 4 5 6 7 8 9
\(C(v_i)\) 0* \(\infty\) \(\infty\) 1* 2 4 \(\infty\) 8 1 9
\(P(v_i)\) * ? ? 0 0 0 ? 3 0 0
Proceeding in this manner, we generate the cost table.
Table 6.2.3. Table of iterations of the cost function.
Step \(v_0\) \(v_1\) \(v_2\) \(v_3\) \(v_4\) \(v_5\) \(v_6\) \(v_7\) \(v_8\) \(v_9\)
0 0* \(\infty\) \(\infty\) \(\infty\) \(\infty\) \(\infty\) \(\infty\) \(\infty\) \(\infty\) \(\infty\)
1 0* \(\infty\) \(\infty\) 1* 2 4 \(\infty\) \(\infty\) 1 9
2 0* \(\infty\) \(\infty\) 1* 2 4 \(\infty\) 8 1* 9
3 0* 5 8 1* 2* 4 10 8 1* 7
After this point, no changes are made to the cost function, and vertices are added in order of least cost. Next we produce the table of predecessors.
Table 6.2.4. Table of iterations of the predecessor function.
Step \(v_0\) \(v_1\) \(v_2\) \(v_3\) \(v_4\) \(v_5\) \(v_6\) \(v_7\) \(v_8\) \(v_9\)
0 *
1 * 0 0 0 0 0
2 * 0 0 0 3 0 0
3 * 8 8 0 0 0 8 3 0 4

Subsection 6.2.2 Min Heaps

Definition 6.2.5.

A min heap is a special data structure representing a rooted tree, with the condition that a parent vertex is always ordered before its children.
Heaps provide extremely efficient algorithms when knowing the position of a minimal (or maximal, for a max heap) element of a data set is necessary. Hence they are the optimal data structure for Dijkstra’s algorithm, and they provide a very efficient data structure for storing the vertices of a graph when using the Jarník-Prim Algorithm to produce a minimum spanning tree of a connected graph.
There is a standard way to develop a heap class representing a binary min heap. The data is stored in a list such that the children of the element stored in index \(i\) are stored in indices \(2i+1\) and \(2i+2\text{,}\) and with every insertion into the heap the array is rebalanced to maintain this heap condition. While this is of course a fascinating algorithm, the implementation of a heap data structure to produce an optimized version of this algorithm, or any other algorithm, is beyond the current scope of this text.

Subsection 6.2.3 Kruskal’s algorithm

Joseph Kruskal first published his solution to the minimum spanning tree problem in 1956; again, this is a wonderful, almost intuitive algorithm, which hinges on an awesome use of mathematical induction. The caveat is that the weight function for Kruskal’s algorithm must be a non-negative function, \(w:E\to \set{x\geq 0:x\in \mathbb{R}}\text{.}\)

Proof.

Suppose \(\Gamma=(V,\set{~})\) is a totally disconnected graph, and let \(u_0,u_1\in V\text{.}\) Then \(u_0\) and \(u_1\) are (naturally) in separate components. The graph \(\Gamma'=(V,\set{u_0,u_1})\) contains no cycles, since a graph with only one edge cannot contain a cycle.
Suppose \(\Gamma=(V,E)\) is a forest and \(u_0,u_1\in V\) are vertices in different components of \(\Gamma\text{.}\) Assume, towards a contradiction, that \(\Gamma=(V,E\cup\set{u_0u_1})\) contains a cycle. Then there is a cycle \(C=(v_0,e_1,v_1,e_2,v_2,\ldots,v_{n-1},e_n,v_n=v_0)\) where \(v_i\in V\) and \(v_i\neq v_j\) when \(i\neq j\) and \(\set{i,j}\neq\set{0,n}\text{.}\) If \(u_0u_1\neq e_j\) for any \(j\in\set{1,2,\ldots,n}\text{,}\) then we have contradicted the assumption that \(\Gamma\) was a forest. On the other hand, if \(u_0u_1\in \set{e_1,e_2,\ldots,e_n}\text{,}\) then we may relabel the cycle \(C=(v_0=u_0, e_1, v_1, e_2, v_2, \ldots, v_{n-1}=u_1, e_n=u_0u_1, v_n=u_0=v_0)\text{.}\) But then \((v_0=u_0, e_1, v_1, e_2, v_2, \ldots, v_{n-1}=u_1)\) is a \(u_0,u_1\)-path, contradicting the assumption that \(u_0\) and \(u_1\) are in different components of \(\Gamma\text{.}\) In either case, we have a contradiction.
Hence we have an immediate method of producing a spanning forest — include any edge which does not connect components. More importantly, we have all that is necessary for Kruskal’s algorithm: if the graph is weighted, the edge of minimal weight which does not introduce a cycle is added in every iteration.

Example 6.2.8. Applying Kruskal’s Algorithm.

Consider the graph with the following weighted edges:
Weight Edge Weight Edge Weight Edge Weight Edge
1 \(v_{0}v_{3}\) 4 \(v_{1}v_{8}\) 7 \(v_{2}v_{8}\) 9 \(v_{0}v_{9}\)
1 \(v_{4}v_{8}\) 5 \(v_{1}v_{2}\) 7 \(v_{3}v_{7}\) 9 \(v_{3}v_{9}\)
2 \(v_{0}v_{4}\) 5 \(v_{2}v_{7}\) 7 \(v_{4}v_{9}\) 9 \(v_{5}v_{7}\)
2 \(v_{0}v_{8}\) 5 \(v_{3}v_{4}\) 7 \(v_{5}v_{9}\) 10 \(v_{1}v_{6}\)
4 \(v_{0}v_{5}\) 5 \(v_{5}v_{8}\) 8 \(v_{2}v_{6}\) 10 \(v_{6}v_{8}\)
The graph \(\Gamma\) with this edge set on \(V=\set{v_0,v_1,\ldots,v_9}\) and weight function \(w\) is depicted in Figure 9. To begin applying Kruskal’s algorithm to find a spanning tree of \(\Gamma\) with minimal weight, we assign each vertex to its own component; thus, the partition of the vertices is
\begin{equation*} \mathcal{C} = \set{C_0=\set{v_0}, C_1=\set{v_1}, C_2=\set{v_2},\ldots,C_{9}=\set{v_9}}. \end{equation*}
Figure 6.2.9. A weighted graph \(\Gamma\text{.}\)
For the sake of abbreviated notation, we define the function \(c:V\to V\) such that \(c(u)=v\) if and only if \(u\in C_v\text{.}\) The algorithm proceeds as follows: the edge of least weight is \(v_0v_3\text{,}\) with \(w(v_0v_3) = 1\text{.}\) Since \(c(v_0)=0\neq 3=c(v_3)\text{,}\) the edge \(v_0v_3\) can be included in the tree. We will merge the component \(C_0\) into \(C_3\text{,}\) since the two components are of equal size and \(0 < 3\text{.}\) Hence we make \(C_0=\set{}\) and \(C_3=\set{v_0,v_3}\text{.}\) The next edge of least weight is \(v_4v_8\text{,}\) also with \(w(v_4v_8)=1\text{;}\) since \(c(v_4)=4\) and \(c(v_8)=8\text{,}\) we must merge \(C_4\) and \(C_8\) just as we did \(C_0\) and \(C_3\text{.}\) The process is tabulated below.
Table 6.2.10. Tabulation of steps of Kruskal’s Algorithm
Step Weight \(u\) \(c(u)\) \(v\) \(c(v)\) Include \(uv\text{?}\)
0 1 \(v_{0}\) 0 \(v_{3}\) 3 True
1 1 \(v_{4}\) 4 \(v_{8}\) 8 True
2 2 \(v_{0}\) 3 \(v_{4}\) 8 True
3 2 \(v_{0}\) 8 \(v_{8}\) 8 False
4 4 \(v_{0}\) 8 \(v_{5}\) 5 True
5 4 \(v_{1}\) 1 \(v_{8}\) 8 True
6 5 \(v_{1}\) 8 \(v_{2}\) 2 True
7 5 \(v_{2}\) 8 \(v_{7}\) 7 True
8 5 \(v_{3}\) 8 \(v_{4}\) 8 False
9 5 \(v_{5}\) 8 \(v_{8}\) 8 False
10 7 \(v_{2}\) 8 \(v_{8}\) 8 False
11 7 \(v_{3}\) 8 \(v_{7}\) 8 False
12 7 \(v_{4}\) 8 \(v_{9}\) 9 True
13 7 \(v_{5}\) 8 \(v_{9}\) 8 False
14 8 \(v_{2}\) 8 \(v_{6}\) 6 True
15 9 \(v_{0}\) 8 \(v_{9}\) 8 False
16 9 \(v_{3}\) 8 \(v_{9}\) 8 False
17 9 \(v_{5}\) 8 \(v_{7}\) 8 False
18 10 \(v_{1}\) 8 \(v_{6}\) 8 False
19 10 \(v_{6}\) 8 \(v_{8}\) 8 False
Figure 6.2.11. The same graph \(\Gamma\) with the minimum spanning tree discovered via Kruskal’s algorithm highlighted with red edges.

Subsubsection 6.2.3.1 Implementing Kruskal’s algorithm

In a very naive approach to Kruskal’s algorithm, the function \(c(v)\) reporting the component of vertex \(v\) is not implemented. This requires two searches through the components for each edge. We avoid this at the cost of storing both the component of each vertex as well as the vertices in each component.
def kruskal(verts, wedges, verbose=False):
    WE = sorted((l,u,v) for u,v,l in wedges)

    comp_of = {v:v for v in verts}
    vert_of = {v:[v] for v in verts}

    keep = []
    dump = []
    out_grid = []
    for i,(w,u,v) in enumerate(WE):
        cu, cv = comp_of[u], comp_of[v]
        row = [w,u,cu,v,cv,cu!=cv]
        out_grid.append( row )
        if cu==cv:
            dump.append( (u,v,w) )
            continue
        keep.append( (u,v,w) )
        if len(vert_of[cu]) > len(vert_of[cv]):
            v,u = u,v
            cu,cv = cv,cu
        vert_of[cv] += vert_of[cu]
        while len(vert_of[cu])>0:
            comp_of[vert_of[cu].pop()] = cv
    if verbose:
        out_grid = [["Weight", "u", "C(u)", "v", "C(v)", "Include?"]]
        out_grid += [[str(entr) for entr in row] for row in out_grid]
        mc = [max(len(out_grid[r][c]) for r in range(len(out_grid))) for c in range(len(out_grid[0]))]
        new_out_grid = [[] for _ in out_grid]
        for c in range(len(out_grid[0])):
            for r in range(len(out_grid)):
                new_out_grid[r].append(("{x:>%d}"% mc[c]).format(x=out_grid[r][c]))
        print( "\n".join([" | ".join(row) for row in new_out_grid]))
    return keep,dump
Listing 6.2.12. A Python implementation of Kruskal’s algorithm.