Project: Gauss-Jordan Elimination

Section 5.2 Project: Gauss-Jordan Elimination

Objectives

This section contains the theoretical descriptions of the Gauss-Jordan Elimination algorithm, in two distinct variations, along with explanations of why the naive method first presented is insufficient in most computational settings.

Learn the formal techniques of GJE.
Implement the elementary row operations as methods in the AlgoMatrix class.
Combine the elementary row operations to perform Gauss-Jordan elimination with partial pivoting, and add a gje method to the AlgoMatrix class.
Create a method in the AlgoMatrix class to compute the inverse of a square matrix, or to raise an error if the matrix is singular.

Given an arbitrary matrix \(A\in\Mats{m,n}\text{,}\) Gauss-Jordan elimination (GJE) is the best general-purpose algorithm for producing a matrix \(A'\in\Mats{m,n}\) which is row-equivalent to \(A\) and is in reduced row-echelon form. The algorithm proceeds iteratively through the matrix by a process of identifying a pivot position, scaling the row containing that pivot position so that the pivot position contains a 1, and then using the 1 in that pivot position to eliminate all nonzero elements in the pivot column. The variations among different types of GJE are referred to as pivoting strategeies, algorithms which choose the optimal row to swap into the preferred row for pivoting. In fact, the pivot selection and swapping of an entry into the pivot position is the only difference in the many algorithms, since the pivot selection step is the only step which does not change the final result of the algorithm.

We begin with the algorithm which seems to most easily satisfy the four points of Definition 5.1.10, wherein the only time a number in a potential pivot position is swapped is when the pivot position contains a zero.

Subsection 5.2.1 Gauss-Jordan elimination with naive pivoting

In the algorithms in this section, it is clearest to speak of performing an elementary row operation on a matrix as an in-place change of the matrix. This replaces the awkwardness of keeping track of an index of a last element of a sequence as the sequence is being built. In implementation it is simple enough to perform these changes on a copy of the original matrix, or even simply on a copy of the list-of-lists making up the data of the matrix. We use the notation \(A\xmapsto{\phi} A\) to represent that \(A\) is transformed by the operation \(\phi\text{;}\) this shortens the two steps of \(A\xto{\phi}A'\) followed by assigning \(A\) to take the value of \(A'\) into a single step of notation.

Algorithm 5.2.1. Gauss-Jordan elimination.

Given a matrix \(A'\in\Mats{m,n}\text{,}\) let \(A\) be a copy of that matrix.

Let \(r=0\) and \(c=0\text{.}\)
While \(r\lt m\) and \(c\lt n\text{,}\) repeat the following:
1. Pivot Selection.
  If \(\entry{A_{-1}}{r,c} = 0\text{,}\) do the following:
  1. Let \(r'\) be the least row index satisfying \(r\lt r'\lt m\) and \(\entry{A}{r',c}\neq 0\text{.}\) If no such \(r'\) exists, increase \(c\) by 1 and return to step (2).
  2. Let \(A\xmapsto{\rswap{r}{r'}} A\text{.}\)
2. Scaling pivot row.
  Set \(\alpha = \entry{A}{r,c}\text{,}\) and if \(\alpha\neq 1\text{,}\) let \(A\xmapsto{\rscale{\frac1{\alpha}}{r}} A\)
3. Elimination phase.
  For every row index \(r'\neq r\text{,}\) do the following:
  1. Let \(\alpha = \entry{A}{r',c}\text{.}\)
  2. Let \(A\xmapsto{\relim{-\alpha}{r}{r'}} A\)
4. Increase both \(r\) and \(c\) by 1.
When step (2) finishes, the matrix \(A\) is in reduced row-echelon form and is row-equivalent to the initial matrix \(A'\text{.}\)

Remark 5.2.2.

It is frequently useful to preform a GJE algorithm on an augmented matrix \([A|B]\) where \(A\in\Mats{m,n}\) and \(B\in\Mats{m,q}\text{;}\) in this case the “augmentation line” is treated as the edge of the matrix for the purposes of halting; pivot selection never continues when \(c\geq n\) as the algorithm terminates at that point.

Subsection 5.2.2 Swamping: the computational flaw in naive pivoting

We must frequently recall when performing computational algorithms that an algorithm which works perfectly on paper can fail catastrophically when converted from calculations over \(\Comps\) to calculations over the floating-point approximations to \(\Comps\text{.}\) Consider the following small example.

Example 5.2.3. Calculations in exact arithmetic.

Let

\begin{align*} A \amp= \begin{bmatrix}10^{-20}\amp 1 \\ 1 \amp 2\end{bmatrix} \amp \vec{b} \amp= \cvec{1\\4} \end{align*}

and consider the SLE with augmented matrix \([A\mid\vec{b}]\text{.}\) Then following the naive algorithm and using exact arithmetic, we have

\begin{align*} [A\mid \vec{b}] \xto{\rscale{10^{20}}{0}} \amp\begin{bmatrix} 1 \amp 10^{20} \amp 10^{20} \\ 1 \amp 2 \amp 4 \end{bmatrix}\\ \xto{\relim{-1}{0}{1}} \amp\begin{bmatrix} 1 \amp 10^{20} \amp 10^{20} \\ 0 \amp 2-10^{20} \amp 4-10^{20} \end{bmatrix}\\ \xto{\rscale{1/(2-10^{20})}{1}} \amp\begin{bmatrix} 1 \amp 10^{20} \amp 10^{20} \\ 0 \amp 1 \amp \frac{4-10^{20}}{2-10^{20}} \end{bmatrix}\\ \xto{\relim{-10^{20}}{1}{0}} \amp\begin{bmatrix} 1 \amp 0 \amp 10^{20}\left(1- \frac{4-10^{20}}{2-10^{20}} \right) \\ 0 \amp 1 \amp \frac{4-10^{20}}{2-10^{20}} \end{bmatrix} \end{align*}

If our solution vector is \(\vec{x}\text{,}\) then we see

\begin{align*} \entry{\vec{x}}{0} \amp= 10^{20}\left(1- \frac{4-10^{20}}{2-10^{20}} \right)\\ \amp= 10^{20}\left(\frac{2-10^{20}-4+10^{20}}{2-10^{20}} \right)\\ \amp = 2\left(\frac{10^{20}}{10^{20}-2}\right)\\ \amp= 2\left(\frac{100000000000000000000}{99999999999999999998}\right) \approx 2\\ \end{align*}

and

\begin{align*} \entry{\vec{x}}{1} \amp= \frac{10^{20}-4}{10^{20}-2}\\ \amp = \frac{99999999999999999996}{99999999999999999998} \approx 1\text{.} \end{align*}

Unfortunately, the 64-bit specification for floating point arithmetic which is in modern use can only see differences larger than about \(10^{-15}\) times the scale of the number. Hence in floating point arithmetic, we have staggeringly different results with the naive algorithm.

Example 5.2.4. Calculations in floating point arithmetic.

Let

\begin{align*} A \amp= \begin{bmatrix} 1.0\times 10^{-20}\amp 1.0 \\ 1.0 \amp 2.0\end{bmatrix} \amp \vec{b} \amp= \cvec{1.0\\4.0} \end{align*}

and consider the SLE with augmented matrix \([A\mid\vec{b}]\text{.}\) Then following the naive algorithm and using floating point arithmetic, we have

\begin{align*} [A\mid \vec{b}] \xto{\rscale{10^{20}}{0}} \amp\begin{bmatrix} 1.0 \amp 1.0\times 10^{20} \amp 1.0\times 10^{20} \\ 1.0 \amp 2.0 \amp 4.0 \end{bmatrix}\\ \xto{\relim{-1.0}{0}{1}} \amp\begin{bmatrix} 1.0 \amp 1.0\times 10^{20} \amp 1.0\times 10^{20} \\ 0.0 \amp -1.0\times 10^{20} \amp -1.0\times 10^{20} \end{bmatrix}\\ \xto{\rscale{-10^{-20}}{1}} \amp\begin{bmatrix} 1.0 \amp 1.0\times 10^{20} \amp 1.0\times 10^{20} \\ 0 \amp 1.0 \amp 1.0 \end{bmatrix}\\ \xto{\relim{1.0\times 10^{20}}{1}{0}} \amp\begin{bmatrix} 1.0 \amp 0.0 \amp 0.0 \\ 0.0 \amp 1.0 \amp 1.0 \end{bmatrix}\\ \end{align*}

which gives a solution vector

\begin{align*} \vec{x} \amp= \cvec{0.0\\1.0}\text{.} \end{align*}

As shown in the previous example, this is very incorrect, even in approximation.

Subsection 5.2.3 Gauss-Jordan elimination with partial pivoting

Fortunately the algorithmic modification which overcomes the problem of swamping is minor: the pivot position must be selected so that it is always the element of greatest absolute value in its subcolumn.

Algorithm 5.2.5. Gauss-Jordan elimination with Partial Pivoting.

Given a matrix \(A'\in\Mats{m,n}\text{,}\) let \(A\) be a copy of that matrix.

Let \(r=0\) and \(c=0\text{.}\)
While \(r\lt m\) and \(c\lt n\text{,}\) repeat the following:
1. Pivot Selection.
  Let \(p\) be the least row index satisfying \(r\leq p\lt m\) and such that \(\abs{\entry{A}{p,c}}\geq \abs{\entry{A}{r',c}}\) for all \(r\leq r'\lt m\text{.}\)
  1. If \(\entry{A}{p,c} = 0\text{,}\) increase \(c\) by 1 and return to step (2).
  2. If \(\entry{A}{p,c} > 0\text{,}\) let \(A\xmapsto{\rswap{p}{r}} A\text{.}\)
2. Scaling pivot row.
  Set \(\alpha = \entry{A}{r,c}\text{,}\) and if \(\alpha\neq 1\text{,}\) let \(A\xmapsto{\rscale{\frac1{\alpha}}{r}} A\)
3. Elimination phase.
  For every row index \(r'\neq r\text{,}\) do the following:
  1. Let \(\alpha = \entry{A}{r',c}\text{.}\)
  2. Let \(A\xmapsto{\relim{-\alpha}{r}{r'}} A\)
4. Increase both \(r\) and \(c\) by 1.
When step (2) finishes, the matrix \(A\) is in reduced row-echelon form and is row-equivalent to the initial matrix \(A'\text{.}\)

This slight change fixes the swamping even in small example!

Example 5.2.6. Drain the swamp!

Let

\begin{align*} A \amp= \begin{bmatrix} 1.0\times 10^{-20}\amp 1.0 \\ 1.0 \amp 2.0\end{bmatrix} \amp \vec{b} \amp= \cvec{1.0\\4.0} \end{align*}

and consider the SLE with augmented matrix \([A\mid\vec{b}]\text{.}\) Then following the partial pivoting algorithm and using exact arithmetic, we have to swap rows immediately, obtaining

\begin{align*} [A\mid \vec{b}] \xto{\rswap{0}{1}} \amp\begin{bmatrix} 1.0 \amp 2.0 \amp 4.0 \\ 1.0\times 10^{-20}\amp 1.0 \amp 1.0 \end{bmatrix}\\ \xto{\relim{-10^{-20}}{0}{1}} \amp\begin{bmatrix} 1.0 \amp 2.0 \amp 4.0 \\ 0.0 \amp 1.0 \amp 1.0 \end{bmatrix}\\ \xto{\relim{-2}{1}{0}} \amp\begin{bmatrix} 1.0 \amp 0.0 \amp 2.0 \\ 0.0 \amp 1.0 \amp 1.0 \end{bmatrix}\\ \end{align*}

giving a solution vector of

\begin{align*} \vec{x} \amp= \cvec{2.0\\1.0}\text{.} \end{align*}

This was our approximate solution in the exact case; we have lost some precision, but very, very little.

Subsection 5.2.4 Implementing GJE with PP

In order to build a working GJE implementation, you need to have a fully functional AlgoMatrix class; everything is built upon that.

Activity 5.1. Implementing the elementary row operations.

We want to build these carefully so that AlgoMatrix has rswap, rscale, and relim methods which return row-equivalent matrices having performed the appropriate operation.

(a) Swapping rows.

We first want to build a function which swaps elements in a list; if we can build such a function, it is an easy application to be able to swap rows in a AlgoMatrix.

(i)

Suppose that grid is a list of \(m\) entries and \(0\leq r_1\lt r_2 \lt m\) are valid indices of grid. What is the slice of grid prior to \(r_1\text{?}\) What is the slice of grid containing only the \(r_1\)th entry of grid? What is the slice of grid between indices \(r_1\) and \(r_2\) but not containing either? What is the slice of grid containing only the \(r_2\)th element of grid? And finally, what is the slice of grid beyond the index \(r_2\) not containing the element in index \(r_2\text{?}\)

(ii)

How could you add together the slices from the preceding part to swap the elements in indices \(r_1\) and \(r_2\) of grid?

(iii)

Complete the following function in your aam_proj6.py file by setting out to the appropriate value in a single line.

def gswap(grid, r1, r2):
    """Return a list consisting of the contents of grid with elements
        in indices r1 and r2 swapped."""
    out = None
    return out

(iv)

Adapt the function gswap from the previous subtask to the following method in your AlgoMatrix.py class file.

def rswap(self, r1, r2):
    """Return the AlgoMatrix obtained from self by swapping rows
        with indices r1 and r2."""
    out = None
    return AlgoMatrix(out)

(b) Scaling a row.

Similarly to (a), we first work on a list.

(i)

Suppose that grid is a list of \(m\) entries and \(0\leq r_1\lt m\text{,}\) and suppose \(\alpha\in\Comps\text{.}\) Suppose further that the element of grid in index \(r_1\) is a container, like a list or tuple. How could you create a new version of that container whose elements have all been multiplied by \(\alpha\text{?}\)

(ii)

Complete the following function in your aam_proj6.py file to produce the effect described in the function’s docstring.

def gscale(grid, alpha, r1):
    """Return a list whose elements are identical to those of grid,
        except in the index r1, which has been scaled by a factor
        of alpha."""
    temp_entry = [ ]
    for entry in grid[r1]:
        temp_entry.append( None )
    # Build out on the next line using slicing and addition.
    out = None
    return out

(iii)

Adapt the function gscale from the previous subtask to the following method in your AlgoMatrix.py class file.

def rscale(self, alpha, r1):
    """Return the AlgoMatrix obtained from self by scaling the row
        with index r1 by a factor of alpha."""
    out = None
    return AlgoMatrix(out)

(c) Eliminating one row using a scalar multiple of another.

This task is a combination of tasks (a) and (b).

(i)

Complete the following function in your aam_proj6.py file to produce the effect described in the function’s docstring.

def gelim(grid, alpha, r1, r2):
    """Return a list whose elements are identical to those of grid,
        except in the index r2, which has been modified by adding to
        it term-by-term the elements of the entry in index r1 scaled
        by a factor of alpha."""
    temp_entry = [ ]
    for entry1, entry2 in zip(grid[r1], grid[r2]):
        temp_entry.append( None )
    # Build out on the next line using slicing and addition.
    out = None
    return out

(ii)

In your AlgoMatrix.py class file, complete the method below by adapting the gelim function.

def relim(self, alpha, r1, r2):
    """Return a row-equivalent matrix to self which is obtained
        by performing the elimination operation which adds to the
        row with index r2 term-by-term the elements of the row
        with index r1 scaled by alpha."""
    out = None
    return AlgoMatrix(out)

Now you have adequate tools to combine the row operations and implement the GJE algorithm as a method in the AlgoMatrix class.

Activity 5.2. Implement Gauss-Jordan Elimination with Partial Pivoting.

Complete the following in AlgoMatrix.py by replacing the well-marked comments with the appropriate code.

def gje(self, ncols=None):
    '''Perform Gauss-Jordan elimination with Partial Pivoting on the first
        ncols of self; if ncols is not specified, perform the algorithm on
        the entire matrix.'''
    m, n = self.dims()
    if ncols in range(n):
        max_col = ncols
    elif ncols == None:
        max_col = n
    else:
        raise ValueError("Invalid number of columns to left of augmentation.")
    r, c = 0, 0
    grid = AlgoMatrix(self._data)
    while r< m and c< max_col:
        # Pivot Selection
        temp_max = abs(grid[r, c])
        piv_row = r
        for r_prime in range(r+1, m):
            # Find the greatest modulus entry in the subcolumn
            if abs(grid[r_prime, c]) > temp_max:
                temp_max = abs(grid[r_prime, c])
                piv_row = r_prime
        if temp_max == 0:
            #
            # Something needs to happen here to move the process to the
            # next column; the continue line below makes the process
            # go back to the beginning of the while loop, which is
            # necessary. What must change to make the process move to
            # the next column? Replace this comment.
            #
            continue
        elif piv_row != r:
            # Swap rows, we found a bigger pivot!
            #
            # Remember that grid is an AlgoMatrix and that the
            # row operations produce a new row-equivalent matrix
            # rather than changing self.
            #
            grid = None
        if grid[r, c] != 1:
            # Scale pivot row
            alpha = grid[r, c]
            #
            # Remember that grid is an AlgoMatrix and that the
            # row operations produce a new row-equivalent matrix
            # rather than changing self.
            #
            grid = None
        for r_prime in range(m):
            # Eliminate column entries outside pivot row
            if r_prime == r: continue
            alpha = grid[r_prime, c]
            #
            # Remember that grid is an AlgoMatrix and that the
            # row operations produce a new row-equivalent matrix
            # rather than changing self.
            #
            grid = None
        # Step down and right
        r += 1
        c += 1
    return grid

Activity 5.3. Identity matrices.

Write a function in aam_proj6.py which returns the identity AlgoMatrix of size cols.

def idmat(cols):
    """Return the identity matrix of size cols"""
    if type(cols) != int:
        raise TypeError("Bad size")
    elif cols <= 0:
        raise ValueError("Bad size")
    out = [ ]
    for r in range(cols):
        out.append( [ ] )
        for c in range(cols):
            if r==c:
                out.append(None)
            else:
                out.append(None)
    return AlgoMatrix(out)

Activity 5.4. Inverse matrices.

Use augmentation, Gauss-Jordan elimination, and the creation of an AlgoMatrix from a slice of the columns of another AlgoMatrix together to add the following method to the AlgoMatrix class:

def inverse(self):
    """Return the inverse of self if self is a nonsingular square matrix."""
    m, n = self.dims()
    if m == n:
        # Build the correct augmented matrix
        augmentation = None
        bigger_mat = None
        # Find the RREF matrix row-equivalent to it.
        rref_bigger_mat = None
        # Extract the correct columns from it as slices
        left_submat = None
        right_submat = None
        # Return the answer
        if SOME_NICE_CONDITION:
            return AlgoMatrix(THE_CORRECT_INPUT)
        else:
            raise ValueError("Singular matrix")
    else:
        raise ValueError("Non-square matrices are not invertible.")

Prev Top Next