In [1]:

#
import numpy as np
import scipy as sp
import pandas as pd
import matplotlib as mp
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn
import laUtilities as ut
import slideUtilities as sl
import demoUtilities as dm
from matplotlib import animation
from importlib import reload
from datetime import datetime
from IPython.display import Image, display_html, display, Math, HTML;
qr_setting = None

mp.rcParams['animation.html'] = 'jshtml';

import warnings # ignore warning for Fig 2.8
warnings.filterwarnings('ignore')

Announcements¶

HW1 is out on Piazza, due Feb 3 at 8pm
Office hours tomorrow:
- Peer tutor Rohan Anand, 1:30-3pm at CCDS 16th floor
- Abhishek Tiwari, 3:30-4:30pm at CCDS 13th floor
Read Boyd-Vandenberghe Chapter 3.1-3.2 and Chapter 4

5. Gaussian Elimination, continued¶

Gaussian Elimination has two phases:

1. Elimination

Take a matrix $A$ and convert it to echelon form (or row echelon form) with the following three properties:

All nonzero rows are above any rows of all zeros.
Each leading entry of a row is in a column to the right of the leading entry of the row above it.
All entries in a column below a leading entry are zeros.

[\begin{array}{cccccccccc} 0 & ◼ & * & * & * & * & * & * & * & * \\ 0 & 0 & 0 & ◼ & * & * & * & * & * & * \\ 0 & 0 & 0 & 0 & ◼ & * & * & * & * & * \\ 0 & 0 & 0 & 0 & 0 & ◼ & * & * & * & * \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ◼ & * \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{array}]

In this diagram, the leading entries $◼$ are nonzero, and the $*$ symbols can be any value.

2. Backsubstitution

Take a matrix in echelon form and convert it to reduced echelon form (or reduced row echelon form), which additionally has the following properties:

The leading entry in each nonzero row is 1.
Each leading 1 is the only nonzero entry in its column.

[\begin{array}{cccccccccc} 0 & 1 & * & 0 & 0 & 0 & * & * & 0 & * \\ 0 & 0 & 0 & 1 & 0 & 0 & * & * & 0 & * \\ 0 & 0 & 0 & 0 & 1 & 0 & * & * & 0 & * \\ 0 & 0 & 0 & 0 & 0 & 1 & * & * & 0 & * \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & * \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{array}]

5.1 Examples of Gaussian Elimination¶

Let's do a few more examples of Gaussian elimination, this time where the equations have infinitely many solutions. Once the matrix is in reduced row echelon form, in order to categorize the set of solutions we must identify the basic variables and free variables.

In our first example, let the input matrix $A$ be

[\begin{array}{rrrr} 0 & 3 & 4 & - 5 \\ 3 & - 7 & 8 & 9 \\ 3 & - 9 & 6 & 15 \end{array}]

Stage 1 (Elimination)

Start with the first row ( $i = 1$ ). The leftmost nonzero in row 1 and below is in column 1. But since it's not in row 1, we need to swap. We'll swap rows 1 and 3 (we could have swapped 1 and 2).

[\begin{array}{rrrr} 3 & - 9 & 6 & 15 \\ 3 & - 7 & 8 & 9 \\ 0 & 3 & 4 & - 5 \end{array}]

The pivot is shown in a box. Use row reduction operations to create zeros below the pivot. In this case, that means subtracting row 1 from row 2.

[\begin{array}{rrrr} 3 & - 9 & 6 & 15 \\ 0 & 2 & 2 & - 6 \\ 0 & 3 & 4 & - 5 \end{array}]

Now $i = 2$ . The pivot is boxed (no need to do any swaps). Use row reduction to create zeros below the pivot. To do so we subtract $3 / 2$ times row 2 from row 3.

[\begin{array}{rrrr} 3 & - 9 & 6 & 15 \\ 0 & 2 & 2 & - 6 \\ 0 & 0 & 1 & 4 \end{array}]

Now $i = 3$ . Since it is the last row, we are done with Stage 1. The pivots are marked:

[\begin{array}{rrrr} 3 & - 9 & 6 & 15 \\ 0 & 2 & 2 & - 6 \\ 0 & 0 & 1 & 4 \end{array}]

Stage 2 (Backsubstitution)

Starting again with the first row ( $i = 1$ ). Divide row 1 by its pivot.

[\begin{array}{rrrr} 1 & - 3 & 2 & 5 \\ 0 & 2 & 2 & - 6 \\ 0 & 0 & 1 & 4 \end{array}]

Moving to the next row ( $i = 2$ ). Divide row 2 by its pivot.

[\begin{array}{rrrr} 1 & - 3 & 2 & 5 \\ 0 & 1 & 1 & - 3 \\ 0 & 0 & 1 & 4 \end{array}]

And use row reduction operations to create zeros in all elements above the pivot. In this case, that means adding 3 times row 2 to row 1.

[\begin{array}{rrrr} 1 & 0 & 5 & - 4 \\ 0 & 1 & 1 & - 3 \\ 0 & 0 & 1 & 4 \end{array}]

Moving to the next row ( $i = 3$ ). The pivot is already 1. So we subtract row 3 from row 2, and subtract 5 times row 3 from row 1.

[\begin{array}{rrrr} 1 & 0 & 0 & - 24 \\ 0 & 1 & 0 & - 7 \\ 0 & 0 & 1 & 4 \end{array}]

This matrix is in reduced row echelon form, so we are done with GE. The solution is:

x_{1} = - 24 x_{2} = - 7 x_{3} = 4

Let's assume that the augmented matrix of a system has been transformed into the equivalent reduced echelon form:

[\begin{array}{rrrr} 1 & 0 & - 5 & 1 \\ 0 & 1 & 1 & 4 \\ 0 & 0 & 0 & 0 \end{array}]

This system is consistent. Is the solution unique?

The associated system of equations is

\begin{array}{rrrrr} x_{1} & - 5 x_{3} & = & 1 \\ x_{2} & + x_{3} & = & 4 \\ 0 & = & 0 \end{array}

Variables $x_{1}$ and $x_{2}$ correspond to pivot columns. They are called basic variables. The other variable $x_{3}$ is a free variable.

Whenever a system is consistent, the solution set can be described explicitly by solving the reduced system of equations for the basic variables in terms of the free variables.

This operation is possible because the reduced echelon form places each basic variable in one and only one equation.

In the example, solve the first and second equations for $x_{1}$ and $x_{2}$ . Ignore the third equation; it offers no restriction on the variables.

So the solution set is:

\begin{aligned} x_{1} & = 1 + 5 x_{3} \\ x_{2} & = 4 - x_{3} \\ x_{3} & is free \end{aligned}

" $x_{3}$ is free" means you can choose any value for $x_{3}$ .

In other words, there are an inifinite set of solutions to this linear system. Each solution corresponds to one particular value of $x_{3}$ .

For instance,

when $x_{3} = 0$ , the solution is $(1, 4, 0)$ ;
when $x_{3} = 1,$ the solution is $(6, 3, 1)$ .

These are parametric descriptions of solutions sets. The free variables act as parameters.

So: solving a system amounts to either:

finding a parametric description of the solution set, or
determining that the solution set is empty.

Geometrically, the solution set to this system is a line in $R^{3}$ .

In [2]:

#
fig = ut.three_d_figure((3, 3), fig_desc = 'Solution set when one variable is free.',
                        xmin = -4, xmax = 4, ymin = 0, ymax = 8, zmin = -4, zmax = 4, 
                        qr = qr_setting)
plt.close()
eq1 = [1, 0, -5, 1]
eq2 = [0, 1, 1, 4]
fig.plotLinEqn(eq1, 'Brown')
fig.plotLinEqn(eq2, 'Green')
fig.plotIntersection(eq1, eq2, color='Blue')
fig.set_title('Solution set when one variable is free.')
fig.ax.view_init(azim = 0, elev = 22)
fig.save()
#
def anim(frame):
    fig.ax.view_init(azim = frame, elev = 22)
    # fig.canvas.draw()
#
# create and display the animation 
HTML(animation.FuncAnimation(fig.fig, anim,
                       frames = 2 * np.arange(180),
                       fargs = None,
                       interval = 30,
                       repeat = False).to_jshtml(default_mode = 'loop'))

Out[2]:

How many solutions will a system have?¶

[\begin{array}{rrrr} 1 & 0 & - 5 & 1 \\ 0 & 1 & 1 & 4 \\ 0 & 0 & 0 & 1 \end{array}]

[\begin{array}{rrrrrr} 1 & 0 & - 2 & 3 & 0 & - 24 \\ 0 & 1 & - 2 & 2 & 0 & - 7 \\ 0 & 0 & 0 & 0 & 1 & 4 \end{array}]

[\begin{array}{rrrr} 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & - 2 \\ 0 & 0 & 1 & 0 \end{array}]

[\begin{array}{rrr} 2 & - 3 & 1 \\ 3 & - 2 & 4 \\ 1 & - 1 & 1 \end{array}]

Given a system of $m$ equations in $n$ unknowns, let $A$ be the $m \times (n + 1)$ augmented matrix. Let $r$ be the number of pivot positions in the REF of $A$ .

If $r = n$ , there is a unique solution (no parameters in the solution).
If $r > n$ (so $r = n + 1$ ) the system is inconsistent (no solution).
If $r < n$ , either the system is inconsistent (no solution) or has a solution with $n - r$ parameters.

Features of homogeneous systems¶

A linear system $A x = b$ is homogeneous if $b = 0$ . Otherwise, it is inhomogeneous.

Homogeneous systems are always consistant. (why?)

Homogeneous systems have infinite solutions if they have at least one free variable.

If x and y are particular solutions to a homogeneous system, any linear combination of x and y is also a solution to the system.

5.2 Gaussian Elimination: The Algorithm¶

The generic algorithm we've created to find solutions to a linear system is called Gaussian Elimination or Gauss-Jordan Elimination. It has two stages. Given an augmented matrix $A$ representing a linear system:

Elimination: Convert $A$ to one of its echelon forms, say $U$ .
Backsubstitution: Convert $U$ to $A$ 's unique reduced row echelon form.

Each stage iterates over the rows of $A$ , starting with the first row.

Before stating the algorithm, let's recall the set of row reduction operations that we can perform on rows of a matrix without changing the solution set:

Swap two rows.
Multiply a row by a nonzero value.
Add a multiple of a row to another row.

Gaussian Elimination, Stage 1: Elimination¶

Input: matrix $A$ .

We will use $i$ to denote the index of the current row. To start, let $i = 1$ . Repeat the following steps:

Let $j$ be the position of the leftmost nonzero value in row $i$ or any row below it. If there is no such position, stop.
If the $j$ th position in row $i$ is zero, swap this row with a row below it to make the $j$ th position nonzero. This creates a pivot in position $i, j$ .
Use row reduction operations to create zeros in all posititions below the pivot. If any operation creates a row that is all zeros except the last element, the system is inconsistent; stop.
Let $i = i + 1.$ If $i$ equals the number of rows in $A$ , stop.

The output of this stage is an echelon form of $A$ .

[\begin{array}{cccccccccc} 0 & ◼ & * & * & * & * & * & * & * & * \\ 0 & * & * & * & * & * & * & * & * & * \\ 0 & * & * & * & * & * & * & * & * & * \\ 0 & * & * & * & * & * & * & * & * & * \\ 0 & * & * & * & * & * & * & * & * & * \\ 0 & * & * & * & * & * & * & * & * & * \end{array}] \Rightarrow [\begin{array}{cccccccccc} 0 & ◼ & * & * & * & * & * & * & * & * \\ 0 & 0 & 0 & ◼ & * & * & * & * & * & * \\ 0 & 0 & 0 & 0 & ◼ & * & * & * & * & * \\ 0 & 0 & 0 & 0 & 0 & ◼ & * & * & * & * \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ◼ & * \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{array}]

Gaussian Elimination, Stage 2: Backsubstitution¶

Input: an echelon form of $A$ .

We start at the top again, so let $i = 1$ . Repeat the following steps:

If row $i$ is all zeros, or if $i$ exceeds the number of rows in $A$ , stop.
If row $i$ has a nonzero pivot value, divide row $i$ by its pivot value. This creates a 1 in the pivot position.
Use row reduction operations to create zeros in all positions above the pivot.
Let $i = i + 1$ .

The output of this stage is the reduced echelon form of $A$ .

[\begin{array}{cccccccccc} 0 & ◼ & * & * & * & * & * & * & * & * \\ 0 & 0 & 0 & ◼ & * & * & * & * & * & * \\ 0 & 0 & 0 & 0 & ◼ & * & * & * & * & * \\ 0 & 0 & 0 & 0 & 0 & ◼ & * & * & * & * \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ◼ & * \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{array}] \Rightarrow [\begin{array}{cccccccccc} 0 & 1 & * & 0 & 0 & 0 & * & * & 0 & * \\ 0 & 0 & 0 & 1 & 0 & 0 & * & * & 0 & * \\ 0 & 0 & 0 & 0 & 1 & 0 & * & * & 0 & * \\ 0 & 0 & 0 & 0 & 0 & 1 & * & * & 0 & * \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & * \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{array}]

The columns that contain the pivots in the row-echelon form correspond to the basic variables, and the remaining variables are free variables.

For example, in the reduced row echelon form matrix above:

$x_{2}$ , $x_{4}$ , $x_{5}$ , $x_{6}$ , and $x_{9}$ are the basic variables, and
$x_{1}$ , $x_{3}$ , $x_{7}$ , and $x_{8}$ are the free variables.

5.3 How Many Operations does Gaussian Elimination Require?¶

Gaussian Elimination is the first algorithm we have discussed in the course. As with any algorithm, it is important to assess its cost.

This will help us to understand how difficult it is for a computer to run the algorithm, especially when the datasets become large.

Measuring the Cost of an Algorithm¶

First, we need to define our units. In this course, we will measure the cost of an algorithm by counting the number of additions, multiplications, divisions, subtractions, or square roots.

In modern processors, each of these operations requires only a single instruction. When performed over real numbers in floating point representation, these operations are called flops (floating point operations).

Second, when counting operations we will primarily be concerned with the highest-powered term in the expression that counts flops.

This tells us how the flop count scales for very large inputs.

For example, let's say for a problem with input size $n$ , an algorithm has flop count $12 n^{2} + 3 n + 2$ .

Then the cost of the algorithm is $12 n^{2}$ .

This is a good approximation because $12 n^{2}$ is asymptotically equivalent to the exact flop count:

lim_{n \to \infty} \frac{12 n^{2} + 3 n + 2}{12 n^{2}} = 1.

We will use the symbol $\sim$ to denote this relationship.

So we would say that this algorithm has flop count $\sim 12 n^{2}$ .

The Cost of Gaussian Elimination¶

Now, let's assess the computational cost required to solve a system of $n$ equations in $n$ unknowns using Gaussian Elimination.

For $n$ equations in $n$ unknowns, $A$ is an $n \times (n + 1)$ matrix.

We can summarize stage 1 of Gaussian Elimination as, in the worst case:

For each row $i$ of $A$ :
- add a multiple of row $i$ to all rows below it

In [6]:

# Image credit: Prof. Mark Crovella
display(Image("images/03-ge1.jpg", width=400))

For row 1, this becomes $(n - 1) \cdot 2 (n + 1)$ flops.

That is, there are $n - 1$ rows below row 1, each of those has $n + 1$ elements, and each element requires one multiplication and one addition. This is $2 n^{2} - 2$ flops for row 1.

When operating on row $i$ , there are $k = n - i + 1$ unknowns and so there are $2 k^{2} - 2$ flops required to process the rows below row $i$ .

In [7]:

# Image credit: Prof. Mark Crovella
display(Image("images/03-ge2.jpg", width=400))

So we can see that $k$ ranges from $n$ down to $1$ .

So, the number of operations required for the Elimination stage is:

\begin{array}{rcl} (1) & \sum_{k = 1}^{n} (2 k^{2} - 2) & = 2 (\sum_{k = 1}^{n} k^{2} - \sum_{k = 1}^{n} 1) \\ (2) & = 2 (\frac{n (n + 1) (2 n + 1)}{6} - n) \\ (3) & = \frac{2}{3} n^{3} + n^{2} - \frac{5}{3} n, \end{array}

based on the formula about sums of squares.

When $n$ is large, this expression is dominated by $\frac{2}{3} n^{3}$ .

That is,

lim_{n \to \infty} \frac{\frac{2}{3} n^{3} + n^{2} - \frac{5}{3} n}{\frac{2}{3} n^{3}} = 1

The second stage of GE only requires on the order of $n^{2}$ flops, so the whole algorithm is dominated by the $\frac{2}{3} n^{3}$ flops in the first stage.

So, we find that:

The Elimination stage is $\sim \frac{2}{3} n^{3}$ .
The Backsubstitution stage is $\sim n^{2}$ .

Thus we say that the flop count of Gaussian Elimination is $\sim \frac{2}{3} n^{3}$ .