In [1]:
#
import numpy as np
import scipy as sp
import pandas as pd
import matplotlib as mp
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn
import laUtilities as ut
import slideUtilities as sl
import demoUtilities as dm
from matplotlib import animation
from importlib import reload
from datetime import datetime
from IPython.display import Image, display_html, display, Math, HTML;
qr_setting = None

mp.rcParams['animation.html'] = 'jshtml';

import warnings # ignore warning for Fig 2.8
warnings.filterwarnings('ignore')

Announcements¶

  • HW1 is out on Piazza, due Feb 3 at 8pm

  • Office hours tomorrow:

    • Peer tutor Rohan Anand, 1:30-3pm at CCDS 16th floor
    • Abhishek Tiwari, 3:30-4:30pm at CCDS 13th floor
  • Read Boyd-Vandenberghe Chapter 3.1-3.2 and Chapter 4

5. Gaussian Elimination, continued¶

Gaussian Elimination has two phases:

1. Elimination

Take a matrix AA and convert it to echelon form (or row echelon form) with the following three properties:

  1. All nonzero rows are above any rows of all zeros.
  2. Each leading entry of a row is in a column to the right of the leading entry of the row above it.
  3. All entries in a column below a leading entry are zeros.
⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣0■∗∗∗∗∗∗∗∗000■∗∗∗∗∗∗0000■∗∗∗∗∗00000■∗∗∗∗00000000■∗0000000000⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦[0◼∗∗∗∗∗∗∗∗000◼∗∗∗∗∗∗0000◼∗∗∗∗∗00000◼∗∗∗∗00000000◼∗0000000000]

In this diagram, the leading entries ■◼ are nonzero, and the ∗∗ symbols can be any value.

2. Backsubstitution

Take a matrix in echelon form and convert it to reduced echelon form (or reduced row echelon form), which additionally has the following properties:

  1. The leading entry in each nonzero row is 1.
  2. Each leading 1 is the only nonzero entry in its column.
⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣01∗000∗∗0∗000100∗∗0∗000010∗∗0∗000001∗∗0∗000000001∗0000000000⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦[01∗000∗∗0∗000100∗∗0∗000010∗∗0∗000001∗∗0∗000000001∗0000000000]

5.1 Examples of Gaussian Elimination¶

Let's do a few more examples of Gaussian elimination, this time where the equations have infinitely many solutions. Once the matrix is in reduced row echelon form, in order to categorize the set of solutions we must identify the basic variables and free variables.

Example 1¶

In our first example, let the input matrix AA be

⎡⎢ ⎢⎣034−53−7893−9615⎤⎥ ⎥⎦[034−53−7893−9615]

Stage 1 (Elimination)

Start with the first row (i=1i=1). The leftmost nonzero in row 1 and below is in column 1. But since it's not in row 1, we need to swap. We'll swap rows 1 and 3 (we could have swapped 1 and 2).

⎡⎢ ⎢⎣3−96153−789034−5⎤⎥ ⎥⎦[3−96153−789034−5]

The pivot is shown in a box. Use row reduction operations to create zeros below the pivot. In this case, that means subtracting row 1 from row 2.

⎡⎢ ⎢⎣3−9615022−6034−5⎤⎥ ⎥⎦[3−9615022−6034−5]

Now i=2i=2. The pivot is boxed (no need to do any swaps). Use row reduction to create zeros below the pivot. To do so we subtract 3/23/2 times row 2 from row 3.

⎡⎢ ⎢⎣3−9615022−60014⎤⎥ ⎥⎦[3−9615022−60014]

Now i=3i=3. Since it is the last row, we are done with Stage 1. The pivots are marked:

⎡⎢ ⎢⎣3−9615022−60014⎤⎥ ⎥⎦[3−9615022−60014]

Stage 2 (Backsubstitution)

Starting again with the first row (i=1i=1). Divide row 1 by its pivot.

⎡⎢ ⎢⎣1−325022−60014⎤⎥ ⎥⎦[1−325022−60014]

Moving to the next row (i=2i=2). Divide row 2 by its pivot.

⎡⎢ ⎢⎣1−325011−30014⎤⎥ ⎥⎦[1−325011−30014]

And use row reduction operations to create zeros in all elements above the pivot. In this case, that means adding 3 times row 2 to row 1.

⎡⎢ ⎢⎣105−4011−30014⎤⎥ ⎥⎦[105−4011−30014]

Moving to the next row (i=3i=3). The pivot is already 1. So we subtract row 3 from row 2, and subtract 5 times row 3 from row 1.

⎡⎢ ⎢⎣100−24010−70014⎤⎥ ⎥⎦[100−24010−70014]

This matrix is in reduced row echelon form, so we are done with GE. The solution is:

x1=−24x2=−7x3=4x1=−24x2=−7x3=4

Example 2¶

Let's assume that the augmented matrix of a system has been transformed into the equivalent reduced echelon form:

⎡⎢ ⎢⎣10−5101140000⎤⎥ ⎥⎦[10−5101140000]

This system is consistent. Is the solution unique?

The associated system of equations is

x1−5x3=1x2+x3=40=0x1−5x3=1x2+x3=40=0

Variables x1x1 and x2x2 correspond to pivot columns. They are called basic variables. The other variable x3x3 is a free variable.

Whenever a system is consistent, the solution set can be described explicitly by solving the reduced system of equations for the basic variables in terms of the free variables.

This operation is possible because the reduced echelon form places each basic variable in one and only one equation.

In the example, solve the first and second equations for x1x1 and x2x2. Ignore the third equation; it offers no restriction on the variables.

So the solution set is:

x1=1+5x3x2=4−x3x3is freex1=1+5x3x2=4−x3x3is free

"x3x3 is free" means you can choose any value for x3x3.

In other words, there are an inifinite set of solutions to this linear system. Each solution corresponds to one particular value of x3x3.

For instance,

  • when x3=0x3=0, the solution is (1,4,0)(1,4,0);
  • when x3=1,x3=1, the solution is (6,3,1)(6,3,1).

These are parametric descriptions of solutions sets. The free variables act as parameters.

So: solving a system amounts to either:

  • finding a parametric description of the solution set, or
  • determining that the solution set is empty.

Geometrically, the solution set to this system is a line in R3R3.

In [2]:
#
fig = ut.three_d_figure((3, 3), fig_desc = 'Solution set when one variable is free.',
                        xmin = -4, xmax = 4, ymin = 0, ymax = 8, zmin = -4, zmax = 4, 
                        qr = qr_setting)
plt.close()
eq1 = [1, 0, -5, 1]
eq2 = [0, 1, 1, 4]
fig.plotLinEqn(eq1, 'Brown')
fig.plotLinEqn(eq2, 'Green')
fig.plotIntersection(eq1, eq2, color='Blue')
fig.set_title('Solution set when one variable is free.')
fig.ax.view_init(azim = 0, elev = 22)
fig.save()
#
def anim(frame):
    fig.ax.view_init(azim = frame, elev = 22)
    # fig.canvas.draw()
#
# create and display the animation 
HTML(animation.FuncAnimation(fig.fig, anim,
                       frames = 2 * np.arange(180),
                       fargs = None,
                       interval = 30,
                       repeat = False).to_jshtml(default_mode = 'loop'))
Out[2]:

How many solutions will a system have?¶

⎡⎢ ⎢⎣10−5101140001⎤⎥ ⎥⎦[10−5101140001]
⎡⎢ ⎢⎣10−230−2401−220−7000014⎤⎥ ⎥⎦[10−230−2401−220−7000014]
⎡⎢ ⎢⎣1001010−20010⎤⎥ ⎥⎦[1001010−20010]
⎡⎢ ⎢⎣2−313−241−11⎤⎥ ⎥⎦[2−313−241−11]

Given a system of mm equations in nn unknowns, let AA be the m×(n+1)m×(n+1) augmented matrix. Let rr be the number of pivot positions in the REF of AA.

  • If r=nr=n, there is a unique solution (no parameters in the solution).
  • If r>nr>n (so r=n+1r=n+1) the system is inconsistent (no solution).
  • If r<nr<n, either the system is inconsistent (no solution) or has a solution with n−rn−r parameters.

Features of homogeneous systems¶

A linear system Ax=bAx=b is homogeneous if b=0b=0. Otherwise, it is inhomogeneous.

Homogeneous systems are always consistant. (why?)

Homogeneous systems have infinite solutions if they have at least one free variable.

If x and y are particular solutions to a homogeneous system, any linear combination of x and y is also a solution to the system.

5.2 Gaussian Elimination: The Algorithm¶

The generic algorithm we've created to find solutions to a linear system is called Gaussian Elimination or Gauss-Jordan Elimination. It has two stages. Given an augmented matrix AA representing a linear system:

  1. Elimination: Convert AA to one of its echelon forms, say UU.
  2. Backsubstitution: Convert UU to AA's unique reduced row echelon form.

Each stage iterates over the rows of AA, starting with the first row.

Before stating the algorithm, let's recall the set of row reduction operations that we can perform on rows of a matrix without changing the solution set:

  1. Swap two rows.
  2. Multiply a row by a nonzero value.
  3. Add a multiple of a row to another row.

Gaussian Elimination, Stage 1: Elimination¶

Input: matrix AA.

We will use ii to denote the index of the current row. To start, let i=1i=1. Repeat the following steps:

  1. Let jj be the position of the leftmost nonzero value in row ii or any row below it. If there is no such position, stop.
  2. If the jjth position in row ii is zero, swap this row with a row below it to make the jjth position nonzero. This creates a pivot in position i,ji,j.
  3. Use row reduction operations to create zeros in all posititions below the pivot. If any operation creates a row that is all zeros except the last element, the system is inconsistent; stop.
  4. Let i=i+1.i=i+1. If ii equals the number of rows in AA, stop.

The output of this stage is an echelon form of AA.

⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣0■∗∗∗∗∗∗∗∗0∗∗∗∗∗∗∗∗∗0∗∗∗∗∗∗∗∗∗0∗∗∗∗∗∗∗∗∗0∗∗∗∗∗∗∗∗∗0∗∗∗∗∗∗∗∗∗⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦⇒⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣0■∗∗∗∗∗∗∗∗000■∗∗∗∗∗∗0000■∗∗∗∗∗00000■∗∗∗∗00000000■∗0000000000⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦[0◼∗∗∗∗∗∗∗∗0∗∗∗∗∗∗∗∗∗0∗∗∗∗∗∗∗∗∗0∗∗∗∗∗∗∗∗∗0∗∗∗∗∗∗∗∗∗0∗∗∗∗∗∗∗∗∗]⇒[0◼∗∗∗∗∗∗∗∗000◼∗∗∗∗∗∗0000◼∗∗∗∗∗00000◼∗∗∗∗00000000◼∗0000000000]

Gaussian Elimination, Stage 2: Backsubstitution¶

Input: an echelon form of AA.

We start at the top again, so let i=1i=1. Repeat the following steps:

  1. If row ii is all zeros, or if ii exceeds the number of rows in AA, stop.
  2. If row ii has a nonzero pivot value, divide row ii by its pivot value. This creates a 1 in the pivot position.
  3. Use row reduction operations to create zeros in all positions above the pivot.
  4. Let i=i+1i=i+1.

The output of this stage is the reduced echelon form of AA.

⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣0■∗∗∗∗∗∗∗∗000■∗∗∗∗∗∗0000■∗∗∗∗∗00000■∗∗∗∗00000000■∗0000000000⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦⇒⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣01∗000∗∗0∗000100∗∗0∗000010∗∗0∗000001∗∗0∗000000001∗0000000000⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦[0◼∗∗∗∗∗∗∗∗000◼∗∗∗∗∗∗0000◼∗∗∗∗∗00000◼∗∗∗∗00000000◼∗0000000000]⇒[01∗000∗∗0∗000100∗∗0∗000010∗∗0∗000001∗∗0∗000000001∗0000000000]

The columns that contain the pivots in the row-echelon form correspond to the basic variables, and the remaining variables are free variables.

For example, in the reduced row echelon form matrix above:

  • x2x2, x4x4, x5x5, x6x6, and x9x9 are the basic variables, and
  • x1x1, x3x3, x7x7, and x8x8 are the free variables.

5.3 How Many Operations does Gaussian Elimination Require?¶

Gaussian Elimination is the first algorithm we have discussed in the course. As with any algorithm, it is important to assess its cost.

This will help us to understand how difficult it is for a computer to run the algorithm, especially when the datasets become large.

Measuring the Cost of an Algorithm¶

First, we need to define our units. In this course, we will measure the cost of an algorithm by counting the number of additions, multiplications, divisions, subtractions, or square roots.

In modern processors, each of these operations requires only a single instruction. When performed over real numbers in floating point representation, these operations are called flops (floating point operations).

Second, when counting operations we will primarily be concerned with the highest-powered term in the expression that counts flops.

This tells us how the flop count scales for very large inputs.

For example, let's say for a problem with input size nn, an algorithm has flop count 12n2+3n+212n2+3n+2.

Then the cost of the algorithm is 12n212n2.

This is a good approximation because 12n212n2 is asymptotically equivalent to the exact flop count:

limn→∞12n2+3n+212n2=1.limn→∞12n2+3n+212n2=1.

We will use the symbol ∼∼ to denote this relationship.

So we would say that this algorithm has flop count ∼12n2∼12n2.

The Cost of Gaussian Elimination¶

Now, let's assess the computational cost required to solve a system of nn equations in nn unknowns using Gaussian Elimination.

For nn equations in nn unknowns, AA is an n×(n+1)n×(n+1) matrix.

We can summarize stage 1 of Gaussian Elimination as, in the worst case:

  • For each row ii of AA:
    • add a multiple of row ii to all rows below it
In [6]:
# Image credit: Prof. Mark Crovella
display(Image("images/03-ge1.jpg", width=400))

For row 1, this becomes (n−1)⋅2(n+1)(n−1)⋅2(n+1) flops.

That is, there are n−1n−1 rows below row 1, each of those has n+1n+1 elements, and each element requires one multiplication and one addition. This is 2n2−22n2−2 flops for row 1.

When operating on row ii, there are k=n−i+1k=n−i+1 unknowns and so there are 2k2−22k2−2 flops required to process the rows below row ii.

In [7]:
# Image credit: Prof. Mark Crovella
display(Image("images/03-ge2.jpg", width=400))

So we can see that kk ranges from nn down to 11.

So, the number of operations required for the Elimination stage is:

n∑k=1(2k2−2)=2(n∑k=1k2−n∑k=11)=2(n(n+1)(2n+1)6−n)=23n3+n2−53n,(1)(2)(3)(1)∑k=1n(2k2−2)=2(∑k=1nk2−∑k=1n1)(2)=2(n(n+1)(2n+1)6−n)(3)=23n3+n2−53n,

based on the formula about sums of squares.

When nn is large, this expression is dominated by 23n323n3.

That is,

limn→∞23n3+n2−53n23n3=1limn→∞23n3+n2−53n23n3=1

The second stage of GE only requires on the order of n2n2 flops, so the whole algorithm is dominated by the 23n323n3 flops in the first stage.

So, we find that:

  • The Elimination stage is ∼23n3∼23n3.
  • The Backsubstitution stage is ∼n2∼n2.

Thus we say that the flop count of Gaussian Elimination is ∼23n3∼23n3.