In [1]:
#
import numpy as np
import scipy as sp
import pandas as pd
import matplotlib as mp
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn
import laUtilities as ut
import slideUtilities as sl
import demoUtilities as dm
from matplotlib import animation
from importlib import reload
from datetime import datetime
from IPython.display import Image, display_html, display, Math, HTML;
qr_setting = None

mp.rcParams['animation.html'] = 'jshtml';

Announcements¶

  • Homework 6 is due Friday, 3/24 at 8pm

  • Remember to submit a readable file and to cite all people you collaborated with + sources you used

  • Upcoming office hours

    • Tomorrow: Peer tutor Rohan Anand from 1:30-3pm in CCDS 16th floor
    • Tomorrow: Abhishek Tiwari from 3:30-4:30pm on CCDS 13th floor
  • Weekly reading and viewing assignments

    • Boyd-Vandenberghe Chapter 12

Recap¶

Definition. A subspace is any set HH in RnRn that has three properties:

  1. The zero vector is in HH.
  2. For each uu and vv in HH, the sum u+vu+v is in HH.
  3. For each uu in HH and each scalar c,c, the vector cucu is in HH.

Another way of stating properties 2 and 3 is that HH is closed under addition and scalar multiplication.

In [12]:
#
fig = ut.three_d_figure((20, 1), fig_desc = 'Span{v1, v2}, a subspace.',
                        xmin = -10, xmax = 10, ymin = -10, ymax = 10, zmin = -10, zmax = 10, qr = qr_setting)
a1 = [-8.0, 8.0, 5.0]
a2 = [3.0,  2.0, -2.0]
a3 = 2.5 * np.array(a2)
fig.text(a1[0]+.5, a1[1]+.5, a1[2]+.5, r'$\bf v_1$', 'v_1', size=20)
fig.text(a3[0]+.5, a3[1]+.5, a3[2]+.5, r'$\bf v_2$', 'v_2', size=20)
fig.plotSpan(a1, a2,'Green')
fig.plotPoint(a1[0], a1[1], a1[2],'r')
fig.plotPoint(a3[0], a3[1], a3[2],'r')
# fig.plotPoint(a3[0], a3[1], a3[2],'r')
#fig.text(a1[0], a1[1], a1[2], r'$\bf v_1$', 'a_1', size=20)
fig.text(0.1, 0.1, -3, r'$\bf 0$', '0', size=12)
fig.plotPoint(0, 0, 0, 'b')
fig.set_title(r'Span{$v_1, v_2$}, a subspace.')
fig.text(9, -9, -7, r'$\bf H$', 'H', size = 20)
img = fig.save()

Definition. A basis for a subspace HH of RnRn is a minimally-small linearly independent set in HH that spans H.H.

In [16]:
#
fig = ut.three_d_figure((20, 5), fig_desc = 'Bases of H are {a1, a2} or {a1, a3}.',
                        xmin = -10, xmax = 10, ymin = -10, ymax = 10, zmin = -10, zmax = 10, qr = qr_setting)
a1 = [-8.0, 8.0, 5.0]
a2 = [3.0,  2.0, -2.0]
a3 = 2.5 * np.array(a2)
fig.text(a1[0]+.5, a1[1]+.5, a1[2]+.5, r'$\bf a_1$', 'a_1', size=18)
fig.text(a3[0]+.5, a3[1]+.5, a3[2]+.5, r'$\bf a_3$', 'a_3', size=18)
fig.text(a2[0]+.5, a2[1]+.5, a2[2]+.5, r'$\bf a_2$', 'a_2', size=18)
fig.plotSpan(a1, a2,'Green')
fig.plotPoint(a1[0], a1[1], a1[2],'r')
fig.plotPoint(a3[0], a3[1], a3[2],'r')
fig.plotPoint(a2[0], a2[1], a2[2],'r')
fig.plotLine([[0, 0, 0], a3], 'r', '--')
fig.plotLine([[0, 0, 0], a1], 'r', '--')
# fig.plotPoint(a3[0], a3[1], a3[2],'r')
fig.text(0.1, 0.1, -3, r'$\bf 0$', '0', size=12)
fig.plotPoint(0, 0, 0, 'b')
fig.set_title(r'Bases of $H$ are $\{{\bf a}_1, {\bf a}_2\}$ or $\{{\bf a}_1, {\bf a}_3\}$')
fig.text(9, -9, -7, r'H', 'H', size = 16)
img = fig.save()

So in the example above, a basis for HH could be:

{a1,a2}{a1,a2} or {a1,a3}.{a1,a3}.

However, {a1,a2,a3}{a1,a2,a3} is not a basis for HH.

And furthermore, {a1}{a1} is not a basis for HH.

Definition. The dimension of a nonzero subspace H,H, denoted by dimH,dim⁡H, is the number of vectors in any basis for H.H.

The dimension of the zero subspace {0}{0} is defined to be zero.

Column Space¶

Definition. The column space of a matrix AA is the set Col ACol A of all linear combinations of the columns of AA.

If AA = [a1⋯an][a1⋯an], with columns in Rm,Rm, then Col ACol A is the same as Span{a1,…,an}{a1,…,an}, and hence it is a subspace.

The column space of an m×nm×n matrix is a subspace of Rm.Rm.

In particular, note that Col ACol A equals RmRm only when the columns of AA span Rm.Rm. Otherwise, Col ACol A is only part of Rm.Rm.

Null Space¶

Definition. The null space of a matrix AA is the set Nul ANul A of all solutions of the homogeneous equation Ax=0.Ax=0.

In other words: the null space of AA is the set of all vectors that are mapped to the zero vector by AA. When AA has nn columns, a solution of Ax=0Ax=0 is a vector in Rn.Rn. So Nul ANul A is a subspace of Rn.Rn.

Basis and Dimension of the Null Space¶

We can construct a basis for Nul ANul⁡ A by finding a parametric description of the solution of Ax=0Ax=0.

We looked at the following example:

[A0]∼⎡⎢ ⎢⎣1−20−1300012−20000000⎤⎥ ⎥⎦,x2⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣21000⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦+x4⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣10−210⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦+x5⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣−30201⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦[A0]∼[1−20−1300012−20000000],x2[21000]+x4[10−210]+x5[−30201]

We constructed an explicit description of the null space of this matrix, as x=x2u+x4v+x5wx=x2u+x4v+x5w where uu, vv, and ww are linearly independent.

That is, Nul ANul A is the subspace spanned by {u,v,w}.{u,v,w}.

Each basis vector corresponds to a free variable in the equation Ax=0.Ax=0. Since this basis contains 3 vectors, the dimension of AA's null space (ie, dimNulAdim⁡Nul⁡A) is 3.

In general, to find the dimension of Nul A,Nul⁡ A, simply identify and count the number of free variables in Ax=0.Ax=0.

Basis and Dimension for Column Space¶

To find a basis for the column space, we have an easier starting point.

We know that the column space is the span of the matrix columns.

So, we can choose matrix columns to make up the basis.

The question is: which columns should we choose?

Warmup.

We start with a warmup example.

Suppose we have a matrix BB that happens to be in reduced echelon form:

B=⎡⎢ ⎢ ⎢⎣10−350012−100000100000⎤⎥ ⎥ ⎥⎦.B=[10−350012−100000100000].

Denote the columns of BB by b1,…,b5b1,…,b5.

Note that b3=−3b1+2b2b3=−3b1+2b2 and b4=5b1−b2.b4=5b1−b2.

So any combination of b1,…,b5b1,…,b5 is actually just a combination of b1,b2,b1,b2, and b5.b5.

So {b1,b2,b5}{b1,b2,b5} spans Col BCol B.

Also, b1,b2,b1,b2, and b5b5 are linearly independent, because they are columns from an identity matrix.

So: the pivot columns of BB form a basis for Col B.Col B.

So, for matrices in reduced row echelon form, we have a simple rule for the basis of the column space:

Choose the columns that hold the pivots.

The general case.

Now I'll show that the pivot columns of AA form a basis for Col ACol A for any AA.

Consider the case where Ax=0Ax=0 for some nonzero x.x. This says that there is a linear dependence relation between some of the columns of AA.

When we row-reduce AA to its reduced echelon form BB, the columns are changed, but the equations Ax=0Ax=0 and Bx=0Bx=0 have the same solution set.

So this means that the columns of AA and the columns of BB have exactly the same dependence relationships as the columns of BB. Specifically:

  1. If some column of BB can be written as a combination of other columns of BB, then the same is true of the corresponding columns of AA.

  2. If no combination of certain columns of BB yields the zero vector, then no combination of corresponding columns of AA yields the zero vector.

In other words:

  1. If some set of columns of BB spans the column space of BB, then the same columns of AA span the column space of AA.

  2. If some set of columns of BB are linearly independent, then the same columns of AA are linearly independent.

So, if some columns of BB are a basis for Col BCol B, then the corresponding columns of AA are a basis for Col ACol A.

Example. Consider the matrix AA:

A=⎡⎢ ⎢ ⎢⎣1332−9−2−22−822307134−111−8⎤⎥ ⎥ ⎥⎦A=[1332−9−2−22−822307134−111−8]

To find its basis, we simply need to look at the basis for its reduced row echelon form.

It turns out that AA is row equivalent to the matrix BB that we considered above. For matrix BB, recall that:

  • b3=−3b1+2b2b3=−3b1+2b2 and b4=5b1−b2b4=5b1−b2.
  • A basis for Col BCol B was columns 1, 2, and 5.

Therefore we can immediately conclude that a basis for Col ACol A is AA's columns 1, 2, and 5.

So a basis for Col ACol A is:

⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩⎡⎢ ⎢ ⎢⎣1−223⎤⎥ ⎥ ⎥⎦,⎡⎢ ⎢ ⎢⎣3−234⎤⎥ ⎥ ⎥⎦,⎡⎢ ⎢ ⎢⎣−921−8⎤⎥ ⎥ ⎥⎦⎫⎪ ⎪ ⎪⎬⎪ ⎪ ⎪⎭{[1−223],[3−234],[−921−8]}

Theorem. The pivot columns of a matrix AA form a basis for the column space of AA.

Be careful here -- note that

  • we compute the reduced row echelon form of AA to find which columns are pivot columns, but...
  • we use the columns of AA itself as the basis for Col ACol A!

Matrix Rank¶

Definition. The rank of a matrix, denoted by RankA,Rank⁡A, is the dimension of the column space of AA.

Since the pivot columns of AA form a basis for ColA,Col⁡A, the rank of AA is just the number of pivot columns in AA. This is also equal to the number of basic variables in Ax=0Ax=0 (hopefully now the term makes sense!)

Example. Determine the rank of the matrix

A=⎡⎢ ⎢ ⎢⎣25−3−4847−4−3969−5240−965−6⎤⎥ ⎥ ⎥⎦.A=[25−3−4847−4−3969−5240−965−6].

Solution Reduce AA to an echelon form:

A=⎡⎢ ⎢ ⎢⎣25−3−480−325−70−6414−200−965−6⎤⎥ ⎥ ⎥⎦∼⋯∼⎡⎢ ⎢ ⎢⎣25−3−480−325−70004−600000⎤⎥ ⎥ ⎥⎦.A=[25−3−480−325−70−6414−200−965−6]∼⋯∼[25−3−480−325−70004−600000].

The matrix AA has 3 pivot columns, so RankA=3.Rank⁡A=3.

The Rank Theorem¶

Theorem. If a matrix AA has nn columns, then RankA+dimNul A=nRank⁡A+dim⁡Nul⁡ A=n.

Lecture 22: Orthogonality¶

[This lecture is based on Prof. Crovella's CS 132 lecture notes and Fast.ai Lecture 8.]

22.1 Coordinate Systems¶

Today's lecture will explore the geometry of vectors and matrices. We will explore the concept of orthogonality of two vectors in more detail. This will lead to our second type of matrix factorization, called QR factorization.

To begin, let's look at the geometry of space in a new way, based on the concept of a basis that we defined last time.

A geometric coordinate system gives a unique name to each point in the plane (or in any vector space).

Traditionally, we label points using the standard basis. Remember that the "standard basis" refers to the columns of the identity matrix. For instance in R2R2 the standard basis vectors are:

e1=[10],e2=[01].e1=[10],e2=[01].
In [13]:
#
ax = dm.plotSetup(-3,3,-3,3,size=(6,6))
ax.plot([0],[1],'ro',markersize=8)
ax.arrow(0, 0, 1, 0, head_width=0.2, head_length=0.2, length_includes_head = True)
ax.arrow(0, 0, 0, 1, head_width=0.2, head_length=0.2, length_includes_head = True)
ax.text(0.25,1,'(0,1)',size=20)
ax.plot([1],[0],'ro',markersize=8)
ax.text(1.25,0.25,'(1,0)',size=20)
ax.plot([-1],[2],'ro',markersize=8)
ax.text(-1.5,2.25,'(-1,2)',size=20);

In the last lecture we developed the idea of a basis -- a minimal spanning set for a subspace HH. What happens if we label points using a different basis? Perhaps even a basis whose vectors aren't perpendicular?

Today we'll show that:

a basis provides a coordinate system for HH.

Theorem. If we are given a basis for HH, then each vector in HH can be written in only one way as a linear combination of the basis vectors.

Proof. Suppose B ={b1,…,bp}B ={b1,…,bp} is a basis for H,H, and suppose a vector xx in HH can be generated in two ways, say

x=c1b1+⋯+cpbpandx=d1b1+⋯+dpbp.x=c1b1+⋯+cpbpandx=d1b1+⋯+dpbp.

Then, subtracting gives

0=x−x=(c1−d1)b1+⋯+(cp−dp)bp.0=x−x=(c1−d1)b1+⋯+(cp−dp)bp.

Now, since BB is a basis, we know that the vectors {b1…bp}{b1…bp} are linearly independent.

So by the definition of linear independence, the weights in the above expression must all be zero. That is, cj=djcj=dj for all jj.

As a result, the two representations must be the same.

Definition. Suppose the set B ={b1,…,bp}B ={b1,…,bp} is a basis for the subspace HH.

For each xx in HH, the coordinates of xx relative to the basis BB are the weights c1,…,cpc1,…,cp such that x=c1b1+⋯+cpbpx=c1b1+⋯+cpbp.

The vector in RpRp

[x]B=⎡⎢ ⎢⎣c1⋮cp⎤⎥ ⎥⎦[x]B=[c1⋮cp]

is called the coordinate vector of xx (relative to BB) or the BB-coordinate vector of xx.

Example. In R2R2, let's look at the point x=[16]x=[16]. We will plot this point in the standard basis and also relative to a new basis:

B={[10],[12]}.B={[10],[12]}.
In [2]:
# standard basis
xmin = -6.0 
xmax = 6.0 
ymin = -2.0
ymax = 8.0

b0 = [1, 0]
b1 = [1, 2]

fig = ut.two_d_figure('Dummylabel', xmin = xmin, xmax = xmax, ymin = ymin, ymax = ymax, size=(6,5))
for x in np.linspace(ymin, ymax, int(ymax-ymin+1)):
    fig.plotLinEqn(0., 1., x, alpha=0.3)
for y in np.linspace(xmin, xmax, int(xmax-xmin+1)):
    fig.plotLinEqn(1., 0., y, alpha=0.3)
fig.plotLinEqn(1., 0, 0, color = 'k')
fig.plotLinEqn(0, 1, 0, color = 'k')
fig.plotPoint(0, 0, 'k')
fig.ax.text(0+.1, 0-.1, r'$\bf 0$', size = 12, horizontalalignment='left', verticalalignment='top')
fig.plotPoint(1, 0, 'r')
fig.ax.text(1+.1, 0-.1, r'${\bf e}_1$', size = 12, horizontalalignment='left', verticalalignment='top')
fig.plotPoint(0, 1, 'r')
fig.ax.text(0+.1, 1-.1, r'${\bf e}_2$', size = 12, horizontalalignment='left', verticalalignment='top')
fig.plotPoint(b1[0], b1[1], 'g')
fig.ax.text(b1[0]+.1, b1[1]-.1, r'${\bf b}_2$', size = 12, horizontalalignment='left', verticalalignment='top')
fig.plotPoint(1, 6, 'b')
fig.ax.text(1+.1, 6-.1, r'${\bf x}$', size = 12, horizontalalignment='left', verticalalignment='top')
fig.ax.axis('off')
fig.ax.set_title(r'Standard Basis.  $\mathbf{x}$ = (1, 6)', size = 16);
In [3]:
# B-basis
fig = ut.two_d_figure('Dummylabel', xmin = xmin, xmax = xmax, ymin = ymin, ymax = ymax, size=(6,5))
m = b1[1]/b1[0]
upper_intercept = ymax - m * xmin
upper_intercept = b1[1] * np.ceil(upper_intercept / b1[1])
lower_intercept = ymin - m * xmax
lower_intercept = b1[1] * np.floor(lower_intercept / b1[1])
for yint in np.linspace(lower_intercept, upper_intercept, int((upper_intercept-lower_intercept)/b1[1])+1):
    fig.plotLinEqn(-b1[1], b1[0], yint, color = 'g', alpha=0.3)
for y in np.linspace(ymin, ymax, int(((ymax-ymin)/b1[1])+1)):
    fig.plotLinEqn(0., 1., y, color = 'g', alpha=0.3)
fig.plotLinEqn(b1[1], -b1[0], 0, color = 'k')
fig.plotLinEqn(b0[1], -b0[0], 0, color = 'k')
fig.plotPoint(0, 0, 'k')
fig.ax.text(0+.1, 0-.1, r'$\bf 0$', size = 12, horizontalalignment='left', verticalalignment='top')
fig.plotPoint(1, 0, 'g')
fig.ax.text(1+.1, 0-.1, r'${\bf b}_1$', size = 12, horizontalalignment='left', verticalalignment='top')
fig.plotPoint(b1[0], b1[1], 'g')
fig.ax.text(b1[0]+.1, b1[1]-.1, r'${\bf b}_2$', size = 12, horizontalalignment='left', verticalalignment='top')
fig.plotPoint(1, 6, 'b')
fig.ax.text(1+.1, 6-.1, r'${\bf x}$', size = 12, horizontalalignment='left', verticalalignment='top')
fig.ax.axis('off')
fig.ax.set_title('B-Basis. $[\mathbf{x}]_\mathcal{B}$ = (-2, 3)', size = 16);

Notice that the location of xx relative to the origin does not change.

However, using the BB-basis, it has different coordinates. The new coordinates are [x]B=[−23][x]B=[−23].

Finding the Coordinates of a Point in a Basis¶

Suppose we are given a particular basis. How do we find the coordinates of a point in that basis?

Let's consider a specific example.

Let v1=⎡⎢⎣362⎤⎥⎦,v2=⎡⎢⎣−101⎤⎥⎦,x=⎡⎢⎣3127⎤⎥⎦,v1=[362],v2=[−101],x=[3127], and B={v1,v2}.B={v1,v2}.

Then BB is a basis for HH = Span{v1,v2}{v1,v2} because v1v1 and v2v2 are linearly independent.

Problem: Determine if xx is in HH, and if it is, find the coordinate vector of xx relative to B.B.

Solution. If xx is in H,H, then the following vector equation is consistent:

c1⎡⎢⎣362⎤⎥⎦+c2⎡⎢⎣−101⎤⎥⎦=⎡⎢⎣3127⎤⎥⎦.c1[362]+c2[−101]=[3127].

The scalars c1c1 and c2,c2, if they exist, are the BB-coordinates of x.x.

Row operations show that

⎡⎢ ⎢⎣3−136012217⎤⎥ ⎥⎦∼⎡⎢ ⎢⎣102013000⎤⎥ ⎥⎦.[3−136012217]∼[102013000].

The reduced row echelon form shows that the system is consistent, so xx is in HH.

Furthermore, it shows that c1=2c1=2 and c2=3,c2=3,

so [x]B=[23].[x]B=[23].

In this example, the basis BB determines a coordinate system on HH, which can be visualized like this:

In [2]:
#
fig = ut.three_d_figure((22, 1), fig_desc = 'Basis-B coordinate system on the subspace H', figsize = (8, 8),
                        xmin = -1, xmax = 10, ymin = -1, ymax = 10, zmin = -1, zmax = 14, qr = qr_setting)
v = [3.0, 6, 2]
u = [-1,  0, 1]
#
v = np.array([2, 0.5, 0.5])
u = np.array([1.,  2,   1])
fig.text(v[0], v[1]-0.75, v[2]-2, r'$\bf v_1$', 'v1', size=16)
fig.text(u[0]-0.5, u[1], u[2]+2, r'$\bf v_2$', 'v2', size=16)
fig.text(3*u[0]+2*v[0], 3*u[1]+2*v[1], 3*u[2]+2*v[2]+1, r'$\bf 2v_1+3v_2$', '2 v1 + 3 v2', size=16)
# plotting the span of v
fig.plotSpan(u, v, 'Green')
# blue grid lines
fig.plotPoint(0, 0, 0, 'y')
fig.plotPoint(u[0], u[1], u[2], 'b')
fig.plotPoint(2*u[0], 2*u[1], 2*u[2],'b')
fig.plotPoint(3*u[0], 3*u[1], 3*u[2], 'b')
fig.plotLine([[0, 0, 0], 4*u], color='b')
fig.plotLine([v, v+4*u], color='b')
fig.plotLine([2*v, 2*v+3*u], color='b')
fig.plotLine([3*v, 3*v+2.5*u], color='b')
# red grid lines
fig.plotPoint(v[0], v[1], v[2], 'r')
fig.plotPoint(2*v[0], 2*v[1], 2*v[2], 'r')
fig.plotPoint(3*v[0], 3*v[1], 3*v[2], 'r')
fig.plotLine([[0, 0, 0], 3.5*v], color='r')
fig.plotLine([u, u+3.5*v], color='r')
fig.plotLine([2*u, 2*u+3.5*v], color='r')
fig.plotLine([3*u, 3*u+2*v], color='r')
#
fig.plotPoint(3*u[0]+2*v[0], 3*u[1]+2*v[1], 3*u[2]+2*v[2], color='m')
# plotting the axes
fig.plotIntersection([0,0,1,0], [0,1,0,0], color='Black')
fig.plotIntersection([0,0,1,0], [1,0,0,0], color='Black')
fig.plotIntersection([0,1,0,0], [1,0,0,0], color='Black')
fig.set_title('Basis-B coordinate system on the subspace H', size=16)
fig.save()

Important: Don't be confused by the fact that the "coordinate axes" are not perpendicular.

The whole idea here is that they don't need to be.

As long as the independent vectors span the space, there is only one way to express any point in terms of them.

Thus, every point has a unique coordinate.

Isomorphism¶

Another important idea is that, although points in HH are in R3R3, they are completely determined by their coordinate vectors, which belong to R2.R2.

That is, the grid in the previous figure makes HH "look like" R2.R2.

The correspondence x↦[x]Bx↦[x]B is one-to-one correspondence between HH and R2R2 that preserves linear combinations.

In our example, ⎡⎢⎣3127⎤⎥⎦↦[23][3127]↦[23].

Definition. When we have a one-to-one correspondence between two subspaces that preserves linear combinations, we call such a correspondence an isomorphism, and we say that HH is isomorphic to R2.R2.

In general, if B ={b1,…,bp}B ={b1,…,bp} is a basis for HH, then the mapping x↦[x]Bx↦[x]B is a one-to-one correspondence that makes HH look and act the same as Rp.Rp.

This is even through the vectors in HH themselves may have more than pp entries.

22.2 Orthogonal Sets¶

Today we deepen our study of geometry. We take on more challenging geometric notions that bring in sets of vectors and subspaces. Within this realm, we will focus on orthogonality and a new notion called projection.

First of all, today we'll study the properties of sets of orthogonal vectors. These can be very useful.

Definition. A set of vectors {u1,…,up}{u1,…,up} in RnRn is said to be an orthogonal set if each pair of distinct vectors from the set is orthogonal, i.e.,

u⊤iuj=0wheneveri≠j.ui⊤uj=0wheneveri≠j.

Example. Show that {u1,u2,u3}{u1,u2,u3} is an orthogonal set, where

u1=⎡⎢⎣311⎤⎥⎦,u2=⎡⎢⎣−121⎤⎥⎦,u3=⎡⎢⎣−1/2−27/2⎤⎥⎦.u1=[311],u2=[−121],u3=[−1/2−27/2].

Solution. Take the inner product of each pair of distinct vectors.

u⊤1u2=3(−1)+1(2)+1(1)=0,u⊤1u3=3(−1/2)+1(−2)+1(7/2)=0,u⊤2u3=−1(−1/2)+2(−2)+1(7/2)=0.(1)(2)(3)(1)u1⊤u2=3(−1)+1(2)+1(1)=0,(2)u1⊤u3=3(−1/2)+1(−2)+1(7/2)=0,(3)u2⊤u3=−1(−1/2)+2(−2)+1(7/2)=0.

Each pair of distinct vectors is orthogonal, and so {u1,u2,u3}{u1,u2,u3} is an orthogonal set.

In three-space they describe three lines that we say are mutually perpendicular.

In [3]:
#
fig = ut.three_d_figure((22, 2), fig_desc = 'An orthogonal set of vectors',
                        xmin = -3, xmax = 3, ymin = -3, ymax = 3, zmin = -3, zmax = 3, 
                        figsize = (12, 8), qr = qr_setting)
u1 = np.array([3, 1, 1])
u2 = np.array([-1, 2, 1])
u3 = np.array([-1/2, -2, 7/2])
origin = np.array([0, 0, 0])
fig.plotLine([origin, u1], 'r', '--')
fig.plotPoint(u1[0], u1[1], u1[2], 'r')
fig.text(u1[0]+.1, u1[1]+.1, u1[2]+.1, r'$\bf u_1$', 'u1', size=16, color='k')
fig.plotLine([origin, u2], 'r', '--')
fig.plotPoint(u2[0], u2[1], u2[2], 'r')
fig.text(u2[0]+.1, u2[1]+.1, u2[2]+.1, r'$\bf u_2$', 'u2', size=16, color='k')
fig.plotLine([origin, u3], 'r', '--')
fig.plotPoint(u3[0], u3[1], u3[2], 'r')
fig.text(u3[0]+.1, u3[1]+.1, u3[2]+.1, r'$\bf u_3$', 'u3', size=16, color = 'k')
fig.text(origin[0]-.45, origin[1]-.45, origin[2]-.45, r'$\bf 0$', 0, size = 16)
fig.plotPerpSym(origin, u1, u2, 0.5)
fig.plotPerpSym(origin, u3, u2, 0.5)
fig.plotPerpSym(origin, u3, u1, 0.5)
fig.set_title(r'An orthogonal set of vectors in $\mathbb{R}^3$', 'An orthogonal set of vectors in R3', size = 16)
fig.save();

Orthogonal Sets Must be Independent¶

Orthogonal sets are very nice to work with. First of all, we will show that any orthogonal set must be linearly independent.

Theorem. If S={u1,…,up}S={u1,…,up} is an orthogonal set of nonzero vectors in Rn,Rn, then SS is linearly independent.

Proof. We will prove that there is no linear combination of the vectors in SS with nonzero coefficients that yields the zero vector.

It is easier to prove the contrapositive. The contrapositive of "if A then B" is "if not-B then not-A".

That is: our proof strategy will be to show that for any linear combination of the vectors in SS,

if the combination is the zero vector, then all coefficients of the combination must be zero.

Specifically: assume 0=c1u1+⋯+cpup0=c1u1+⋯+cpup for some scalars c1,…,cpc1,…,cp. Then:

0=c1u1+c2u2+⋯+cpup0=c1u1+c2u2+⋯+cpup
0⊤u1=(c1u1+c2u2+⋯+cpup)⊤u10⊤u1=(c1u1+c2u2+⋯+cpup)⊤u1
0=(c1u1)⊤u1+(c2u2)⊤u1⋯+(cpup)⊤u10=(c1u1)⊤u1+(c2u2)⊤u1⋯+(cpup)⊤u1
0=c1(u⊤1u1)+c2(u⊤2u1)+⋯+cp(u⊤pu1)0=c1(u1⊤u1)+c2(u2⊤u1)+⋯+cp(up⊤u1)

Because u1u1 is orthogonal to u2,…,upu2,…,up:

0=c1(u⊤1u1)0=c1(u1⊤u1)

Since u1u1 is nonzero, the inner product u⊤1u1u1⊤u1 is not zero and so c1=0c1=0.

We can use the same kind of reasoning to show that, c2,…,cpc2,…,cp must be zero.

In other words, there is no nonzero combination of uiui's that yields the zero vector ...

... so SS is linearly independent.

Notice that since SS is a linearly independent set, it is a basis for the subspace spanned by SS.

This leads us to a new kind of basis.

22.3 Orthogonal Basis¶

Definition. An orthogonal basis for a subspace HH of RnRn is a basis for HH that is also an orthogonal set.

For example, consider

u1=⎡⎢⎣−1/221⎤⎥⎦,u2=⎡⎢⎣8/31/32/3⎤⎥⎦.u1=[−1/221],u2=[8/31/32/3].

Note that u⊤1u2=0.u1⊤u2=0. Hence they form an orthogonal basis for their span.

Here is the subspace W=Span{u1,u2}W=Span{u1,u2}:

In [4]:
#
fig = ut.three_d_figure((22, 3), fig_desc = 'Orthogonal Basis on the subspace H', figsize = (8, 8),
                        xmin = -2, xmax = 10, ymin = -1, ymax = 10, zmin = -1, zmax = 10, qr = qr_setting)

v = 1/2 * np.array([-1, 4, 2])
u = 1/3 * np.array([8, 1, 2])
vpos = v + 0.4 * v - 0.5 * u
upos = u - 0.5 * v + 0.15 * u
fig.text(vpos[0], vpos[1], vpos[2], r'$\bf u_2$', 'v', size=16)
fig.text(upos[0], upos[1], upos[2], r'$\bf u_1$', 'u', size=16)
# fig.text(3*u[0]+2*v[0], 3*u[1]+2*v[1], 3*u[2]+2*v[2]+1, r'$\bf 2v_1+3v_2$', '2 v1 + 3 v2', size=16)
# plotting the span of v
fig.plotSpan(u, v, 'Green')
# blue grid lines
fig.plotPoint(0, 0, 0, 'y')
fig.plotPoint(u[0], u[1], u[2], 'b')
fig.plotPoint(2*u[0], 2*u[1], 2*u[2],'b')
fig.plotPoint(3*u[0], 3*u[1], 3*u[2], 'b')
fig.plotLine([[0, 0, 0], 4*u], color='b')
fig.plotLine([v, v+4*u], color='b')
fig.plotLine([2*v, 2*v+3*u], color='b')
fig.plotLine([3*v, 3*v+2.5*u], color='b')
# red grid lines
fig.plotPoint(v[0], v[1], v[2], 'r')
fig.plotPoint(2*v[0], 2*v[1], 2*v[2], 'r')
fig.plotPoint(3*v[0], 3*v[1], 3*v[2], 'r')
fig.plotLine([[0, 0, 0], 3.5*v], color='r')
fig.plotLine([u, u+3.5*v], color='r')
fig.plotLine([2*u, 2*u+3.5*v], color='r')
fig.plotLine([3*u, 3*u+2*v], color='r')
#
# fig.plotPoint(3*u[0]+2*v[0], 3*u[1]+2*v[1], 3*u[2]+2*v[2], color='m')
# plotting the axes
#fig.plotIntersection([0,0,1,0], [0,1,0,0], color='Black')
#fig.plotIntersection([0,0,1,0], [1,0,0,0], color='Black')
#fig.plotIntersection([0,1,0,0], [1,0,0,0], color='Black')
#
fig.plotPerpSym(np.array([0, 0, 0]), v, u, 1)
fig.plotPerpSym(u, v+u, u+u, 1)
fig.plotPerpSym(2*u, v+2*u, 3*u, 1)
#
fig.plotPerpSym(np.array([0, 0, 0])+v, 2*v, v+u, 1)
fig.plotPerpSym(u+v, 2*v+u, v+2*u, 1)
#
fig.set_title(r'Orthogonal Basis on the subspace $H$', 'Orthogonal Basis on the subspace H', size=16)
fig.save()

Coordinates in an orthogonal basis¶

We have seen that for any subspace, there may be many different sets of vectors that can serve as a basis for HH.

For example, let's say we have a basis B={u1,u2,u3}.B={u1,u2,u3}.

We know that to compute the coordinates of yy in this basis, denoted [y]B[y]B, we need to solve the linear system:

c1u1+c2u2+c3u3=yc1u1+c2u2+c3u3=y

or

Uc=y.Uc=y.

In general, we'd need to perform Gaussian Elimination, or matrix inversion, or some other computationally slow method to do this.

However an orthogonal basis is a particularly nice basis, because the coordinates of any point can be computed easily and simply. Let's see how.

Theorem. Let {u1,…,up}{u1,…,up} be an orthogonal basis for a subspace HH of RnRn. For each yy in HH, the weights of the linear combination

c1u1+⋯+cpup=yc1u1+⋯+cpup=y

are given by

cj=y⊤ujuTjujj=1,…,pcj=y⊤ujujTujj=1,…,p

Proof. Let's consider the inner product of yy and one of the uu vectors --- say, u1u1.

As we saw in the last proof, the orthogonality of {u1,…,up}{u1,…,up} means that

y⊤u1=(c1u1+c1u2+⋯+cpup)⊤u1=c1(u⊤1u1).(4)(5)(4)y⊤u1=(c1u1+c1u2+⋯+cpup)⊤u1(5)=c1(u1⊤u1).

Since u⊤1u1u1⊤u1 is not zero (why?), the equation above can be solved for c1.c1.

c1=y⊤u1u⊤1u1c1=y⊤u1u1⊤u1

To find any other cj,cj, compute y⊤ujy⊤uj and solve for cjcj.

Example. The set SS that we saw earlier, i.e.,

u1=⎡⎢⎣311⎤⎥⎦,u2=⎡⎢⎣−121⎤⎥⎦,u3=⎡⎢⎣−1/2−27/2⎤⎥⎦,u1=[311],u2=[−121],u3=[−1/2−27/2],

is an orthogonal basis for R3.R3.

Let's express the vector y=⎡⎢⎣61−8⎤⎥⎦y=[61−8] as a linear combination of the vectors in SS.

That is, find yy's coordinates in the basis SS --- i.e., in the coordinate system SS.

In [5]:
#
fig = ut.three_d_figure((22, 4), fig_desc = 'y in an orthogonal basis',
                        xmin = -3, xmax = 7, ymin = -5, ymax = 5, zmin = -8, zmax = 4, 
                        figsize = (12, 8), qr = qr_setting, equalAxes = False)
u1 = np.array([3, 1, 1])
u2 = np.array([-1, 2, 1])
u3 = np.array([-1/2, -2, 7/2])
origin = np.array([0, 0, 0])
#
fig.plotLine([origin, u1], 'r', '--')
fig.plotPoint(u1[0], u1[1], u1[2], 'r')
fig.text(u1[0]+.1, u1[1]+.1, u1[2]+.1, r'$\bf u_1$', 'u1', size=16, color='k')
#
fig.plotLine([origin, u2], 'r', '--')
fig.plotPoint(u2[0], u2[1], u2[2], 'r')
fig.text(u2[0]+.1, u2[1]+.1, u2[2]+.1, r'$\bf u_2$', 'u2', size=16, color='k')
#
fig.plotLine([origin, u3], 'r', '--')
fig.plotPoint(u3[0], u3[1], u3[2], 'r')
fig.text(u3[0]+.1, u3[1]+.1, u3[2]+.1, r'$\bf u_3$', 'u3', size=16, color = 'k')
#
fig.text(origin[0]-.45, origin[1]-.45, origin[2]-.45, r'$\bf 0$', 0, size = 16)
#
fig.plotPerpSym(origin, u1, u2, 0.5)
fig.plotPerpSym(origin, u3, u2, 0.5)
fig.plotPerpSym(origin, u3, u1, 0.5)
#
y = u1 - 2 * u2 - 2 * u3
# print(y)
fig.plotPoint(y[0], y[1], y[2], 'b')
fig.text(y[0]-2, y[1]+.1, y[2]+.1, r'$\bf y$ = (6, 1, -8)', 'y = (6, 1, -8)', size=16, color = 'b')
fig.text(y[0]-2, y[1]+.1, y[2]-2.5, r'${\bf y} = 1{\bf u}_1 -2 {\bf u}_2 -2 {\bf u}_3$', 'y = (6, 1, -8)', size=16, color = 'b')
#
fig.set_title(r'${\bf y}$ in an Orthogonal Basis', 'y in an Orthogonal Basis', size = 16)
fig.save();

Solution. Compute

y⊤u1=11,y⊤u2=−12,y⊤u3=−33,y⊤u1=11,y⊤u2=−12,y⊤u3=−33,u⊤1u1=11,u⊤2u2=6,u⊤3u3=33/2u1⊤u1=11,u2⊤u2=6,u3⊤u3=33/2

So

y=y⊤u1u⊤1u1u1+y⊤u2u⊤2u2u2+y⊤u3u⊤3u3u3y=y⊤u1u1⊤u1u1+y⊤u2u2⊤u2u2+y⊤u3u3⊤u3u3
=1111u1+−126u2+−3333/2u3=1111u1+−126u2+−3333/2u3
=u1−2u2−2u3.=u1−2u2−2u3.

As a result, [y]B=⎡⎢⎣1−2−2⎤⎥⎦[y]B=[1−2−2].

Note how much simpler it is finding the coordinates of yy in the orthogonal basis because:

  • There are no matrix operations involved, so the overall computational work is quadratic rather than cubic (nn inner products, each of which take time 2n2n).
  • Each coefficient cici can be found separately, so with enough computers available you can perform the calculation in linear time.