In [1]:
#
import numpy as np
import scipy as sp
import pandas as pd
import matplotlib as mp
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn
import laUtilities as ut
import slideUtilities as sl
import demoUtilities as dm
from matplotlib import animation
from importlib import reload
from datetime import datetime
from IPython.display import Image, display_html, display, Math, HTML;
qr_setting = None

mp.rcParams['animation.html'] = 'jshtml';

Announcements¶

  • Homework 4 is out, due 3/3

  • Upcoming office hours

    • Today: Prof McDonald from 4:30-6pm in CCDS 1341
    • Tomorrow: Peer tutor Rohan Anand from 1:30-3pm in CCDS 16th floor
  • Aggarwal Section 2.4-2.5

  • Watch 3Blue1Brown video 5 and video 6

Lecture 17: Matrix Composition¶

In [104]:
#
B = np.array(
    [[ 1, 0],
     [ 0,-1]])
A = np.array(
    [[-1, 0],
     [ 0, 1]])
#print("A ="); print(A)
#print("B ="); print(B)
dm.plotSetup()
dm.plotShape(note)
dm.plotShape(B @ note,'g')
dm.plotShape(A @ (B @ note),'r')

[This lecture is based on Prof. Crovella's CS 132 lecture notes.]

Recap of the previous lecture¶

Last time, we discussed a geometric interpretation of matrix-vector multiplication Ax=bAx=b.

We think of AA as a linear transformation that maps the vector xx in the domain into a new vector bb in the co-domain. The resulting vector bb is called the image of xx under the transformation AA.

In [4]:
#
display(Image("images/05-3dtransform.jpg", width=1000))

We saw some example transformations like the horizontal shear matrix A=[11.501]A=[11.501].

To understand how matrices transform space, it suffices to understand how the matrix transforms the standard basis vectors e1=[10]e1=[10] and e2=[01]e2=[01], along with the unit square containing them.

In [2]:
#
ax = dm.plotSetup()
square = np.array([[0, 1, 1, 0],
                   [1, 1, 0, 0]])
dm.plotSquare(square)
A = np.array([[1.0, 1.5],[0.0,1.0]])
ssquare = np.zeros(np.shape(square))
for i in range(4):
    ssquare[:,i] = dm.AxVS(A,square[:,i])
dm.plotSquare(ssquare,'r')
ax.plot([0],[1],'ro',markersize=8)
ax.arrow(0, 0, 1, 0, head_width=0.2, head_length=0.2, length_includes_head = True)
ax.arrow(0, 0, 0, 1, head_width=0.2, head_length=0.2, length_includes_head = True)
ax.text(-0.5,0.35,'$e_2$',size=20)
ax.plot([1],[0],'ro',markersize=8)
ax.text(0.35,-0.35,'$e_1$',size=20);

Then, we can understand how the matrix applies to all of R2R2 using the principle of linearity.

Definition. A transformation TT is linear if:

T(cu+dv)=cT(u)+dT(v)T(cu+dv)=cT(u)+dT(v)

for all u,vu,v in the domain of TT and for all scalars c,d∈Rc,d∈R.

We proved last time that the ideas of matrix multiplication and linear transformation of vector spaces are equivalent.

  • Multiplying by an m×nm×n matrix AA always produces a linear transformation from RnRn to RmRm.
  • Every linear transformation TT from vectors to vectors is equivalent to multiplication by the matrix A=[T(e1)…T(en)]A=[T(e1)…T(en)].

Then, we found the matrix corresponding to several linear transformations of 2-dimensional space. For example, the act of rotating counterclockwise by θ=90θ=90 degrees corresponds to the matrix A=[0−110]A=[0−110].

In [3]:
#
note = np.array(
        [[193,47],
        [140,204],
        [123,193],
        [99,189],
        [74,196],
        [58,213],
        [49,237],
        [52,261],
        [65,279],
        [86,292],
        [113,295],
        [135,282],
        [152,258],
        [201,95],
        [212,127],
        [218,150],
        [213,168],
        [201,185],
        [192,200],
        [203,214],
        [219,205],
        [233,191],
        [242,170],
        [244,149],
        [242,131],
        [233,111]])
note = note.T/150.0
dm.plotSetup()
note = dm.mnote()
dm.plotShape(note)
angle = 90
theta = (angle/180) * np.pi
A = np.array(
    [[np.cos(theta), -np.sin(theta)],
     [np.sin(theta), np.cos(theta)]])
rnote = A @ note
dm.plotShape(rnote,'r')

17.1 Visualizing Linear Transformations of R2R2¶

Let's examine several linear transformations of R2R2 to R2R2, both geometrically and computationally.

Rotation¶

Let T:R2→R2T:R2→R2 be the transformation that rotates each point in R2R2 about the origin through an angle θθ, with counterclockwise rotation for a positive angle.

Let's find the standard matrix AA of this transformation.

Question. The columns of II are e1=[10]e1=[10] and e2=[01].e2=[01]. Where does TT map the standard basis vectors?

Referring to the diagram below, we can see that [10][10] rotates into [cosθsinθ],[cos⁡θsin⁡θ], and [01][01] rotates into [−sinθcosθ].[−sin⁡θcos⁡θ].

In [18]:
#
import matplotlib.patches as patches
ax = dm.plotSetup(-1.2, 1.2, -0.5, 1.2)
# red circle portion
arc = patches.Arc([0., 0.], 2., 2., 0., 340., 200.,
                 linewidth = 2, color = 'r',
                 linestyle = '-.')
ax.add_patch(arc)
#
# labels
ax.text(1.1, 0.1, r'$\mathbf{e}_1 = (1, 0)$', size = 20)
ax.text(0.1, 1.1, r'$\mathbf{e}_2 = (0, 1)$', size = 20)
#
# angle of rotation and rotated points
theta = np.pi / 6
e1t = [np.cos(theta), np.sin(theta)]
e2t = [-np.sin(theta), np.cos(theta)]
#
# theta labels
ax.text(0.5, 0.08, r'$\theta$', size = 20)
ax.text(-0.25, 0.5, r'$\theta$', size = 20)
#
# arrows from origin
ax.arrow(0, 0, e1t[0], e1t[1],
        length_includes_head = True,
        width = .02)
ax.arrow(0, 0, e2t[0], e2t[1],
        length_includes_head = True,
        width = .02)
#
# new point labels
ax.text(e1t[0]+.05, e1t[1]+.05, r'$[\cos\; \theta, \sin \;\theta]$', size = 20)
ax.text(e2t[0]-1.1, e2t[1]+.05, r'$[-\sin\; \theta, \cos \;\theta]$', size = 20)
#
# curved arrows showing rotation
ax.annotate("",
            xytext=(0.7, 0), xycoords='data',
            xy=(0.7*e1t[0], 0.7*e1t[1]), textcoords='data',
            size=10, va="center", ha="center",
            arrowprops=dict(arrowstyle="simple",
                            connectionstyle="arc3,rad=0.3"),
            )
ax.annotate("",
            xytext=(0, 0.7), xycoords='data',
            xy=(0.7*e2t[0], 0.7*e2t[1]), textcoords='data',
            size=10, va="center", ha="center",
            arrowprops=dict(arrowstyle="simple",
                            connectionstyle="arc3,rad=0.3"),
            )
#
# new points
plt.plot([e1t[0], e2t[0]], [e1t[1], e2t[1]], 'bo', markersize = 10)
plt.plot([0, 1], [1, 0], 'go', markersize = 10);

So by the Theorem above,

A=[cosθ−sinθsinθcosθ].A=[cos⁡θ−sin⁡θsin⁡θcos⁡θ].

In particular, a 90 degree counterclockwise rotation corresponds to the matrix

[0−110].[0−110].

Dilation¶

First, let's recall the linear transformation

T(x)=rx.T(x)=rx.

With r>1r>1, this is a dilation. It moves every vector further from the origin.

Let's say the dilation is by a factor of r=2.5r=2.5.

To construct the matrix AA that implements this transformation, we ask: where do e1e1 and e2e2 go?

In [5]:
#
ax = dm.plotSetup()
ax.plot([0],[1],'ro',markersize=8)
ax.text(0.25,1,'(0,1)',size=20)
ax.plot([1],[0],'ro',markersize=8)
ax.text(1.25,0.25,'(1,0)',size=20);

Under the action of AA, e1e1 goes to [2.50][2.50] and e2e2 goes to [02.5][02.5].

So the matrix AA must be [2.5002.5][2.5002.5]. Let's see this visually.

In [6]:
#
square = np.array(
    [[0,1,1,0],
     [1,1,0,0]])
A = np.array(
    [[2.5, 0],
     [0, 2.5]])
# display(Latex(rf"$A = {ltx_array_fmt(A, '{:1.1f}')}$"))
dm.plotSetup()
dm.plotSquare(square)
dm.plotSquare(A @ square,'r')
In [7]:
#
dm.plotSetup(-7,7,-7, 7)
dm.plotShape(note)
dm.plotShape(A @ note,'r')

Reflections¶

OK, now let's reflect through the x1x1 axis. Where do e1e1 and e2e2 go?

In [15]:
#
A = np.array(
    [[1,  0],
     [0, -1]])
print("A = "); print(A)
#display(Latex(rf"$A = {ltx_array_fmt(A, '{:d}')}$"))
dm.plotSetup()
dm.plotSquare(square)
dm.plotSquare(A @ square,'r')
#plt.title(r'Reflection through the $x_1$ axis', size = 20);
A = 
[[ 1  0]
 [ 0 -1]]
In [10]:
#
dm.plotSetup()
dm.plotShape(note)
dm.plotShape(A @ note,'r')

What about reflection through the x2x2 axis?

In [24]:
#
A = np.array(
    [[-1,0],
     [0, 1]])
print("A = "); print(A)
dm.plotSetup()
dm.plotSquare(square)
dm.plotSquare(A @ square,'r')
A = 
[[-1  0]
 [ 0  1]]
In [19]:
#
dm.plotSetup()
dm.plotShape(note)
dm.plotShape(A @ note,'r')

What about reflection through the line x1=x2x1=x2?

In [27]:
#
A = np.array(
    [[0,1],
     [1,0]])
print("A = "); print(A)
dm.plotSetup()
dm.plotSquare(square)
dm.plotSquare(A @ square,'r')
plt.plot([-2,2],[-2,2],'b--');
A = 
[[0 1]
 [1 0]]
In [26]:
#
dm.plotSetup()
dm.plotShape(note)
dm.plotShape(A @ note,'r')

What about reflection through the line x1=−x2x1=−x2?

In [28]:
#
A = np.array(
    [[ 0,-1],
     [-1, 0]])
print("A = "); print(A)
dm.plotSetup()
dm.plotSquare(square)
dm.plotSquare(A @ square,'r')
plt.plot([-2,2],[2,-2],'b--');
A = 
[[ 0 -1]
 [-1  0]]
In [29]:
#
dm.plotSetup()
dm.plotShape(note)
dm.plotShape(A @ note,'r')

Other transformations¶

What about 180-degree rotation through the origin? Note that this is different than reflection about the line x1=−x2x1=−x2 (how?).

We could use our rotation formula to compute this matrix, but let's instead just think about how this transformation affects e1e1 and e2e2.

In [30]:
#
A = np.array(
    [[-1, 0],
     [ 0,-1]])
print("A = "); print(A)
ax = dm.plotSetup()
dm.plotSquare(square)
dm.plotSquare(A @ square,'r');
A = 
[[-1  0]
 [ 0 -1]]
In [31]:
#
dm.plotSetup()
dm.plotShape(note)
dm.plotShape(A @ note,'r')

We can expand or shrink one dimension independent of the other. Suppose we want to contract the x1x1 dimension by a ratio of r=0.45r=0.45, but leave the x2x2 axis unchanged. What matrix performs this operation?

In [32]:
#
A = np.array(
    [[0.45, 0],
     [0,    1]])
print("A = "); print(A)
ax = dm.plotSetup()
dm.plotSquare(square)
dm.plotSquare(A @ square,'r')
ax.arrow(1.0,1.5,-1.0,0,head_width=0.15, head_length=0.1, length_includes_head=True);
A = 
[[0.45 0.  ]
 [0.   1.  ]]
In [33]:
#
dm.plotSetup()
dm.plotShape(note)
dm.plotShape(A @ note,'r')

We can similarly construct a horizontal expansion by a ratio of r=2.5r=2.5.

In [40]:
#
A = np.array(
    [[2.5,0],
     [0,  1]])
print("A = "); print(A)
dm.plotSetup()
dm.plotSquare(square)
dm.plotSquare(A @ square,'r')
A = 
[[2.5 0. ]
 [0.  1. ]]
In [38]:
#
dm.plotSetup()
dm.plotShape(note)
dm.plotShape(A @ note,'r')

Matrices of the form [1a01][1a01] or [10a1][10a1] are called shears.

We have already seen a horizontal shear. Here is a vertical shear.

In [41]:
#
A = np.array(
    [[   1, 0],
     [-1.5, 1]])
print("A = "); print(A)
dm.plotSetup()
dm.plotSquare(square)
dm.plotSquare(A @ square,'r')
A = 
[[ 1.   0. ]
 [-1.5  1. ]]
In [42]:
#
dm.plotSetup()
dm.plotShape(note)
dm.plotShape(A @ note,'r')

Now let's look at a particular kind of transformation called a projection.

Imagine we took any given point and 'dropped' it onto the x1x1-axis.

In [55]:
#
A = np.array(
    [[1,0],
     [0,0]])
print("A = "); print(A)
ax = dm.plotSetup()
dm.plotSquare(square)
dm.plotSquare(A @ square,'r')
ax.arrow(1.0,1.0,0,-0.9,head_width=0.15, head_length=0.1, length_includes_head=True);
ax.arrow(0.0,1.0,0,-0.9,head_width=0.15, head_length=0.1, length_includes_head=True);
A = 
[[1 0]
 [0 0]]
In [51]:
#
dm.plotSetup()
dm.plotShape(note)
dm.plotShape(A @ note,'r')

Question. What happens to the shape of the point set? What about its area?

Here is a projection in the x2x2 axis.

In [56]:
#
A = np.array(
    [[0,0],
     [0,1]])
print("A = "); print(A)
ax = dm.plotSetup()
dm.plotSquare(square)
dm.plotSquare(A @ square)
ax.arrow(1.0,1.0,-0.9,0,head_width=0.15, head_length=0.1, length_includes_head=True);
ax.arrow(1.0,0.0,-0.9,0,head_width=0.15, head_length=0.1, length_includes_head=True);
A = 
[[0 0]
 [0 1]]
In [57]:
#
dm.plotSetup()
dm.plotShape(note)
dm.plotShape(A @ note,'r')

17.2 Existence and Uniqueness¶

Notice that some of these transformations map multiple inputs to the same output, and some are incapable of generating certain outputs.

For example, the projections above can send multiple different points to the same point.

We need some terminology to understand these properties of linear transformations.

Onto¶

Definition. A mapping T:Rn→RmT:Rn→Rm is said to be onto RmRm if each bb in RmRm is the image of at least one xx in RnRn.

In other words, TT is onto if every element of its codomain is in its range.

Another (important) way of thinking about this is that TT is onto if there is a solution xx of

T(x)=bT(x)=b

for all possible b.b.

This is asking an existence question about a solution of the equation T(x)=bT(x)=b for all b.b.

Question. For the transformation TT pictured below, is TT onto?

In [2]:
#
display(Image("images/05-image.jpg", width=1000))

Here, we see that TT maps points in R2R2 to a plane lying within R3R3. That is, the range of TT is a strict subset of the codomain of TT.

So TT is not onto R3R3.

Examples¶

Consider the following two transformations (a reflection and a projection). In each case, the red points are the images of the blue points. Is each transformation onto R2R2?

In [76]:
#
A = np.array(
    [[ 0,-1],
     [-1, 0]])
print("A = "); print(A)
dm.plotSetup()
dm.plotShape(note)
dm.plotShape(A @ note,'r')
A = 
[[ 0 -1]
 [-1  0]]
In [77]:
#
A = np.array(
    [[1,0],
     [0,0]])
print("A = "); print(A)
dm.plotSetup()
dm.plotShape(note)
dm.plotShape(A @ note,'r')
A = 
[[1 0]
 [0 0]]

One-to-one¶

Definition. A mapping T:Rn→RmT:Rn→Rm is said to be one-to-one if each bb in RmRm is the image of at most one xx in RnRn.

If TT is one-to-one, then for each b,b, the equation T(x)=bT(x)=b has either a unique solution, or none at all. This is asking an existence question about a solution of the equation T(x)=bT(x)=b for all bb.

In [3]:
#
display(Image("images/06-onto.jpeg", width=1000))

Relationship to consistency¶

Let's examine the relationship between these ideas and some previous definitions.

Question. If the equation Ax=bAx=b is consistent (i.e., has at least one solution) for all bb, is T(x)=AxT(x)=Ax onto? Is it one-to-one?

  • T(x)T(x) is onto.
  • T(x)T(x) may or may not be one-to-one. If the system has multiple solutions for some bb, T(x)T(x) is not one-to-one.

Question. If Ax=bAx=b is consistent and has a unique solution for all bb, is T(x)=AxT(x)=Ax onto? Is it one-to-one?

  • Yes to both.

Question. If Ax=bAx=b is not consistent for all bb, is T(x)=AxT(x)=Ax onto? one-to-one?

  • T(x)T(x) is not onto.
  • T(x)T(x) may or may not be one-to-one.

Questions. (Think about these on your own.)

  • If T(x)=AxT(x)=Ax is onto, is Ax=bAx=b consistent for all bb? is the solution unique for all bb?

  • If T(x)=AxT(x)=Ax is one-to-one, is Ax=bAx=b consistent for all bb? is the solution unique for all bb?

17.3 Area is Scaled by the Determinant¶

Notice that in some of the transformations above, the "size" of a shape grows or shrinks.

Let's look at how area (or volume) of a shape is affected by a linear transformation.

Thanks to linearity, it suffices to ask what happens to the unit square (or cube, or hypercube). All other areas or volumes will scale similarly.

In [61]:
#
A = np.array([[2, 1],[0.75, 1.5]])
print("A = "); print(A)
ax = dm.plotSetup(-1, 4, -1, 4)
dm.plotSquare(square, 'r')
dm.plotSquare(A @ square);
A = 
[[2.   1.  ]
 [0.75 1.5 ]]

Let's denote the matrix of a general 2-dimensional linear transformation as:

A=[abcd].A=[abcd].

Then, here is what happens to the unit square:

In [62]:
#
v1 = np.array([2, 0.75])
v2 = np.array([1, 1.5])
A = np.column_stack([v1, v2])
ax = dm.plotSetup(-1, 4, -1, 4)
dm.plotSquare(square, 'r')
dm.plotSquare(A @ square)
plt.text(1.1, 0.1, '(1, 0)', size = 12)
plt.text(1.05, 1.05, '(1, 1)', size = 12)
plt.text(-0.6, 1.2, '(0, 1)', size = 12)
plt.text(v1[0]+0.2, v1[1]+0.1, '(a, c)', size = 12)
plt.text((v1+v2)[0]+0.1, (v1+v2)[1]+0.1, '(a+b, c+d)', size = 12)
plt.text(v2[0]-0.5, v2[1]+0.1, '(b, d)', size = 12);

Now, let's determine the area of the blue diamond in terms of a,b,ca,b,c, and dd.

To do that, we'll use this diagram:

In [63]:
#
a = 2
b = 1
c = 0.75
d = 1.5
v1 = np.array([a, c])
v2 = np.array([b, d])
A = np.column_stack([v1, v2])
ax = dm.plotSetup(-1, 4, -1, 4)
# red triangles
dm.plotShape(np.array([[0, 0], [0, d], [b, d]]).T, 'r')
plt.text(0.2, 1, r'$\frac{1}{2}bd$', size = 12)
dm.plotShape(np.array([[a, c], [a+b, c+d], [a+b, c]]).T, 'r')
plt.text(2.5, 1, r'$\frac{1}{2}bd$', size = 12)
# gray triangles
dm.plotShape(np.array([[b, d], [b, c+d], [a+b, c+d]]).T, 'k')
plt.text(1.2, 1.9, r'$\frac{1}{2}ac$', size = 12)
dm.plotShape(np.array([[0, 0], [a, c], [a, 0]]).T, 'k')
plt.text(1.2, 0.15, r'$\frac{1}{2}ac$', size = 12)
# green squares
dm.plotShape(np.array([[a, 0], [a, c], [a+b, c], [a+b, 0]]).T, 'g')
plt.text(0.2, 1.9, r'$bc$', size = 12)
dm.plotShape(np.array([[0, d], [0, c+d], [b, c+d], [b, d]]).T, 'g')
plt.text(2.5, 0.15, r'$bc$', size = 12)
#
dm.plotSquare(A @ square)
plt.text(v1[0]-0.5, v1[1]+0.05, '(a, c)', size = 12)
plt.text((v1+v2)[0]+0.1, (v1+v2)[1]+0.1, '(a+b, c+d)', size = 12)
plt.text(v2[0]+0.1, v2[1]-0.15, '(b, d)', size = 12);

Each of the triangles and rectangles has an area we can determine in terms of a,b,ca,b,c and dd.

  • The large rectangle has sides (a+b)(a+b) and (c+d)(c+d), so its area is (a+b)(c+d)(a+b)(c+d).
  • We need to subtract bdbd (red triangles), acac (gray triangles), and 2bc2bc (green rectangles).

So the area of the blue diamond is:

(a+b)(c+d)−bd−ac−2bc=ad−bc.(a+b)(c+d)−bd−ac−2bc=ad−bc.

Determinant¶

So we conclude that when we use a linear transformation

A=[abcd]A=[abcd]

the area of a unit square (or any shape) is scaled by a factor of ad−bcad−bc.

This quantity is a fundamental property of the matrix AA.

So, we give it a name: it is the determinant of AA.

We denote it as det(A)det(A).

So, for a 2×22×2 matrix A=[abcd]A=[abcd],

det(A)=ad−bc.det(A)=ad−bc.

The determinant can be defined for any n×nn×n square matrix. For a square matrix AA larger than 2×22×2, the determinant tells us how the volume of a unit (hyper)cube is scaled when it is linearly transformed by AA.

We will see how to compute these determinants in a later lecture.

For non-square matrices, the determinant is not defined because we aren't scaling the area of a shape; we are moving it to a different space entirely.

In [4]:
#
display(Image("images/05-3dtransform.jpg", width=1000))

Zero determinant¶

A special case to consider is when the determinant of a matrix is zero.

Question. When does it happen that det(A)=0det(A)=0?

Consider when AA is the matrix of a projection.

The unit square has been collapsed onto the xx-axis, resulting in a shape with area of zero.

This is confirmed by the determinant, which is

det([1000])=(1⋅0)−(0⋅0)=0.det([1000])=(1⋅0)−(0⋅0)=0.
In [66]:
#
A = np.array(
    [[1,0],
     [0,0]])
print("A = "); print(A)
ax = dm.plotSetup()
dm.plotSquare(square)
dm.plotSquare(A @ square,'r')
ax.arrow(1.0,1.0,0,-0.9,head_width=0.15, head_length=0.1, length_includes_head=True);
ax.arrow(0.0,1.0,0,-0.9,head_width=0.15, head_length=0.1, length_includes_head=True);
A = 
[[1 0]
 [0 0]]

17.4 Inverse of a matrix¶

Next, we investigate the idea of the "inverse" of a matrix. It is analogous to inverses of real numbers, where each non-zero number rr has a multiplicative inverse r−1r−1 such that r⋅r−1=1r⋅r−1=1.

This inverse does not exist for all matrices.

  • It only exists for square matrices (why?)
  • And not even for all square matrices.
In [5]:
#
display(Image("images/06-inverse.png", width=1000))

Definition. A matrix AA is called invertible if there exists a matrix CC such that

AC=I and CA=I.AC=I and CA=I.

In that case, CC is called the inverse of AA.

Clearly, CC must also be square and the same size as AA.

The inverse of AA is denoted A−1.A−1.

A matrix that is not invertible is called a singular matrix. A matrix that is invertible is called nonsingular.

Example. As in the previous example, let A=[0−110]A=[0−110] and B=[01−10]B=[01−10]. Show that BB is the inverse of AA.

We know that ABAB corresponds to performing a 270 degree rotation T270T270 followed by a 90 degree rotation T90T90. The result is a 360 degree rotation -- that is, every vector ends up back where it started, which is the identity transformation. Hence, AB=IAB=I.

Similarly, BABA corresponds to performing a 90 degree rotation followed by a 270 degree rotation, so once again the result BA=IBA=I is the identity matrix.

Here are some nice properties of the matrix inversion.

Theorem.

  • If AA is an invertible matrix, then A−1A−1 is invertible, and
(A−1)−1=A.(A−1)−1=A.
  • If AA is an invertible matrix, then so is AT,AT, and the inverse of ATAT is the transpose of A−1.A−1.
(AT)−1=(A−1)T.(AT)−1=(A−1)T.
  • If AA and BB are n×nn×n invertible matrices, then so is AB,AB, and the inverse of ABAB is the product of the inverses of AA and BB in the reverse order.
(AB)−1=B−1A−1.(AB)−1=B−1A−1.

(I will let you verify these properties on your own.)

Example.

If A=[25−3−7]A=[25−3−7] and C=[−7−532]C=[−7−532], then:

AC=[25−3−7][−7−532]=[1001],AC=[25−3−7][−7−532]=[1001],

and:

CA=[−7−532][25−3−7]=[1001],CA=[−7−532][25−3−7]=[1001],

so we conclude that C=A−1.C=A−1.

17.6 Finding the inverse of a matrix¶

How do we find A−1A−1? Let's first look at the special case of 2×22×2 matrices and then consider the general case of any square matrix.

The 2×22×2 case¶

Theorem. Let AA = [abcd].[abcd].

If det(A)≠0det(A)≠0, then

  • AA is invertible and
  • A−1=1det(A)[d−b−ca].A−1=1det(A)[d−b−ca].

If det(A)=0det(A)=0, then

  • AA is not invertible.

Notice that this theorem tells us, for 2×22×2 matrices, exactly which ones are invertible... namely, those that have nonzero determinant.

Example. Given a 2×22×2 matrix AA, if the columns of AA are linearly dependent, is AA invertible?

Solution. If the columns of AA are linearly dependent, then at least one of the columns is a multiple of the other.

Let the multiplier be m.m.

Then we can express AA as: [amabmb].[amabmb].

The determinant of AA is a(mb)−b(ma)=0.a(mb)−b(ma)=0.

So a 2×22×2 matrix with linearly dependent columns is not invertible.

Matrices larger than 2×22×2.¶

Now let's look at a general method for computing the inverse of AA. Recall our definition of matrix multiplication: ABAB is the matrix formed by multiplying AA times each column of BB.

AB=[Ab1…Abn].AB=[Ab1…Abn].

Let's look at the equation

AA−1=I.AA−1=I.

Let's call the columns of A−1A−1 = [x1,x2,…,xn].[x1,x2,…,xn].

We know what the columns of II are: [e1,e2,…,en].[e1,e2,…,en].

So:

AA−1=A[x1,x2,…,xn]=[e1,e2,…,en].AA−1=A[x1,x2,…,xn]=[e1,e2,…,en].

Notice that we can break this up into nn separate problems:

Ax1=e1Ax1=e1Ax2=e2Ax2=e2⋮⋮Axn=enAxn=en

(This is a common trick ... make sure you understand why it works!)

So here is a general way to compute the inverse of AA:

  • Solve the linear system Ax1=e1Ax1=e1 to get the first column of A−1.A−1.
  • Solve the linear system Ax2=e2Ax2=e2 to get the second column of A−1.A−1.
  • ……
  • Solve the linear system Axn=enAxn=en to get the last column of A−1.A−1.

If any of the systems are inconsistent or has an infinite solution set, then A−1A−1 does not exist.

This general strategy leads to an algorithm for inverting any matrix. We can use Gaussian Elimination on the larger augmented matrix [A∣I][A∣I]. The result in reduced row echelon form will be [I∣A−1][I∣A−1].

Computing inverse matrices¶

But solving all of these linear systems is complicated.

Any time you need to invert a matrix larger than 2×2,2×2, you may need to use a calculator or computer.

To invert a matrix in Python/numpy, use the function np.linalg.inv(). For example:

In [87]:
import numpy as np
A = np.array(
    [[ 2.0, 5.0],
     [-3.0,-7.0]])
print('A =\n',A)
B = np.linalg.inv(A)
print('B = \n',B)
A =
 [[ 2.  5.]
 [-3. -7.]]
B = 
 [[-7. -5.]
 [ 3.  2.]]

What do you think happens if you call np.linalg.inv() on a matrix that is not invertible?

In [10]:
A = np.array([[2.,4.],[2.,4.]])
np.linalg.inv(A) # try uncommenting this line
---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
<ipython-input-10-55215ad851ba> in <module>
      1 A = np.array([[2.,4.],[2.,4.]])
----> 2 np.linalg.inv(A) # try uncommenting this line

~/.local/lib/python3.8/site-packages/numpy/core/overrides.py in inv(*args, **kwargs)

~/.local/lib/python3.8/site-packages/numpy/linalg/linalg.py in inv(a)
    543     signature = 'D->D' if isComplexType(t) else 'd->d'
    544     extobj = get_linalg_error_extobj(_raise_linalgerror_singular)
--> 545     ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
    546     return wrap(ainv.astype(result_t, copy=False))
    547 

~/.local/lib/python3.8/site-packages/numpy/linalg/linalg.py in _raise_linalgerror_singular(err, flag)
     86 
     87 def _raise_linalgerror_singular(err, flag):
---> 88     raise LinAlgError("Singular matrix")
     89 
     90 def _raise_linalgerror_nonposdef(err, flag):

LinAlgError: Singular matrix

The right way to handle this is:

In [3]:
A = np.array([[2.,4.],[2.,4.]])
try:
    np.linalg.inv(A)
except np.linalg.LinAlgError:
    print('Oops, looks like A is singular!')
Oops, looks like A is singular!
In [ ]: