In [1]:
#
import numpy as np
import scipy as sp
import pandas as pd
import matplotlib as mp
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn
import laUtilities as ut
import slideUtilities as sl
import demoUtilities as dm
from matplotlib import animation
from importlib import reload
from datetime import datetime
from IPython.display import Image, display_html, display, Math, HTML;
qr_setting = None

mp.rcParams['animation.html'] = 'jshtml';

Announcements¶

  • Homework 4 is out, due 3/3

  • Upcoming office hours

    • Tomorrow: Peer tutor Rohan Anand from 1:30-3pm in CCDS 16th floor
    • Tomorrow: Abhishek Tiwari from 3:30-4:30pm on CCDS 13th floor
  • Aggarwal Section 2.4-2.5

  • Watch 3Blue1Brown video 5 and video 6

Lecture 16: Linear Transformations¶

In [3]:
#
display(Image("images/01-modes.jpg", width=600))

[This lecture is based on Prof. Crovella's CS 132 lecture notes.]

16.1 Matrices as functions¶

So far we've been treating the matrix equation

Ax=bAx=b

as simply another way of writing the vector equation

x1a1+⋯+xnan=b.x1a1+⋯+xnan=b.

However, we'll now think of the matrix equation in a new way:

We will think of AA as a function that is "acting on" the vector xx to create a new vector bb.

In [18]:
#
display(Image("images/05-2dtransform.jpg", width=1000))

For example, consider A=[21131−1].A=[21131−1]. Then we find:

A⎡⎢⎣1−4−3⎤⎥⎦=[−52]A[1−4−3]=[−52]

In other words, if x=⎡⎢⎣1−4−3⎤⎥⎦x=[1−4−3] and b=[−52]b=[−52], then AA transforms xx into bb.

In [19]:
#
display(Image("images/05-3dtransform.jpg", width=1000))

Notice what AA has done: it took a vector in R3R3 and transformed it into a vector in R2R2.

Question. How does this fact relate to the shape of AA?

AA is 2×32×3 -- that is, A∈R2×3A∈R2×3.

This gives a new way of thinking about solving Ax=bAx=b.

When we solve Ax=bAx=b, we are finding the vector(s) xx in R3R3 that are transformed into bb in R2R2 under the "action" of AA.

Transformations of different sizes¶

For a different AA, the mapping might be from R1R1 to R3R3:

In [20]:
#
display(Image("images/05-1dtransform.jpg", width=1000))

Question. What would the shape of AA be in the above case?

Since AA maps from R1R1 to R3R3, A∈R3×1A∈R3×1.

That is, AA has 3 rows and 1 column.

In another case, AA could be a square 2×22×2 matrix.

Then, it would map from R2R2 to R2R2.

In [21]:
#
display(Image("images/05-2dtransform.jpg", width=1000))

We have moved out of the familiar world of functions of one variable: we are now thinking about functions that transform a vector into a vector.

Or, put another way, functions that transform multiple variables into multiple variables.

16.2 Transformations¶

Some terminology:

A transformation (or function or mapping) TT from RnRn to RmRm is a rule that assigns to each vector xx in RnRn a vector T(x)T(x) in RmRm.

The set RnRn is called the domain of TT, and RmRm is called the codomain of TT.

The notation:

T:Rn→RmT:Rn→Rm

indicates that the domain of TT is RnRn and the codomain is RmRm.

For xx in Rn,Rn, the vector T(x)T(x) is called the image of xx (under TT). We may sometimes write that TT maps x↦T(x)x↦T(x).

The set of all images T(x)T(x) is called the range of TT.

In [23]:
#
display(Image("images/05-image.jpg", width=1000))

Here, the green plane is the set of all points that are possible outputs of TT for some input xx.

So in this example:

  • The domain of TT is R2R2
  • The codomain of TT is R3R3
  • The range of TT is the green plane.

Visualizing a transformation¶

Let's do an example. Say that I have these points in R2R2:

[01],[11],[10],[00][01],[11],[10],[00]

These points form a square in 2-dimensional space, where each side is of length 1.

In [3]:
#
square = np.array([[0, 1, 1, 0],
                   [1, 1, 0, 0]])
dm.plotSetup()
# print(square)
dm.plotSquare(square)

Now let's transform each of these points according to the following rule. Consider the following matrix, which is called a shear matrix.

A=[11.501].A=[11.501].

We define T(x)=AxT(x)=Ax. Then we have

T:R2→R2.T:R2→R2.

What is the image of each of these points under TT?

A[01]=[1.51]A[01]=[1.51]A[11]=[2.51]A[11]=[2.51]A[10]=[10]A[10]=[10]A[00]=[00]A[00]=[00]
In [4]:
#
ax = dm.plotSetup()
# print("square = "); print(square)
dm.plotSquare(square)
#
# create the A matrix
A = np.array([[1.0, 1.5],[0.0,1.0]])
# print("A matrix = "); print(A)
#
# apply the shear matrix to the square
ssquare = np.zeros(np.shape(square))
for i in range(4):
    ssquare[:,i] = dm.AxVS(A,square[:,i])
# print("sheared square = "); print(ssquare)
dm.plotSquare(ssquare,'r')

This sort of transformation, where points are successively slid sideways, is called a shear transformation.

16.3 Linear Transformations¶

By the algebraic properties of matrix-vector multiplication, we know that the transformation x↦Axx↦Ax has the properties that

A(u+v)=Au+AvandA(cu)=cAuA(u+v)=Au+AvandA(cu)=cAu

for all u,vu,v in RnRn and all scalars cc.

We are now ready to define one of the most fundamental concepts in the course: the concept of a linear transformation. (This is why the subject is called linear algebra!)

Definition. A transformation TT is linear if:

  1. T(u+v)=T(u)+T(v)T(u+v)=T(u)+T(v) for all u,vu,v in the domain of TT; and
  2. T(cu)=cT(u)T(cu)=cT(u) for all scalars cc and all uu in the domain of TT.

To fully grasp the significance of what a linear transformation is, don't think of just matrix-vector multiplication. Think of TT as a function in more general terms.

The definition above captures a lot of transformations that are not matrix-vector multiplication. For example, think of:

T(f)=∫10f(t)dtT(f)=∫01f(t)dt

TT is a linear transformation!

Properties of Linear Transformations¶

A key aspect of a linear transformation is that it preserves the operations of vector addition and scalar multiplication.

For example: for vectors uu and vv, one can either:

  1. Transform them both according to T()T(), then add them, or:
  2. Add them, and then transform the result according to T()T().

One gets the same result either way. The transformation and the addition "commute" with each other.

This leads to two important facts.

If TT is a linear transformation, then

T(0)=T(0)+T(0),soT(0)=0T(0)=T(0)+T(0),soT(0)=0

and

T(cu+dv)=cT(u)+dT(v).T(cu+dv)=cT(u)+dT(v).

In fact, if a transformation satisfies the second equation for all u,vu,v and c,d,c,d, then it must be a linear transformation.

Both of the rules defining a linear transformation derive from this single equation.

Example.

Given a scalar rr, define T:R2→R2T:R2→R2 by T(x)=rxT(x)=rx.

(TT is called a contraction when 0≤r≤10≤r≤1 and a dilation when r>1r>1.)

Let r=3r=3, and show that TT is a linear transformation.

Solution.

Let u,vu,v be in R2R2 and let c,dc,d be scalars. Then

T(cu+dv)=3(cu+dv)T(cu+dv)=3(cu+dv)
=3cu+3dv=3cu+3dv
=c(3u)+d(3v)=c(3u)+d(3v)
=cT(u)+dT(v)=cT(u)+dT(v)

Thus TT is a linear transformation because it satisfies the rule T(cu+dv)=cT(u)+dT(v)T(cu+dv)=cT(u)+dT(v).

Example.

Let T(x)=x+bT(x)=x+b for some b≠0b≠0.

What sort of operation does TT implement?

Answer: translation.

Is TT a linear transformation?

Solution.

We only need to compare

T(u+v)T(u+v)

to

T(u)+T(v).T(u)+T(v).

So:

T(u+v)=u+v+bT(u+v)=u+v+b

and

T(u)+T(v)=(u+b)+(v+b)=u+v+2bT(u)+T(v)=(u+b)+(v+b)=u+v+2b

If b≠0b≠0, then the above two expressions are not equal.

So TT is not a linear transformation.

A Non-Geometric Example: Manufacturing¶

A company manufactures two products, B and C. To do so, it requires materials, labor, and overhead.

For one dollar's worth of product B, it spends 45 cents on materials, 25 cents on labor, and 15 cents on overhead.

For one dollar's worth of product C, it spends 40 cents on materials, 30 cents on labor, and 15 cents on overhead.

Let us construct a "unit cost" matrix:

U=BC⎡⎢⎣.45.40.25.30.15.15⎤⎥⎦MaterialsLaborOverheadU=BC[.45.40.25.30.15.15]MaterialsLaborOverhead

Let x=[x1x2]x=[x1x2] be a production vector, corresponding to x1x1 dollars of product B and x2x2 dollars of product C.

Then define T:R2→R3T:R2→R3 by

T(x)=Ux=x1⎡⎢⎣.45.25.15⎤⎥⎦+x2⎡⎢⎣.40.30.15⎤⎥⎦=⎡⎢⎣Total cost of materialsTotal cost of laborTotal cost of overhead⎤⎥⎦T(x)=Ux=x1[.45.25.15]+x2[.40.30.15]=[Total cost of materialsTotal cost of laborTotal cost of overhead]

The mapping TT transforms a list of production quantities into a list of total costs.

The linearity of this mapping is reflected in two ways:

  1. If production is increased by a factor of, say, 4, ie, from xx to 4x4x, then the costs increase by the same factor, from T(x)T(x) to 4T(x)4T(x).
  2. If xx and yy are production vectors, then the total cost vector associated with combined production of x+yx+y is precisely the sum of the cost vectors T(x)T(x) and T(y)T(y).

16.4 The Matrix of a Linear Transformation¶

We have seen that every matrix multiplication is a linear transformation from vectors to vectors.

But, are there any other possible linear transformations from vectors to vectors?

No! In other words, the reverse statement is also true:

Every linear transformation from vectors to vectors is a matrix multiplication.

We'll now prove this fact. We'll do it constructively, meaning we'll actually show how to find the matrix corresponding to any given linear transformation TT.

Theorem. Let T:Rn→RmT:Rn→Rm be a linear transformation. Then there is (always) a unique matrix AA such that:

T(x)=Axfor allx∈Rn.T(x)=Axfor allx∈Rn.

Let ejej denote the jjth column of the identity matrix II in RnRn. These vectors are called the standard basis vectors.

In [15]:
#
ax = dm.plotSetup(-3,3,-3,3,size=(6,6))
ax.plot([0],[1],'ro',markersize=8)
ax.arrow(0, 0, 1, 0, head_width=0.2, head_length=0.2, length_includes_head = True)
ax.arrow(0, 0, 0, 1, head_width=0.2, head_length=0.2, length_includes_head = True)
ax.text(0.25,1,'(0,1)',size=20)
ax.plot([1],[0],'ro',markersize=8)
ax.text(1.25,0.25,'(1,0)',size=20);

Then, AA is the m×nm×n matrix whose jjth column is the vector T(ej)T(ej), where

A=[T(e1)…T(en)].A=[T(e1)…T(en)].

AA is called the standard matrix of TT.

Proof. Write

x=Ix=[e1…en]xx=Ix=[e1…en]x
=x1e1+⋯+xnen.=x1e1+⋯+xnen.

Because TT is linear, we have:

T(x)=T(x1e1+⋯+xnen)T(x)=T(x1e1+⋯+xnen)
=x1T(e1)+⋯+xnT(en)=x1T(e1)+⋯+xnT(en)
=[T(e1)…T(en)]⎡⎢ ⎢⎣x1⋮xn⎤⎥ ⎥⎦=Ax.=[T(e1)…T(en)][x1⋮xn]=Ax.

So ... we see that the ideas of matrix multiplication and linear transformation of vector spaces are essentially equivalent.

In this course, we will use the term linear transformation when we want to focus on a property of the mapping, while the term matrix multiplication focuses on how such a mapping is implemented.

Recap¶

To find the standard matrix of a linear tranformation, ask what the transformation does to the columns of II.

Now, in R2R2, I=[1001]I=[1001]. So:

e1=[10]ande2=[01].e1=[10]ande2=[01].

To find the matrix of any given linear transformation of vectors in R2R2, we only have to know what that transformation does to these two points.

This is a hugely powerful tool. If we start from some given linear transformation, we can use this idea to find the matrix that implements that linear transformation.

16.5 Rotations¶

For example, let's consider rotation about the origin as a kind of transformation.

In [16]:
#
u = np.array([1.5, 0.75])
v = np.array([0.25, 1])
diamond = np.array([[0,0], u,  u+v, v]).T
ax = dm.plotSetup()
plt.plot(u[0], u[1], 'go')
plt.plot(v[0], v[1], 'yo')
ax.text(u[0]+.25,u[1],r'$\bf{u}$',size=20)
ax.text(v[0],v[1]-.35,r'$\bf{v}$',size=20)
rotation = np.array([[0, -1],[1, 0]])
up = rotation @ u
vp = rotation @ v
plt.plot(up[0], up[1], 'go')
plt.plot(vp[0], vp[1], 'yo')
ax.text(up[0]-1,up[1],r'$T(\mathbf{u})$',size=20)
ax.text(vp[0]-.9,vp[1]+.15,r'$T(\mathbf{v})$',size=20)
ax.text(0.75, 1.75, r'$\theta$', size = 20)
ax.text(-.5, 0.25, r'$\theta$', size = 20)
ax.annotate("",
            xy=((up)[0]+.1, (up)[1]+.1), xycoords='data',
            xytext=((u)[0]-.1, (u)[1]+.1), textcoords='data',
            size=20, va="center", ha="center",
            arrowprops=dict(arrowstyle="simple",
                            connectionstyle="arc3,rad=0.3"),
            )
ax.annotate("",
            xy=((vp)[0]+.1, (vp)[1]+.1), xycoords='data',
            xytext=((v)[0]-.1, (v)[1]), textcoords='data',
            size=20, va="center", ha="center",
            arrowprops=dict(arrowstyle="simple",
                            connectionstyle="arc3,rad=0.3"),
            );

First things first: Is rotation a linear transformation?

Recall that a for a transformation to be linear, it must be true that T(u+v)=T(u)+T(v).T(u+v)=T(u)+T(v).

I'm going to show you a "geometric proof."

This figure shows that "the rotation of u+vu+v is the sum of the rotation of uu and the rotation of vv".

In [17]:
#
u = np.array([1.5, 0.75])
v = np.array([0.25, 1])
diamond = np.array([[0,0], u,  u+v, v]).T
ax = dm.plotSetup()
dm.plotSquare(diamond)
ax.text(u[0]+.25,u[1],r'$\bf{u}$',size=20)
ax.text(v[0],v[1]+.25,r'$\bf{v}$',size=20)
ax.text(u[0]+v[0]+.25,u[1]+v[1],r'$\bf{u + v}$',size=20)
rotation = np.array([[0, -1],[1, 0]])
up = rotation @ u
vp = rotation @ v
diamond = np.array([[0,0], up,  up+vp, vp]).T
dm.plotSquare(diamond)
ax.text(0.75, 2.4, r'$\theta$', size = 20)
ax.annotate("",
            xy=((up+vp)[0]+.1, (up+vp)[1]+.1), xycoords='data',
            xytext=((u+v)[0]-.1, (u+v)[1]+.1), textcoords='data',
            size=20, va="center", ha="center",
            arrowprops=dict(arrowstyle="simple",
                            connectionstyle="arc3,rad=0.3"),
            );

So the answer is yes, rotation is a linear transformation.

Let's see how to compute the linear transformation that is a rotation.

Specifically: Let T:R2→R2T:R2→R2 be the transformation that rotates each point in R2R2 about the origin through an angle θθ, with counterclockwise rotation for a positive angle.

Let's find the standard matrix AA of this transformation.

Question. The columns of II are e1=[10]e1=[10] and e2=[01].e2=[01]. Where does TT map the standard basis vectors?

Referring to the diagram below, we can see that [10][10] rotates into [cosθsinθ],[cos⁡θsin⁡θ], and [01][01] rotates into [−sinθcosθ].[−sin⁡θcos⁡θ].

In [18]:
#
import matplotlib.patches as patches
ax = dm.plotSetup(-1.2, 1.2, -0.5, 1.2)
# red circle portion
arc = patches.Arc([0., 0.], 2., 2., 0., 340., 200.,
                 linewidth = 2, color = 'r',
                 linestyle = '-.')
ax.add_patch(arc)
#
# labels
ax.text(1.1, 0.1, r'$\mathbf{e}_1 = (1, 0)$', size = 20)
ax.text(0.1, 1.1, r'$\mathbf{e}_2 = (0, 1)$', size = 20)
#
# angle of rotation and rotated points
theta = np.pi / 6
e1t = [np.cos(theta), np.sin(theta)]
e2t = [-np.sin(theta), np.cos(theta)]
#
# theta labels
ax.text(0.5, 0.08, r'$\theta$', size = 20)
ax.text(-0.25, 0.5, r'$\theta$', size = 20)
#
# arrows from origin
ax.arrow(0, 0, e1t[0], e1t[1],
        length_includes_head = True,
        width = .02)
ax.arrow(0, 0, e2t[0], e2t[1],
        length_includes_head = True,
        width = .02)
#
# new point labels
ax.text(e1t[0]+.05, e1t[1]+.05, r'$[\cos\; \theta, \sin \;\theta]$', size = 20)
ax.text(e2t[0]-1.1, e2t[1]+.05, r'$[-\sin\; \theta, \cos \;\theta]$', size = 20)
#
# curved arrows showing rotation
ax.annotate("",
            xytext=(0.7, 0), xycoords='data',
            xy=(0.7*e1t[0], 0.7*e1t[1]), textcoords='data',
            size=10, va="center", ha="center",
            arrowprops=dict(arrowstyle="simple",
                            connectionstyle="arc3,rad=0.3"),
            )
ax.annotate("",
            xytext=(0, 0.7), xycoords='data',
            xy=(0.7*e2t[0], 0.7*e2t[1]), textcoords='data',
            size=10, va="center", ha="center",
            arrowprops=dict(arrowstyle="simple",
                            connectionstyle="arc3,rad=0.3"),
            )
#
# new points
plt.plot([e1t[0], e2t[0]], [e1t[1], e2t[1]], 'bo', markersize = 10)
plt.plot([0, 1], [1, 0], 'go', markersize = 10);

So by the Theorem above,

A=[cosθ−sinθsinθcosθ].A=[cos⁡θ−sin⁡θsin⁡θcos⁡θ].

In particular, a 90 degree counterclockwise rotation corresponds to the matrix

[0−110].[0−110].

Computing the matrix corresponding to a rotation¶

To demonstrate computationally the use of a rotation matrix, let's rotate the following shape note. It is constructed using an array of 26 vectors in R2R2 that define its outline. In other words, it is a 2 ×× 26 matrix.

In [9]:
note = np.array(
        [[193,47],
        [140,204],
        [123,193],
        [99,189],
        [74,196],
        [58,213],
        [49,237],
        [52,261],
        [65,279],
        [86,292],
        [113,295],
        [135,282],
        [152,258],
        [201,95],
        [212,127],
        [218,150],
        [213,168],
        [201,185],
        [192,200],
        [203,214],
        [219,205],
        [233,191],
        [242,170],
        [244,149],
        [242,131],
        [233,111]])
note = note.T/150.0
In [8]:
#
dm.plotSetup()
dm.plotShape(note)

To rotate note we need to multiply each column of note by the rotation matrix AA.

Recall that: in Python matrix multiplication is performed using the @ operator. That is, if A and B are matrices, then

A @ B

will multiply A by every column of B, and the resulting vectors will be formed into a matrix.

In [10]:
angle = 90
theta = (angle/180) * np.pi
A = np.array(
    [[np.cos(theta), -np.sin(theta)],
     [np.sin(theta), np.cos(theta)]])
rnote = A @ note
In [11]:
#
dm.plotSetup()
dm.plotShape(rnote)