Eigenvalue Decomposition

I wrote about eigenvalues and eigenvectors a while back, here. In this post, I’ll show how determining the eigenvalues and eigenvectors of a matrix (2 by 2 in this case) is pretty much all of the work of what’s called eigenvalue decomposition. We’ll start with this matrix, which represents a linear transformation: \[\begin{bmatrix}\mathtt{\,\,\,\,0}&\mathtt{\,\,\,\,1}\\\mathtt{-2}&\mathtt{-3}\end{bmatrix}\]

You can see the action of this matrix at the right (sort of). It sends the (1, 0) vector to (0, –2) and the (0, 1) vector to (1, –3).

The eigenvectors of this transformation are any nonzero vectors that do not change their direction during this transformation, but only scale up or down (or stay the same) by a factor of \(\mathtt{\lambda}\) as a result of the transformation. So,

\(\begin{bmatrix}\mathtt{\,\,\,\,0}&\mathtt{\,\,\,\,1}\\\mathtt{-2}&\mathtt{-3}\end{bmatrix}\begin{bmatrix}\mathtt{r_1}\\\mathtt{r_2}\end{bmatrix}=\lambda \begin{bmatrix}\mathtt{r_1}\\\mathtt{r_2}\end{bmatrix}\)

Using our calculations from the previous post linked above, we calculate the eigenvalues to be \(\mathtt{\lambda_1=-2}\) and \(\mathtt{\lambda_2=-1}\). And the corresponding eigenvectors are of the form \(\mathtt{(r, -2r)}\) and \(\mathtt{(-r, r)}\), respectively.

The red vector (representing the eigenvector \(\mathtt{(-r, r)}\)) at right starts at \(\mathtt{(-1, 1)}\). It is scaled by the eigenvalue of \(\mathtt{-1}\) during the transformation—meaning it simply turns in the opposite direction and its magnitude doesn’t change. Any vector of the form \(\mathtt{(-r, r)}\) will behave this way during this tranformation.

The purple vector (representing the eigenvector \(\mathtt{(r, -2r)}\)) starts at \(\mathtt{(-1, 2)}\). It is scaled by the eigenvalue of \(\mathtt{-2}\) during the transformation—meaning it turns in the opposite direction and is scaled by a factor of \(\mathtt{2}\). Any vector of the form \(\mathtt{(r, -2r)}\) will behave this way during this tranformation.

And Now for the Decomposition

We can now use the equation above and plug in each eigenvalue and its corresponding eigenvector to create two matrix equations.

\(\begin{bmatrix}\mathtt{\,\,\,\,0}&\mathtt{\,\,\,\,1}\\\mathtt{-2}&\mathtt{-3}\end{bmatrix}\begin{bmatrix}\mathtt{-1}\\\mathtt{\,\,\,\,2}\end{bmatrix}=\mathtt{-2}\begin{bmatrix}\mathtt{-1}\\\mathtt{\,\,\,\,2}\end{bmatrix}\) \[\begin{bmatrix}\mathtt{\,\,\,\,0}&\mathtt{\,\,\,\,1}\\\mathtt{-2}&\mathtt{-3}\end{bmatrix}\begin{bmatrix}\mathtt{-1}\\\mathtt{\,\,\,\,1}\end{bmatrix}=\mathtt{-1}\begin{bmatrix}\mathtt{-1}\\\mathtt{\,\,\,\,1}\end{bmatrix}\]

We can combine the items on the left side of each equation and the items on the right side of each equation into one matrix equation.

\(\begin{bmatrix}\mathtt{\,\,\,\,0}&\mathtt{\,\,\,\,1}\\\mathtt{-2}&\mathtt{-3}\end{bmatrix}\begin{bmatrix}\mathtt{-1}&\mathtt{-1}\\\mathtt{\,\,\,\,2}&\mathtt{\,\,\,\,1}\end{bmatrix}=\begin{bmatrix}\mathtt{-1}&\mathtt{-1}\\\mathtt{\,\,\,\,2}&\mathtt{\,\,\,\,1}\end{bmatrix}\begin{bmatrix}\mathtt{-2}&\mathtt{\,\,\,\,0}\\\mathtt{\,\,\,\,0}&\mathtt{-1}\end{bmatrix}\)

This leaves us with [original matrix][eigenvector matrix] = [eigenvector matrix][eigenvalue matrix]. Finally, we multiply both sides by the inverse of the eigenvector matrix, in order to remove it from the left side of the equation. We can’t remove it from the right side, because matrix multiplication is not commutative. That leaves us with the final decomposition (hat tip to Math the Beautiful for some of the ideas in this post):

Multiplying these three matrices together, or combining the transformations represented by the matrices as we showed here, will result in the original matrix.

Making Parallelepipeds

We have talked about the cross product (here and here), so let’s move on to doing something marginally interesting with it: we’ll make a rectangular prism, or the more general parallelepiped.

A parallelepiped is shown at the right. It is defined by three vectors: u, v, and w. The cross product vector \(\mathtt{v \wedge w}\) is perpendicular to both v and w, and its magnitude, \(\mathtt{||v \wedge w||}\), is equal to the area of the parallelogram formed by the vectors v and w (something I didn’t mention in the previous two posts).

The perpendicular height of the skewed prism, or parallelepiped, is given by \(\mathtt{||u||cos(θ)}\).

The volume of the parallelepiped can thus be written as the area of the base times the height, or \(\mathtt{V = (||v \wedge w||)(||u||\text{cos}(θ))}\).

We can write \(\mathtt{\text{cos}(θ)}\) in this case as \[\mathtt{\text{cos}(θ) = \frac{(v \wedge w) \cdot u}{(||v \wedge w||)(||u||)}}\]

Which means that, after simplifying the volume equation, we’re left with \(\mathtt{V = (v \wedge w) \cdot u}\): so, the dot product of the vector perpendicular to the base and the slant height of the prism. The result is a scalar value, of course, for the volume of the parallelepiped, and, because it is a dot product, it is a signed value. We can get negative volumes, which doesn’t mean the volume is negative but tells us something about the orientation of the parallelepiped.

Creating Some Parallelepipeds

Creating prisms and skewed prisms can be done in Geogebra 3D, but here I’ll show how to create these figures from scratch using Three.js. Click and drag on the window to rotate the scene below. Right click and drag to pan left, right, up, or down. Scroll to zoom in and out.

Click on the Pencil icon above in the lovely Trinket window and navigate to the parallelepiped.js tab to see the code that makes the cubes. You can see that vectors are used to create the vertices (position vectors, so just points). Each face is composed of 2 triangles: (0, 1, 2) means to create a face from the 0th, 1st, and 2nd vertices from the vertices list. Make some copies of the box in the code and play around!

To determine the volume of each cube: \[\left(\begin{bmatrix}\mathtt{1}\\\mathtt{0}\\\mathtt{0}\end{bmatrix} \wedge \begin{bmatrix}\mathtt{0}\\\mathtt{1}\\\mathtt{0}\end{bmatrix}\right) \cdot \begin{bmatrix}\mathtt{0}\\\mathtt{0}\\\mathtt{1}\end{bmatrix} \mathtt{= 1}\]

Making New Functions

Algebra students usually learn at some point in their studies that you can dilate a function like \(\mathtt{f(x)=x^2}\) by multiplying it by a constant value. Usually the multiplier is written as \(\mathtt{A}\), so you get \(\mathtt{f(x)=Ax}\), which can be \(\mathtt{f(x)=2x^2}\) or \(\mathtt{f(x)=3x^2}\) or \(\mathtt{f(x)=\frac{1}{2}x^2}\), and so on.

What we focus on at the beginning is how this change affects the graph of the function—and, importantly, how we can consistently describe that change as it applies to any function.

So, for example, the brightest (orange-brown) curve in the image at the right represents \(\mathtt{f(x)=x^2}\), or \(\mathtt{f(x)=1x^2}\). And when we increase the A-value, from 2 to 3, the curve gets narrower. Decreasing the A-value, to \(\mathtt{\frac{1}{2}}\) for example, causes the curve to get wider. The same kind of “squishing” happens with every function type. (Check out this article by Better Explained for a really nice explanation.)

Making New Functions

We can also make higher-degree functions from lower-degree functions using dilations. The only change we make to the process of dilation shown above is that now we multiply each point of a function by a non-constant value. More specifically, we can multiply the y-coordinate of each point by its x-coordinate to get its new y-coordinate.

At the left, we transform the constant function \(\mathtt{f(x)=5}\) into a higher-order function in this way. By multiplying the y-coordinate of each original point by its x-coordinate, we change the function from \(\mathtt{f(x)=5}\) to \(\mathtt{f(x)=(x)(5)}\), or \(\mathtt{f(x)=5x}\). Another multiplication of all the points by x would get us \(\mathtt{f(x)=5x^2}\). In that case, you can see that all the points to the left of the y-axis have to reflect across the x-axis, since each y-coordinate would be a negative times a negative.

Another idea that becomes clearer when working with non-constant dilations in this way is that zeros start to make a little more sense.

Try it with other dilations (say, \(\mathtt{x \cdot (x+3)}\) or even \(\mathtt{x \cdot (x-1)^2)}\)) and pay attention to what happens to those points that wind up getting multiplied by 0.

Making Sense of the Cross Product

Last time, we saw that the cross product is a product of two 3d vectors which delivers a vector perpendicular to those two factor vectors.

The cross product is built using three determinants. To determine the x-component of the cross product from the factor vectors (1, 3, 0) and (–2, 0, 0), you find the determinant of the vectors (3, 0) and (0, 0)—the vectors built from the “not-x” components (y- and z-components) of the factors. Repeat this process for the other two components of the cross product, making sure to reverse the sign of the result for the y-component.

But why does this work? How does the cross product make itself perpendicular to the two factor vectors by just using determinants? Below, we’ll still be using magic, but we get a little closer to making our understanding magic free.

Getting the Result We Want

We can actually start with a result we definitely want from the cross product and go from there. (1) The result we want is that when we determine the cross product of a “pure” x-vector (\(\mathtt{1,0,0}\)) and a “pure” y-vector (\(\mathtt{0,1,0}\)), we should get a “pure” z-vector (\(\mathtt{0,0,1}\)). The same goes for other pairings as well. Thus:

\(\begin{bmatrix}\mathtt{1}\\\mathtt{0}\\\mathtt{0}\end{bmatrix} \otimes \begin{bmatrix}\mathtt{0}\\\mathtt{1}\\\mathtt{0}\end{bmatrix} = \begin{bmatrix}\mathtt{0}\\\mathtt{0}\\\mathtt{1}\end{bmatrix} \quad \quad \) \(\begin{bmatrix}\mathtt{1}\\\mathtt{0}\\\mathtt{0}\end{bmatrix} \otimes \begin{bmatrix}\mathtt{0}\\\mathtt{0}\\\mathtt{1}\end{bmatrix} = \begin{bmatrix}\mathtt{0}\\\mathtt{1}\\\mathtt{0}\end{bmatrix} \quad \quad \begin{bmatrix}\mathtt{0}\\\mathtt{1}\\\mathtt{0}\end{bmatrix} \otimes \begin{bmatrix}\mathtt{0}\\\mathtt{0}\\\mathtt{1}\end{bmatrix} = \begin{bmatrix}\mathtt{1}\\\mathtt{0}\\\mathtt{0}\end{bmatrix} \)

A simpler way to write this is to use \(\mathtt{i}\), \(\mathtt{j}\), and \(\mathtt{k}\) to represent the pure x-, y-, and z-vectors, respectively. So, \(\mathtt{i \otimes j = k}\) and so on.

Another thing we want—and here comes some (more) magic—is for (2) the cross product to be antisymmetric, which means that when we change the order of the factors, the cross product’s sign changes but its value does not. So, we want \(\mathtt{i \otimes j = k}\), but then \(\mathtt{j \otimes i = -k}\). And, as before, the same goes for the other pairings as well: \(\mathtt{j \otimes k = i}\), \(\mathtt{k \otimes j = -i}\), \(\mathtt{k \otimes i = j}\), \(\mathtt{i \otimes k = -j}\). This property allows us to use the cross product in order to get a sense of how two vectors are oriented relative to each other in 3d space.

With those two magic beans in hand (and a third and fourth to come in just a second), we can go back to notice that any vector can be written as a linear combination of \(\mathtt{i}\), \(\mathtt{j}\), and \(\mathtt{k}\). The two vectors at the end of the previous post on this topic, for example, (0, 4, 1) and (–2, 0, 0) can be written as \(\mathtt{4j + k}\) and \(\mathtt{-2i}\), respectively.

The cross product, then, of any two 3d vectors \(\mathtt{v = (v_x,v_y,v_z)}\) and \(\mathtt{w = (w_x,w_y,w_z)}\) can be written as: \[\mathtt{(v_{x}i+v_{y}j+v_{z}k) \otimes (w_{x}i+w_{y}j+w_{z}k)}\]

For the final bits of magic, we (3) assume that the cross product distributes over addition as we would expect it to, and (4) decide that the cross product of a “pure” vector (i, j, or k) with itself is 0. If that all works out, then we get this: \[\mathtt{v_{x}w_{x}i^2 + v_{x}w_{y}ij + v_{x}w_{z}ik + v_{y}w_{x}ji + v_{y}w_{y}j^2 + v_{y}w_{z}jk + v_{z}w_{x}ki + v_{z}w_{y}kj + v_{z}w_{z}k^2}\]

Then, by applying the ideas in (1) and (4), we simplify to this: \[\mathtt{(v_{y}w_{z} – v_{z}w_{y})i + (-v_{x}w_{z} – v_{z}w_{x})j + (v_{x}w_{y} – v_{y}w_{x})k}\]

And that’s our cross product vector that we saw before. The cross product of the vectors shown in the image above would be the vector (0, –2, 8).

The Cross Product

The cross product of two vectors is another vector (whereas the dot product was just another number—a scalar). The cross product vector is perpendicular to both of the factor vectors. Typically, books will say that we need 3d vectors (vectors with 3 components) to talk about the cross product, which is true, sort of, but we can give 3d vectors a third component of zero to see how the cross product works with 2d-ish vectors, like below.

At the right, we show the vector (1, 3, 0), the vector (–2, 0, 0), and the cross product of those two vectors (in that order), which is the cross product vector (0, 0, 6).

Since we’re calling it a product, we’ll want to know how we built that product. So, let’s talk about that.

Deconstructing the Cross Product

The cross product vector is built using three determinants, as shown below.

For the x-component of the cross product vector, we deconstruct the factor vectors into 2d vectors made up of the y- and z-components. Then we find the determinant of those two 2d vectors (the area of the parallelogram they form, if any). We do the same for each of the other components of the cross product vector—if we’re working on the y-component of the cross product vector, then we create two 2d vectors from the x- and z-components of the factor vectors and find their parallelogram area, or determinant. And the same for the third component of the cross product vector. (Notice, though, that we reverse the sign of the second component of the cross product vector. It’s not evident here, because it’s zero.)

We’ll look more into the intuition behind this later. It is not immediately obvious why three simple area calculations (the determinants) should be able to deliver a vector that is exactly perpendicular to the two factor vectors (which is an indication that we don’t know everything there is to know about the seemingly dirt-simple concept of area!). But the cross product has a lot of fascinating connections to and uses in physics and engineering—and computer graphics.

I’ll leave you with this exercise to determine the cross product, or a vector perpendicular to this little ramp. The blue vector is (0, 4, 1), and the red vector is
(–2, 0, 0).


Vectors and Complex Numbers

Vectors share a lot of characteristics with complex numbers. They are both multi-dimensional objects, so to speak. Position vectors with 2 components \(\mathtt{(x_1, x_2)}\) behave in much the same way geometrically as complex numbers \(\mathtt{a + bi}\). At the right, you can see that Geogebra displays the position vectors as arrows and the complex numbers as points. In some sense, though, we could use both the vector and the complex number to refer to the same object if we wanted.

You’ll have no problem finding out about how to multiply two complex numbers, though a similar product result for multiplying 2 vectors seems to be hard to come by. For complex numbers, we just use the Distributive Property: \[\mathtt{(a + bi)(c + di) = ac + adi + bci + bdi^2 = ac – bd + (ad + bc)i}\] In fact, we are told that we can think of multiplying complex numbers as rotating points on the complex plane. Since \(\mathtt{0 + i}\) is at a 90° angle to the x-axis, multiplying \(\mathtt{3 + 2i}\) by \(\mathtt{0 + i}\) will rotate the point \(\mathtt{3 + 2i}\) ninety degrees about the origin: \[\mathtt{(3 + 2i)(0 + 1i) = (3)(0) + (3)(1)i + (2)(0)i + (2)(1)i^2 = -2 + 3i}\]

We’ll get the same result after changing the order of the factors too, of course, since complex multiplication is commutative, but now we have to say that \(\mathtt{0 + i}\) was not only rotated by β but scaled as well.

By what was it scaled? Well, since the straight vertical vector has a length of 1, it was scaled by the length of the vector represented by the complex number \(\mathtt{3 + 2i}\), or \(\mathtt{\sqrt{13}}\).

Multiplying Vectors in the Same Way

It seems that we can multiply vectors in the same way that you can multiply complex numbers, though I’m hard pressed to find a source which describes this possibility.

That is, we can rotate the position vector (a, b) so many degrees (\(\mathtt{tan^{-1}(\frac{d}{c})}\)) counterclockwise by multiplying by the position vector (c, d) of unit length, like so: \[\begin{bmatrix}\mathtt{a}\\\mathtt{b}\end{bmatrix}\begin{bmatrix}\mathtt{c}\\\mathtt{d}\end{bmatrix} = \begin{bmatrix}\mathtt{ac – bd}\\\mathtt{ad + bc}\end{bmatrix}\]

Want to rotate the vector (5, 2) by 19°? First we determine the unit vector which forms a 19° angle with the x-axis. That’s (cos(19°), sin(19°)). Then multiply as above:

\[\begin{bmatrix}\mathtt{5}\\\mathtt{2}\end{bmatrix}\begin{bmatrix}\mathtt{cos(19^\circ)}\\\mathtt{sin(19^\circ)}\end{bmatrix} = \begin{bmatrix}\mathtt{5cos(19^\circ) – 2sin(19^\circ)}\\\mathtt{5sin(19^\circ) + 2cos(19^\circ)}\end{bmatrix}\]

Seems like a perfectly satisfactory way of multiplying vectors to me. We have some issues with undefined values and generality, etc., but for chopping some things together, multiplying vectors in a crazy way seems easier to think about than hauling out full blown matrices to do the job.

Word Vectors and Dot Products

A really cool thing about vectors is that they are used to represent and compare a lot of different things that don’t, at first glance, appear to be mathematically representable or comparable. And a lot of this power comes from working with vectors that are “bigger” than the 2-component vectors we have looked at thus far.

\(\begin{bmatrix}\mathtt{1}\\\mathtt{0}\\\mathtt{1}\\\mathtt{0}\\\mathtt{1}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{1}\\\mathtt{0}\\\mathtt{1}\\\mathtt{0}\\\mathtt{1}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\end{bmatrix}\) \(\begin{bmatrix}\mathtt{1}\\\mathtt{0}\\\mathtt{1}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{1}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{1}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\\\mathtt{0}\end{bmatrix}\)

For example, we could have a vector with 26 components. Some would say that this is a vector with 26 dimensions, but I don’t see the need to talk about dimensions—for the most part, if we’re talking about 26-component vectors, we’re probably not talking about dimensions in any helpful sense, except to help us look smart.

At the right are two possible 26-component vectors. We can say that the vector on the left represents the word pelican. The vector on the right represents the word clap. Each component of the vectors is a representation of a letter from a to z in the word. So, each vector may not be unique to the word it represents. The one on the left could also be the vector for capelin, a kind of fish, or panicle, which is a loose cluster of flowers.

The words, however, are similar in that the shorter word clap contains all the letters that the longer word pelican contains. We might be able to see this similarity show up if we measure the cosine between the two vectors. The cosine can be had, recall, by determining the dot product of the vectors (multiply each pair of corresponding elements and add all the products) and dividing the result by the product of their lengths (the lengths being, in each case, the square root of component12 + component22 . . .). What we get for the two vectors on the right is: \[\mathtt{\frac{4}{\sqrt{6}\sqrt{4}} \approx 0.816}\]

This is fairly close to 1. The angle measure between the two words would be about 35°. Now let’s compare pelican and plenty. These two words are also fairly similar—there is the same 4-letter overlap between the words—but should yield a smaller cosine because of the divergent letters. Confirm for yourself, but for these two words I get: \[\mathtt{\frac{4}{\sqrt{6}\sqrt{6}} = \frac{2}{3}\quad\quad}\]

And that’s about a 48-degree angle between the words. An even more different word, like sausage (the a and s components have 2s), produces a cosine (with pelican) of about 0.3693, which is about a 68° angle.

So, we see that with vectors we can apply a numeric measurement to the similarity of words, with anagrams having cosines of 1 and words sharing no letters at all as being at right angles to each other (having a dot product and cosine of 0).

Combining Matrix Transformations

Something that stands out in my mind as I have learned more linear algebra recently is how much more sane it feels to do a lot of forward thinking before getting into the backward “solving” thinking—to, for example, create a bunch of linear transformations and strengthen my ability to do stuff with the mathematics before throwing a wrench in the works and having me wonder what would happen if I didn’t know the starting vectors.

So, we’ll continue that forward thinking here by looking at the effect of combining transformations. Or, if we think about a 2 × 2 matrix as representing a linear transformation, then we’ll look at combining matrices.

How about this one, then? This is a transformation in which the (1, 0) basis vector goes to (1, 1 third) and the (0, 1) basis vector goes to (–2, 1). You can see the effect this transformation has on the unshaded triangle (producing the shaded triangle).

Before we combine this with another transformation, notice that the horizontal base of the original triangle, which was parallel to the horizontal basis vector, appears to be, in its transformed form, now parallel to the transformed horizontal basis vector. Let’s test this. \[\begin{bmatrix}\mathtt{1} & \mathtt{-2}\\\mathtt{\frac{1}{3}} & \mathtt{\,\,\,\,1}\end{bmatrix}\begin{bmatrix}\mathtt{2}\\\mathtt{2}\end{bmatrix} = \begin{bmatrix}\mathtt{-2}\\\mathtt{2\frac{2}{3}}\end{bmatrix} \quad\text{and}\quad\begin{bmatrix}\mathtt{1} & \mathtt{-2}\\\mathtt{\frac{1}{3}} & \mathtt{\,\,\,\,1}\end{bmatrix}\begin{bmatrix}\mathtt{4}\\\mathtt{2}\end{bmatrix} = \begin{bmatrix}\mathtt{0}\\\mathtt{3\frac{1}{3}}\end{bmatrix}\]

The slope of the originally horizontal but now transformed base is, then, \(\mathtt{\frac{3\frac{1}{3}\, – \,2\frac{2}{3}}{0\,-\,(-2)} = \frac{\frac{2}{3}}{2} = \frac{1}{3}}\), which is the same slope as the transformed horizontal basis vector (1, 1 third).

Transform the Transformation

Okay, so let’s transform the transformation, as shown at the right, under this matrix: \[\begin{bmatrix}\mathtt{-1} & \mathtt{0}\\\mathtt{\,\,\,\,0} & \mathtt{\frac{1}{2}}\end{bmatrix}\]

Is it possible to multiply the two matrices to get our final (purple) transformation? Here’s how to multiply the two matrices and the result:

You should be able to check that, yes indeed, the last matrix takes the original triangle to the purple triangle. You should also be able to test that reversing the order of the multiplication of the two matrices changes the answer completely, so matrix multiplication is not commutative. Notice also that the determinant is approximately \(\mathtt{-0.8333…}\). This tells us that the area of the new triangle is 5 sixths that of the original. And the negative indicates the reflection the triangle underwent. The determinant of the first matrix is –0.5, and that of the second is 5 thirds. Multiply those together and you get the determinant of the combined transformations matrix.

Inverse of a Scaling Matrix

Well, we should be pretty comfortable moving things around with vectors and matrices. We’re good on some of the forward thinking. We can think of a matrix \(\mathtt{A}\) as a mapping of one vector (or an entire set of vectors) to another vector (or to another set of vectors). Then we can think of \(\mathtt{B}\) as the matrix which undoes the mapping of \(\mathtt{A}\). So, \(\mathtt{B}\) is the inverse of \(\mathtt{A}\).

How do we figure out what \(\mathtt{A}\) and \(\mathtt{B}\) are?

\[\mathtt{A}\color{green}{\begin{bmatrix}\mathtt{\,\,3\,\,} \\\mathtt{\,\,3\,\,} \end{bmatrix}} \,= \color{green}{\begin{bmatrix}\mathtt{-4} \\\mathtt{\,\,\,\,1} \end{bmatrix}}\]
\[\mathtt{B}\color{green}{\begin{bmatrix}\mathtt{-4} \\\mathtt{\,\,\,\,\,1} \end{bmatrix}} = \color{green}{\begin{bmatrix}\mathtt{\,\,3\,\,} \\\mathtt{\,\,3\,\,} \end{bmatrix}}\]

Eyeballing Is a Lost Art in Mathematics Education

It is! We can figure out the matrix \(\mathtt{A}\) without doing any calculations. Break down the movement of the green point into horizontal and vertical components. Horizontally, the green point is reflected across the “y-axis” and then stretched another third of its distance from the y-axis. This corresponds to multiplying the horizontal component of the green point by –1.333…. For the vertical component, the green point starts at 3 and ends at 1, so the vertical component is dilated by a factor of 0.333…. We can see both of these transformations shown in the change in the sizes and directions of the blue and orange basis vectors. So, our transformation matrix \(\mathtt{A}\) is shown below. When we multiply the vector (3, 3) by this transformation matrix, we get the point, or position vector, (–4, 1). \[\begin{bmatrix}\mathtt{-\frac{4}{3}} & \mathtt{0}\\\mathtt{\,\,\,\,0} & \mathtt{\frac{1}{3}}\end{bmatrix}\begin{bmatrix}\mathtt{3}\\\mathtt{3}\end{bmatrix} = \begin{bmatrix}\mathtt{-4}\\\mathtt{\,\,\,\,1}\end{bmatrix}\]

You can see that \(\mathtt{A}\) is a scaling matrix, which is why it can be eyeballed, more or less. And what is the inverse matrix? We can use similar reasoning and work backward from (–4, 1) to (3, 3). For the horizontal component, reflect across the y-axis and scale down by three fourths. For the vertical component, multiply by 3. So, the inverse matrix, \(\mathtt{B}\), when multiplied to the vector, produces the correct starting vector: \[\begin{bmatrix}\mathtt{-\frac{3}{4}} & \mathtt{0}\\\mathtt{\,\,\,\,0} & \mathtt{3}\end{bmatrix}\begin{bmatrix}\mathtt{-4}\\\mathtt{\,\,\,\,1}\end{bmatrix} = \begin{bmatrix}\mathtt{3}\\\mathtt{3}\end{bmatrix}\]

You’ll notice that we use the reciprocals of the non-zero scaling numbers in the original matrix to produce the inverse matrix. You can do the calculations with the other points on the animation above to test it out.

Incidentally, we can also eyeball the eigenvectors—those vectors which don’t change direction but are merely scaled as a result of the transformations—and even the eigenvalues (the scale factor of each transformed eigenvector). The vector (1, 0) is an eigenvector, with an eigenvalue of \(\mathtt{-\frac{4}{3}}\) for the original transformation and an eigenvalue of –0.75 for the inverse, and the vector (0, 1) is an eigenvector, with an eigenvalue of \(\mathtt{\frac{1}{3}}\) for the original transformation and an eigenvalue of 3 for the inverse.

Rotations, Reflections, Scalings

I just wanted to pause briefly to showcase how some of the linear transformations we have been looking into can be represented in computerese (or at least one version of computerese). You can click on the pencil icon and then on the matrix_transform.js file in the trinket below and look for the word matrix. Change the numbers in those lines to check the effects on the transformations. You can get some fairly wild stuff.

By the way, trinket is an incredibly beautiful product if you like tinkering with all kinds of code. Grab a free account and share your work!

For this demo, I stuck with simple transformations centered at the origin of a coordinate system (so to speak). As you can imagine, there are much more elaborate things you can do when you combine transformations and move the center point around.