## The Determinant, Briefly

I want to get to moving stuff around using vectors and matrices, but I’ll stop for a second and touch on the determinant, since linear algebra seems to think it’s important. And, to be honest, it is kind of interesting.

The determinant is the area of the parallelogram created by two vectors. Two vectors will always create a parallelogram like the one shown below, unless they are just scaled versions of each other—but we’ll get to that.

The two vectors shown here are $$\color{blue}{\mathtt{u} = \begin{bmatrix}\mathtt{u_1}\\\mathtt{u_2}\end{bmatrix}}$$ and $$\color{red}{\mathtt{v} = \begin{bmatrix}\mathtt{v_1}\\\mathtt{v_2}\end{bmatrix}}$$.

We can determine the area of the parallelogram by first determining the area of the large rectangle and then subtracting the triangle areas. Note, by the way, that there are two pairs of two congruent triangles.

So, the area of the large rectangle is $$\mathtt{(u_1 + -v_1)(u_2 + v_2)}$$. The negative is interesting. We need it because we want to use positive values when calculating the area of the rectangle. If you play around with different pairs of vectors and different rectangles, you will notice that one of the vector components will always have to be negative in the area calculation, if a parallelogram is formed.

The two large congruent right triangles have a combined area of $$\mathtt{u_{1}u_{2}}$$. And the two smaller congruent right triangles have a combined area of $$\mathtt{-v_{1}v_{2}}$$. Thus, distributing and subtracting, we get $\mathtt{u_{1}u_{2} + u_{1}v_{2} – v_{1}u_{2} – v_{1}v_{2} – u_{1}u_{2} – (-v_{1}v_{2})}$

Then, after simplifying, we have $$\mathtt{u_{1}v_{2} – u_{2}v_{1}}$$. If the two vectors u and v represented a linear transformation and were written as column vectors in a matrix, then we could say that there is a determinant of the matrix and show the determinant of the matrix in the way it is usually presented: $\begin{vmatrix}\mathtt{u_1} & \mathtt{v_1}\\\mathtt{u_2} & \mathtt{v_2}\end{vmatrix} = \mathtt{u_{1}v_{2} – u_{2}v_{1}}$

One thing to note is that this is a signed area. The sign records a change in orientation that we won’t go into at the moment. In fact, describing the determinant as an area is a little misleading. When you look at transformations, the determinant tells you the scale factor of the change in area. A determinant of 1 would mean that areas did not change, etc. Also, if we have vectors that are simply scaled versions of one another—the components of one vector are scaled versions of the other—then the determinant will be zero, which is pretty much what we want, since the area will be zero. Let’s use lambda ($$\mathtt{\lambda}$$) as our scalar to be cool. $\,\,\,\,\,\,\quad\,\,\,\,\,\begin{vmatrix}\mathtt{u_1} & \mathtt{\lambda u_1}\\\mathtt{u_2} & \mathtt{\lambda u_2}\end{vmatrix} = \mathtt{\lambda u_{1}u_{2} – \lambda u_{1}u_{2} = 0}$

## A Matrix and a Transformation

So, we’ve jumped around a bit in what is turning into an introduction to linear algebra. The posts here, here, here, here, here, and here show the ground we’ve covered so far—although, saying it that way implies that we’ve moved along continuous patches of ground, which is certainly not true. We skipped over adding and scaling vectors and have focused on concepts which have close analogs to current high school algebra and geometry topics.

Now we’ll jump to the concept of a matrix. A matrix gives you information about two arrows—the x-axis arrow, if you will, and the y-axis arrow. The matrix below, for example, tells you that you are in the familiar xy coordinate plane, with the x arrow, or x vector, extending from the origin to (1, 0) and the y arrow, or y vector, going from the origin to (0, 1).

$\begin{bmatrix}\mathtt{\color{blue}{1}} & \mathtt{\color{orange}{0}}\\\mathtt{\color{blue}{0}} & \mathtt{\color{orange}{1}}\end{bmatrix}$

This is a kind of home-base matrix, and it is called the identity matrix. If we multiply a vector by this matrix, we’ll always get back the vector we put in. The equation below shows how this matrix-vector multiplication is done with the identity matrix and the vector (1, 2), as shown at the right.

$$\begin{bmatrix}\mathtt{1} & \mathtt{0}\\\mathtt{0} & \mathtt{1}\end{bmatrix}\begin{bmatrix}\mathtt{1}\\\mathtt{2}\end{bmatrix} = \mathtt{1}\begin{bmatrix}\mathtt{1}\\\mathtt{0}\end{bmatrix} + \mathtt{2}\begin{bmatrix}\mathtt{0}\\\mathtt{1}\end{bmatrix} = \begin{bmatrix}\mathtt{(1)(1) + (2)(0)}\\\mathtt{(1)(0) + (2)(1)}\end{bmatrix}$$

As you can see on the far right of the equation, the result is (1 + 0, 0 + 2), or (1, 2), the vector we started with.

A Linear Transformation

Now let’s take the vector at (1, 2) and map it to (0, 2). We’re looking for a matrix that can accomplish this—a transformation of the coordinate system that will map (1, 2) to (0, 2). If we shrink the horizontal vector to (0, 0) and keep the vertical vector the same, that would seem to do the trick.

$$\begin{bmatrix}\mathtt{0} & \mathtt{0}\\\mathtt{0} & \mathtt{1}\end{bmatrix}\begin{bmatrix}\mathtt{1}\\\mathtt{2}\end{bmatrix} = \mathtt{1}\begin{bmatrix}\mathtt{0}\\\mathtt{0}\end{bmatrix} + \mathtt{2}\begin{bmatrix}\mathtt{0}\\\mathtt{1}\end{bmatrix} = \begin{bmatrix}\mathtt{(1)(0) + (2)(0)}\\\mathtt{(1)(0) + (2)(1)}\end{bmatrix}$$

And it does! This matrix is called a shear matrix, and it takes any vector and shmooshes it onto the y-axis. We could do the same for any vector and the x-axis by zeroing out the second column of the matrix and keeping the first column the same.

You can try out all kinds of different numbers to see their effects. You can do rotations, reflections, and scalings, among other things. The transformation shown at right, for example, where the two column vectors are taken to (1, 1) and (–1, 1), respectively, maps the vector (1, 2) to the vector (–1, 3).

$$\begin{bmatrix}\mathtt{1} & \mathtt{-1}\\\mathtt{1} & \mathtt{\,\,\,\,1}\end{bmatrix}\begin{bmatrix}\mathtt{1}\\\mathtt{2}\end{bmatrix} = \mathtt{1}\begin{bmatrix}\mathtt{1}\\\mathtt{1}\end{bmatrix} + \mathtt{2}\begin{bmatrix}\mathtt{-1}\\\mathtt{\,\,\,\,1}\end{bmatrix} = \begin{bmatrix}\mathtt{(1)(1) + (2)(-1)}\\\mathtt{(1)(1) + (2)(1)}\end{bmatrix}$$

You may notice, by the way, that what we did with the matrix above was to first rotate the column vectors by 45° and then scale them up by a factor of $$\mathtt{\sqrt{2}}$$. We can do each of these transformations with just one matrix. $\begin{bmatrix}\mathtt{\frac{\sqrt{2}}{\,\,2}} & \mathtt{\frac{-\sqrt{2}}{\,\,2}}\\\mathtt{\frac{\sqrt{2}}{\,\,2}} & \mathtt{\,\,\,\,\frac{\sqrt{2}}{2}}\end{bmatrix} \leftarrow \textrm{Rotate by 45}^\circ \textrm{.} \quad \quad \begin{bmatrix}\mathtt{\sqrt{2}} & \mathtt{0}\\\mathtt{0} & \mathtt{\sqrt{2}}\end{bmatrix} \leftarrow \textrm{Scale up by }\sqrt{2}\textrm{.}$

Then, we can combine these matrices by multiplying them to produce the transformation matrix we needed. Each column of one of the matrices is multiplied by both columns of the other to get the two column vectors of the resulting matrix. We’ll look at that more in the future.

## Distance to a Line

I‘d almost always prefer to solve a problem using what I already know—if that can be done—than learning something I don’t know in order to solve the problem. After that, I’m happy to see how the new learning relates to what I already know. That’s what I’ll do here. There is a way to use the dot product efficiently to determine the distance of a point to a line, but we already know enough to get at it another way, so let’s start there.

So, suppose we know this information about the diagram at the right: $\mathtt{p=}\begin{bmatrix}\mathtt{4}\\\mathtt{2}\end{bmatrix}, \,\,\,\mathtt{x=}\begin{bmatrix}\mathtt{2}\\\mathtt{1}\end{bmatrix}, \,\,\,\mathtt{r=}\begin{bmatrix}\mathtt{-1}\\\mathtt{-3}\end{bmatrix}$ And we want to know the distance $$\mathtt{r}$$ is from the line.

An equation for the distance of $$\mathtt{r}$$ to the line, then—a symbolic way to identify this distance—might be given in words as follows: go to point $$\mathtt{p}$$, then scale to some point on the line. From that point, scale to some point on the vector that is perpendicular to the line until you get to point $$\mathtt{r}$$. In symbols, that could be written as: $\begin{bmatrix}\mathtt{4}\\\mathtt{2}\end{bmatrix}\mathtt{+\,\,\,\, j}\begin{bmatrix}\mathtt{2}\\\mathtt{1}\end{bmatrix}\mathtt{+\,\,\,\,k}\begin{bmatrix}\mathtt{-1}\\\mathtt{\,\,\,\,\,2}\end{bmatrix}\mathtt{\,\,=\,\,}\begin{bmatrix}\mathtt{-1}\\\mathtt{-3}\end{bmatrix}$ With the vector and scalar names, we could write this as $$\mathtt{p + j(p – x) + ka = r}$$. The distance to the line depends on our figuring out what $$\mathtt{k}$$ is. Once we have that, then the distance is just $$\mathtt{\sqrt{(ka_1)^2 + (ka_2)^2}}$$.

We can subtract vectors from both sides of an equation just like we do with scalar values. Subtracting the vector (4, 2) from both sides, we get an equation which can be rewritten as a system of two equations \mathtt{j}\begin{bmatrix}\mathtt{2}\\\mathtt{1}\end{bmatrix}\mathtt{+\,\,\,\,k}\begin{bmatrix}\mathtt{-1}\\\mathtt{\,\,\,\,\,2}\end{bmatrix}\mathtt{\,\,=\,\,}\begin{bmatrix}\mathtt{-5}\\\mathtt{-5}\end{bmatrix} \rightarrow \left\{\begin{align*}\mathtt{2j – k = -5} \\ \mathtt{j + 2k = -5}\end{align*}\right.

Solving that system gives us $$\mathtt{j = -3}$$ and $$\mathtt{k = -1}$$. So, the distance of $$\mathtt{r}$$ to the line is $$\mathtt{\sqrt{5}.}$$

Can We Get to the Dot Product?

Maybe we can get to the dot product. I’m not sure at this point. But there are some interesting things to point out about what we’ve already done. First, we can see that the vector $$\mathtt{j(p-x)}$$ is a scaling of vector $$\mathtt{(p-x)}$$ along the line, which, when added to $$\mathtt{p}$$, brings us to the right point on the line where some scaling of the perpendicular $$\mathtt{a}$$ can intersect to give us the distance. The scalar $$\mathtt{j=-3}$$ tells us to reverse the vector (2, 1) and stretch it by a factor of 3. Adding to $$\mathtt{p}$$ means that all of that happens starting at point $$\mathtt{p}$$.

Then the scalar $$\mathtt{k=-1}$$ reverses the direction of $$\mathtt{a}$$ to take us to $$\mathtt{r}$$.

We can then use this diagram to at least show how the dot product gets us there. We modify it a little to include the parts we will need and talk about.

Okay, here we go. Let’s consider the dot product $$\mathtt{-a \cdot (r – p)}$$. We know that since $$\mathtt{-a}$$ and $$\mathtt{x-p}$$ are perpendicular, their dot product is 0, but this is $$\mathtt{r-p}$$, not $$\mathtt{x-p}$$. So, $$\mathtt{-a \cdot (r – p)}$$ will likely have some nonzero value. Their dot product is this $\mathtt{a \cdot (r – p) = |-a||r-p|\textrm{cos}(θ)}$ We got this by rearranging the formula we saw here.

We also know, however, that we can use the cosine of the same angle in representing the distance, d: $\mathtt{d=|r-p|\textrm{cos}(θ)}$

Putting those two equations together, we get $$\mathtt{d = \frac{a \cdot (r – p)}{|a|}}$$.

We can forget about the negative in front of $$\mathtt{a}$$. But you may want to play around with it to convince yourself of that. A nice feature of determining the distance this way is that the distance is signed. It is negative below the line and positive above it.

## Makin’ Copies

research

At the heart of many calls to improve education is the taken-for-granted notion that because the world is now changing so rapidly, it is better for schools to focus on producing innovative and critical thinkers and ‘not just’ knowledgable students. The common instructional approach deployed, at all scales, to produce this effect—whether it is inquiry learning or personalized learning—is to remove or dramatically lessen the influence of knowledgable others.

Copying the effective behaviors of knowledgable others was a much more effective learning strategy than learning directly from the environment.

But important research on learning strategies in the wild shows that, at the very least, different intuitions are possible here. Researchers discovered—much to their surprise—that, in a rapidly changing environment, copying the effective behaviors of knowledgable others (social learning) could be a much more effective learning strategy than learning directly from the environment (asocial learning). This result held even when social learning was “noisy” and asocial learning was noise free.

The team has gone on to further investigate and apply their findings to other animal studies, and a book, Darwin’s Unfinished Symphony, was released just last year, detailing their work.

Social Learning Strategies Tournament

The method used for this research was a tournament in which the researchers designed a computer simulation environment and entrants to the tournament (104 in all) designed ‘agents’ that competed to survive in the generated environment by learning behaviors and applying them to receive payoffs for those behaviors. Each agent had three possible moves it could play: Observe, Innovate, or Exploit. The first two of these moves—Observe and Innovate—were learning moves, which allowed the agent to acquire new behaviors (or not in some cases), and the third move, Exploit, allowed agents to apply their acquired behaviors to receive a payoff (or not, depending on the environment and the behavior). As was mentioned above, Observe moves were “noisy,” whereas Innovate moves were noise free:

Innovate represented asocial learning, that is, individual learning stemming solely through direct interaction with the environment, for example, through trial and error. An Innovate move always returned accurate information about the payoff of a randomly selected behavior previously unknown to the agent. Observe represented any form of social learning or copying through which an agent could acquire a behavior performed by another individual, whether by observation of or interaction with that individual. An Observe move returned noisy information about the behavior and payoff currently being demonstrated in the population by one or more other agents playing Exploit. Playing Observe could return no behavior if none was demonstrated or if a behavior that was already in the agent’s repertoire is observed and always occurred with error, such that the wrong behavior or wrong payoff could be acquired. The probabilities of these errors occurring and the number of agents observed were parameters we varied.

Some Key Findings

When the winning agent, which learned primarily by copying, was modified to learn only through Innovate moves, it placed last.

It was not effective to play a lot of learning moves. But when learning moves were played, agents which relied almost exclusively on Observe outperformed the rest, and an increase in copying was strongly positively correlated with higher payoffs. When the winning agent (called DISCOUNTMACHINE) was modified to learn only through Innovate moves, it placed last.

Even when learning by copying was made noisier—the probability and size of copying errors increased—agents which relied on it heavily still did best.

Finally, agents who combined asocial and social learning in more balanced ways (winning agents used social learning at least 95% of the time) performed worse than those who opted for social learning most of the time.

Why Copying Is Effective

It must be underscored, again, that, in more naturalistic environments there is a cost to asocial learning that copying does not have. Learning by observation is safer than learning by interacting directly with the environment, alone. But in this simulation, that cost was erased. And social learning (copying) STILL outperformed innovation, even when social learning was noisy (Observe “failed to introduce new behavior into an agent’s repertoire in 53% of all the Observe moves in the first tournament phase, overwhelmingly because agents observed behaviors they already knew”).

So, why was copying effective? The researchers boiled it down to being surrounded by rational agents, which I choose to rephrase as “knowledgable adults”:

Social learning proved advantageous because other agents were rational in demonstrating the behavior in their repertoire with the highest payoff, thereby making adaptive information available for others to copy. This is confirmed by modified simulations wherein social learners could not benefit from this filtering process and in which social learning performed poorly. Under any random payoff distribution, if one observes an agent using the best of several behaviors that it knows about, then the expected payoff of this behavior is much higher than the average payoff of all behaviors, which is the expected return for innovating. Previous theory has proposed that individuals should critically evaluate which form of learning to adopt in order to ensure that social learning is only used adaptively, but a conclusion from our tournament is that this may not be necessary. Provided the copied individuals themselves have selected the best behavior to perform from at least two possible options, social learning will be adaptive.

Any takeaways for education from this will be stretches. The research was a computer simulation, after all. But, whatever. My takeaway from all this is that, as long as there are knowledgable adults around, we should encourage students to learn directly from them. A milder takeaway (or maybe stronger, depending on your point of view): regardless of how adept you feel yourself to be in your social world, social worlds are not intuitive. What seems to make sense to you as a strong connection between ideas A and B (in this case, changing world → promote innovation) will not necessarily be effective just because a lot of people believe it and it makes intuitive sense. The way to change that is not to stop making those arguments, because few people do. The way to change it is to stop forwarding those kinds of arguments along when they are made. That way, the behavior won’t be copied. : )

Coda

I should add, by way of the quote below from Darwin’s Unfinished Symphony, that, although copying was a more successful strategy than innovating, it was not, by itself, the reason for success. What made the difference was better, more efficient, more accurate copying behaviors:

The tournament teaches us that natural selection will tend to favor those individuals who exhibit more efficient, more strategic, and higher-fidelity (i.e., more accurate) copying over others who either display less efficient or exact copying, or are reliant on asocial learning.

## Dot Product Deep(ish) Dive

The dot product is helpful in finding the distance of a point to a line. The dot product, as we mentioned here, is the the sum of the element-wise products of the vector components. Given two vectors $$\mathtt{v}$$ and $$\mathtt{w}$$, their dot product is $\begin{bmatrix}\mathtt{v_1}\\\mathtt{v_2}\end{bmatrix} \cdot \begin{bmatrix}\mathtt{w_1}\\\mathtt{w_2}\end{bmatrix}\mathtt{= v_1w_1 + v_2w_2}$

The result of this computation is not another vector, but just a number, a scalar quantity. And, given that the dot product of two perpendicular vectors is 0, it would be nice if the dot product were related to cosine in some way, since the cosine of 90° is also 0. So let’s take a look at some vector pairs and their dot products and think about any patterns we see.

$$\mathtt{v \cdot w=-4}$$     $$\mathtt{θ=180^{\circ}}$$     $$\mathtt{\textrm{cos}(θ)=-1}$$

$$\mathtt{v \cdot w=-2}$$     $$\mathtt{θ=120^{\circ}}$$     $$\mathtt{\textrm{cos}(θ)=-\frac{1}{2}}$$

$$\mathtt{v \cdot w=4}$$     $$\mathtt{θ=45^{\circ}}$$     $$\mathtt{\textrm{cos}(θ)=\frac{\sqrt{2}}{2}}$$

$$\mathtt{v \cdot w=2}$$     $$\mathtt{θ=60^{\circ}}$$     $$\mathtt{\textrm{cos}(θ)=\frac{1}{2}}$$

Well, so, the dot products have the same signs as the cosines. That’s a start. And in all but one case shown, we can divide the dot product by 4 to get the cosine. What makes the 45° case different?

Each of the vectors shown, with the exception of the vector (2, 2) has a length, a magnitude, of 2. To determine the magnitude, or length, of a vector, you treat the components of the vector as the legs of a right triangle and the vector itself as the hypotenuse. So, $|\begin{bmatrix}\mathtt{-1}\\\mathtt{\sqrt{3}}\end{bmatrix}|=\sqrt{(-1)^2+(\sqrt{3})^2}=2$

But the length of (2, 2) is $$\mathtt{\sqrt{8}}$$. If we were to give that vector a length of 2, without changing the angle between v and w, then the vector would become ($$\mathtt{\sqrt{2}, \sqrt{2}}$$). And, lo, the dot product would become $$\mathtt{2\sqrt{2}}$$, which, when divided by 4, would yield the cosine.

The 4 that we divide by isn’t random. It’s the product of the lengths of the vectors. If we leave the 45° angled vectors alone, the product of their lengths is $$\mathtt{2\sqrt{8}}$$. Dividing 4 by this product does indeed yield the correct cosine. So, we have an initial conjecture that the dot product of two vectors v and w relates to cosine like this: $\mathtt{\frac{v \cdot w}{|v||w|} = cos(θ)}$

Perpendicular vectors will still have a dot product of 0 with this formula, so that’s good. And we can scale the vectors however we want and the cosine should remain the same—as it should be—though it may take a little manipulation to see that that’s true. But we are still left with the puzzle of proving this conjecture, more or less, or at least demonstrating to our satisfaction that the result is general.

Although the derivation doesn’t go beyond the Pythagorean Theorem, really, it gets a little symbol heavy, so let’s start with something simpler. We can write the cosine of θ at the right as $\mathtt{\textrm{cos}(θ)=\frac{|w|}{|v|}}$ If we think of w here as truly horizontal, its length is simply $$\mathtt{v_1}$$, the length of the horizontal component of v. Combining this fact with the length of v, we can rewrite the cosine equation above as $\mathtt{\textrm{cos}(θ)=\frac{v_1}{\sqrt{v_{1}^2+v_{2}^2}}}$

Since w is horizontal (has a second component of 0), the dot product $$\mathtt{v \cdot w}$$ becomes simply $$\mathtt{v_{1}^2}$$. Dividing this by the product of the lengths of the vectors v and w (where the length of w is just $$\mathtt{v_1}$$), we get this equation for cosine: $\mathtt{\,\,\,\,\,\textrm{cos}(θ)=\frac{v_{1}^2}{(v_1)(\sqrt{v_{1}^2+v_{2}^2})}}$ And that’s clearly equal to the above. So, while it is by no means definitive, we can have a little more confidence at this point that we have the right equation for cosine using the dot product. We can get more formal and sure about it later. Next time we’ll look at how it can help us determine the distance from a point to a line.

## Implicit Equations for Lines

So, I’ve covered parametric lines already. Another form in which we can write equations for lines using linear algebra is implicit form.

The parameter in the parametric form of a line was a scalar $$\mathtt{k}$$. We built the parametric form using a position vector to get us to a starting point on the line. Then we added this to the product of the slope vector and the parameter $$\mathtt{k}$$ to get all the other points on the line. The implicitness of the implicit form comes from the fact that we build the equation using the slope vector and a vector perpendicular to the slope vector.

I mentioned back here that perpendicular vectors always have a dot product of 0. So, thinking of $$\mathtt{x-p}$$ as the (slope) vector of our line, then $$\mathtt{a \cdot (x-p) = 0}$$. With the actual values shown here, we have $\begin{bmatrix}\mathtt{1}\\\mathtt{3}\end{bmatrix} \cdot \begin{bmatrix}\mathtt{\,\,\,\,3}\\\mathtt{-1}\end{bmatrix}\mathtt{ = (1)(3) + (3)(-1) = 0}$ If we know one point $$\mathtt{x}$$ on the line, then the dot product equation is true for any $$\mathtt{p}$$ and identifies a unique line. Let’s represent all the parts here as vectors, and more generally. $\begin{bmatrix}\mathtt{a_1}\\\mathtt{a_2}\end{bmatrix} \cdot (\begin{bmatrix}\mathtt{x_1}\\\mathtt{x_2}\end{bmatrix} \mathtt{- }\begin{bmatrix}\mathtt{p_1}\\\mathtt{p_2}\end{bmatrix}) \mathtt{\,\,= 0 \rightarrow a_1x_1 + a_2x_2 + (-a_1p_1 – a_2p_2) = 0}$

What’s cool about this equation is that we are all familiar with its form, $$\mathtt{ax_1 + bx_2 + c = 0}$$, so long as we let $$\mathtt{a = a_1, b = a_2,}$$ and $$\mathtt{c = -a_1p_1 – a_2p_2}$$. This is what is called the general form or standard form of a linear equation. Even more interesting is that the coefficients in this form help to describe a vector perpendicular to the line.

Knowing the above and $$\mathtt{y:=x_2}$$, we can write the equation for the line at the right as $\mathtt{4x+3y-(4)(-5)-(3)(6)=0}$ And then, working out that c-value, we get $$\mathtt{4x + 3y + 2 = 0}$$. The vector a = (4, 3), which we can rewrite as the ratio –4 : 3, describes the slope of the line.

Now we can easily slide back and forth between linear algebra and plain old current high school algebra with certain linear equations.

Source

## Line Segments with Linear Algebra

We saw last time that the parametric equation of a line is given by $$\mathtt{l(k) = p + kv}$$, where p is a point on the line (written as a vector), v is a free vector indicating the slope of the line, and k is a scalar value called the parameter. Turning the knob to change k gives you different points on the line. At the right is the line

$\mathtt{l(k) =} \begin{bmatrix}\mathtt{1}\\\mathtt{3}\end{bmatrix} + \begin{bmatrix}\mathtt{\,\,\,\,1}\\\mathtt{-1}\end{bmatrix}\mathtt{k}$

Substituting different numbers for k gives us different points on the line. These resolve into position vectors.

This setup makes it fairly easy to make a line segment, and to partition that line segment into any ratio you want (this will be our ‘current high school connection’ for this post).

When $$\mathtt{k = 0}$$, we get our position vector back: $$\begin{bmatrix}\mathtt{1}\\\mathtt{3}\end{bmatrix}$$. This is the point (1, 3). When $$\mathtt{k = 1}$$, we have … $\begin{bmatrix}\mathtt{1}\\\mathtt{3}\end{bmatrix} + \begin{bmatrix}\mathtt{\,\,\,\,1 \cdot 1}\\\mathtt{-1 \cdot 1}\end{bmatrix} = \begin{bmatrix}\mathtt{1 + 1}\\\mathtt{3 + (-1)}\end{bmatrix}$

… which is the point (2, 2). And so on to generate all the points on the line. To generate a line segment from (1, 3) to, say, the point where the line crosses the x-axis, we first have to figure out where the line crosses the x-axis. We can do this by inspection to see that it crosses at (4, 0), but let’s set it up too. We start by setting the line equal to the point (x, 0) and solving the resulting system of equations: $\begin{bmatrix}\mathtt{1}\\\mathtt{3}\end{bmatrix} + \begin{bmatrix}\mathtt{\,\,\,\,1}\\\mathtt{-1}\end{bmatrix}\mathtt{k} = \begin{bmatrix}\mathtt{x}\\\mathtt{0}\end{bmatrix} \rightarrow \left\{\begin{array}{c}\mathtt{1+k=x}\\\mathtt{3-k=0}\end{array}\right.$

Adding the equations, we get x = 4, so (4, 0) is indeed where the line crosses the x-axis. To generate points on the line segment from (1, 3) to (4, 0), we use position vectors for both endpoints. Then we can use what’s called a convex combination of k—which is just extremely fancy wording for coefficients that add up to 1. We scale the second position vector, (4, 0) by some k and we scale the first position vector (1, 3) by 1 – k. $\mathtt{l(k) =} \begin{bmatrix}\mathtt{1}\\\mathtt{3}\end{bmatrix}\mathtt{(1-k)} + \begin{bmatrix}\mathtt{4}\\\mathtt{0}\end{bmatrix}\mathtt{k}$

Want the line segment divided into fifths? Then just use k values in intervals of fifths, from 0 to 5 fifths. Transpose the k coefficients to get a different “direction” of partitioning of the line segment.

## Lines the Linear Algebra Way

Let’s continue with the idea of reinterpreting some high school algebra concepts in the light of linear algebra. For example, we learn even before high school in some cases that a line on a coordinate plane can be defined by two points or it can be defined by a point and the slope of the line.

When we have two points, $$\mathtt{(x_1, y_1)}$$ and $$\mathtt{(x_2, y_2)}$$, we can determine the slope with $\mathtt{\frac{y_2 – y_1}{x_2 – x_1}}$

and then do some substitutions to work out the y-intercept.

The linear algebra way uses vectors, of course. And all we need is a point and a vector to define a line. Or, really, two vectors, since the point can be described as a position vector and the slope is also a vector.

We have the line here defined as a vector plus a scaled vector—scaled by k. (See here for adding vectors and here for scaling them.) $\color{brown}{\begin{bmatrix}\mathtt{1}\\\mathtt{3}\end{bmatrix}} + \color{blue}{\begin{bmatrix}\mathtt{\,\,\,\,1}\\\mathtt{-1}\end{bmatrix}\mathtt{k}}$ That second, scaled, vector looks like it could do the job of defining the line all by itself, but free vectors like that don’t have a fixed location, so we need a position vector to “fix” that. In general terms, thinking about the free vector as extending from $$\mathtt{(x_1, y_1)}$$ to $$\mathtt{(x_2, y_2)}$$, we can write the equation for a line as $\mathtt{l(k) = }\begin{bmatrix}\mathtt{x_1}\\\mathtt{y_1}\end{bmatrix} + \begin{bmatrix}\mathtt{x_2 – x_1}\\\mathtt{y_2 – y_1}\end{bmatrix}\mathtt{k}$ That form is called the parametric form of an equation and can be written as $$\mathtt{l(k) = p + kv}$$, where p is a point (or position vector), v is the free vector, and k is a scalar value—the parameter that we change to get different points on the line.

Let’s put this into the context of a (reworded) word problem:

In 2014, County X had 783 miles of paved roads. Starting in 2015, the county has been building 8 miles of new paved roads each year. At this rate, if n is the number of years after 2014, what function gives the number of miles of paved road there will be in County X? (Assume that no paved roads go out of service.)

The equation we’re after is $$\mathtt{f(n) = 783 + 8n}$$. As a vector function this can be written as $\mathtt{f(n) = }\begin{bmatrix}\mathtt{0}\\\mathtt{783}\end{bmatrix} + \begin{bmatrix}\mathtt{1}\\\mathtt{8}\end{bmatrix}\mathtt{n}$ We can see here, perhaps a little more clearly with the vector representation, that our domain is restricted by the situation. Our parameter n is, at the very least, a positive real number, and really a positive integer.

It seems to me that here is at least one other example of a close relationship between linear algebra and current high school algebra instruction that would make absorbing linear algebra into some high school material feasible.

If you’d like to practice with some items related to this post, visit Linear Algebra Exercises I.

## How to Use Guzinta Math

• Adults and students work together to complete modules—at school, at home, or both.
• Students check in regularly. When their Practice Meter for a lesson is in the red, they should complete one or more modules in that lesson to get their Practice Meter to blue (or at least out of red).
• Adults check in 1 day, 1 week, and 1 month after first going through a lesson module and require that all practice levels be out of the red or in the blue on that day.
• Your Practice Meter levels are saved, even if you uninstall and then reinstall the application. To reset a Practice Meter for a lesson, click on the Guzinta Math logo at the center of the lesson homepage.

First things first: head over to the download page and install the application. When you start the application each time, you start at the homepage, which shows all 15 lessons for Grade 6.

After you have completed at least one module in one of the 15 lessons, this will activate your Practice Meter (explained below) for that lesson, and you will see a meter level to the right of the lesson on the homepage. This will allow you to, at a glance, see what concepts need your attention. Above, you can see that I’ve completed at least one module in the Ratio Names lesson (a green bar is shown to the right of the lesson). The Instructor Notes link at the bottom right allows you the ability to download the complete PDF of all the Instructor Notes for the grade level (279 pp).

Lesson Structure

Click on one of the lessons to be taken to the lesson homepage. Every lesson (with the exception of Plotting Ratios) in Grade 6 contains five modules. At the right, the lesson homepage for Ratio Names is shown. Notice the green Practice Meter level in the center icon. This mirrors the level shown on the main home page. Hover over this icon to see the numeric Practice Meter level (the one shown is at 36 currently).

Notice also the Home button in the bottom right corner of the lesson homepage. This can be found on all lesson homepages and will take you back to the main home page.

The 3 main modules, located in the center quad-panel are numbered 1 to 3 (Algebraic Expressions shown at right). The fourth square in the quad-panel is a link to the Instructor Notes for the lesson. Click on that to download a PDF of these notes.

The modules do not necessarily have to completed in any particular order. However, completing them in the order given is recommended.

Each of the first modules in every lesson is labeled with Guided Practice (Equations and Inequalities shown at left). This means that if it is used in a classroom, it should be used as an activity centerpiece involving teachers and students. If it is used at home, the first module in particular should be the focus of both parent(s) and student.

The Instructor Notes provide an outline for interactive teaching and learning discussions with this material. It is recommended that every module—when first completed—be done together with adult and student.

To use Guzinta Math at home, a teacher may assign a module for homework or practice. Parent and student then discuss and complete the module together and, if it is requested, return the student to school the next day with the completion certificate or email it. (Parents can follow the Instructor Notes for each module as well.) The material can be used also at school or solely at school. In that case, different modules can be completed as a class and others may be assigned for homework. It is not the case that adults should be doing all or even most of the work during these interactions, but they should attend to them, rather than plop students down in front of a monitor to complete these activities alone.

Practice Meter

At 25 and below, the Practice Meter color is red. Between 26 and 79, the color is green. And, a level of 80 or above makes the color blue. Once a module has been completed—either at school with a teacher or at home with a parent or caregiver, the time is recorded for that lesson, along with the Practice Meter level. As time passes, the Practice Meter level decreases to represent a forgetting of the content.

In the first 7 days, the meter decreases at a rate of about 54% each day. That is, it loses a little more than half its value each day. The Practice Meter level of 36 (green) mentioned above would be about 16 or 17 (red) a day later, if no work is done in the lesson. From 7 days to 28 days, the meter only loses about 17% of its value each day. From 28 to 90 days, only 6% is lost each day. And from 90 days on, only 1% is lost each day.

The purpose of the Practice Meter is to provide a visual indication of forgetting and to alert students, teachers, and parents when it is time to revisit a lesson. Forgetting is very useful for learning, so it is important to allow time for the Practice Meter level to decrease before recharging it. A good schedule to keep would be to use a module together as a class or with parents and students as homework and then check in 1 day, then 1 week, and then each month after this first start. Have students get their meters out of the red or in the blue, at least for these check-ins. The goal is to keep this content alive throughout the year—yes, even if students are repeating the same questions. Repetition is excellent for novice learners!

Time-Released Practice Questions

Because the application timestamps the beginning of a student’s work in a lesson, this allows it to reveal new practice questions over time. The table below shows the number of questions (excluding Module 0) asked in each lesson, starting on Day 1 and then the extra questions revealed on Days 4, 9, and 22. These days are measured separately for each lesson, and the timer doesn’t start until after adults and students together complete at least one module in the lesson.

LessonDay 1Day 4+Day 9+Day 22+
Ratio Names31+5+8+6
Ratio Tables28+10+7+5
Comparing Ratios45+0+0+0
Plotting Ratios12+0+0+0
Measure Conversions26+6+12+6
Fraction Division28+6+12+4
Long Division22+6+10+6
GCF and LCM33+6+6+5
Negative Numbers31+0+0+0
Order and Absolute Value34+7+6+3
Numeric Expressions28+8+8+6
Algebraic Expressions46+0+0+0
Equations & Inequalities29+0+0+0

Across the entire Grade 6, students have 424 practice questions on Day 1. Then, over time, this number grows to 651 total practice questions. The zeros in the table show 5 lessons which do not time-release new questions over time (at the moment). The other 10 lessons do.

There are a few reasons for time-releasing new questions: (1) This ratchets up the challenge level for a lesson. The answers for questions revealed on and after Day 4 are not included in the Instructor’s Notes. And, more questions in a lesson means that it becomes slightly more difficult to raise one’s practice meter up to any given level (though the difficulty increase is very minor in most cases). (2) It helps break the repetition a little. (3) Transfer is facilitated when students revisit a previously learned topic in a slightly new context.

Contact and Future Work

I’ve written here about work on Guzinta Math that I’ll be getting to in the near and farther future. If you have any questions or technical issues, please email me at qanda[at]guzintamath.com.

Some Polite Suggestions

Here are some recommendations for how to think and behave around this material.

• Ideally, provide guidance in one form or another on EVERY module the first time students go through it. The Instructor Notes provide the answers and some guiding questions for adults for the original practice questions (not the time-released ones). Use those notes to help you guide students. Guidance doesn’t mean you are perched on top of their shoulder, making sure they don’t get anything wrong.
• Take. Your. Time. Students should live with this content—revisiting it regularly to maintain their Practice Meter levels—throughout the entire year. They don’t need to “get through” the content fast.
• It’s possible to revisit the same module of a lesson every time to raise one’s Practice Meter level. That’s okay! But encourage students to complete other modules over time as well.
• Forget whatever label you’ve assigned to your student. “Quick learner”? They still need to work through and revisit this material throughout the year, just like everyone else. Revisiting deepens their understanding and often reveals patches of misunderstanding—where they learned to play the game well but don’t really get it. “Slow learner”? They should be challenged and given high expectations like other students. Again, revisiting over the entire year is key.
• If your student has memorized the answers to questions, that’s fine, at least once. That’s a good cue to let that module sit and allow forgetting to set in for a while and do other things. Also, if you suspect that a student is moving through stuff without thinking, that’s a clue to SIT DOWN WITH THEM AGAIN and work together. Watch the videos again and discuss. Have them explain things to you. Read the Notes. Have them generalize. Work through the exploratory module even when it doesn’t have questions to answer. In short, make them think and elaborate when otherwise they wouldn’t.

## Zukei and Dot Products

Zukei puzzles that ask students to find right triangles seem to rely on an understanding of perpendicularity that is situated more comfortably in linear algebra than in Euclidean geometry. Consider the following, which has a hidden isosceles right triangle in it. Your job is to find the vertices of that isosceles right triangle.

High school students would be expected to look at perpendicularity either intuitively—searching for square corners—or using the slope criteria, that perpendicular lines have slopes which are negative reciprocals of each other. But it seems a bit much to start treating this puzzle as a coordinate plane and determining equations of lines.

The Dot Product

In all fairness, the dot product is a bit much too. Instead, we can operationalize slopes with negative reciprocals by, for example, starting from any point, counting 1, 2, . . . n to the left or right and then 1, 2, . . . n up or down to get to the next point. From that point, we have to count left-rights in the way we previously counted up-downs and up-downs in the way we counted left-rights, and we have to reverse one of those directions. For the puzzle above, we count 1, 2 to the right from a point and then 1, 2 up to the next point. From that second point, it’s 1, 2 right and then 1, 2 down. It’s a little harder to see our counting and direction-switching rule at work when the slopes are 1 and –1, but, in the Zukei context at least, the slopes have to be 1 and –1, I think, to get an isosceles right triangle if we’re not talking about square corners.

This kind of counting is really treating the possible triangle sides as vectors. And, with perpendicular vectors, we can see that we can get something like one of these two pairs (though perpendicular vectors don’t have to look like this): $\mathtt{\begin{bmatrix} x_1\\x_2 \end{bmatrix} \textrm{and} \begin{bmatrix} -x_2\\ \,\,\,\,x_1 \end{bmatrix} \textrm{or} \begin{bmatrix} x_1\\x_2 \end{bmatrix} \textrm{and} \begin{bmatrix} \,\,\,\,x_2\\-x_1 \end{bmatrix}}$

The dot product is defined as the sum of the element-wise products of the vector components. In the case of perpendicular vectors, the dot product is 0. Here is the dot product of our vectors: $\mathtt{(x_1 \cdot -x_2) + (x_2 \cdot x_1) = 0}$

Some Programming

One reason why this way of defining perpendicularity (with a single value) is helpful is that we avoid nasty zero denominators and, therefore, undefined slopes. With the two vectors at the right, we get $\mathtt{\begin{bmatrix}0\\5\end{bmatrix} \cdot \begin{bmatrix}4\\0\end{bmatrix} = (0)(4) + (5)(0) = 0}$

We can take all of the points and run them through a program to find all the connected perpendicular vectors. The result ((0, 1), (1, 0)), ((0, 1), (2, 3)) below means that the vector connecting (0, 1) and (1, 0) and the vector connecting (0, 1) and (2, 3) are perpendicular.

This gives us all the perpendicular vector pairs, though it doesn’t filter out those vectors with unequal magnitudes, which we wanted in order to identify the isosceles right triangle.

There are some Zukei solvers available, though I confess I haven’t looked at any of them. No doubt, one or all of them use linear algebra rather than ordinary coordinate plane geometry to do their magic. It’s about time, I think, we start weaving linear algebra into high school algebra and geometry standards.

Solving Zukei puzzles is not the best justification for bringing linear algebra down into high school, of course. But I hope it can be a salient example of how connected linear algebra can be to a lot of high school content standards.