Some TikTok Math

@rebrokeraaron#duet with @dustin_wheeler

I came across a nice TikTok vid recently that definitely screamed out for a mathematical explanation. It was interesting to watch because I had written an activity not too long ago which connected the two mathematical ideas relevant to this video—I just didn’t have this nice context to apply it to.

Anyway, you can see it at the right there.

The two connected ideas at work here are the Triangle Proportionality Theorem and the Midpoint Formula. We don’t actually need both of these concepts to explain the video. It’s just more pleasing to apply them both together.

The Triangle Proportionality Theorem (see an interactive proof here) tells us that if we draw a line segment inside a triangle that connects two sides and is parallel to the third side, then the segment will divide the two connected sides proportionally.

So, \(\mathtt{\overline{DE}}\) above connects sides \(\mathtt{\overline{AB}}\) and \(\mathtt{\overline{AC}}\), is parallel to side \(\mathtt{\overline{BC}}\), and thus divides both \(\mathtt{\overline{AB}}\) and \(\mathtt{\overline{AC}}\) proportionally, such that \(\mathtt{a:b=c:d}\).

And of course the Midpoint Formula, which we’ll suss out in a second, is \(\mathtt{\left(\frac{x_{1}+x_{2}}{2}, \frac{y_{1}+y_{2}}{2}\right)}\).

Okay, Let’s Explain This

What’s happening in the video is that the measurer is creating a right triangle. They first make a horizontal measurement—which they don’t know the midpoint of. That’s the horizontal leg of the right triangle. Then they take the tape measure up the board diagonally. That’s the hypotenuse of the right triangle. We do know that midpoint (because it’s a simpler whole number).

We can drop a line segment from the midpoint that we know straight down vertically, such that the segment is parallel to the vertical side of the triangle (the vertical side of the board).

Now we have set up everything we need for the Triangle Proportionality Theorem. The vertical segment connects two sides—the hypotenuse and the horizontal side—and is parallel to the third (vertical) side. Thus, the vertical segment divides the hypotenuse and horizontal segment into equal ratios.

But since we know that the segment is at the midpoint of the hypotenuse, we know that it divides the hypotenuse into the ratio \(\mathtt{1:1}\). That is, it divides the hypotenuse into two congruent segments. Therefore, it must also divide the horizontal line segment into two congruent segments!

Okay, so that pretty much nails it (ha!). Why do we need the Midpoint Formula?

Now for the Midpoint Formula

Well, we don’t really need the Midpoint Formula. It’s just interesting that the Triangle Proportionality Theorem explains why the Midpoint Formula works. Working through the explanation above, it may have occurred to you that what the Triangle Proportionality Theorem says, indirectly, about ANY right triangle, is that—if you can wrangle the legs to be horizontal and vertical—the midpoint of the hypotenuse is always directly above or below the midpoint of the horizontal side, and the midpoint of the hypotenuse is always directly to the left or right of the midpoint of the vertical side. That’s what the Midpoint Formula basically says in symbols.

GCF and LCM Triangles

Go grab some dot paper or grid paper—or just make some dots in a square grid on a blank piece of paper. Let’s start with a 4 × 4 grid of dots, like so.

A 4 by 4 array of dots.

Now, start at the top left corner, draw a vertical line down to the bottom of the grid, and count each dot that your pen enters—which just means that you won’t count the first dot, since your pen leaves that dot but does not enter it. Then, draw a horizontal line to the right, starting over with your counting. Again, count each dot that your pen enters. Count just 2 dots as you draw to the right.

4 by 4 array of dots with an L-shape 3 high and 2 wide

Finally, draw a straight line (a hypotenuse) back to your starting point. Here again, count the number of dots you enter.

4 by 4 array of dots with a right triangle 3 high and 2 wide

One example is not, of course, enough to convince you that the number of dots your pen enters when drawing the hypotenuse is the greatest common factor (GCF) of the number of counted vertical dots and the number of counted horizontal dots. So, here are a few more examples with just a 4 × 4 grid.

No doubt there are tons of people out there for whom this display is completely unsurprising. But it surprised me. The GCF of two numbers is an object that seems as though it should be rather hidden—a value that may appear when we crack two numbers open and do some calculations with them, not something that just pops up when we draw lines on dot paper. We use prime factorization to suss out GCF, after all, and that is by no means an intuitive process.

Connections

There are some very nice mathematical connections here. The first is to the coordinate plane, or perhaps more simply to orthogonal axes, which we use to compare values all the time—but only in certain contexts. Widen or eliminate the context constraint, and it seems obvious that comparing two numbers orthogonally could yield insights about GCF.

And slope is, ultimately, the “reason” why this all works. The slope of a line in lowest terms is just the rise over the run with both the numerator and denominator divided by the GCF: \[\mathtt{\frac{\text{rise}}{\text{run}}\div\frac{\text{GCF}}{\text{GCF}}=\text{slope in lowest terms}}\]

Once slope is there, all kinds of connections take hold: divisibility, fractions, lowest terms, etc. Linear algebra, too, contains a connection, which itself is connected to something called Bézout’s Identity. There is also a weird connection to calculus—maybe—that I haven’t quite teased out. To see what I mean, let’s also draw the LCM out of these images.

From the lowest entered point on the hypotenuse, draw a horizontal line extending to the width of the triangle. Then draw a vertical line to the bottom right corner of the triangle. Now go left: draw a horizontal line all the way to the left edge of the triangle. Then a vertical line extending to the height of the lowest entered point on the hypotenuse. Finally, move right and draw a horizontal line back to where you started. You should draw a rectangle as shown in each of these examples. The area of each rectangle is the LCM of the two numbers.

The maybe-calculus connection I speak of is the visible curve vs. area-under-the-curve vibe we’ve got going on there. I’m still noodling on that one.

The Farey Mean?

I had never heard of the Farey mean, but here it is, brought to you by @howie_hua.

When you add two fractions, you of course remember that you should never just add the numerators and add the denominators across. The resulting fraction will not be the sum of the two addends. But if you do add across (under certain conditions which I’ll show below), the result will be a fraction between the two “addend” fractions. So, you can use the add-across method to find a fraction between two other fractions.

For example, \(\mathtt{\frac{1}{2}+\frac{4}{3}\rightarrow\frac{5}{5}}\). The first “addend” is definitely less than 1, and the second definitely greater than 1. The Farey mean here is exactly 1 (or \(\mathtt{\frac{5}{5}}\)), which is between the two “addend” fractions.

Why Does It Work?

Since this site is fast becoming all linear algebra all the time, let’s throw some linear algebra at this. What we want to show is that, given \(\mathtt{\frac{a}{b}<\frac{c}{d}}\) (we'll go with this assumption for now), \[\mathtt{\frac{a}{b}<\frac{a+c}{b+d}<\frac{c}{d}}\]

for certain positive integer values of \(\mathtt{a,b,c,}\) and \(\mathtt{d}\). I would probably do better to make those inequality signs less-than-or-equal-tos, but let’s stick with this for the present. We’ll start by representing the fraction \(\mathtt{\frac{a}{b}}\) as the vector \(\scriptsize\begin{bmatrix}\mathtt{b}\\\mathtt{a}\end{bmatrix}\) along with the fraction \(\mathtt{\frac{c}{d}}\) as the vector \(\scriptsize\begin{bmatrix}\mathtt{d}\\\mathtt{c}\end{bmatrix}\).

We’re looking specifically at the slopes or angles here (which is why we can represent a fraction as a vector in the first place), so we’ve made \(\scriptsize\begin{bmatrix}\mathtt{d}\\\mathtt{c}\end{bmatrix}\) have a greater slope to keep in line with our assumption above that \(\mathtt{\frac{a}{b}<\frac{c}{d}}\).

The fraction \(\mathtt{\frac{a+c}{b+d}}\) is the same as the vector \(\scriptsize\begin{bmatrix}\mathtt{b+d}\\\mathtt{a+c}\end{bmatrix}\). And since this vector is the diagonal of the vector parallelogram, it will of course have a greater slope than \(\mathtt{\frac{a}{b}}\) but less than \(\mathtt{\frac{c}{d}}\). You can keep going forever—just take one of the side vectors and use the diagonal vector as the other side. So long as you’re making parallelograms, you’ll get a new diagonal shallower than the two side vectors, and the result will be a fraction between the other two.

Incidentally, our assumption at the beginning that \(\mathtt{\frac{a}{b}<\frac{c}{d}}\) doesn't really matter to this picture. If we make \(\mathtt{\frac{c}{d}}\) less than \(\mathtt{\frac{a}{b}}\), our picture simply flips. The diagonal vector still has to be located between the two side vectors.

What Doesn’t Work?

The linear algebra picture of this concept also tells us where this method fails to find a fraction between the two addend fractions. When the two “addend” fractions are equivalent, \(\mathtt{c}\) and \(\mathtt{d}\) are multiples of \(\mathtt{a}\) and \(\mathtt{b}\), respectively, or vice versa. In that case, the resulting fraction looks like this.

The slopes or angles for both addends and for the result are the same, producing a Farey mean that is equal to both fractions.

Cosine Similarity and Correlation

I wrote a lesson not too long ago that started with a Would You Rather? survey activity. For our purposes here, we can pretend that each question had a Likert scale from 1–10 attached to it, though in reality, the lesson was about categorical data.

At any rate, here are the questions—edited a bit. Feel free to rate your answers on the scales provided. Careful! Once you click, you lock in your answer.

Would you rather . . .

  1. be able to fly (1) or be able to read minds (10)?

  1. go way back in time (1) or go way into the future (10)?

  1. be able to talk to animals (1) or speak all languages (10)?

  1. watch only historical movies (1) or sci-fi movies (10) for the rest of your life?

  1. be just a veterinarian (1) or just a musician (10)?

Finally, one last question that is not a would-you-rather. Once you’ve answered this and the rest of the questions, you can press the I'm Finished! button to submit your responses.

  1. Rate your fear of heights from (1) not at all afraid to (10) very afraid.

Check out the results so far.

Are Your Responses Correlated?

Next in the lesson, I move on to asking whether you think some of the survey responses are correlated. For example, if you scored “low” on the veterinarian-or-musician scale—meaning you would strongly prefer to be a veterinarian over a musician—would that indicate that you probably also scored “low” on Question (c) about talking to animals or speaking all the languages? In other words, are those two scores correlated? What about choosing the ability to fly and your fear of heights? Are those correlated? How could we measure this using a lot of responses from a lot of different people?

An ingenious way of looking at this question is by using cosine similarity from linear algebra. (We looked at the cosine of the angle between two vectors here and here.)

For example, suppose you really would rather have the ability to fly and you have almost no fear of heights. So, you answered Question (a) with a 1 and Question (f) with, say, a 2. Another person has no desire to fly and a terrible fear of heights, so they answer Question (a) with an 8 and Question (f) with a (10). From this description, we would probably guess that the two quantities wish-for-flight and fear-of-heights are strongly correlated. But we’ve also now got the vectors (1, 2) and (8, 10) to show us this correlation.

See that tiny angle between the vectors on the left? The cosine of tiny angles (as we saw) is close to 1, which indicates a strong correlation. On the right, you see the opposite idea. One person really wants to fly but is totally afraid of heights (1, 10) and another almost couldn’t care less about flying (or at least would really rather read minds) but has a low fear of heights (8, 2). The cosine of the close-to-90°-angle between these vectors will be close to 0, indicating a weak correlation between responses to our flight and heights questions.

But That’s Not the Ingenious Part

That’s pretty cool, but it is not, in fact, how we measure correlation. The first difficulty we encounter happens after adding more people to the survey, giving us several angles to deal with—not impossible, but pretty messy for a hundred or a thousand responses. The second, more important, difficulty is that the graph on the right above doesn’t show a weak correlation; it shows a strong negative correlation. Given just the two response pairs to work from in that graph, we would have to conclude that a strong fear of heights would make you more likely to want the ability to fly (or vice versa) rather than less likely. But the “weakest” the cosine can measure in this kind of setup is 0.

The solution to the first difficulty is to take all the x-components of the responses and make one giant vector out of them. Then do the same to the y-components. Now we’ve got just two vectors to compare! For our data on the left, the vectors (1, 2) and (8, 10) become (1, 8) and (2, 10). The vectors on the right—(1, 10) and (8, 2)—become (1, 8) and (10, 2).

The solution to the second difficulty—no negative correlations—we can achieve by centering the data. Let’s take our new vectors for the right, uncorrelated, graph: (1, 8) and (10, 2). Add the components in each vector and divide by the number of components (2) to get an average. Then subtract the average from each component. So, our new centered vectors are

(1 – ((1 + 8) ÷ 2), 8 – ((1 + 8) ÷ 2)) and (10 – ((10 + 2) ÷ 2), 2 – ((10 + 2) ÷ 2))

Or (–3.5, 3.5) and (4, –4). It’s probably not too tough to see that a vector in the 2nd quadrant and a vector in the 4th quadrant are heading in opposite directions. And these vectors now form a close-to-180° angle, and the cosine of 180° is –1 which is the actual lowest correlation we can get, indicating a strong negative correlation.

And That’s Correlation

To summarize, the way to determine correlation linear-algebra style is to determine the cosine of the centered x- and y-vectors of the data. That formula is \[\mathtt{\frac{(x-\overline{x}) \cdot (y-\overline{y})}{|x-\overline{x}||y-\overline{y}|} = cos(θ)}\]

Which is just another way of writing the more common version of the r-value correlation.


The Formula for Combinations

And now, finally, let’s get to the formula for combinations. The math in my last post got a little tricky toward the end, with the strange exclamation mark notation floating around. So let’s recap permutations without that notation.

4
× 3
× 3 × 2
× 3 × 2 × 1
÷ (1 × 2 × 3)
÷ (1 × 2)
÷ 1

You should see that, to traverse a tree diagram, we multiply by the tree branches, cumulatively, to move right, and then divide by those branches—again, cumulatively—to move left. The formula for the number of permutations of 2 cards chose from 4, \(\mathtt{\frac{4!}{(4-2)!}}\), tells us to multiply all the way to the right, to get 24 in the numerator and then divide two steps to the left (divide by \(\mathtt{(4-2)!}\) or 2) to get 12 permutations of 2 cards chosen from 4.

Combinations

An important point about the above is that the number of permutations of \(\mathtt{r}\) cards chosen from \(\mathtt{n}\) cards, \(\mathtt{_{n}P_r}\), is a subset of the number of permutations of \(\mathtt{n}\) cards, \(\mathtt{n!}\) The tree diagram shows \(\mathtt{n!}\) and contained within it are \(\mathtt{_{n}P_r}\).

Combinations of \(\mathtt{r}\) items chosen from \(\mathtt{n}\), denoted as \(\mathtt{_{n}C_r}\), are a further subset. That is, \(\mathtt{_{n}C_r}\) are a subset of \(\mathtt{_{n}P_r}\). In our example of 2 cards chosen from 4, \(\mathtt{_{n}P_r}\) represents the first two columns of the tree diagram combined. In those columns, we have, for example, the permutations JQ and QJ. But these two permutations represent just one combination. The same goes for the other pairs in those columns. Thus, we can see that to get the number of combinations of 2 cards chosen from 4, we take \(\mathtt{_{n}P_r}\) and divide by 2. So, \[\mathtt{\frac{4!}{(4-2)!}\div 2=\frac{4!}{(4-2)!\cdot2}}\]

What about combinations of 3 cards chosen from 4? That’s the first 3 columns combined. Now the repeats are, for example, JQK, JKQ, QJK, QKJ, KJQ, KQJ. Which is 6. Noticing the pattern? For \(\mathtt{_{4}C_2}\), we divide \(\mathtt{_{4}P_2}\) further by 2! For \(\mathtt{_{4}C_3}\), we divide \(\mathtt{_{4}P_3}\) further by 3! We’re dividing (further) by \(\mathtt{r!}\)

When you think about it, this makes sense. We need to collapse every permutation of \(\mathtt{r}\) cards down to 1 combination. So we divide by \(\mathtt{r!}\) Here, finally then, is the formula for combinations: \[\mathtt{_{n}C_r=\frac{n!}{(n-r)!r!}}\]


And Now for the Legal Formula

So, did you come up with a working rule to describe the pattern we looked at last time? Here’s what I came up with:

As we saw last time, the “root” of the tree diagram (the first column) shows \(\mathtt{_{4}P_1}\), which is the number of permutations of 1 card chosen from 4. The first and second columns combined show \(\mathtt{_{4}P_2}\), the number of permutations of 2 cards chosen from 4. So, to determine \(\mathtt{_{n}P_r}\), according to this pattern, we start with \(\mathtt{n}\) and then multiply \(\mathtt{(n-1)(n-2)}\) and so on until we reach \(\mathtt{n-(r-1)}\).

The number of permutations of, say, 3 items chosen from 5, then, would be \[\mathtt{_{5}P_3=5\cdot (5-1)(5-2)=60}\]

This is a nice rule that works every time for permutations of \(\mathtt{r}\) things chosen from \(\mathtt{n}\) things. It can even be represented a little more ‘mathily’ as \[\mathtt{_{n}P_r=\prod_{k=0}^{r-1}(n-k)}\]

So let’s move on to the “legal” formula for \(\mathtt{_{n}P_r}\). A quick sidebar on notation, though, which we’ll need in a moment.

When we count the number of permutations at the end of a tree diagram, what we get is actually \(\mathtt{_{n}P_n}\). In our example, that’s \(\mathtt{_{4}P_4}\). The way we write this amount is with an exclamation mark: \(\mathtt{n!}\), or, in our case, \(\mathtt{4!}\) What \(\mathtt{4!}\) means is \(\mathtt{4\times(4-1)\times(4-2)\times(4-3)}\) according to our rule above, or just \(\mathtt{4\times3\times2\times1}\). And \(\mathtt{3!}\) is \(\mathtt{3\times(3-1)\times(3-2)}\), or just \(\mathtt{3\times2\times1}\).

In general, we can say that \(\mathtt{n!=n\times(n-1)!}\) So, for example, \(\mathtt{4!=4\times3!}\) etc. And since this means that \(\mathtt{1!=1\times(1-1)!}\), that means that \(\mathtt{0!=1}\).

So, for the tree diagram, \(\mathtt{_{4}P_4}\) means multiplying all the way to the right by \(\mathtt{n!}\). But if we’re interested in the number of arrangements of \(\mathtt{r}\) cards chosen from \(\mathtt{n}\) cards, then we need to come back to the left by \(\mathtt{(n-r)!}\) And since moving right is multiplying, moving left is dividing.

4
4 × 3
4 × 3 × 2
4 × 3 × 2 × 1
÷ (4 – 1)!
÷ (4 – 2)!
÷ (4 – 3)!
   ÷ (4 – 4)!

The division we need is not immediately obvious, but if you study the tree diagram above, I think it’ll make sense. This gives us, finally, the “legal” formula for the number of permutations of \(\mathtt{r}\) items from \(\mathtt{n}\) items: \[\mathtt{_{n}P_r=\frac{n!}{(n-r)!}}\]


A New Formula for Permutations?

Last time, we saw that combinations are a subset of permutations, and we wondered what the relationship between the two is. Before we get there, though, let’s look at another possible relationship—one we only hinted at last time. And to examine this relationship, we’ll use a tree diagram.

Tree Diagram

This tree diagram shows the number of permutations of the 4 cards J, Q, K, A—the number of ways we can arrange the 4 cards. The topmost branch shows the result JQKA. And you can see all 24 results from our list last time here in the tree diagram.

4
4 × 3
4 × 3 × 2
4 × 3 × 2 × 1
÷ 3!
÷ 2!
÷ 1!
   ÷ 0!

Here’s where, normally, people would talk about the multiplication 4 × 3 × 2 × 1 and tell you that another way to write that is with an exclamation mark: 4! But that’s skipping over something important.

And that something important is this: Notice that the first column of the tree diagram—the root of the tree—shows 4 items. This is the number of different permutations you can make of just 1 card, chosen from 4 different cards. And the first and second columns combined show the number of permutations you can make of 2 cards, chosen from 4 cards (JQ, JK, JA, etc.).

And so on. You might think that to go from “permutations of 1 card chosen from 4″ to “permutations of 2 cards chosen from 4″ you would multiply by 2. But of course that’s not right (and the tree diagram tells us so). You actually multiply 4 by 4 – 1. And to go from “permutations of 1 card chosen from 4″ to “permutations of 3 cards chosen from 4″ you multiply 4 • (4 – 1) • (4 – 2).

We’re on the verge of being able to describe the relationship, which I’ll put in question form (and mix in some notation to):

What is the relationship between the number of permutations of \(\mathtt{n}\) things, \(\mathtt{P(n)}\), and the number of permutations of \(\mathtt{r}\) things chosen from \(\mathtt{n}\) things, \(\mathtt{_{n}P_r}\)?

We can see from our example above that \(\mathtt{P(4)=24}\). That is, the number of permutations of 4 things is 24. But we also noticed these three results: $$\begin{aligned}_{\mathtt{4}}\mathtt{P}_{\mathtt{1}}&= \mathtt{\,\,4}\cdot \mathtt{1} \\ _{\mathtt{4}}\mathtt{P}_{\mathtt{2}}&= (\mathtt{4}\cdot \mathtt{1})(\mathtt{4}-\mathtt{1}) \\ _{\mathtt{4}}\mathtt{P}_{\mathtt{3}}&= (\mathtt{4}\cdot \mathtt{1})(\mathtt{4}-\mathtt{1})(\mathtt{4}-\mathtt{2})\end{aligned}$$

A New Formula?

Study the pattern above and see if you can write a rule that will get you the correct result for any \(\mathtt{_{n}P_r}\). Check your results here (for example, for \(\mathtt{_{16}P_{12}}\), you can just enter 16P12 and press Enter).

The rule you write, if you get it right, won’t be an algorithm. But it’ll work every time! This is the step we always skip when teaching about permutations! The next step is to think hard about why it works. We’ll get to the “legal” formula for permutations next time.

Permutations & Combinations

I have now been blogging for 16 years, and my very first post (long gone) was on combinations and permutations. So, it’s fun to come back to the idea now. In 2004, my experience with the two concepts was limited to how textbooks often used the awkward “care about order” (permutations) or “don’t care about order” (combinations) language to introduce the ideas. So, that’s what I wrote about then. Now I want to talk about how the two concepts are related.

What They Are

When you count permutations, you count how many different ways you can sequentially arrange some things. When you count combinations, you count how many ways you can have some things. So, given 2 cards, there are 2 different ways you can sequentially arrange 2 cards, but given 2 cards, there’s just one way to have 2 cards.

Right off the bat, the language is weird, and it’s hard to see why combinations should ever be a thing (there’s always just 1 way to have a set of things). But combinations make better sense when you are not choosing from all the elements you are given.

So, for example, how many permutations and combinations can I make of 2 cards, chosen from a total of 3 cards?

Now having the two categories of permutation and combination makes a little more sense. There are 6 permutations of 2 cards chosen from 3 cards and there are 3 combinations of 2 cards chosen from 3 cards. That is, there are 6 different ways to sequentially arrange 2 cards chosen from 3 and just 3 different ways to have 2 cards chosen from 3. And you can see, by the way, that the combinations are a subset of the permutations.

In fact, let’s do an example with 4 cards to show the actual relationship between permutations and combinations. Here we’ll just use letters to save space. The permutations of JQKA if we choose 3 cards are:

JQK, JKQ, QJK, QKJ, KJQ, KQJ
JQA, JAQ, QJA, QAJ, AJQ, AQJ
QKA, QAK, KQA, KAQ, AQK, AKQ
KAJ, KJA, JKA, JAK, AJK, AKJ

That’s 24 permutations. For combinations, we get JQK, JQA, QKA, KAJ. That’s 4 combinations. What’s the relationship? We’ll come back on that next time.

Rotating Coordinate Systems

I‘ve just started with Six Not-So-Easy Pieces, based on Feynman’s famous lectures, and already there’s some decently juicy stuff. In the beginning, Feynman discusses the symmetry of physical laws—that is, the invariance of physical laws under certain transformations (like rotations):

If we build a piece of equipment in some place and watch it operate, and nearby we build the same kind of apparatus but put it up on an angle, will it operate in the same way?

He goes on to explain that, of course, a grandfather clock will not operate in the same way under specific rotations. Assuming the invariance of physical laws under rotations, this change in operation tells us something interesting: that the operation of the clock is dependent on something outside of the “system” that is the clock itself.

The theorem is then false in the case of the pendulum clock, unless we include the earth, which is pulling on the pendulum. Therefore we can make a prediction about pendulum clocks if we believe in the symmetry of physical law for rotation: something else is involved in the operation of a pendulum clock besides the machinery of the clock, something outside it that we should look for.

Rotation Coordinates

Feynman then proceeds with a brief mathematical analysis of forces under rotations. A somewhat confusing prelude to this is a presentation that involves expressing the coordinates of a rotated system in terms of the original system. He uses the diagram below to derive those coordinates (except for the blue highlighting, which I use to show what (x’, y’) looks like in Moe’s system). What we want is to express (x’, y’) in terms of x and y—to describe Moe’s point P in terms of Joe’s point P.

“We first drop perpendiculars from P to all four axes and draw AB perpendicular to PQ.”

The first confusion that is not dealt with (because Feynman makes the assumption that his audience is advanced students) is what angles in the diagram are congruent to θ shown. And here again we see the value of the easy-to-forget art of eyeballing and common sense in geometric reasoning.

The y’ axis is displaced just as much as the x’ axis by rotation, and “displaced just as much by rotation” is a perfectly good definition of angle congruence that we tend to forget after hundreds of hours of deriving work. The same reasoning applies to the rotational displacement from AP to AB. If we imagine rotating AP to AB, we see that we are starting perpendicular to the x-axis and ending perpendicular to the x’-axis. The y-to-y’ rotation does the same thing, so the displacement angle must be the same. So let’s put in those new thetas, only one of which we’ll need.

Inspection of the figure shows that x’ can be written as the sum of two lengths along the x’-axis, and y’ as the difference of two lengths along AB.

Here is x’ as the sum of two lengths (red and orange): \[\mathtt{x’=OA\cdot\color{red}{\frac{OC}{OA}}+AP\cdot\color{orange}{\frac{BP}{AP}}\quad\rightarrow\quad x\cdot\color{red}{cos\,θ}+y\cdot\color{orange}{sin\,θ}}\]

And here is y’ as the difference of two lengths (green – purple): \[\mathtt{y’=AP\cdot\color{green}{\frac{AB}{AP}}-OA\cdot\color{purple}{\frac{AC}{OA}}\quad\rightarrow\quad y\cdot\color{green}{cos\,θ}-x\cdot\color{purple}{sin\,θ}}\]

So, if Joe describes the location of point P to Moe, and the rotational displacement between Moe and Joe’s systems is known (and it is known that the two systems share an origin), Moe can use the manipulations above to determine the location of point P in his system.

Another, exactly equal, way of saying this—the way we said it when we talked about rotation matrices—is that, if we represent point P in Joe’s system as a position vector (x, y), then Moe’s point P vector is \[\small{\begin{bmatrix}\mathtt{x’}\\\mathtt{y’}\end{bmatrix}=\begin{bmatrix}\mathtt{\,\,\,\,\,cos\,θ} & \mathtt{sin\,θ}\\\mathtt{-sin\,θ} & \mathtt{cos\,θ}\end{bmatrix}\begin{bmatrix}\mathtt{x}\\\mathtt{y}\end{bmatrix}=\mathtt{x}\begin{bmatrix}\mathtt{\,\,\,\,\,cos\,θ}\\\mathtt{-sin\,θ}\end{bmatrix}+\mathtt{y}\begin{bmatrix}\mathtt{sin\,θ}\\\mathtt{cos\,θ}\end{bmatrix}=\begin{bmatrix}\mathtt{\,\,\,\,\,\,x\cdot \color{red}{cos\,θ}\,\,\,+y\cdot \color{orange}{sin\,θ}}\\\mathtt{-(x\cdot \color{purple}{sin\,θ})+y\cdot \color{green}{cos\,θ}}\end{bmatrix}}\]

The first rotation matrix above actually describes a clockwise rotation, which is both different from what we discussed at the link above (our final matrix there was for counterclockwise rotations) and unexpected, since we know that Moe’s system is a counterclockwise rotation of Joe’s system.

The resolution to that unexpectedness can again be found after a little eyeballing. The position vector for point P in Joe’s system is at a steep angle, whereas in Moe’s system, it is at a shallow angle. Only a clockwise rotation will change the coordinates in the appropriate way.

Projections

A projection is like the shadow of a vector, say \(\mathtt{u}\), on another vector, say \(\mathtt{v}\), if light rays were coming in across \(\mathtt{u}\) and perpendicular to \(\mathtt{v}\). For the vectors at the right, imagine a light source above and to the left of the illustration, perpendicular to the vector \(\mathtt{v}\). The projection, which we’ll call \(\mathtt{p}\), will be a vector that will extend from the point shown to where the end of the shadow of \(\mathtt{u}\) touches \(\mathtt{v}\).

If you take a moment maybe to read that description twice (because it’s kind of dense), you may be able to tell that the vector \(\mathtt{p}\) that we’re looking for will be some scalar multiple of vector \(\mathtt{v}\), since it will lie exactly on top of \(\mathtt{v}\). In fact, given the picture, the projection vector \(\mathtt{p}\) will have the opposite sign as \(\mathtt{v}\) and will have a scale factor pretty close to 0, since the projection vector looks like it will be much smaller than \(\mathtt{v}\).

That information is shown at the right. Helpfully, the vector \(\mathtt{u-p}\) is perpendicular to \(\mathtt{v}\), so we know that \(\mathtt{\left(u-p\right)\cdot v=0}\). And, using the Distributive Property, we get \(\mathtt{u\cdot v-p\cdot v=0}\).

Since we know that our sought-after vector \(\mathtt{p}\) will be some scalar multiple of \(\mathtt{v}\), we can substitute, say, \(\mathtt{cv}\) for \(\mathtt{p}\) in the above to get \(\mathtt{u\cdot v-cv\cdot v=0}\). And a property of dot product multiplication allows us to rewrite that as \(\mathtt{u\cdot v-c\left(v\cdot v\right)=0}\).

This means that \(\mathtt{u\cdot v=c\left(v\cdot v\right)}\), which means that the scale factor \(\mathtt{c}\) that we’re after—the factor we can multiply by \(\mathtt{v}\) to produce \(\mathtt{p}\)—is \[\mathtt{c=\frac{u\cdot v}{v\cdot v}\quad\rightarrow\quad p=\left(\frac{u\cdot v}{v\cdot v}\right)v}\]