WEBVTT
00:00:01.160 --> 00:00:13.440
Hey everyone!
00:00:13.440 --> 00:00:26.400
If I had to choose just one topic that makes all of the others in linear algebra start to click and which too often goes unlearned the first time a student takes linear algebra, it would be this one: the idea of a linear transformation and its relation to matrices.
00:00:27.120 --> 00:00:35.080
For this video, I’m just going to focus on what these transformations look like in the case of two dimensions and how they relate to the idea of matrix-vector multiplication.
00:00:35.880 --> 00:00:42.000
In particular, I want to show you a way to think about matrix-vector multiplication that doesn’t rely on memorization.
00:00:43.200 --> 00:00:46.560
To start, let’s just parse this term “linear transformation”.
00:00:47.360 --> 00:00:49.880
“Transformation” is essentially a fancy word for “function”.
00:00:50.400 --> 00:00:53.800
It’s something that takes in inputs and spits out an output for each one.
00:00:54.480 --> 00:01:01.080
Specifically in the context of linear algebra, we like to think about transformations that take in some vector and spit out another vector.
00:01:02.480 --> 00:01:06.240
So why use the word “transformation” instead of “function” if they mean the same thing?
00:01:07.120 --> 00:01:11.320
Well, it’s to be suggestive of a certain way to visualize this input-output relation.
00:01:11.840 --> 00:01:15.960
You see, a great way to understand functions of vectors is to use movement.
00:01:16.920 --> 00:01:24.840
If a transformation takes some input vector to some output vector, we imagine that input vector moving over to the output vector.
00:01:25.680 --> 00:01:34.080
Then to understand the transformation as a whole, we might imagine watching every possible input vector move over to its corresponding output vector.
00:01:35.080 --> 00:01:39.160
It gets really crowded to think about all of the vectors all at once, each one is an arrow.
00:01:39.480 --> 00:01:47.400
So, as I mentioned last video, a nice trick is to conceptualize each vector, not as an arrow, but as a single point: the point where its tip sits.
00:01:48.080 --> 00:01:56.240
That way to think about a transformation taking every possible input vector to some output vector, we watch every point in space moving to some other point.
00:01:57.280 --> 00:02:05.840
In the case of transformations in two dimensions, to get a better feel for the whole “shape” of the transformation, I like to do this with all of the points on an infinite grid.
00:02:06.560 --> 00:02:12.840
I also sometimes like to keep a copy of the grid in the background, just to help keep track of where everything ends up relative to where it starts.
00:02:15.000 --> 00:02:21.160
The effect for various transformations, moving around all of the points in space, is, you’ve gotta admit, beautiful.
00:02:21.880 --> 00:02:24.680
It gives the feeling of squishing and morphing space itself.
00:02:25.600 --> 00:02:38.360
As you can imagine, though arbitrary transformations can look pretty complicated, but luckily linear algebra limits itself to a special type of transformation, ones that are easier to understand, called “linear” transformations.
00:02:39.200 --> 00:02:49.640
Visually speaking, a transformation is linear if it has two properties: all lines must remain lines, without getting curved, and the origin must remain fixed in place.
00:02:50.680 --> 00:02:55.480
For example, this right here would not be a linear transformation since the lines get all curvy.
00:02:56.160 --> 00:03:01.840
And this one right here, although it keeps the line straight, is not a linear transformation because it moves the origin.
00:03:02.680 --> 00:03:09.280
This one here fixes the origin and it might look like it keeps line straight, but that’s just because I’m only showing the horizontal and vertical grid lines.
00:03:09.600 --> 00:03:15.280
When you see what it does to a diagonal line, it becomes clear that it’s not at all linear since it turns that line all curvy.
00:03:16.840 --> 00:03:22.200
In general, you should think of linear transformations as keeping grid lines parallel and evenly spaced.
00:03:23.520 --> 00:03:27.520
Some linear transformations are simple to think about, like rotations about the origin.
00:03:28.120 --> 00:03:30.640
Others are a little trickier to describe with words.
00:03:32.200 --> 00:03:35.440
So how do you think you could describe these transformations numerically?
00:03:36.000 --> 00:03:47.320
If you were, say, programming some animations to make a video teaching the topic, what formula do you give the computer so that if you give it the coordinates of a vector, it can give you the coordinates of where that vector lands?
00:03:48.520 --> 00:03:54.520
It turns out that you only need to record where the two basis vectors, 𝑖-hat and 𝑗-hat, each land.
00:03:54.880 --> 00:03:56.520
And everything else will follow from that.
00:03:57.480 --> 00:04:05.640
For example, consider the vector 𝐯 with coordinates negative one, two, meaning that it equals negative one times 𝑖-hat plus two times 𝑗-hat.
00:04:09.080 --> 00:04:25.360
If we play some transformation and follow where all three of these vectors go, the property that grid lines remain parallel and evenly spaced has a really important consequence: the place where 𝐯 lands will be negative one times the vector where 𝑖-hat landed plus two times the vector where 𝑗-hat landed.
00:04:26.080 --> 00:04:34.600
In other words, it started off as a certain linear combination of 𝑖-hat and 𝑗-hat, and it ends up is that same linear combination of where those two vectors landed.
00:04:35.600 --> 00:04:40.960
This means you can deduce where 𝐯 must go based only on where 𝑖-hat and 𝑗-hat each land.
00:04:41.640 --> 00:04:55.000
This is why I like keeping a copy of the original grid in the background; for the transformation shown here, we can read off that 𝑖-hat lands on the coordinates one, negative two and 𝑗-hat lands on the 𝑥-axis over at the coordinates three, zero.
00:04:55.920 --> 00:05:06.160
This means that the vector represented by negative one 𝑖-hat plus two times 𝑗-hat ends up at negative one times the vector one, negative two plus two times the vector three, zero.
00:05:07.080 --> 00:05:11.600
Adding that all together, you can deduce that it has to land on the vector five, two.
00:05:14.640 --> 00:05:17.240
This is a good point to pause and ponder cause it’s pretty important.
00:05:18.520 --> 00:05:25.200
Now, given that I’m actually showing you the full transformation, you could have just looked to see the 𝐯 has the coordinates five, two.
00:05:25.880 --> 00:05:37.200
But the cool part here is that this gives us a technique to deduce where any vectors land, so long as we have a record of where 𝑖-hat and 𝑗-hat each land, without needing to watch the transformation itself.
00:05:38.640 --> 00:05:50.560
Write the vector with more general coordinates 𝑥 and 𝑦, and it will land on 𝑥 times the vector where 𝑖-hat lands — one, negative two — plus 𝑦 times the vector where 𝑗-hat lands — three, zero.
00:05:51.960 --> 00:06:00.080
Carrying out that sum, you see that it lands at one 𝑥 plus three 𝑦, negative two 𝑥 plus zero 𝑦.
00:06:00.360 --> 00:06:03.600
I give you any vector, and you can tell me where that vector lands using this formula.
00:06:05.000 --> 00:06:16.520
What all of this is saying is that a two-dimensional linear transformation is completely described by just four numbers: the two coordinates for where 𝑖-hat lands and the two coordinates for where 𝑗-hat lands.
00:06:17.080 --> 00:06:17.640
Isn’t that cool?!
00:06:18.480 --> 00:06:30.920
It’s common to package these coordinates into a two-by-two grid of numbers, called a two-by-two matrix, where you can interpret the columns as the two special vectors where 𝑖-hat and 𝑗-hat each land.
00:06:30.920 --> 00:06:47.240
If you’re given a two-by-two matrix describing a linear transformation and some specific vector and you want to know where that linear transformation takes that vector, you can take the coordinates of the vector multiply them by the corresponding columns of the matrix, then add together what you get.
00:06:48.200 --> 00:06:52.760
This corresponds with the idea of adding the scaled versions of our new basis vectors.
00:06:54.960 --> 00:07:00.520
Let’s see what this looks like in the most general case, where your matrix has entries 𝑎, 𝑏, 𝑐, 𝑑.
00:07:01.160 --> 00:07:06.240
And remember, this matrix is just a way of packaging the information needed to describe a linear transformation.
00:07:06.720 --> 00:07:16.440
Always remember to interpret that first column — 𝑎, 𝑐 — as the place where the first basis vector lands and that second column — 𝑏, 𝑑 — as the place where the second basis vector lands.
00:07:17.560 --> 00:07:22.240
When we apply this transformation to some vector — 𝑥, 𝑦 — what do you get?
00:07:22.800 --> 00:07:27.000
Well, it’ll be 𝑥 times 𝑎, 𝑐 plus 𝑦 times 𝑏, 𝑑.
00:07:28.160 --> 00:07:37.560
Putting this together, you get a vector 𝑎𝑥 plus 𝑏𝑦, 𝑐𝑥 plus 𝑑𝑦.
00:07:38.000 --> 00:07:40.880
You can even define this as matrix-vector multiplication when you put the matrix on the left of the vector like it’s a function.
00:07:41.600 --> 00:07:46.760
Then, you could make high schoolers memorize this, without showing them the crucial part that makes it feel intuitive.
00:07:48.320 --> 00:07:58.040
But, isn’t it more fun to think about these columns as the transformed versions of your basis vectors and to think about the results as the appropriate linear combination of those vectors?
00:08:01.000 --> 00:08:03.720
Let’s practice describing a few linear transformations with matrices.
00:08:04.560 --> 00:08:12.340
For example, if we rotate all of space 90 degrees counterclockwise then 𝑖-hat lands on the coordinates zero, one.
00:08:14.240 --> 00:08:17.200
And 𝑗-hat lands on the coordinates negative, zero.
00:08:18.080 --> 00:08:22.000
So the matrix we end up with has columns zero, one; negative one, zero.
00:08:23.280 --> 00:08:29.680
To figure out what happens to any vector after a 90-degree rotation, you could just multiply its coordinates by this matrix.
00:08:31.520 --> 00:08:34.320
Here’s a fun transformation with a special name, called a “shear”.
00:08:35.000 --> 00:08:45.360
In it, 𝑖-hat remains fixed so the first column of the matrix is one, zero, but 𝑗-hat moves over to the coordinates one, one, which become the second column of the matrix.
00:08:46.200 --> 00:08:54.080
And, at the risk of being redundant here, figuring out how a shear transforms a given vector comes down to multiplying this matrix by that vector.
00:08:55.840 --> 00:09:01.620
Let’s say we wanna go the other way around, starting with the matrix, say, with columns one, two and three, one.
00:09:02.000 --> 00:09:04.400
And we want to deduce what its transformation looks like.
00:09:04.920 --> 00:09:06.960
Pause and take a moment to see if you can imagine it.
00:09:08.600 --> 00:09:12.160
One way to do this is to first move 𝑖-hat to one, two.
00:09:12.840 --> 00:09:14.880
Then, move 𝑗-hat to three, one.
00:09:15.440 --> 00:09:20.160
Always moving the rest of space in such a way that keeps grid lines parallel and evenly spaced.
00:09:22.000 --> 00:09:42.440
If the vectors that 𝑖-hat and 𝑗-hat land on are linearly dependent which, if you recall from last video, means that one is a scaled version of the other, it means that the linear transformation squishes all of 2D space onto the line where those two vectors sit, also known as the one-dimensional span of those two linearly dependent vectors.
00:09:45.000 --> 00:09:53.840
To sum up, linear transformations are a way to move around space such that the grid lines remain parallel and evenly spaced and such that the origin remains fixed.
00:09:54.440 --> 00:09:58.640
Delightfully, these transformations can be described using only a handful of numbers.
00:09:59.080 --> 00:10:02.000
The coordinates of where each basis vector lands.
00:10:02.640 --> 00:10:14.680
Matrices give us a language to describe these transformations where the columns represent those coordinates and matrix-vector multiplication is just a way to compute what that transformation does to a given vector.
00:10:15.360 --> 00:10:21.920
The important takeaway here is that, every time you see a matrix, you can interpret it as a certain transformation of space.
00:10:22.600 --> 00:10:27.280
Once you really digest this idea, you’re in a great position to understand linear algebra deeply.
00:10:27.720 --> 00:10:40.520
Almost all of the topics coming up, from matrix multiplication to determinants, change of basis, eigenvalues, all of these will become easier to understand once you start thinking about matrices as transformations of space.
00:10:41.440 --> 00:10:56.160
Most immediately, in the next video, I’ll be talking about multiplying two matrices together.
00:10:56.400 --> 00:10:57.680
See you then!