Summary of GAMES101 Series (1) Linear Algebra and Model Transformation

Posted on 2023-03-30 Edited on 2025-07-02 In Unity Waline:

Recently, I began to re-learn the knowledge of Computer Graphics, so I set my eyes on GAMES101. While watching, I did it and summarized it. This time I summarize the basic knowledge of linear algebra and how to do model transformation in the game. Finally, finish homework 1.

Basic knowledge of linear algebra

Vector

A vector is a line segment with a direction and a length, written in mathematical notation: $\ vec {a} $
Or expressed as a line segment from the beginning to the end, for example, there are two points A, B, then the vector from A to B can be expressed as $\ vec {AB} = B - A $
Vectors have no absolute starting point, that is, if a vector is moved in space, the vector itself remains unchanged
The length of the vector is expressed as: $| |\ vec {a} | | $, and the vector divided by its own length is the unit vector of changing direction: $\ frac {\ vec {a}} {| |\ vec {a} | |} $
The addition of vectors conforms to the parallelogram rule or the triangle rule

Coordinates can be represented by vectors

If there are two unit vectors perpendicular to each other, and we choose one of them to be $\ vec {X} $and the other to be $\ vec {Y} $, we can represent a vector $\ vec {A} =\ binom {x} {y} $, or $\ vec {A} ^ T = (x, y) $

We assume that the starting point of this vector is the origin, then the end point is the (x, y) point we usually say in geometry

Multiplication of a vector

Vectors can be multiplied by dot. Suppose there are two vectors $\ vec {a},\ vec {b} $in different directions. Suppose their starting points are put together, and there will be an angle between them, assuming $\ theta $

那么 $\vec{a} \cdot \vec{b} = ||\vec{a}||||\vec{b}||cos\theta$

The point product of a vector has commutative and associative laws.

\vec{a} \cdot \vec{b} = \vec{b} \cdot \vec{a} \\ \vec{a} \cdot (\vec{b} + \vec{c}) = \vec{a} \cdot \vec{b} + \vec{a} \cdot \vec{c} \\ (k\vec{a}) \cdot \vec{b} = \vec{a} \cdot (k\vec{b}) = k(\vec{a} \cdot \vec{b})

The vector points are multiplied by coordinates:

\ vec {a}\ cdot\ vec {b} =\ binom {x _ a} {y _ a}\ cdot\ binom {x _ b} {y _ b} = x _ ax _ b + y _ ay _ b\\ \vec{a} \cdot \vec{b} = \begin{bmatrix} x_a \\ y_a \\ z_a \end{bmatrix} \cdot \begin{bmatrix} x_b \\ y_b \\ z_b \end{bmatrix} = x_ay_a + x_by_b + z_az_b

Applications of vector dot multiplication are:

We can use the result of the dot product between the two vectors and the 0 comparison to determine whether the angle between the two is acute or obtuse.
Find the angle between two vectors
Find the projection of one vector on another vector

Cross product of vectors

Another multiplication of vectors is cross multiplication

The direction of the cross product of the vector follows the right-hand rule. Assuming $\ vec {a}\ times\ vec {b} $, then the direction of the result is that the four fingers of the right hand turn from the direction of the vector a to the direction of b, clench tightly, and then give a thumbs up is, the direction of the thumb. That is, the result of the cross product is perpendicular to the plane where a and b are located

Then the length of the cross product is actually $|\ vec {a}\ times\ vec {b }||=||\ vec {a} | | | |\ vec {b} | | sin\ theta $

The cross product of vectors does not support associativity. To be precise, the result of the order of commutative cross products is reversed, that is, the direction is reversed: $\ vec {a}\ times\ vec {b} = -\ vec {b}\ times\ vec {a} $

The cross product of the vector is represented by a matrix:

\vec{a} \times \vec{b} = \begin{bmatrix} y_az_b - y_bz_a \\ x_az_b - x_bz_a \\ x_ay_b - x_by_a \end{bmatrix} \\ \vec{a} \times \vec{b} = A * \vec{b} = \begin{bmatrix} 0 & -z_a & y_a \\ z_a & 0 & -x_a \\ -y_a & x_a & 0 \\ \end{bmatrix} \begin{bmatrix} x_b \\ y_b \\ z_b \end{bmatrix}

Cross product can determine whether a vector is on the left or right of another vector. This is easier to understand. When the right-hand rule rotates, it rotates clockwise or counterclockwise, and the result is the opposite.

Another function is to determine whether a point is inside a triangle

In the figure above, if the symbols of vector BC cross-multiplied by vector BP, vector CA cross-multiplied by vector CP, and vector AB cross-multiplied by vector AP are all the same, then it means that the P point is inside

Matrix

A matrix is a two-dimensional array of m rows and n columns.

The premise that two matrices can be multiplied is that the number of columns in the first matrix and the number of rows in the second matrix are the same.

That is, a matrix with M rows and N columns can be multiplied by a matrix with N rows and P columns, resulting in a matrix with M rows and P columns.

Suppose A and B are multiplied by two matrices to a matrix C, where each term of A is $a_ {ij} $, each term of B is $B_ {ij} $, and each term of C is $c_ {ij} $, then $c_ {ij} =\ sum_ {k = 0} ^ {k = N} a_ {ik} b_ {kj} $

The important point here is how to write the equations of coordinate transformation in the form of matrices

For example, how to say that points in a two-dimensional coordinate system are symmetrical according to the y-axis

Just write it in a system of equations

\begin{cases} x' = -x; \\ y' = y \end{cases}

Write it in matrix form

\begin{bmatrix} -1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} -x \\ y \end{bmatrix}

Each matrix has its own transpose matrix and inverse matrix.

The transpose matrix of A is written as $A_T $, and $ (AB) ^ T = B ^ TA ^ T $

The Inverse Matrix of A is written as $A ^ {-1} $, and $AA ^ {-1} = I $, where I is the identity matrix, and any matrix multiplied by the identity matrix equals nothing, that is, any matrix multiplied by A The change produced by the matrix can be restored by multiplying the Inverse Matrix of A

How to use a matrix to transform (Transform)

2D transformation

Zoom out

When the image is scaled by s times, it is represented by the equation

\begin{cases} x' = sx \\ y' = sy \\ \end{cases}

The corresponding scaling matrix is:

\begin{bmatrix} s & 0 \\ 0 & s \end{bmatrix}

Inversion

\begin{bmatrix} -1 & 0 \\ 0 & 1 \\ \end{bmatrix}

Shear

\begin{bmatrix} 1 & a \\ 0 & 1 \end{bmatrix}

Step 4 Rotate

\begin{bmatrix} cos\theta & -sin\theta \\ sin\theta & cos\theta \end{bmatrix}

So far, all our transformations can be expressed in matrix form, because our previous transformations can be expressed in the following equation:

\begin{cases} x' = ax + by \\ y '= cx + dy \end{cases}

Expressed as a matrix is

\begin{bmatrix} x' \\ y' \end{bmatrix} = \begin{bmatrix} a & b \\ c & d \\ \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}

But the problem is that you can’t represent translation in this way, because translation can’t be written in this form

Translation

The system of equations for translation is like this:

\begin{cases} x' = x + t_x;\\ y' = y + t_y; \end{cases}

If scaling, rotation, and translation are all represented by matrices, they should be as follows:

\begin{bmatrix} x' \\ y' \end{bmatrix} = \begin{bmatrix} a & b \\ c & d \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} + \begin{bmatrix} t_x \\ t_y \end{bmatrix}

So at this time we have to introduce homogeneous coordinates, that is, add w. At this time, the 2D point coordinates are represented by (x, y, 1), and the 2D vector is represented by (x, y, 0)

When representing a point, w is 1, when representing a vector, w is 0, and there is a very magical place like this, that is, if two points are subtracted, w will become 0, which happens to be a vector, point and vector Adding, w is 1, which is also a point

Using homogeneous coordinates, we can uniformly rotate, scale, and translate into a matrix

The translation is expressed in homogeneous coordinates as:

\begin{bmatrix} x' \\ y' \\ z' \end{bmatrix} = \begin{bmatrix} 1 & 0 & t_x \\ 0 & 1 & t_y \\ 0 & 0 & 1 \\ \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} = \begin{bmatrix} x + t_x \\ y + t_y \\ 1 \end{bmatrix}

Mixing

We can now use homogeneous coordinates to represent rotation, translation, and scaling respectively

S(s_x, s_y) = \begin{bmatrix} s_x & 0 & 0 \\ 0 & s_y & 0 \\ 0 & 0 & 1 \end{bmatrix} \\ R(\theta) = \begin{matrix} cos\theta & -sin\theta & 0 \\ sin\theta & cos\theta & 0 \\ 0 & 0 & 1 \end{matrix} \\ T(t_x, t_y) = \begin{bmatrix} 1 & 0 & t_x \\ 0 & 1 & t_y \\ 0 & 0 & 1 \end{bmatrix}

So how do we mix these operations?

As you can see in the figure above, we now translate and then rotate. The result of translation is different from that of rotation first, because our rotation matrix is rotated around the origin

Therefore, we generally specify the order to perform mixing operations. The first operation is multiplied left by the coordinates of the current point, and after obtaining the new point, it is multiplied left by the next operation matrix.

Then, although our order cannot be changed because matrix multiplication has no commutativity, matrix multiplication has associativity, that is:

So we can achieve the result of multiplying the rotation, scaling, and translation matrices as the transformation matrix, and then multiplying it left with each point

There is actually another problem here, which is, what if we just want a point to rotate around its lower left corner?

It’s simple, translate the bottom left corner to the origin, then rotate, and finally translate the bottom left corner back:

3D transformation

The transformation of 3D is actually no different from 2D, except that homogeneous coordinates have four dimensions

The transformation matrix is as follows:

\begin{bmatrix} x' \\ y' \\ z' \\ 1 \end{bmatrix} = \begin{bmatrix} a & b & c & t_x \\ d & e & f & t_y \\ g & h & o & t_z \\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ z \\ 1 \end{bmatrix}

The scaling matrix can be written as;

S(s_x, s_y, s_z) = \begin{bmatrix} s_x & 0 & 0 & 0\\ 0 & x_y & 0 & 0\\ 0 & 0 & s_z & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}

The translation matrix can be expressed as:

T(t_x, t_y, t_z) = \begin{bmatrix} 1 & 0 & 0 & t_x \\ 0 & 1 & 0 & t_y \\ 0 & 0 & 1 & t_z \\ 0 & 0 & 0 & 1 \end{bmatrix}

Rotation is more complicated because it can be divided into rotation around different axes

R_x(\theta) = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & cos\theta & -sin\theta & 0 \\ 0 & sin\theta & cos\theta & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \\ R_y(\theta) = \begin{bmatrix} cos\theta & 0 & sin\theta & 0 \\ 0 & 1 & 0 & 0 \\ sin\theta & 0 & cos\theta & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \\ R_z(\theta) = \begin{bmatrix} cos\theta & -sin\theta & 0 & 0 \\ sin\theta & cos\theta & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}

There is also a formula to multiply any rotation matrix Factorization by the rotation matrix in three axis directions. The following way represents the rotation angle of $\ theta $around the $\ vec {n} $axis.

R(\vec{n}, \alpha) = cos\alpha\vec{I} + (1 - cos\alpha)\vec{n}\vec{n}^T + sin\alpha\begin{bmatrix} 0 & -n_z & n_y \\ n_z & 0 & -n_x \\ -n_y & n_x & 0 \\ \end{bmatrix}

Observation Transformation in Graphics

When we are doing game development and writing Shader, we often use something called an MVP matrix to change the coordinates of points on the model to the coordinates on the screen.

The MVP here refers to Model, View, and Projection, which means model transformation, view transformation, and projection transformation

Model transformation is to change the coordinates from the coordinate system of the model itself to the coordinates of the game world coordinate system

View transformation is to change the coordinates on the world coordinate system into the coordinates of the observation space

The projection transformation is to change the coordinates of the observation space into the clipping space. In fact, this step does not do the operation of projecting to the two-dimensional plane. The specific projection operation is written in the GPU in the rendering pipeline, and is generally not processed in the Shader.

The observation transformation we are talking about here is just a step in the rendering pipeline. At the beginning, it is operated vertex-by-vertex in the vertex shader. We get the coordinates of the point in the clipping space. In the next rendering pipeline, we have to go through grating., sampling, chip-by-chip shader output color, depth test, etc. will finally be projected onto a two-dimensional plane.

Tired than the usual way we take pictures, the Model matrix is like we find a suitable camera position, the View matrix is to use the camera to find an angle

We don’t talk about the Model matrix here, because it is the same as the View matrix, which changes from one coordinate system to another, and the Projection matrix is different in that the observation space is a box with a side length of 1 cube, we need to consider scaling

View matrix

View matrix is the point from the world coordinates into the observation space, i.e., from the origin of the world coordinate system relative to the camera position becomes coordinates.

After the Model transformation, we have the coordinates of the point in space, now we need to define the position and orientation of the camera in space:

The coordinates and orientation of the camera here are relative to the world coordinate system. And our object coordinates are currently relative to the world coordinate system.

Now all we have to do is change the object coordinates relative to the world coordinate system to relative to the camera coordinates.

Here we introduce a common physical concept - relative motion, that is, if the same transformation operation is performed on the camera and the object, the relative position of the two remains unchanged.

Then we can now try to move the camera to the origin. The observation direction of the camera is towards the negative direction of the z-axis of the world coordinate system, find the matrix of this transformation, and then apply this matrix to each point, which is equivalent to moving the object to the observation space. Although from the point of view of the object coordinates, a certain movement is made in the world coordinate system, this movement does not change the relative position of the object and the camera, and also successfully moves the camera to the origin of the world coordinates, so the result is equal to moving the object to the observation space.

So how do you get this view matrix?

This method is more complicated

We can use a better property here, that is, the rotation matrix is actually an orthogonal matrix, the Inverse Matrix and the transpose matrix of the orthogonal matrix are the same, that is, we can find the matrix of the world coordinate axis transformed into the camera coordinate axis, and then find his transpose matrix, that is, the Inverse Matrix, which is the matrix of the camera coordinate axis transformed into the world coordinate axis

Projection matrix

Just through the View matrix, the relative position of our camera and the object as a whole remains unchanged and moves to the position of the camera at the origin of the world coordinate system.

What is the purpose of this? Of course, there is an advantage that it is easy to understand, but in fact, it does not make sense for a calculator, because it is all multiplied by a matrix, and the amount of calculation will not make a difference.

Another advantage of this is to reduce the calculation of the projection matrix.

Our projection matrix is divided into two types, one is parallel projection and the other is orthogonal projection:

Parallel projection

Let’s first look at the relatively simple parallel projection

A relatively simple way to understand this projection is to just throw away the z-axis, which is the coordinate of the final point on the screen, and then both the x and y directions are translated and scaled between [-1, 1]. The reason why we can just throw away z here is that our camera is moved to the origin and in the negative direction of the z-axis.

However, dropping the z-axis cannot be done yet. We still need the information of z to do in-depth tests later. What we need to do now is to normalize x, y, z to the cube of $[-1,1] ^ 3 $

As shown in the figure below, in the parallel projection, we start with a cube in the observation space. We need to move the center of this cube to the origin and scale it to a cube with side length 1

Orthogonal projection

The difference between orthogonal projection and parallel projection is that there is a near-large and far-small effect, and its initial observation space is not a cube, but a ladder

We are looking for this ladder normalization matrix to be divided into two steps:

Scale the ladder to a cube, and the observation space at this time is equivalent to a parallel projection
Reuse normalized matrices with parallel projections

So we’re focusing on the first step right now

And because our current camera is facing the negative direction of z, so the zoom bottom will not affect the z coordinate, we only need to focus on x and y, we take the y coordinate for example:

Similarly, the left side of x becomes $x '=\ frac {n} {z} x $

Then at this time, our transformation matrix can be written. First, the transformation of the orthogonal projection space into the parallel projection space

M_{presp-ortho}\begin{bmatrix} x \\ y \\ z \\ 1 \end{bmatrix} = \begin{bmatrix} \ frac {n} {z} x\\ frac {n} {z} y\ z\\ 1 \end{bmatrix} = \begin{bmatrix} nx \\ ny \\ z^2 \\ z \end{bmatrix} \\ That is\\ M_{presp-ortho} = \begin{bmatrix} n & 0 & 0 & 0 \\ 0 & n & 0 & 0 \\ 0 & 0 & z & 0 \\ 0 & 0 & 1 & 0 \\ \end{bmatrix}

Then the formal orthogonal projection View matrix is $M_ {press-ortho} $multiplied by the parallel projection View matrix

At this point, we have completed the transformation of a point from model space to normalized space in the world coordinate system