Matrix math for 3d graphics

This document focuses on 3D affine transformation matrices, which take the form of a 4x4 matrix:

\begin{Bmatrix} c_{00} & c_{01} & c_{02} & c_{02} \\ c_{10} & c_{11} & c_{12} & c_{12} \\ c_{20} & c_{21} & c_{22} & c_{22} \\ c_{30} & c_{31} & c_{32} & c_{32} \end{Bmatrix}

$\large{C}_{\normalsize{\text{row}\enspace\text{column}}}$ is the element at the given row and column index (0-based).

A standard affine matrix, represented in column-major form, it looks like this:

\begin{bmatrix} \color{red}{\hat{x}_0} & \color{green}{\hat{y}_0} & \color{blue}{\hat{z}_0} & \mathbf{t}_x \\ \color{red}{\hat{x}_1} & \color{green}{\hat{y}_1} & \color{blue}{\hat{z}_1} & \mathbf{t}_y \\ \color{red}{\hat{x}_2} & \color{green}{\hat{y}_2} & \color{blue}{\hat{z}_2} & \mathbf{t}_z \\ 0 & 0 & 0 & 1 \end{bmatrix}

Where $\color{red}{\hat{x}}$ , $\color{green}{\hat{y}}$ and $\color{blue}{\hat{z}}$ are their respective axes, such as $\color{red}{\hat{\imath}}$ , $\color{green}{\hat{\jmath}}$ and $\color{blue}{\hat{k}}$ in the case of an identity matrix, and $\mathbf{t}$ represents the translation.

Column-Major and Row-Major

Column-major and row-major is the most important property, because it determines both how the matrix is read and how operations on it are performed.

Column-Major

Column major matrices use column-vectors, and the matrix is composed of three column-vector bases. A column-vector takes the form:

\hat{a} = \begin{bmatrix}x \\ y \\ z \\ w\end{bmatrix}

This is a four-component vector because affine transformations use homogeneous coordinates, which require a forth $w$ component (typically set to 1).

The columns of the matrix represent the bases (or axes) of the coordinate system, and the translation values are stored in the $c_{03}$ , $c_{13}$ , $c_{23}$ elements.

\begin{bmatrix} \color{red}{\hat{x}_0} & \color{green}{\hat{y}_0} & \color{blue}{\hat{z}_0} & \mathbf{t}_x \\ \color{red}{\hat{x}_1} & \color{green}{\hat{y}_1} & \color{blue}{\hat{z}_1} & \mathbf{t}_y \\ \color{red}{\hat{x}_2} & \color{green}{\hat{y}_2} & \color{blue}{\hat{z}_2} & \mathbf{t}_z \\ 0 & 0 & 0 & 1 \end{bmatrix}

Vectors are right- or post-multiplied to matrices:

\begin{bmatrix} \color{red}{\hat{x}_0} & \color{green}{\hat{y}_0} & \color{blue}{\hat{z}_0} & \mathbf{t}_x \\ \color{red}{\hat{x}_1} & \color{green}{\hat{y}_1} & \color{blue}{\hat{z}_1} & \mathbf{t}_y \\ \color{red}{\hat{x}_2} & \color{green}{\hat{y}_2} & \color{blue}{\hat{z}_2} & \mathbf{t}_z \\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix}x \\ y \\ z \\ 1\end{bmatrix}

\mathbf{v}' = \mathbf{M}\mathbf{v}

Matrix multiplication call order is the reverse of the order the transforms are applied, reading from right-to-left: "take $\mathbf{P}$ , transform by $\mathbf{T}$ , transform by $\mathbf{R_z}$ , transform by $\mathbf{R_y}$ " is written as:

\mathbf{P'} = \mathbf{R_y} \mathbf{R_z} \mathbf{T} \mathbf{P}

Row-Major

The primary properties of row-major matrices are the opposite of column-major matrices.

They use row-vectors:

\hat{a} = \begin{bmatrix}x & y & z & w\end{bmatrix}

The rows of the matrix represent the bases (or axes), and the translation values are stored in $c_{30}$ , $c_{31}$ and $c_{32}$ .

\begin{bmatrix} \color{red}{\hat{x}_0} & \color{red}{\hat{x}_1} & \color{red}{\hat{x}_2} & 0 \\ \color{green}{\hat{y}_0} & \color{green}{\hat{y}_1} & \color{green}{\hat{y}_2} & 0 \\ \color{blue}{\hat{z}_0} & \color{blue}{\hat{z}_1} & \color{blue}{\hat{z}_2} & 0\\ T_x & T_y & T_z & 1 \end{bmatrix}

Vectors are left- or pre-multiplied to matrices:

\begin{bmatrix}x & y & z & 1\end{bmatrix} \begin{bmatrix} \color{red}{\hat{x}_0} & \color{red}{\hat{x}_1} & \color{red}{\hat{x}_2} & 0 \\ \color{green}{\hat{y}_0} & \color{green}{\hat{y}_1} & \color{green}{\hat{y}_2} & 0 \\ \color{blue}{\hat{z}_0} & \color{blue}{\hat{z}_1} & \color{blue}{\hat{z}_2} & 0\\ T_x & T_y & T_z & 1 \end{bmatrix}

\mathbf{v}' = \mathbf{v}\mathbf{M}

Matrix multiplication call order and the order the transforms are applied is the same, reading from left-to-right: "take $\mathbf{P}$ , transform by $\mathbf{T}$ , transform by $\mathbf{R_z}$ , transform by $\mathbf{R_y}$ " is written as:

\mathbf{P'} = \mathbf{P} \mathbf{T} \mathbf{R_z} \mathbf{R_y}

C-based languages access arrays in row-major order:

const float M[4][4] = {
    {  0,  1,  2,  3 },
    {  4,  5,  6,  7 },
    {  8,  9, 10, 11 },
    { 12, 13, 14, 15 },
};

// Accessed as M[row][column].
EXPECT_EQ(0, M[0][0]);
EXPECT_EQ(1, M[0][1]);
EXPECT_EQ(4, M[1][0]);

// In-memory, the array is laid out like so:
// { 0, 1, 2, 3, 4, 5, 6, 7, ... }

// So to calculate the position in memory, C-based languages perform:
const float* M_ptr = M;
float getValue(size_t row, size_t column) {
    return M_ptr[row * 4 + column];
}

EXPECT_EQ(M[0][0], getValue(0, 0));
EXPECT_EQ(M[0][1], getValue(0, 1));
EXPECT_EQ(M[1][0], getValue(1, 0));

Left Handed vs. Right Handed

For left-handed vs. right-handed, a picture is worth a thousand words. For a given coordinate system, orient the thumb along the positive X-axis, the index finger along the positive Y-axis, and the middle finger along the positive Z-axis. If it's possible to do this with the right hand, the coordinate system is right-handed, and vise-versa.

Left vs. Right Handed Coordinates Diagram

Image credit: Primalshell

Left-handed or right-handed coordinate systems determine how the axes relate to each other, specifically for the Z-axis compared to X and Y. There are multiple orientations that are valid for each rule, and sometimes it's difficult to orient the hand to align, but the rule still holds true for each orientation.

To determine if a given coordinate system is left-handed or right-handed, the cross product can be used.

For right-handed coordinate systems, the cross product of $\hat{\imath}$ and $\hat{\jmath}$ bases yield $\hat{k}$ . The cross product is natually right-handed:

\hat{\imath} \times \hat{\jmath} = \hat{k}

For left-handed coordinate systems, the z-axis is negated:

\hat{\imath} \times \hat{\jmath} = -\hat{k}

Handedness and Rotations

Handedness also changes rotations around an axis. For an observer looking at the axis that is rotating, as if you are looking at your hand along the axis that is rotating (For Z-axis rotations, looking towards your middle finger), positive rotations are:

Right-Handed	Left-Handed
Counter-clockwise	Clockwise

Handedness and rotations diagram

Image credit: mufnull

This means that for the standard 2D coordinate system is actually right-handed, with a virtual Z-axis extended out from the paper.

Other Differences

Y-Up vs. Z-Up

Why the difference? It depends on the 3D library's use case and how it perceives the world.
Imagine each method starts in 2D, and the Z-axis is grafted onto it.
With +Y up, the 2D graph is placed on a monitor, with +X to the right, and depth is added by going into the monitor.
With +Z up, the XY coordinates are placed horizontally, like a piece of graph paper on the table, and Z changes the altitude.

Who Uses What?

Here is a table which lists the default properties of matrices and coordinate systems for various libraries. Note that for handedness it may be possible to use either, but the default is listed below.

	OpenGL	DirectX	Unity	Unreal
Handedness	Right-handed¹	Left-handed	Left-handed	Left-handed
Up Direction	+Y up	+Y up	+Y up	+Z up
Layout	Column-major²	Row-major	Column-major	Row-major

OpenGL

As noted above, OpenGL notation is typically column-major, but the in-memory layout is row-major.

For a row-major storage, matrix access in the form of $M_{\text{[row][column]}}$ accesses memory laid out in this form:

\begin{bmatrix} M_{\text{[0][0]}} & M_{\text{[0][1]}} & M_{\text{[0][2]}} & M_{\text{[0][3]}} \\ M_{\text{[1][0]}} & M_{\text{[1][1]}} & M_{\text{[1][2]}} & M_{\text{[1][3]}} \\ M_{\text{[2][0]}} & M_{\text{[2][1]}} & M_{\text{[2][2]}} & M_{\text{[2][3]}} \\ M_{\text{[3][0]}} & M_{\text{[3][1]}} & M_{\text{[3][2]}} & M_{\text{[3][3]}} \end{bmatrix}

\left[ M_{\text{[0][0]}}, M_{\text{[0][1]}}, M_{\text{[0][2]}}, M_{\text{[0][3]}}, M_{\text{[1][0]}}, M_{\text{[1][1]}}, M_{\text{[1][2]}}, M_{\text{[1][3]}}, \ldots \right]

Or put in the terms of an affine transform:

\begin{bmatrix} \color{red}{\hat{x}_0} & \color{green}{\hat{y}_0} & \color{blue}{\hat{z}_0} & \mathbf{t}_x \\ \color{red}{\hat{x}_1} & \color{green}{\hat{y}_1} & \color{blue}{\hat{z}_1} & \mathbf{t}_y \\ \color{red}{\hat{x}_2} & \color{green}{\hat{y}_2} & \color{blue}{\hat{z}_2} & \mathbf{t}_z \\ 0 & 0 & 0 & 1 \end{bmatrix}

\left[ \color{red}{\hat{x}_0}, \color{red}{\hat{x}_1}, \color{red}{\hat{x}_1}, 0, \color{green}{\hat{y}_0}, \color{green}{\hat{y}_1}, \color{green}{\hat{y}_2}, 0, \color{blue}{\hat{z}_0}, \color{blue}{\hat{z}_1}, \color{blue}{\hat{z}_2}, 0, \mathbf{t}_x, \mathbf{t}_y, \mathbf{t}_z, 1 \right]

DirectX

DirectX also uses a row-major memory layout, but because it also uses row-major conventions, it maps logically reading left-to-right:

\begin{bmatrix} \color{red}{\hat{x}_0} & \color{red}{\hat{x}_1} & \color{red}{\hat{x}_2} & 0 \\ \color{green}{\hat{y}_0} & \color{green}{\hat{y}_1} & \color{green}{\hat{y}_2} & 0 \\ \color{blue}{\hat{z}_0} & \color{blue}{\hat{z}_1} & \color{blue}{\hat{z}_2} & 0\\ \mathbf{t}_x & \mathbf{t}_y & \mathbf{t}_z & 1 \end{bmatrix}

\left[ \color{red}{\hat{x}_0}, \color{red}{\hat{x}_1}, \color{red}{\hat{x}_1}, 0, \color{green}{\hat{y}_0}, \color{green}{\hat{y}_1}, \color{green}{\hat{y}_2}, 0, \color{blue}{\hat{z}_0}, \color{blue}{\hat{z}_1}, \color{blue}{\hat{z}_2}, 0, \mathbf{t}_x, \mathbf{t}_y, \mathbf{t}_z, 1 \right]

Citations

OpenGL notation is typically column-major, as seen in the OpenGL specification and the OpenGL reference manual, but the in-memory layout is row-major. See question 9.005 in the FAQ on this page.↩
OpenGL is right-handed in object-space and world-space, but left-handed in screen-space. See this stack overflow post.↩

Column-Major and Row-Major​

Column-Major​

Row-Major​

Left Handed vs. Right Handed​

Handedness and Rotations​

Other Differences​

Y-Up vs. Z-Up​

Who Uses What?​

OpenGL​

DirectX​

Citations​