5. Matrix operations#

An \(m\times n\)-matrix is an array of \(m\) rows and \(n\) columns containing (usually real or complex) numbers called the entries (or coefficients) of the matrix. The entries of a matrix \(A\) are denoted by \(A_{i,j}\), \(a_{ij}\) or similar, where the first index refers to the row and the second to the column. That is, the entries are laid out like

(5.1)#\[\begin{split} A = \begin{pmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,n}\\ a_{2,1} & a_{2,2} & \cdots & a_{2,n}\\ \vdots& \vdots& \ddots & \vdots\\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} \end{pmatrix}. \end{split}\]

You will often see notation like \(A=(a_{i,j})\) or \(A=(a_{i,j})_{1\le i\le m\atop 1\le j\le n}\) to indicate how the entries are denoted and/or how many rows and columns the matrix has.

Note

In quantum physics, matrices are very often square and have some additional properties that make them suitable as operators on a physical system or observables of such a system. We will come back to this in the section on Hilbert spaces and operators.

In this section we focus on the basic operations that can be performed with matrices.

5.1. Addition and scalar multiplication#

Like vectors, matrices can be added or subtracted entrywise. They can also be multiplied entrywise by a given scalar. For example,

\[\begin{split} 4\begin{pmatrix}1& 0\\ 3& -1\end{pmatrix} - 2\begin{pmatrix}0& 2\\ 1& -1\end{pmatrix} = \begin{pmatrix}4& 0\\ 12& -4\end{pmatrix} - \begin{pmatrix}0& 4\\ 2& -2\end{pmatrix} = \begin{pmatrix}4& -4\\ 10& -2\end{pmatrix}. \end{split}\]

5.2. Special matrices#

A few (types of) matrices occur very often and have their own names:

  • the \(m\times n\) zero matrix \(\begin{pmatrix} 0 & 0 & \cdots & 0\\ 0 & 0 & \cdots & 0\\ \vdots& \vdots& \ddots & \vdots\\ 0 & 0 & \cdots & 0 \end{pmatrix}\)

  • the \(n\times n\) identity matrix \(\begin{pmatrix} 1 & 0 & \cdots & 0\\ 0 & 1 & \cdots & 0\\ \vdots& \vdots& \ddots & \vdots\\ 0 & 0 & \cdots & 1 \end{pmatrix}\)

  • diagonal matrices \(\begin{pmatrix} a_{1,1} & 0 & \cdots & 0\\ 0 & a_{2,2} & \cdots & 0\\ \vdots& \vdots& \ddots & \vdots\\ 0 & 0 & \cdots & a_{n,n} \end{pmatrix}\)

  • upper triangular matrices \(\begin{pmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,n}\\ 0 & a_{2,2} & \cdots & a_{2,n}\\ \vdots& \ddots& \ddots & \vdots\\ 0 & \cdots & 0 & a_{n,n} \end{pmatrix}\)

  • lower triangular matrices \(\begin{pmatrix} a_{1,1} & 0 & \cdots & 0\\ a_{2,1} & a_{2,2} & \ddots & \vdots\\ \vdots& \vdots& \ddots & 0\\ a_{n,1} & a_{n,2} & \cdots & a_{n,n} \end{pmatrix}\)

5.3. Matrices as linear maps#

Very often, an \(m\times n\)-matrix \(A\) represents a linear map from the space \(\mathbb{R}^n\) of vectors of length \(n\) to the space \(\mathbb{R}^m\) of vectors of length \(m\). More precisely, a matrix \(A\) as in (5.1) sends a vector

\[\begin{split} \vv = \begin{pmatrix}v_1\\ \vdots\\ v_n\end{pmatrix} \in\mathbb{R}^n \end{split}\]

to the vector

\[\begin{split} A\vv=\begin{pmatrix}a_{1,1}v_1 + \cdots + a_{1,n}v_n\\ \vdots\\ a_{m,1}v_1 + \cdots + a_{m,n}v_n\end{pmatrix} \in\mathbb{R}^m. \end{split}\]

Equivalently, we can write

(5.2)#\[ (A\vv)_i = \sum_{j=1}^n A_{i,j} v_j\quad(1\le i\le m). \]

This operation is called matrix-vector multiplication.

Property 5.1 (Properties of matrix-vector multiplication)

For matrices \(A\) and \(B\), vectors \(\vv\) and \(\vw\) and scalars \(c\) the following hold whenever the expressions are defined:

  • \(A(\vv+\vw)= A\vv + A\vw\)

  • \(A(c\vv) = c(A\vv)\)

  • \((A+B)\vv = A\vv+B\vv\)

  • \((cA)\vv = c(A\vv)\)

Note

Here are two other useful ways to think about matrix-vector products.

First, in terms of inner products (see Hilbert spaces and operators): the vector \(A\vv\) consists of the inner products of each of the rows of \(A\) with the vector \(\vv\). Namely, if

\[\begin{split} A = \begin{pmatrix}\va_1\\ \vdots\\ \va_m\end{pmatrix}, \end{split}\]

then

\[\begin{split} A\vv = \begin{pmatrix}\va_1\cdot\vv\\ \vdots\\ \va_m\cdot\vv\end{pmatrix}. \end{split}\]

Second, in terms of linear combinations: the vector \(A\vv\) is a linear combination of the columns of \(A\) where the coefficients are the entries of \(\vv\). Namely, if

\[\begin{split} A = (\va_1\ \va_2\ \cdots\ \va_n)\quad\text{and}\quad \vv = \begin{pmatrix}v_1\\ \vdots\\ v_n\end{pmatrix}, \end{split}\]

then

\[ A\vv = v_1 \va_1 + v_2 \va_2 + \cdots + v_n \va_m. \]

5.4. Matrix multiplication#

Two matrices \(A\) and \(B\) can be multiplied if the number of columns of \(A\) equals the number of rows of \(B\). More precisely, if \(A\) is an \(m\times n\)-matrix and \(B\) is an \(n\times p\)-matrix, then the (matrix) product of \(A\) and \(B\) is the \(m\times p\)-matrix \(AB\) defined as follows:

\[ (AB)_{i,k} = \sum_{j=1}^n A_{i,j} B_{j,k} \quad(1\le i\le m,1\le k\le p). \]

This can be written as \(p\) matrix-vector multiplications: if \(B=(\vb_1\ \vb_2\ \cdots \vb_p)\), then

\[ AB = (A\vb_1\ A\vb_2\ \cdots A\vb_p). \]

Note

Matrix multiplication is defined so that the product \(AB\) represents the linear map gotten by first applying \(B\) and then \(A\). In other words, we have

(5.3)#\[ (AB)\vv = A(B\vv) \]

Note that on the left we have one matrix multiplication and one matrix-vector multiplication, while on the right we just have two matrix-vector multiplications.

Warning

Matrix multiplication is not commutative! That is, in general we have

\[AB\ne BA.\]

Definition 5.1 (Commutator)

Given two \(n\times n\)-matrices \(A\) and \(B\), the commutator of \(A\) and \(B\) is

\[ [A,B]=AB-BA. \]

Commutators play an extremely important role in quantum physics.

5.5. Transpose and adjoint#

Definition 5.2 (Transpose)

The transpose of an \(m\times n\)-matrix \(A\) is the \(n\times m\)-matrix \(A^\top\) defined by

\[ (A^\top)_{i,j}=A_{j,i}. \]

Definition 5.3 (Adjoint)

The adjoint of an \(m\times n\)-matrix \(A\) is the \(n\times m\)-matrix \(A^\dagger\) obtained by applying the complex conjugate and the transpose operation:

\[ (A^\dagger)_{i,j}=\overline{A_{j,i}}. \]

The adjoint of \(A\) is also known as the conjugate transpose or Hermitian transpose of \(A\). The adjoint is also sometimes used to refer to the adjugate of a matrix, or the tranpose of the cofactor matrix, but we will not use the term in this manner.

5.6. Trace, determinant and characteristic polynomial#

Definition 5.4 (Trace)

The trace of an \(n\times n\)-matrix \(A\) is the sum of the diagonal entries of \(A\):

\[ \tr A = \sum_{i=1}^n A_{i,i}. \]

The determinant of an \(n\times n\)-matrix \(A\) can be defined in various ways; you may have seen a different definition than the one below. We will make use of permutations.

We write \(S_n\) for the set of all permutations of \(\{1,\ldots,n\}\); then \(S_n\) has \(n!\) elements. To any \(\sigma\in S_n\) we can attach a sign \(\sign(\sigma)\in\{\pm1\}\). The sign of a permutation is \((-1)^{N(\sigma)}\), where \(N(\sigma)\) is the number of transpositions in the transposition decomposition of the permutation \(\sigma\).

Definition 5.5 (Determinant)

The determinant of an \(n\times n\)-matrix \(A\) is defined as

\[ \det A = \sum_{\sigma\in S_n} \sign(\sigma) \prod_{i=1}^n A_{i,\sigma(i)}. \]

Using the Einstein summation convention and the Levi-Cività symbol commonly used in physics, we can also write this as

\[ \det A=\epsilon^{i_1\cdots i_n} A_{1,i_1}\cdots A_{n,i_n}. \]

Note that except in small cases, this is usually not how you compute a determinant. One option is to expand along a row or column, but this is not much more efficient than directly using the definition. To compute the determinant of a larger matrices \(A\), it is most convenient to apply row (or column) operations to reduce \(A\) to triangular (or echelon) form and to use properties 4–6 below.

Property 5.2 (Properties of the determinant)

  1. A square matrix \(A\) is invertible if and only if \(\det(A)\) is non-zero.

  2. Given two \(n\times n\)-matrices \(A\) and \(B\), we have

    \[ \det(AB)=(\det A)(\det B). \]
  3. The determinant of a diagonal matrix (and more generally of an upper or lower triangular matrix) is the product of the diagonal entries.

  4. If \(B\) is obtained from \(A\) by scaling a row by a scalar \(c\), then \(\det B=c\det A\).

  5. If \(B\) is obtained from \(A\) by swapping two rows, then \(\det B=-\det A\).

  6. If \(B\) is obtained from \(A\) by adding a multiple of a row to another row, then \(\det B=\det A\).

Properties 4–6 also hold for columns instead of rows.

Finally, we define the characteristic polynomial of a square matrix; this will be used in Diagonalisation.

Definition 5.6 (Characteristic polynomial)

The characteristic polynomial of an \(n\times n\)-matrix \(A\) is the polynomial of degree \(n\) in a variable \(t\) defined by

\[ \chi_A(t) = \det(t\id-A). \]

Note

An alternative definition of the characteristic polynomial (which agrees with the one above up to a factor \((-1)^n\)) is

\[ \chi_A(t) = \det(A-t\id). \]

5.7. Exercises#

Exercise 5.1

Show that the definition of the matrix product \(AB\) is the only one for which (5.3) holds.

Exercise 5.2

Compute \([A,B]\) for \(A=\begin{pmatrix}1& 2\\0& 1\end{pmatrix}\) and \(B=\begin{pmatrix}1& 0\\-1& 1\end{pmatrix}\).

Exercise 5.3

Show that if \(A\) is an \(m\times n\)-matrix and \(B\) is an \(n\times m\)-matrix, then we have \(\tr(AB)=\tr(BA)\).

Exercise 5.4

Show that if \(A\) is an \(n\times n\)-matrix and \(c\) is a scalar, then

\[ \tr(cA)=c\tr A \quad\text{and} \det(cA)=c^n\det A. \]

Exercise 5.5

Compute the characteristic polynomial of the matrices

\[\begin{split} \begin{pmatrix}2& -1\\ 3& 2\end{pmatrix} \end{split}\]

and

\[\begin{split} \begin{pmatrix}1& 0& a\\ 0& -1& b\\ 1& 0& c\end{pmatrix} \end{split}\]

(where \(a\), \(b\), \(c\) are scalars).