Matrix operations

3.2. Matrix operations#

3.2.1. Introduction#

In Chapter 2 matrices were introduced to represent systems of linear equations. The coefficients of a linear system were put into the coefficient matrix $A$, and a system as a whole could be squeezed into the augmented matrix. In Section 3.1 we used matrices to construct linear transformations. In this chapter we will study matrices as entities on their own, though every now and then we will keep in mind their role in the two contexts just mentioned.

3.2.2. Sum, scalar multiple and transpose#

In this section we will define the sum and the product of two matrices, and the transpose of a matrix. Recall that an $m\times n$-matrix has $m$ (horizontal) rows of size $n$ or, equivalently, $n$ (vertical) columns of size $m$.

sizeequalequality of matrices

Definition 3.2.1 (Equality of matrices)

Two matrices are said to have the same size if they have the same number of rows and the same number of columns.

Two matrices $A$ and $B$ are equal if they have the same size, say $m$ rows and $n$ columns, and all the corresponding entries are equal, i.e.

\[ a_{ij} = b_{ij},\,\quad \text{for} \quad i = 1,\ldots,m, \quad j = 1,\ldots,n. \]

zero matrix

Definition 3.2.2

A zero matrix $O$ is a matrix with all entries equal to 0. If the context requires clarity as to its size it may be denoted by $O_{mn}$.

scalar multiplicationscalar multiplemultiple

Definition 3.2.3 (Scalar multiplication)

If $A$ is an $m\times n$-matrix and $c$ is a scalar, then $cA$ is the $m \times n$-matrix that is the result of multiplying each entry of $A$ by $c$:

\[\begin{split} c \left(\begin{array}{cccc} a_{11} & a_{12}& \cdots& a_{1n} \\ a_{21} & a_{22}& \cdots& a_{2n} \\ \vdots & \vdots& & \vdots \\ a_{m1} & a_{m2}& \cdots& a_{mn} \end{array} \right)= \left(\begin{array}{cccc} ca_{11} & ca_{12}& \cdots& ca_{1n} \\ ca_{21} & ca_{22}& \cdots& ca_{2n} \\ \vdots & \vdots& & \vdots \\ ca_{m1} & ca_{m2}& \cdots& ca_{mn} \end{array} \right). \end{split}\]

We then say that $cA$ is a scalar multiple of $A$, or simply a multiple of $A$.

the sum of two matricessum

Definition 3.2.4 (The sum of two matrices)

If $A$ and $B$ are two $m\times n$-matrices then the sum $A+B$ is the $m\times n$-matrix of which the entry on the position $(i,j)$ is the sum of the corresponding entries of $A$ and $B$:

\[\begin{split} \left(\begin{array}{cccc} a_{11} & a_{12}& \cdots& a_{1n} \\ a_{21} & a_{22}& \cdots& a_{2n} \\ \vdots & \vdots& & \vdots \\ a_{m1} & a_{m2}& \cdots& a_{mn} \end{array} \right) + \left(\begin{array}{cccc} b_{11} & b_{12}& \cdots& b_{1n} \\ b_{21} & b_{22}& \cdots& b_{2n} \\ \vdots & \vdots& & \vdots \\ b_{m1} & b_{m2}& \cdots& b_{mn} \end{array} \right)= \end{split}\]

\[\begin{split} = \left(\begin{array}{cccc} a_{11}+b_{11} & a_{12}+b_{12}& \cdots& a_{1n}+b_{1n} \\ a_{21}+b_{21} & a_{22}+b_{22}& \cdots& a_{2n}+b_{2n} \\ \vdots & \vdots& & \vdots \\ a_{m1}+b_{m1} & a_{m2}+b_{m2}& \cdots& a_{mn}+b_{mn} \end{array} \right). \end{split}\]

If $A$ and $B$ are not of the same size their sum is not defined.

Example 3.2.1

\[\begin{split} \begin{pmatrix} 1 & 3 \\ 5 & 2 \\ 6 & -4 \end{pmatrix} + \begin{pmatrix} 3 & 2 \\ 4 & -5 \\ 2 & 5 \end{pmatrix} = \begin{pmatrix} 4 & 5 \\ 9 & -3 \\ 8 & 1 \end{pmatrix}, \end{split}\]

\[\begin{split} \begin{pmatrix} 1 & 3 \\ 5 & 2 \\ 6 & -4 \end{pmatrix} + \begin{pmatrix} 0 & 0 \\ 0 & 0 \\ 0 & 0 \end{pmatrix} = \begin{pmatrix} 0 & 0 \\ 0 & 0 \\ 0 & 0 \end{pmatrix} + \begin{pmatrix} 1 & 3 \\ 5 & 2 \\ 6 & -4 \end{pmatrix} = \begin{pmatrix} 1 & 3 \\ 5 & 2 \\ 6 & -4 \end{pmatrix}, \end{split}\]

\[\begin{split} \begin{array}{lcl} \begin{pmatrix} 1 & 3 & 5 \\ 2 & 4 & 1 \end{pmatrix} + (-1)\begin{pmatrix} 1 & 3 & 5 \\ 2 & 4 & 1 \end{pmatrix} &=& \begin{pmatrix} 1 & 3 & 5 \\ 2 & 4 & 1 \end{pmatrix} + \begin{pmatrix} -1 & -3 &-5 \\ -2 & -4 & -1 \end{pmatrix} \\ &=& \begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix}. \end{array} \end{split}\]

The multiple $(-1)A$ is also written as $-A$. An obvious property, illustrated in the third example, is:

\[ A + (-A) = O, \]

where $O$ is the zero matrix.

Example 3.2.2

\[\begin{split} \begin{pmatrix} 1 & 3 \\ 5 & 2 \\ 6 & -4 \end{pmatrix} + \begin{pmatrix} 1 & 3 & 5 \\ 2 & 4 & 1 \end{pmatrix} \end{split}\]

is not defined. This is because the matrices do not have the same size.

Remark 3.2.1

The two definitions of sum and scalar multiple are called componentwise definitions. They are completely analogous to the definitions of the scalar multiples of a vector and the sum of two vectors. Hence it is not surprising that they obey exactly the same rules, as is summarised in the next proposition (cf. Section Vectors).

Proposition 3.2.1

Suppose $A, B$ and $C$ are $m\times n$-matrices and let $c_{1},c_{2}$ be two real numbers. Then we have:

$A+O_{mn}=A=O_{mn}+A$.
$(A+B)+C=A+(B+C)$.
$A+B=B+A$.
$A+(-A)=O$.
$1A=A$.
$c_{1}(A+B)=c_{1}A+c_{1}B$.
$(c_{1}+c_{2})A=c_{1}A+c_{2}A$.
$c_{1}(c_{2}A)=(c_{1}c_{2})A$.

An operator of which the usefulness is not immediately clear, but which fits well in this section with matrix operations, is the following:

transpose

Definition 3.2.5

The transpose of an $m \times n$-matrix $A$ with entries $a_{ij}$ is the $n \times m$-matrix $B$ with entries $b_{ij}$ defined by

\[ b_{ij} = a_{ji}, \quad i = 1,\ldots,n,\,\,j=1,\ldots, m \]

It is denoted by $B = A^T$.

Example 3.2.3

\[\begin{split} \begin{pmatrix} 1 & 3 \\ 5 & 2 \\ 6 & 4 \end{pmatrix}^T = \begin{pmatrix} 1 & 5 & 6 \\ 3 & 2 & 4 \end{pmatrix} \quad \text{and} \quad \begin{pmatrix} -1 & 2 & -4 & 0\end{pmatrix}^T = \begin{pmatrix} -1 \\ 2 \\-4 \\ 0\end{pmatrix}. \end{split}\]

The following rules involving the three operators defined so far in this section are easy to prove:

Proposition 3.2.2

Let $A$ and $B$ be $m\times n$-matrices and $c$ a scalar. Then we have

$(cA)^T = c A^T$.
$(A+B)^T = A^T + B^T$.
$(A^T)^T = A$.

Proof of Proposition 3.2.2

We will prove the second statement and leave the other two to the diligent reader. See Exercise 3.2.1.

So, suppose $A$ and $B$ are two $m \times n$-matrices. Then $A+B$ is an $m \times n$-matrix too, hence $(A+B)^T$ is an $n \times m$-matrix. The matrix $A^T + B^T$ on the right-hand side of the equation is the sum of two $n \times m$-matrices, which is again an $n \times m$-matrix. So the matrices on both sides of the equation have the same size.

Next we have to show that they have equal entries on the corresponding positions. If we put

\[ E = (A+B)^T \quad \text{and}\quad F = A^T + B^T \]

we see that

\[ e_{ij} = \text{ entry of } (A+B) \,\text{ on position }(j,i) \]

and

\[\begin{split} \begin{array}{rl} f_{ij} &= \text{ entry of } A^T \,\text{ on position }(i,j)\,+ \text{ entry of } B^T \,\text{ on position }(i,j) \\ &= \text{ entry of } A \,\text{ on position }(j,i)\,+ \text{ entry of } B \,\text{ on position }(j,i)\\ &= \text{ entry of } (A+B) \,\text{ on position }(j,i)\\ &= \,\,\,\,e_{ij}, \end{array} \end{split}\]

so we are done.

If you are lost in the forest of indices, have a look at Example 3.2.4.

Example 3.2.4

We check property (ii) for two general $3\times 4$-matrices $A$ and $B$ on the position $(2,3)$. Let

\[\begin{split} A = \begin{pmatrix} a_{11}& a_{12} & a_{13} & a_{14} \\ a_{21}& a_{22} & a_{23} & a_{24} \\ a_{31} & \fbox{$a_{32}$} & a_{33} & a_{34} \end{pmatrix} \quad \text{and} \quad B = \begin{pmatrix} b_{11}& b_{12} & b_{13} & b_{14} \\ b_{21}& b_{22} & b_{23} & b_{24} \\ b_{31} & \fbox{$b_{32}$} & b_{33} & b_{34} \end{pmatrix}. \end{split}\]

Then

\[\begin{split} E = (A+B)^T = \begin{pmatrix} a_{11}+b_{11}& a_{12}+b_{12} & a_{13}+b_{13} & a_{14} +b_{14}\\ a_{21}+b_{21}& a_{22}+b_{22} & a_{23}+b_{23} & a_{24}+b_{24} \\ a_{31}+b_{31} & \fbox{$a_{32}+b_{32}$} & a_{33}+b_{33} & a_{34}+b_{34} \end{pmatrix}^T \end{split}\]

so

\[\begin{split} E = \begin{pmatrix} a_{11}+b_{11}& a_{21}+b_{21} & a_{31}+b_{31} \\ a_{12}+b_{12}& a_{22}+b_{22} & \fbox{$a_{32}+b_{32}$} \\ a_{13}+b_{13}& a_{23}+b_{23} & a_{33}+b_{33} \\ a_{14} +b_{14}& a_{24}+b_{24} & a_{34}+b_{34} \end{pmatrix}, \end{split}\]

and on position $(2,3)$ we have $a_{32}+b_{32}$.

On the other hand

\[\begin{split} F = A^T + B^T = \begin{pmatrix} a_{11}& a_{21} & a_{31} \\ a_{12}& a_{22} & \fbox{$a_{32}$} \\ a_{13}& a_{23} & a_{33}\\ a_{14}& a_{24} & a_{34}\end{pmatrix} + \begin{pmatrix} b_{11}& b_{21} & b_{31} \\ b_{12}& b_{22} & \fbox{$b_{32}$} \\ b_{13}& b_{23} & b_{33}\\ b_{14}& b_{24} & b_{34} \end{pmatrix}, \end{split}\]

with on position $(2,3)$ the value $a_{32}$ + $b_{32}$.

Exercise 3.2.1

Prove statements (i) and (iii) of Proposition 3.2.2.

Solution to Exercise 3.2.1

Suppose $A = \left(\begin{array}{cccc} a_{11} & a_{12}& \cdots& a_{1n} \\ a_{21} & a_{22}& \cdots& a_{2n} \\ \vdots & \vdots& & \vdots \\ a_{m1} & a_{m2}& \cdots& a_{mn} \end{array} \right)$ is an arbitrary $m \times n$-matrix. Then

\[\begin{split} \begin{array}{rcl} (cA)^T &=& \left(\begin{array}{cccc} ca_{11} & ca_{12}& \cdots& ca_{1n} \\ ca_{21} & ca_{22}& \cdots& ca_{2n} \\ \vdots & \vdots& & \vdots \\ ca_{m1} & ca_{m2}& \cdots& ca_{mn} \end{array} \right)^T = \left(\begin{array}{cccc} ca_{11} & ca_{21}& \cdots& ca_{m1} \\ ca_{12} & ca_{22}& \cdots& ca_{m2} \\ \vdots & \vdots& & \vdots \\ ca_{1n} & ca_{2n}& \cdots& ca_{mn} \end{array} \right) \\ &=& c \left(\begin{array}{cccc} a_{11} & a_{21}& \cdots& a_{m1} \\ a_{12} & a_{22}& \cdots& a_{m2} \\ \vdots & \vdots& & \vdots \\ a_{1n} & a_{2n}& \cdots& a_{mn} \end{array} \right) = c\,A^T \end{array} \end{split}\]

As regards the other statement, i.e., $(A^T)^T = A$, suppose $A$ is an $m\times n$-matrix, $B = A^T$, and $C = B^T$. We have to show that $C = A$.

Now first of all, if $A$ is an $m\times n$-matrix, then $B$ is an $n\times m$-matrix, and transposing again gives an $m \times n$-matrix $C$, so $C$ has the same shape as $A$.

Furthermore, transposing means ‘flipping’ the indices. We quickly see that $C_{ij} = B_{ji} = A_{ij}$, for $1 \leq i \leq m$, $1 \leq j \leq n$, so entry by entry $A$ and $C$ are equal.

Example 3.2.5

We will solve the equation $A + 2X^T + B = C$ for $X$, where

\[\begin{split} A = \begin{pmatrix} 1 & 1 & 2 \\ 3 & 1 & 0 \end{pmatrix}, \quad B = \begin{pmatrix} 2 & 0 & 3 \\ 2 & 3 & 4 \end{pmatrix}, \text{ and} \quad C = \begin{pmatrix} 7 & 5 & 1 \\ 1 & 4 & 2 \end{pmatrix}. \end{split}\]

We will extricate $X$ step by step:

\[ A + 2X^T + B = C \,\, \iff \,\,2X^T = C-A-B \,\, \iff \,\, X^T = \tfrac12(C-A-B). \]

Next we transpose both terms to find

\[\begin{split} X = \tfrac12(C-A-B)^T = \frac12\begin{pmatrix} 4 & 4 & -4 \\ -4 & 0 & -2 \end{pmatrix}^T = \begin{pmatrix} 2 & -2 \\ 2 & 0 \\ -2 & -1 \end{pmatrix}. \end{split}\]

3.2.3. Grasple exercises (1)#

Grasple Exercise 3.2.1

https://embed.grasple.com/exercises/bc898154-3f5e-45bd-8993-28a74bf34b5f?id=70278

To compute the sum of two matrices.

Grasple Exercise 3.2.2

https://embed.grasple.com/exercises/bf170c2b-127b-4ce7-bd75-c9c9bdfb12f9?id=70277

To compute $c_1A + c_2B$.

Grasple Exercise 3.2.3

https://embed.grasple.com/exercises/dd83bd83-0ce4-4dd7-84de-3472c24acbc0?id=70279

To compute $c_1A + c_2B$.

Grasple Exercise 3.2.4

https://embed.grasple.com/exercises/3e5f0674-1e9f-4349-867f-6b1d638e744b?id=82934

To solve equations involving sum and transpose.

Grasple Exercise 3.2.5

https://embed.grasple.com/exercises/52a8c0e8-09e8-4c46-aa43-0cee2a93e7c4?id=82931

True/False questions involving sum and transpose.

3.2.4. The product of two matrices#

Next we turn our attention to the most important matrix operation, namely the product $AB$ of two matrices. In the previous chapter we have already seen the special case where $B$ is a matrix of just one column, i.e.

\[\begin{split} B = \mathbf{x} = \begin{pmatrix}x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix}, \end{split}\]

a vector in $\mathbb{R}^n$, which we can identify with an $n \times 1$-matrix. We want of course the definition of the general matrix product to be consistent with this.

Definition 3.2.6

The product of an $m\times n$-matrix $A$ and an $n\times p$-matrix $B = (\,{\vect{b}_1}\quad {\vect{b}_2}\quad \cdots \quad {\mathbf{b}_p})$ is defined by

\[ AB = (\,A\mathbf{b}_1\quad A\mathbf{b}_2\quad \cdots \quad A\mathbf{b}_p). \]

So we have

\[ j\text{-th column of } AB = A\text{ times $j$-th column of } B, \quad j = 1,2,\ldots,p. \]

Note that this makes $AB$ an $m \times p$-matrix.

If the number of columns of $A$ is not equal to the number of rows of $B$ the product $AB$ is not defined.

Example 3.2.6

\[\begin{split} \begin{pmatrix} 1 & -3 \\ -1 & 2 \\ 3& -2 \end{pmatrix} \begin{pmatrix} 2 & 1 & 1\\ 3 & 0 & 2 \end{pmatrix} = \begin{pmatrix} -7 & 1 & -5 \\ 4 & -1 & 3 \\ 0 & 3 &-1 \end{pmatrix}. \end{split}\]

For instance, the third column is computed as

\[\begin{split} \begin{pmatrix} 1 & -3 \\ -1 & 2 \\ 3& -2 \end{pmatrix} \begin{pmatrix} 1\\ 2 \end{pmatrix} = 1\begin{pmatrix} 1 \\ -1 \\ 3\end{pmatrix} + 2\begin{pmatrix} -3 \\ 2 \\ -2 \end{pmatrix} \,\,=\, \begin{pmatrix} -5 \\ 3 \\ -1\end{pmatrix}. \end{split}\]

Proposition 3.2.3

The product of the $m\times n$-matrix $A$ and the $n\times p$-matrix $B$ is the $m\times p$-matrix $C$ for which the entry on the position $(i,j)$ is given by

\[\begin{split} c_{ij} = a_{i1}b_{1j} + a_{i2}b_{2j} + \cdots + a_{in}b_{nj} = \begin{pmatrix}a_{i1} & a_{i2} & \cdots & a_{in} \end{pmatrix} \begin{pmatrix} b_{1j} \\ b_{2j} \\ \vdots \\ b_{nj}\end{pmatrix}. \end{split}\]

This is sometimes called the row-column expansion of the product.

Proof of Proposition 3.2.3

We already saw this row-column expansion in Section 2.4.

The following scheme nicely visualises the row-column expansion

\[\begin{split} \begin{array}{ccc} & \begin{pmatrix} b_{11} & b_{12}& \cdots& \class{red}{b_{1j}} & \cdots& b_{1p} \\ b_{21} & b_{22}& \cdots& \class{red}{b_{2j}} & \cdots& b_{2p} \\ \vdots & \vdots& & \class{red}{\vdots} & & \vdots \\ b_{n1} & b_{n2}& \cdots& \class{red}{b_{nj}} & \cdots& b_{np} \end{pmatrix} \\ \begin{pmatrix} a_{11} & a_{12}& \cdots& a_{1n} \\ a_{21} & a_{22}& \cdots& a_{2n} \\ \vdots & \vdots& & \vdots \\ \class{red}{a_{i1}} & \class{red}{a_{i2}}& \class{red}{\cdots}& \class{red}{a_{in}} \\ \vdots & \vdots& & \vdots \\ a_{m1} & a_{m2}& \cdots & a_{mn} \end{pmatrix} \!\! & \! \begin{pmatrix} c_{11} & c_{12}& \cdots& c_{1j} &\cdots& c_{1p} \\ c_{21} & c_{22}& \cdots& c_{2j} &\cdots& c_{2p} \\ \vdots & \vdots& & \vdots & & \vdots \\ c_{i1} & c_{i2}& \cdots&\class{red}{c_{ij}} &\cdots& c_{ip} \\ \vdots & \vdots& &\vdots & & \vdots \\ c_{m1} & c_{m2}& \cdots& c_{mj} &\cdots& c_{mp} \end{pmatrix}. \end{array}\end{split}\]

Example 3.2.7

Let us again consider the matrix product

\[\begin{split} \begin{pmatrix} 1 & -3 \\ -1 & 2 \\ 3& -2 \end{pmatrix} \begin{pmatrix} 2 & 1 & 1\\ 3 & 0 & 2 \end{pmatrix} = \begin{pmatrix} -7 & 1 & -5 \\ 4 & -1 & 3 \\ 0 & 3 &-1 \end{pmatrix}. \end{split}\]

The $-5$ on position $(1,3)$ and the $3$ on position $(3,2)$ in the product come from

\[\begin{split} -5 = \begin{pmatrix} 1 & -3 \end{pmatrix} \begin{pmatrix} 1\\ 2 \end{pmatrix} \quad \text{and} \quad 3 = \begin{pmatrix} 3 & -2 \end{pmatrix} \begin{pmatrix} 1\\ 0 \end{pmatrix}. \end{split}\]

Exercise 3.2.2

Explain why the product

\[\begin{split} \begin{pmatrix} 1 & -3 \\ -1 & 2 \\ 3& -2 \end{pmatrix}\begin{pmatrix} 1 & -3 \\ -1 & 2 \\ 3& -2 \end{pmatrix} \end{split}\]

is not defined.

Solution to Exercise 3.2.2

The first matrix is $3\times 2$, just as the second matrix. The product is not defined, as the number of columns ($2$) of the first matrix does not match the number of rows ($3$) of the second matrix.

Remark 3.2.2

The product of a matrix $A$ with itself is only defined if $A$ is an $n \times n$-matrix. In that case we use the obvious notation

\[ A^2 = A A. \]

Example 3.2.8

\[\begin{split} \begin{pmatrix} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ a_{31}& a_{32} & a_{33} \\ a_{41} & a_{42} & a_{43} \end{pmatrix} \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0\\ 0 & 0 & 1 \end{pmatrix} = \begin{pmatrix} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ a_{31}& a_{32} & a_{33} \\ a_{41} & a_{42} & a_{43} \end{pmatrix}. \end{split}\]

This example illustrates the existence of a unit element with respect to the multiplication. To identify it we first introduce some more terminology.

diagonal matrixmain diagonaldiagonal elementsdiagonalsquare matrix

Definition 3.2.7

An $n\times n $-matrix $A$ is called a square matrix. So it is a matrix where the number of columns is equal to the number of rows.

For a square matrix $A$ we call the elements $a_{ii}$ the diagonal elements. Together the diagonal elements form the (main) diagonal of $A$.

A square matrix where all non-diagonal elements are equal to $0$ is called a diagonal matrix.

Remark 3.2.3

The other diagonal of a square matrix, the one from bottom left to top right, plays a minor role. For this reason we don’t reserve a name for it. By ‘diagonal’ we will always mean: main diagonal.

Example 3.2.9

Consider the matrices

\[\begin{split} A = \begin{pmatrix} 2 & 2 \\ 3 & 3 \end{pmatrix}, \quad B = \begin{pmatrix} 2 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 6 \end{pmatrix}, \quad C = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{pmatrix}. \end{split}\]

The matrices $A$ and $B$ are square, and only $B$ is a diagonal matrix.

Exercise 3.2.3

Is the following statement true or false?

The $n \times n$ zero matrix $O_{nn}$ is a diagonal matrix.

Solution to Exercise 3.2.3

Recall Definition 3.2.7 and Definition 3.2.2.

$O_{nn}$ is a square matrix that has $0$s everywhere, so definitely all non-diagonal elements are equal to zero.

Therefore, the statement is true.

Exercise 3.2.4

Suppose $A = \begin{pmatrix} \mathbf{a}_1 & \mathbf{a}_2 & \cdots & \mathbf{a}_n \end{pmatrix} $ is an $m\times n$-matrix and
$B= \begin{pmatrix} \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_p \end{pmatrix} $ an $m\times p$-matrix. Show that

\[\begin{split} A^TB = \begin{pmatrix} \mathbf{a}_1\ip \mathbf{b}_1 & \mathbf{a}_1\ip\mathbf{b}_2 & \cdots & \mathbf{a}_1\ip \mathbf{b}_p \\ \mathbf{a}_2\ip \mathbf{b}_1 & \mathbf{a}_2\ip\mathbf{b}_2 & \cdots & \mathbf{a}_2\ip \mathbf{b}_p \\ \vdots & \vdots & & \vdots \\ \mathbf{a}_n\ip \mathbf{b}_1 & \mathbf{a}_n\ip\mathbf{b}_2 & \cdots & \mathbf{a}_n\ip \mathbf{b}_p \\ \end{pmatrix},\end{split}\]

where $\mathbf{a}\ip\mathbf{b}$ is the dot product of the vectors $\mathbf{a}$ and $\mathbf{b}$.

Hint

Note that now we can also write the dot product of two (column) vectors in $\R^n$ as a matrix product. Namely

\[ \mathbf{a}\ip\mathbf{b} = \mathbf{a}^T\mathbf{b}. \]

Solution to Exercise 3.2.4

\[\begin{split} \begin{align*} A^TB &= \begin{pmatrix} \mathbf{a}_1 & \mathbf{a}_2 & \cdots & \mathbf{a}_n \end{pmatrix}^T\begin{pmatrix} \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_p \end{pmatrix} \\ &= \begin{pmatrix} \mathbf{a}_1^T \\ \mathbf{a}_2^T \\ \vdots \\ \mathbf{a}_n^T \end{pmatrix}\begin{pmatrix} \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_p \end{pmatrix} \\ &= \begin{pmatrix} \mathbf{a}_1^T \mathbf{b}_1 & \mathbf{a}_1^T\mathbf{b}_2 & \cdots & \mathbf{a}_1^T \mathbf{b}_p \\ \mathbf{a}_2^T \mathbf{b}_1 & \mathbf{a}_2^T\mathbf{b}_2 & \cdots & \mathbf{a}_2^T \mathbf{b}_p \\ \vdots & \vdots & & \vdots \\ \mathbf{a}_n^T \mathbf{b}_1 & \mathbf{a}_n^T\mathbf{b}_2 & \cdots & \mathbf{a}_n^T \mathbf{b}_p \\ \end{pmatrix} \\ &= \begin{pmatrix} \mathbf{a}_1\ip \mathbf{b}_1 & \mathbf{a}_1\ip\mathbf{b}_2 & \cdots & \mathbf{a}_1\ip \mathbf{b}_p \\ \mathbf{a}_2\ip \mathbf{b}_1 & \mathbf{a}_2\ip\mathbf{b}_2 & \cdots & \mathbf{a}_2\ip \mathbf{b}_p \\ \vdots & \vdots & & \vdots \\ \mathbf{a}_n\ip \mathbf{b}_1 & \mathbf{a}_n\ip\mathbf{b}_2 & \cdots & \mathbf{a}_n\ip \mathbf{b}_p \\ \end{pmatrix}. \end{align*} \end{split}\]

Exercise 3.2.5

The special case in the previous exercise where $A = B$ will become very important when we will look at orthogonal projections. For now, show that the columns of a matrix $A$ are orthogonal if and only if the matrix $A^TA$ is a diagonal matrix.

Solution to Exercise 3.2.5

First recognise that

\[\begin{split} A^TA = \begin{pmatrix} \mathbf{a}_1\ip \mathbf{a}_1 & \mathbf{a}_1\ip\mathbf{a}_2 & \cdots & \mathbf{a}_1\ip \mathbf{a}_n \\ \mathbf{a}_2\ip \mathbf{a}_1 & \mathbf{a}_2\ip\mathbf{a}_2 & \cdots & \mathbf{a}_2\ip \mathbf{a}_n \\ \vdots & \vdots & & \vdots \\ \mathbf{a}_n\ip \mathbf{a}_1 & \mathbf{a}_n\ip\mathbf{a}_2 & \cdots & \mathbf{a}_n\ip \mathbf{a}_n \\ \end{pmatrix}. \end{split}\]

If $A$ has orthogonal columns, then at least $\mathbf{a}_i\ip\mathbf{a}_j=0$ for $i\neq j$. This gives

\[\begin{split} A^TA = \begin{pmatrix} \mathbf{a}_1\ip \mathbf{a}_1 & 0 & \cdots & 0\\ 0 & \mathbf{a}_2\ip\mathbf{a}_2 & \cdots & 0 \\ \vdots & \vdots & & \vdots \\ 0 & 0 & \cdots & \mathbf{a}_n\ip \mathbf{a}_n \\ \end{pmatrix}. \end{split}\]

So $A^TA$ is a diagonal matrix.

The other way, if $A^T$ is a diagonal matrix, then at least $\mathbf{a}_i\ip\mathbf{a}_j=0$ for $i\neq j$, so $A$ has orthogonal columns.

identity matrix

Definition 3.2.8

The identity matrix $I_n$ is the $n \times n$ diagonal matrix with $1$s on the diagonal. If the size is irrelevant or clear from the context, we denote it simply by $I$.

Exercise 3.2.6

Let

\[\begin{split} I = I_4 = \begin{pmatrix}1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \quad \text{and} \quad A = \begin{pmatrix} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ a_{31}& a_{32} & a_{33} \\ a_{41} & a_{42} & a_{43} \end{pmatrix}. \end{split}\]

Show that $IA = A$.

Solution to Exercise 3.2.6

We first focus on the first column of $A=\begin{pmatrix}\mathbf{a}_1&\mathbf{a}_2&\mathbf{a}_3\end{pmatrix}=\begin{pmatrix} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ a_{31}& a_{32} & a_{33} \\ a_{41} & a_{42} & a_{43} \end{pmatrix}$:

\[\begin{split} \begin{align*} I\mathbf{a}_1 &= \begin{pmatrix}1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}\begin{pmatrix} a_{11} \\ a_{21}\\ a_{31} \\ a_{41} \end{pmatrix} \\ &= a_{11}\begin{pmatrix}1 \\ 0 \\ 0 \\0 \end{pmatrix} + a_{21}\begin{pmatrix}0 \\ 1 \\ 0 \\0 \end{pmatrix} +a_{31}\begin{pmatrix}0\\ 0 \\ 1 \\0 \end{pmatrix} +a_{41}\begin{pmatrix}0 \\ 0 \\ 0 \\1 \end{pmatrix} \\ &= \begin{pmatrix}a_{11} \\ 0 \\ 0 \\0 \end{pmatrix} + \begin{pmatrix}0 \\ a_{21} \\ 0 \\0 \end{pmatrix} +\begin{pmatrix}0\\ 0 \\ a_{31} \\0 \end{pmatrix} +\begin{pmatrix}0 \\ 0 \\ 0 \\a_{41} \end{pmatrix}. \end{align*} \end{split}\]

Similarly, $I\mathbf{a}_2=\mathbf{a}_2$ and $I\mathbf{a}_3=\mathbf{a}_3$.

Therefore,

\[ IA = I\begin{pmatrix}\mathbf{a}_1&\mathbf{a}_2&\mathbf{a}_3\end{pmatrix} = \begin{pmatrix}I\mathbf{a}_1&I\mathbf{a}_2&I\mathbf{a}_3\end{pmatrix} = \begin{pmatrix}\mathbf{a}_1&\mathbf{a}_2&\mathbf{a}_3\end{pmatrix} = A. \]

The definition of the product of two matrices and the earlier definition of the product of a matrix and a vector (Definition 2.4.1) immediately imply that the columns of the product of two matrices are linear combinations of the columns of the first matrix.
As is often the case in linear algebra things can be looked at from a different perspective. From Proposition 3.2.3 it follows that the elements $c_{i1},c_{i2},\ldots,c_{in}$ of the $i$-th row of the product $C = AB$ as far as $A$ is concerned only depend on the elements $a_{ik}$ of its $i$-th row. The following proposition explains in which way.

Proposition 3.2.4

The $i$-th row of the product $AB$ is the linear combination of the rows of the second matrix, $B$, with the entries of the $i$-th row of $A$ as coefficients.

Proof of Proposition 3.2.4

The indicated linear combination yields:

\[ a_{i1} \begin{pmatrix}b_{11} & b_{12} & \cdots &b_{1p} \end{pmatrix} + a_{i2} \begin{pmatrix}b_{21} & b_{22} & \,\, \cdots \,\, &b_{2p} \end{pmatrix} + \cdots + a_{in} \begin{pmatrix}b_{n1} & b_{n2} & \cdots &b_{np} \end{pmatrix} \]

\[ = \begin{pmatrix} (a_{i1}b_{11} + a_{i2}b_{21}+ \cdots +a_{in}b_{n1}) & \quad\cdots\quad & (a_{i1}b_{1p} + a_{i2} b_{2p} + \cdots + a_{in}b_{np}) \end{pmatrix}. \]

This is a row vector with on the $j$-th position the number

\[ (a_{i1}b_{1j} + a_{i2} b_{2j} + \cdots + a_{in}b_{nj}), \]

and that is precisely the entry $c_{ij}$ of the matrix $C = AB$.

Interestingly this opens the way to describe the row operations of Chapter 2 via matrix multiplication. The following example illustrates this for the three basic row operations.

Example 3.2.10

The following multiplication adds the first row of the matrix

\[\begin{split} A = \begin{pmatrix} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ a_{31}& a_{32} & a_{33} \end{pmatrix} \end{split}\]

four times to the second row:

\[\begin{split} \begin{pmatrix} 1 & 0 & 0 \\ 4 & 1 & 0 \\ 0 & 0 & 1\end{pmatrix} \begin{pmatrix} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ a_{31}& a_{32} & a_{33} \\ \end{pmatrix} = \begin{pmatrix} a_{11}& a_{12} & a_{13} \\ 4a_{11}+a_{21}&4a_{12} +a_{22}& 4a_{13}+a_{23} \\ a_{31}& a_{32} & a_{33} \\ \end{pmatrix}. \end{split}\]

Here the third row is scaled with a factor $5$:

\[\begin{split} \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 5\end{pmatrix} \begin{pmatrix} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{pmatrix} = \begin{pmatrix} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ 5a_{31}& 5a_{32} & 5a_{33} \\ \end{pmatrix}. \end{split}\]

And with the following multiplication the first and third row of $A$ are swapped:

\[\begin{split} \begin{pmatrix} 0 & 0 & 1 \\ 0 & 1 & 0 \\ 1 & 0 & 0\end{pmatrix} \begin{pmatrix} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ a_{31}& a_{32} & a_{33} \end{pmatrix} = \begin{pmatrix}a_{31}& a_{32} & a_{33} \\ a_{21}& a_{22} & a_{23} \\ a_{11}& a_{12} & a_{13} \end{pmatrix}. \end{split}\]

For future reference we give these matrices a name:

elementary matrices

Definition 3.2.9

The matrices $E$ that perform one single row operation (row replacement, row scaling, or row exchange) via $A \mapsto EA$ are called elementary matrices.

Exercise 3.2.7

Describe in words which row operations are the effect of pre-multiplying a $4\times n$-matrix $A$ with the following elementary matrices:

\[\begin{split} E_1 = \begin{pmatrix} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & -1\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{pmatrix}, \quad \quad E_2 = \begin{pmatrix} 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 1\\ 0 & 0 & 1 & 0\\ 0 & 1 & 0 & 0 \end{pmatrix}. \end{split}\]

Solution to Exercise 3.2.7

$E_1$ is the elementary matrix that substracts row $4$ of $A$ from row $2$ of $A$.

$E_2$ is the elementary matrix that switches row $2$ of $A$ and row $4$ of $A$.

Example 3.2.11

The following product may at first sight seem a bit odd, but it is exactly according to the definition:

\[\begin{split} \begin{pmatrix} 1 \\-2\\3\\4 \end{pmatrix}\begin{pmatrix} 2&4&0& -1 \end{pmatrix} = \begin{pmatrix} 2 & 4 & 0 & -1 \\ -4 & -8 & 0 & 2 \\ 6 & 12 & 0 & -3 \\ 8 & 16 & 0 & -4 \end{pmatrix}. \end{split}\]

The column-row product in the last example is the building block for yet another way to look at the matrix product. The next exercise explains how.

Exercise 3.2.8 (Column-row expansion of the product)

Denote the columns of the $m\times n$-matrix $A$ by $A_{(1)}, \ldots, A_{(n)}$, and the rows of the $n\times p$-matrix $B$ by $B^{(1)}, \ldots, B^{(p)}$, so

\[\begin{split} A_{(j)} = \begin{pmatrix} a_{1j} \\ \vdots \\ a_{mj}\end{pmatrix} \quad \text{and} \quad B^{(i)} = \begin{pmatrix} b_{i1} & b_{i2} & \cdots & b_{ip}\end{pmatrix}. \end{split}\]

Show that

\[ AB = A_{(1)} B^{(1)} + A_{(2)} B^{(2)} + \cdots + A_{(n)} B^{(n)}, \]

i.e., $AB$ is the sum of $n$ column-row products (like in Example 3.2.11).

Solution to Exercise 3.2.8

Using Proposition 3.2.3, we see that with $C=AB$ and

\[ c_{ij} = a_{i1}b_{1j} + a_{i2}b_{2j} + \cdots + a_{in}b_{nj}. \]

Now consider one column-row product:

\[\begin{split} \begin{align*} A_{(k)}B^{(k)} &= \begin{pmatrix} a_{1k} \\ \vdots \\ a_{mk}\end{pmatrix}\begin{pmatrix} b_{k1} & b_{k2} & \cdots & b_{kp}\end{pmatrix} \\ &= \begin{pmatrix} a_{1k}b_{k1} & a_{1k}b_{k2} & \cdots & a_{1k}b_{kp} \\ a_{2k}b_{k1} & a_{2k}b_{k2} & \cdots & a_{2k}b_{kp} \\ \vdots & \vdots & & \vdots \\ a_{mk}b_{k1} & a_{mk}b_{k2} & \cdots & a_{mk}b_{kp} \end{pmatrix}. \end{align*} \end{split}\]

This shows that $\left(A_{(k)}B^{(k)}\right)_{ij}=a_{ik}b_{kj}$. Now follows

\[\begin{split} \begin{align*} \left(A_{(1)} B^{(1)} + A_{(2)} B^{(2)} + \cdots + A_{(n)} B^{(n)}\right)_{ij} &= \left(A_{(1)} B^{(1)}\right)_{ij} + \left(A_{(2)} B^{(2)}\right)_{ij} + \cdots + \left(A_{(n)} B^{(n)}\right)_{ij} \\ &= a_{i1}b_{1j}+a_{i2}b_{2j}+\cdots+a_{in}b_{nj} \\ &= c_{ij}. \end{align*} \end{split}\]

This shows that

\[ AB = A_{(1)} B^{(1)} + A_{(2)} B^{(2)} + \cdots + A_{(n)} B^{(n)}.\]

3.2.5. Properties of the matrix product#

Now let us have a look which of the rules of the products of numbers also hold for products of matrices. And which do not. Section 3.2.5

Proposition 3.2.5

For all $m \times n$-matrices $A,A_1,A_2$, all $n \times p$-matrices $B,B_1,B_2$, all $p \times q$-matrices $C$ and all real numbers $c$ the following are true:

$A(B_1+B_2) = AB_1 + AB_2$ and $(A_1+A_2)B = A_1B+A_2B$;
$A(cB) = c(AB) = (cA)B$;
$AI_n = A$ and $I_mA = A$ (the identity matrix $I$ acts as a unit element);
$A(BC) = (AB)C$.

Example 3.2.12

As an illustration of rule iv. we compute the two triple products for the three matrices

\[\begin{split} A = \begin{pmatrix} 3 & 1 \\ 2 & 1 \\ 0 & 5 \end{pmatrix}, \quad B = \begin{pmatrix} 1 & 2 \\ 3 & 0 \end{pmatrix}, \quad C = \begin{pmatrix} 1 & 2 & 0 \\ 2 & 1 & 2 \end{pmatrix}. \end{split}\]

On the one hand

\[\begin{split} A(BC) = \begin{pmatrix} 3 & 1 \\ 2 & 1 \\ 0 & 5 \end{pmatrix} \begin{pmatrix} 5 & 4 & 4\\ 3 & 6 & 0 \end{pmatrix} = \begin{pmatrix} 18 & 18 & 12\\ 13 & 14 & 8 \\ 15 & 30 & 0 \end{pmatrix}, \end{split}\]

and on the other hand

\[\begin{split} (AB)C = \begin{pmatrix} 6 & 6 \\ 5 & 4 \\ 15 & 0 \end{pmatrix} \begin{pmatrix} 1 & 2 & 0 \\ 2 & 1 & 2 \end{pmatrix} = \begin{pmatrix} 18 & 18 & 12 \\ 13 & 14 & 8 \\ 15 & 30 & 0 \end{pmatrix}. \end{split}\]

So the products are indeed equal. But it is not immediately clear how. For instance, the value $14$ on position $(2,2)$ comes about in two ways

\[ \text{via } A(BC)\!: \,14 = 2\cdot4 + 1\cdot 6, \quad \,\, \text{via } (AB)C\!: \,14 = 5\cdot2 + 4\cdot1. \]

We need a good perspective to give a proof of the general case.

Proof of Proposition 3.2.5

Rules i. and ii. are checked in a straightforward way. See Exercise 3.2.9.

We saw instances of this property already in Example 3.2.8 and Exercise 3.2.6. For the general case, one way to show validity of the first statement is to note that the $j$-th column of $AI_n$ is $A\mathbf{e}_j$ where $\mathbf{e}_j$ is the $j$-th column of the identity matrix $I_n$. This gives the linear combination

\[ A\mathbf{e}_j = 0\mathbf{a}_1 + 0\mathbf{a}_2 + \cdots + 1\mathbf{a}_j +\dots + 0\mathbf{a}_n = \mathbf{a}_j \]

which shows that the $j$-th column of $AI_n$ is equal to the $j$-th column of $A$. And this holds for any column.

The identity $\quad I_mA = A\quad$ is shown in an analogous way, working row by row.
First we observe that both triple products yield $m \times q$-matrices. Then the identity can be proved ‘column by column’, as the previous one.

We are done if we can show that

\[\begin{split} \begin{array}{rcl} k\text{-th column of }A(BC) &=& k\text{-th column of }(AB)C \\ &=& (AB)( k\text{-th column of }C) = (AB)\mathbf{c}_k, \end{array} \end{split}\]

for $ k = 1,2,\ldots, q $.

Now recall that (by definition)

\[ k\text{-th column of }BC = B\vect{c}_k, \]

so

\[ k\text{-th column of }A(BC) = A\,(B\vect{c}_k). \]

Making extensive use of the rule

\[ A(c_1\mathbf{x} + c_2\mathbf{y}) = c_1A\mathbf{x} + c_2A\mathbf{y}, \]

we find

\[\begin{split} \begin{array}{ccl} A\,(B\mathbf{c_k}) & = & A \,(c_{1k}\mathbf{b}_1 +c_{2k}\mathbf{b}_2 + \cdots + c_{pk}\mathbf{b}_p)\\ & = & c_{1k}(A\mathbf{b}_1) +c_{2k}(A\mathbf{b}_2) + \cdots + c_{pk}(A\mathbf{b}_p)\\ & = & \begin{pmatrix} A\mathbf{b}_1 & A\mathbf{b}_2 & \cdots & A\mathbf{b}_p \end{pmatrix} \begin{pmatrix} c_{1k} \\ \vdots \\ c_{pk} \end{pmatrix} \\ & = & (AB)\mathbf{c}_k. \end{array} \end{split}\]

Exercise 3.2.9

Prove rules i. and ii. of Proposition 3.2.5.

Recall that matrices are equal when they have the same size and the entries on corresponding positions are equal (which may be checked column by column or row by row).

Solution to Exercise 3.2.9

We show the first identity, assuming $B_1=\begin{pmatrix}\mathbf{b}^1_1&\mathbf{b}^1_2&\cdots&\mathbf{b}^1_q\end{pmatrix}$ and $B_2=\begin{pmatrix}\mathbf{b}^2_1&\mathbf{b}^2_2&\cdots&\mathbf{b}^2_q\end{pmatrix}$:

\[\begin{split} \begin{align*} A(B_1+B_2) &= A\begin{pmatrix}\mathbf{b}^1_1+\mathbf{b}^2_1&\mathbf{b}^1_2+\mathbf{b}^2_2&\cdots&\mathbf{b}^1_q+\mathbf{b}^2_q\end{pmatrix} \\ &= \begin{pmatrix}A\left(\mathbf{b}^1_1+\mathbf{b}^2_1\right)&A\left(\mathbf{b}^1_2+\mathbf{b}^2_2\right)&\cdots&A\left(\mathbf{b}^1_q+\mathbf{b}^2_q\right)\end{pmatrix} \\ &= \begin{pmatrix}A\mathbf{b}^1_1+A\mathbf{b}^2_1&A\mathbf{b}^1_2+A\mathbf{b}^2_2&\cdots&A\mathbf{b}^1_q+A\mathbf{b}^2_q\end{pmatrix} \\ &= \begin{pmatrix}A\mathbf{b}^1_1&A\mathbf{b}^1_2&\cdots&A\mathbf{b}^1_q\end{pmatrix}+\begin{pmatrix}A\mathbf{b}^2_1&A\mathbf{b}^2_2&\cdots&A\mathbf{b}^2_q\end{pmatrix} \\ &= A\begin{pmatrix}\mathbf{b}^1_1&\mathbf{b}^1_2&\cdots&\mathbf{b}^1_q\end{pmatrix}+A\begin{pmatrix}\mathbf{b}^2_1&\mathbf{b}^2_2&\cdots&\mathbf{b}^2_q\end{pmatrix} \\ &= AB_1+AB_2. \end{align*} \end{split}\]

The second identity can be shown similarly.
We only show the first equality, using $B=\begin{pmatrix}\mathbf{b}_1&\mathbf{b}_2&\cdots&\mathbf{b}_q\end{pmatrix}$:

\[\begin{split} \begin{align*} A(cB) &= A\left(c\begin{pmatrix}\mathbf{b}_1&\mathbf{b}_2&\cdots&\mathbf{b}_q\end{pmatrix}\right) \\ &= A\begin{pmatrix}c\mathbf{b}_1&c\mathbf{b}_2&\cdots&c\mathbf{b}_q\end{pmatrix} \\ &= \begin{pmatrix}A\left(c\mathbf{b}_1\right)&A\left(c\mathbf{b}_2\right)&\cdots&A\left(c\mathbf{b}_q\right)\end{pmatrix} \\ &= \begin{pmatrix}c\left(A\mathbf{b}_1\right)&c\left(A\mathbf{b}_2\right)&\cdots&c\left(A\mathbf{b}_q\right)\end{pmatrix} \\ &= c\begin{pmatrix}A\mathbf{b}_1&A\mathbf{b}_2&\cdots&A\mathbf{b}_q\end{pmatrix} \\ &= c\left(AB\right). \end{align*} \end{split}\]

The second equality can be shown similarly.

Remark 3.2.4

The proof of Proposition 3.2.5 iv. can be seen in another light. In the Section Linear transformations we saw that an $m\times n$-matrix $A$ defines a transformation $T$ from $\mathbb{R}^n$ to $\mathbb{R}^m$, namely

\[ \text{for } \mathbf{x} \in \mathbb{R}^n: \quad \mathbf{x} \mapsto T(\mathbf{x}) = A\mathbf{x}. \]

The definition of the product of two matrices then precisely matches the composition of two of such transformations: if $A$ is an $m\times n$-matrix and $B$ is an $n\times p$-matrix

\[ \mathbf{x}\in\mathbb{R}^p \,\,\stackrel{B}{\longrightarrow}\,\, \vect{y}_1 = B\vect{x}\in\mathbb{R}^n \,\, \stackrel{A}{\longrightarrow} \,\,\, \vect{y}_2 = A(B\mathbf{x}) \in \mathbb{R}^m\, \]

and

\[ \mathbf{x}\in\mathbb{R}^p\,\,\,\stackrel{AB}{\longrightarrow} \,\,\,\vect{y}_3 \,=\, (AB)\mathbf{x} \in \mathbb{R}^m \]

yield the same vector:

\[ \vect{y}_2 = \vect{y}_3. \]

So far so good: matrix multiplication behaves as multiplication of numbers. However, in two important respects the concepts deviate. First of all, commutativity no longer holds.

Example 3.2.13

For the matrices

\[\begin{split} A = \begin{pmatrix} 2 & 2 & 1\\ 3 & 3 & 0 \end{pmatrix} \quad \text{and} \quad B = \begin{pmatrix} 1 & 3 \\ 3 & 1 \\ 4 & 0 \end{pmatrix} \end{split}\]

it is clear that

\[ AB \neq BA,\]

simply because the two products are not of the same size: $AB$ is a $2\times 2$-matrix, $BA$ a $3\times3$-matrix.

The following example illustrates that $AB = BA$ is not even guaranteed for two $n\times n$-matrices $A$ and $B$:

\[\begin{split} \begin{pmatrix} 1 & 3 \\ 2 & 1 \end{pmatrix} \begin{pmatrix} 0 & 1 \\ 1 & 2 \end{pmatrix} = \begin{pmatrix} 3 & 7 \\ 1 & 4 \end{pmatrix} \neq \begin{pmatrix} 2 & 1 \\ 5 & 5 \end{pmatrix} = \begin{pmatrix} 0 & 1 \\ 1 & 2 \end{pmatrix} \begin{pmatrix} 1 & 3 \\ 2 & 1 \end{pmatrix}. \end{split}\]

The fact that $AB \neq BA$ can be understood by thinking about the composition of the two transformations corresponding to $A$ and $B$. (See Section Linear transformations.)

The following two exercises shed some light on the non-commutativity.

Example 3.2.14

Consider the two matrices

\[\begin{split} A = \begin{pmatrix} 2 & 0 \\ 0 & 1 \end{pmatrix} \quad \text{and} \quad B = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} \end{split}\]

and the corresponding linear transformations

\[ S: \mathbb{R}^2 \to \mathbb{R}^2, \quad \mathbf{x} \mapsto S(\mathbf{x}) = A \mathbf{x} \]

and

\[ T: \mathbb{R}^2 \to \mathbb{R}^2, \quad \mathbf{x} \mapsto T(\mathbf{x}) = B \mathbf{x}. \]

We get

\[\begin{split} \mathbf{x} = \begin{pmatrix} x_1\\ x_2 \end{pmatrix} \quad \mapsto \quad A\mathbf{x} = \begin{pmatrix} 2 & 0 \\ 0 & 1 \end{pmatrix} \begin{pmatrix} x_1\\ x_2 \end{pmatrix} \,\, = \,\, \begin{pmatrix} 2x_1\\ x_2 \end{pmatrix} \end{split}\]

and likewise

\[\begin{split} T(\mathbf{x}) = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} x_1\\ x_2 \end{pmatrix} = \begin{pmatrix} x_2\\ x_1 \end{pmatrix}. \end{split}\]

../_images/Fig-MatrixOps-NonCommutativity.svg — Fig. 3.2.1 $ \begin{pmatrix} 2 & 0 \\ 0 & 1 \end{pmatrix}\begin{pmatrix} 0 & 1 \\ 1&0 \end{pmatrix} \neq \begin{pmatrix} 0 & 1 \\ 1&0 \end{pmatrix}\begin{pmatrix} 2 & 0 \\ 0 & 1 \end{pmatrix}$.#

Note that $S$ is a transformation that ‘stretches’ horizontally, and $T$ is a reflection. Figure 3.2.1 visualises the transformations corresponding to $AB$ and $BA$. When we apply the transformations one after another, the order in which we do this is important.

Exercise 3.2.10

Another way to understand why $AB\neq BA$ is the following.

Recall that the matrices

\[\begin{split} E_1 = \begin{pmatrix} 1 & 0 \\ 2 & 1 \end{pmatrix} \quad \text{and} \quad E_2 = \begin{pmatrix} 3 & 0 \\ 0 & 1 \end{pmatrix} \end{split}\]

perform row operations, when multiplied with a $2 \times n$-matrix $A$.

Describe in words the row operations corresponding to $E_1$ and $E_2$.
Describe in words the combined row operations corresponding to $E_1E_2$ and $E_2E_1$. Can you explain why $E_1E_2 \neq E_2E_1$?
Compute $E_1E_2$ and $E_2E_1$ to double check the last non-identity.

Solution to Exercise 3.2.10

$E_1$ is the elementary matrix corresponding to adding $2$ times row $1$ of $A$ to row $2$ of $A$.

$E_2$ is the elementary matrix corresponding to multiplying row $1$ of $A$ by $3$.
$E_1E_2$ first multiplies row $1$ by $3$ and then adds that new row twice to row $2$. The new first row is thus $3$ times row $1$ and the new second row is thus $6$ times row $1$ plus row $2$.

$E_2E_1$ first adds that row $1$ twice to row $2$ and then multiplies row $1$ by $3$ (leaving the second row the same). The new first row is thus $3$ times row $1$ and the new second row is thus $2$ times row $1$ plus row $2$.

$E_1E_2$ and $E_2E_1$ have thus a different second row as a result, so $E_1E_2 \neq E_2E_1$.
$E_1E_2=\begin{pmatrix}3&0\\6&1\end{pmatrix}$.

$E_2E_1=\begin{pmatrix}3&0\\2&1\end{pmatrix}$.

The second major difference between the product of numbers and the product of matrices: for two (e.g. real) numbers $a$ and $b$ it is known that

\[ \text{if} \quad a \neq 0 \quad \text{and} \quad b \neq 0\quad \text{then} \quad ab \neq 0, \]

or, equivalently,

\[ \text{if} \quad ab = 0 \quad \text{then} \quad a = 0 \,\,\text{ or } \,\,b = 0. \]

As the following example shows, things are different in the realm of matrices.

Example 3.2.15

\[\begin{split} \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix} \begin{pmatrix} 2 & 6 \\ -1 & -3 \end{pmatrix} = \begin{pmatrix} 0 & 0 \\ 0 & 0 \end{pmatrix}. \end{split}\]

So the product of two non-zero matrices may be the zero matrix.

The following example shows that things are even ‘worse’:

Example 3.2.16

\[\begin{split} \begin{pmatrix} 1 & -3 & 2 \\ 1 & -3 & 2 \\ 1 & -3 & 2 \end{pmatrix} \begin{pmatrix}1 & -3 & 2 \\ 1 & -3 & 2 \\ 1 & -3 & 2 \end{pmatrix} = \begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix}, \end{split}\]

which shows that we cannot even conclude from $A A = O$ that $A$ itself must be the zero matrix.

And here is another example of a non-zero matrix whose square is the zero matrix. In this case it can be seen geometrically what is going on.

Example 3.2.17

For the matrix $A = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}$ we again have that

\[\begin{split} A^2 = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}^2 = \begin{pmatrix} 0 & 0 \\ 0 & 0 \end{pmatrix}. \end{split}\]

It holds that

\[\begin{split} A = A_2A_1 = \begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix}\begin{pmatrix} 0 & 0 \\ 0 & 1 \end{pmatrix} \end{split}\]

Now consider the transformations corresponding to the matrices $A_1$ and $A_2$.

\[\begin{split} T_1(\vect{x}) = A_1\vect{x} = \begin{pmatrix} 0 & 0 \\ 0 & 1 \end{pmatrix}\begin{pmatrix} x_1 \\ x_2 \end{pmatrix} = \begin{pmatrix} 0\\ x_2 \end{pmatrix} \end{split}\]

is the projection onto the $x_2$-axis, and

\[\begin{split} T_2(\vect{x}) = A_2\vect{x} = \begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix}\begin{pmatrix} x_1 \\ x_2 \end{pmatrix} = \begin{pmatrix} x_2\\ -x_1 \end{pmatrix} \end{split}\]

is the clockwise rotation about an angle $\frac12\pi$.

Now let us see, step by step, what is the effect of the transformation $T_2T_1T_2T_1$, corresponding to $A^2$.

An arbitrary vector $\vect{x}$ is sent to a vector $T_1(\vect{x})$ on the $x_2$-axis by $T_1$.

The rotation sends this to a vector $T_2(T_1(\vect{x}))$ on the $x_1$-axis. Projecting onto the $x_2$-axis again, will bring this to $T_1(T_2(T_1(\vect{x}))) = \vect{0}$.
Lastly, applying $T_2$ leaves the zero vector where it is.

So $A^2\vect{x} = T_2(T_1(T_2(T_1(\vect{x})))) = \vect{0}$, for each vector $\vect{x}$.

See Figure 3.2.2.

../_images/Fig-MatrixOps-Nilpotent.svg — Fig. 3.2.2 Visualisation of $\vect{x} \mapsto A^2\vect{x}$.#

Remark 3.2.5

The next list gives six situations where matrix multiplication acts differently than multiplication of numbers.
In fact, all statements can be related to one of the first two.

In general, $AB = BA$ does not hold for two $n\times n$-matrices $A$ and $B$.
In general, from $AB = O$ it does not follow that either $A =O$ or $B = O$.
In general, $(A+B)(A+B) = A^2 + 2AB + B^2$ does not hold for two $n\times n$-matrices $A$ and $B$.
In general, $(A+B)(A-B) = A^2 - B^2$ does not hold for two $n\times n$-matrices $A$ and $B$.
In general, from $AB = AC$ and $A \neq O$ it does not follow that $B = C$.
In general, from $A^2 = I$ it does not follow that either $A = I$ or $A = -I$.

For each statement counterexamples can be given, as we already did for the first two. To get more insight in what is really going on, we can also try to find out how the third till the sixth statement relate to the first two statements.

For instance, the third statement is closely related to the first. Let us check where ‘things go wrong’.

\[\begin{split} \begin{array}{cl} (A+B)(A+B)& = A(A+B) +B(A+B)\\ & = A^2 + AB + BA + B^2. \end{array} \end{split}\]

The last expression is only equal to

\[ A^2 + 2AB + B^2 \]

if

\[ AB + BA = 2AB. \]

And that is only the case if

\[ BA = AB. \]

So any pair of two matrices $A$ and $B$ with

\[ AB \neq BA \]

provides a counterexample where

\[ (A+B)(A+B) \neq A^2 + 2AB + B^2. \]

Likewise, v. follows from ii. Namely,

\[ AB = AC \iff AB - AC = O \iff A(B-C) = O. \]

According to ii. from the last equation we cannot deduce that

\[ \text{either } A = O \quad \text{or}\quad B-C = O. \]

We can create a counterexample by taking for $A$ and $B$ non-zero matrices for which

\[ AB = O, \]

and let $C$ be the zero matrix. Then $B \neq C$, whereas

\[ AB = AC = O \quad \text{and} \,\,\text{(by assumption)} \quad A \neq O. \]

Statement vi. also relates to ii. Namely,

\[ A^2 = I \quad \iff \quad A^2 - I = (A+I)(A-I) = O. \]

From the last equality we cannot conclude that one of the factors $(A+I)$ or $(A-I)$ must be the zero matrix. In this case we do not get a counterexample for free. You are asked to construct counterexamples in Exercise 3.2.11.

Exercise 3.2.11

Give a $2 \times 2$-matrix $A$ not containing any zeros, for which $A^2 = I$.
Give a $2 \times 2$-matrix $B$ for which $B^2 = -I$.

Solution to Exercise 3.2.11

$A=\begin{pmatrix}\frac{1}{\sqrt{2}}&\frac{1}{\sqrt{2}}\\\frac{1}{\sqrt{2}}&-\frac{1}{\sqrt{2}}\end{pmatrix}$.
$B=\begin{pmatrix}0&-1\\1&0\end{pmatrix}$.

The following property connects the two operations matrix transposition and matrix multiplication.

Proposition 3.2.6

If $A$ is an $m\times n$-matrix and $B$ an $n\times p$-matrix, then

\[ (AB)^T = B^TA^T. \]

Before we present the proof, we consider a typical example.

Example 3.2.18

We verify the rule for the two matrices

\[\begin{split} A = \begin{pmatrix} 2 & 1 & -1 \\ 1 & -1 & 3 \end{pmatrix} \quad\text{and}\quad B = \begin{pmatrix} 1 & -3 & 0\\ 4 & 2 & -1 \\ 5 & 2 & 1\end{pmatrix}. \end{split}\]

We compute:

\[\begin{split} AB = \begin{pmatrix} 2 & 1 & -1 \\ 1 & -1 & 3 \end{pmatrix} \begin{pmatrix} 1 & -3 & 0 \\ 4 & 2 & -1\\ 5 & 2 & 1 \end{pmatrix} = \begin{pmatrix} 1 & -6 & -2 \\ 12 & 1 & 4\end{pmatrix} \end{split}\]

and

\[\begin{split} B^TA^T = \begin{pmatrix} 1 & 4 & 5 \\ -3 & 2 & 2 \\ 0 & -1 & 1 \end{pmatrix} \begin{pmatrix} 2& 1 \\ 1 & -1 \\ -1 & 3 \end{pmatrix} = \begin{pmatrix} 1 & 12 \\ -6 & 1 \\ -2 & 4 \end{pmatrix}, \end{split}\]

so that indeed

\[\begin{split} B^TA^T = \begin{pmatrix} 1 & 12 \\ -6 & 1 \\ -2 & 4 \end{pmatrix} = \begin{pmatrix} 1 & -6 & -2 \\ 12 & 1 & 4\end{pmatrix}^T = (AB)^T. \end{split}\]

Careful inspection learns that for the two matrix products exactly the same sums and products of numbers have to be computed. For instance, in both products $12$ is the sum of products

\[ 12 = 1\cdot1 +4\cdot(-1) +5\cdot3 = 1\cdot1 +(-1)\cdot4 +3\cdot5. \]

As Example 3.2.18 illustrates the rule is not restricted to square matrices $A$ and $B$. The proof for general matrices $A$ and $B$ for which the product $AB$ is well defined is as follows

Proof of Proposition 3.2.6

To show that

\[ (AB)^T = B^TA^T, \]

we have to show that the matrices have the same size, and are equal entry by entry. First, we see that $AB$ is an $m \times p$-matrix, so $(AB)^T$ is a $p \times m$-matrix, and $B^TA^T$, being the product of a $p \times n$-matrix with an $n \times m$, is also a $p \times m$-matrix.

Second, the $(i,j)$ entry of $(AB)^T$ is the $(j,i)$ entry of $AB$, which is the (row-column) product of the $j$-th row of $A$ and the $i$-th column of $B$:

\[\begin{split} [(AB)^T]_{ij} = \begin{pmatrix} a_{j1} & a_{j2} & \cdots & a_{jn} \end{pmatrix}\begin{pmatrix} b_{1i} \\ b_{2i} \\ \vdots \\ b_{ni} \end{pmatrix}. \end{split}\]

The $(i,j)$ entry of $B^TA^T$ is the product of the $i$-th row of $B^T$ and the $j$-th column of $A^T$.
Now the $i$-th row of $B^T$ is the $i$-th column of $B$ written as a row, and the $j$-th column of $A^T$ is the $j$-th row of $A$ written as a column:

\[\begin{split} [B^TA^T]_{ij} = \begin{pmatrix} b_{1i} & b_{2i} & \cdots & b_{ni} \end{pmatrix}\begin{pmatrix} a_{j1} \\ a_{j2} \\ \vdots \\ a_{jn} \end{pmatrix}. \end{split}\]

Both row-column products end up as the same value

\[ a_{j1}b_{1i} + a_{j2}b_{2i} + \cdots + a_{jn}b_{ni} = b_{1i}a_{j1} + b_{2i}a_{j2} + \cdots + b_{ni}a_{jn}. \]

Remark 3.2.6

We already defined $A^2$ for a square matrix $A$. We can extend this to higher powers of $A$ in an obvious way:

\[ A^3 = A(A^2),\quad A^4 = A(A^3), \,\,.\,.\,.\,. \]

Since

\[ A(A^2) = A(AA) = (AA)A, \]

we can do without the parentheses:

\[ A^3 = AAA,\quad A^4 = AAAA, \,\,.\,.\,.\,. \]

For the same reason

\[ A^kA^{\ell} = A^{k+\ell}, \quad \text{for integers} \quad k,\ell \geq 1. \]

If we define

\[ A^0 = I, \]

then

\[ A^kA^{\ell} = A^{k+\ell} \quad \text{holds for all integers} \quad \quad k,\ell \geq 0. \]

And what can we say about $A^{-1}$?

We will dedicate Section 3.4 to this topic.

3.2.6. Grasple exercises (2)#

Grasple Exercise 3.2.6

https://embed.grasple.com/exercises/262bcea8-548b-45c2-8c37-b4cb3cb03ddc?id=70281

To compute a product $AB$.

Grasple Exercise 3.2.7

https://embed.grasple.com/exercises/718bda8a-9e75-495a-8aea-506788d46432?id=70282

To compute a product $AB$.

Grasple Exercise 3.2.8

https://embed.grasple.com/exercises/e5799b3f-53f6-4095-bb96-bc2f4febde30?id=70284

To compute a product $AB$.

Grasple Exercise 3.2.9

https://embed.grasple.com/exercises/9d1526f4-777b-4a41-8b8e-c0746f7503c9?id=70285

To compute a product $AB$.

Grasple Exercise 3.2.10

https://embed.grasple.com/exercises/d03c79a5-4936-41ae-8129-96ea9dee875a?id=82963

To compute several matrix products.

Grasple Exercise 3.2.11

https://embed.grasple.com/exercises/9fd59a3b-bdc6-42c5-af90-da9b0541437b?id=70291

To compute $\vect{u}^T\vect{v}$ and $\vect{u}\vect{v}^T$.

Grasple Exercise 3.2.12

https://embed.grasple.com/exercises/6e4d152b-1eae-480b-a40c-ca8846ed6612?id=70286

To find $k$ for which $AB=BA$.

Grasple Exercise 3.2.13

https://embed.grasple.com/exercises/65f960ef-01a1-4c81-b053-8c93c66504db?id=70287

To find $k$ for which $AB=BA$.

Grasple Exercise 3.2.14

https://embed.grasple.com/exercises/d2ccfcf5-7aaf-4859-8219-392abad68e79?id=82853

To find two products $AD_1$ and $D_2A$.

Grasple Exercise 3.2.15

https://embed.grasple.com/exercises/2fc08e2c-b3ad-4a2b-8077-ce66abc466d7?id=82936

To find a high power of a special matrix.

Grasple Exercise 3.2.16

https://embed.grasple.com/exercises/6bd96baf-1862-40c7-a21d-24c1dada9078?id=82937

To find a high power of a special matrix.

The remaining exercises are less of a compuational character.

Grasple Exercise 3.2.17

https://embed.grasple.com/exercises/786324ef-8706-4f4d-ac06-f6b4360a70d8?id=69285

Is the zero matrix a diagonal matrix?

Grasple Exercise 3.2.18

https://embed.grasple.com/exercises/14b6af51-de5f-4e6c-bfb8-20a018fce053?id=69458

To explain why a certain product does not exist.

Grasple Exercise 3.2.19

https://embed.grasple.com/exercises/cdb5014d-eace-489e-9616-45e03bb6e95e?id=69295

To find a $2\times2$ matrix $A$ for which $A^2 = -I$.

Grasple Exercise 3.2.20

https://embed.grasple.com/exercises/7f91a5d2-e1c9-422e-b0f9-ba0b22936e2a?id=69456

To show that $(cA)^T = cA^T$.

Grasple Exercise 3.2.21

https://embed.grasple.com/exercises/78c129ac-644d-4fbd-bf47-c2283d0e1f7a?id=69460

To show: $A^TA = D \iff A$ has orthogonal columns.

Grasple Exercise 3.2.22

https://embed.grasple.com/exercises/9fe5f92d-54f9-4794-a2e8-c21a24a5a8cf?id=70288

To find the size of $C$ if $AC = B$.

Grasple Exercise 3.2.23

https://embed.grasple.com/exercises/bbed8637-4110-4e90-a1dc-a5960a405caf?id=70289

Number of columns of $C$ if $AC=B$.

Grasple Exercise 3.2.24

https://embed.grasple.com/exercises/eb4b9e6e-0436-466c-bb1f-7e596b43ec34?id=70290

To find the number of rows of $B$ if $BC$ is an $m\times n$-matrix.

Grasple Exercise 3.2.25

https://embed.grasple.com/exercises/deea79ca-ba41-46fc-b75b-4cd109fc0513?id=71118

Finding $E$ such that $EA = M$ (or $AE = M$).

Grasple Exercise 3.2.26

https://embed.grasple.com/exercises/0b27fa70-e097-4090-b57e-7225019a4624?id=78589

A bit like the previous one ‘by inspection’.

Grasple Exercise 3.2.27

https://embed.grasple.com/exercises/958761d7-421f-40f1-b3be-3535bf71422b?id=82968

Two True/False questions about products and symmmetric matrices.

Matrix operations

Contents

3.2. Matrix operations#

3.2.1. Introduction#

3.2.2. Sum, scalar multiple and transpose#

3.2.3. Grasple exercises (1)#

3.2.4. The product of two matrices#

3.2.5. Properties of the matrix product#

3.2.6. Grasple exercises (2)#