3.2. Matrix operations#
3.2.1. Introduction#
In Chapter 2 matrices were introduced to represent systems of linear equations. The coefficients of a linear system were put into the coefficient matrix \(A\), and a system as a whole could be squeezed into the augmented matrix. In Section 3.1 we used matrices to construct linear transformations. In this chapter we will study matrices as entities on their own, though every now and then we will keep in mind their role in the two contexts just mentioned.
3.2.2. Sum, scalar multiple and transpose#
In this section we will define the sum and the product of two matrices, and the transpose of a matrix. Recall that an \(m\times n\)-matrix has \(m\) (horizontal) rows of size \(n\) or, equivalently, \(n\) (vertical) columns of size \(m\).
Definition 3.2.1 (Equality of matrices)
Two matrices are said to have the same size if they have the same number of rows and the same number of columns.
Two matrices \(A\) and \(B\) are equal if they have the same size, say \(m\) rows and \(n\) columns, and all the corresponding entries are equal, i.e.
Definition 3.2.2
A zero matrix \(O\) is a matrix with all entries equal to 0. If the context requires clarity as to its size it may be denoted by \(O_{mn}\).
Definition 3.2.3 (Scalar multiplication)
If \(A\) is an \(m\times n\)-matrix and \(c\) is a scalar, then \(cA\) is the \(m \times n\)-matrix that is the result of multiplying each entry of \(A\) by \(c\):
We then say that \(cA\) is a scalar multiple of \(A\), or simply a multiple of \(A\).
Definition 3.2.4 (The sum of two matrices)
If \(A\) and \(B\) are two \(m\times n\)-matrices then the sum \(A+B\) is the \(m\times n\)-matrix of which the entry on the position \((i,j)\) is the sum of the corresponding entries of \(A\) and \(B\):
If \(A\) and \(B\) are not of the same size their sum is not defined.
Example 3.2.1
The multiple \((-1)A\) is also written as \(-A\). An obvious property, illustrated in the third example, is:
where \(O\) is the zero matrix.
Example 3.2.2
is not defined. This is because the matrices do not have the same size.
Remark 3.2.1
The two definitions of sum and scalar multiple are called componentwise definitions. They are completely analogous to the definitions of the scalar multiples of a vector and the sum of two vectors. Hence it is not surprising that they obey exactly the same rules, as is summarised in the next proposition (cf. Section Vectors).
Proposition 3.2.1
Suppose \(A, B\) and \(C\) are \(m\times n\)-matrices and let \(c_{1},c_{2}\) be two real numbers. Then we have:
-
\(A+O_{mn}=A=O_{mn}+A\).
-
\((A+B)+C=A+(B+C)\).
-
\(A+B=B+A\).
-
\(A+(-A)=O\).
-
\(1A=A\).
-
\(c_{1}(A+B)=c_{1}A+c_{1}B\).
-
\((c_{1}+c_{2})A=c_{1}A+c_{2}A\).
-
\(c_{1}(c_{2}A)=(c_{1}c_{2})A\).
An operator of which the usefulness is not immediately clear, but which fits well in this section with matrix operations, is the following:
Definition 3.2.5
The transpose of an \(m \times n\)-matrix \(A\) with entries \(a_{ij}\) is the \(n \times m\)-matrix \(B\) with entries \(b_{ij}\) defined by
It is denoted by \(B = A^T\).
Example 3.2.3
The following rules involving the three operators defined so far in this section are easy to prove:
Proposition 3.2.2
Let \(A\) and \(B\) be \(m\times n\)-matrices and \(c\) a scalar. Then we have
-
\((cA)^T = c A^T\).
-
\((A+B)^T = A^T + B^T\).
-
\((A^T)^T = A\).
Proof of Proposition 3.2.2
We will prove the second statement and leave the other two to the diligent reader. See Exercise 3.2.1.
So, suppose \(A\) and \(B\) are two \(m \times n\)-matrices. Then \(A+B\) is an \(m \times n\)-matrix too, hence \((A+B)^T\) is an \(n \times m\)-matrix. The matrix \(A^T + B^T\) on the right-hand side of the equation is the sum of two \(n \times m\)-matrices, which is again an \(n \times m\)-matrix. So the matrices on both sides of the equation have the same size.
Next we have to show that they have equal entries on the corresponding positions. If we put
we see that
and
so we are done.
If you are lost in the forest of indices, have a look at Example 3.2.4.
Example 3.2.4
We check property (ii) for two general \(3\times 4\)-matrices \(A\) and \(B\) on the position \((2,3)\). Let
Then
so
and on position \((2,3)\) we have \(a_{32}+b_{32}\).
On the other hand
with on position \((2,3)\) the value \(a_{32}\) + \(b_{32}\).
Exercise 3.2.1
Prove statements (i) and (iii) of Proposition 3.2.2.
Solution to Exercise 3.2.1
Suppose \(A = \left(\begin{array}{cccc} a_{11} & a_{12}& \cdots& a_{1n} \\ a_{21} & a_{22}& \cdots& a_{2n} \\ \vdots & \vdots& & \vdots \\ a_{m1} & a_{m2}& \cdots& a_{mn} \end{array} \right)\) is an arbitrary \(m \times n\)-matrix. Then
As regards the other statement, i.e., \((A^T)^T = A\), suppose \(A\) is an \(m\times n\)-matrix, \(B = A^T\), and \(C = B^T\). We have to show that \(C = A\).
Now first of all, if \(A\) is an \(m\times n\)-matrix, then \(B\) is an \(n\times m\)-matrix, and transposing again gives an \(m \times n\)-matrix \(C\), so \(C\) has the same shape as \(A\).
Furthermore, transposing means ‘flipping’ the indices. We quickly see that \(C_{ij} = B_{ji} = A_{ij}\), for \(1 \leq i \leq m\), \(1 \leq j \leq n\), so entry by entry \(A\) and \(C\) are equal.
Example 3.2.5
We will solve the equation \(A + 2X^T + B = C\) for \(X\), where
We will extricate \(X\) step by step:
Next we transpose both terms to find
3.2.3. Grasple exercises (1)#
Grasple Exercise 3.2.1
To compute the sum of two matrices.
Click to show/hide
Grasple Exercise 3.2.2
To compute \(c_1A + c_2B\).
Click to show/hide
Grasple Exercise 3.2.3
To compute \(c_1A + c_2B\).
Click to show/hide
Grasple Exercise 3.2.4
To solve equations involving sum and transpose.
Click to show/hide
Grasple Exercise 3.2.5
True/False questions involving sum and transpose.
Click to show/hide
3.2.4. The product of two matrices#
Next we turn our attention to the most important matrix operation, namely the product \(AB\) of two matrices. In the previous chapter we have already seen the special case where \(B\) is a matrix of just one column, i.e.
a vector in \(\mathbb{R}^n\), which we can identify with an \(n \times 1\)-matrix. We want of course the definition of the general matrix product to be consistent with this.
Definition 3.2.6
The product of an \(m\times n\)-matrix \(A\) and an \(n\times p\)-matrix \(B = (\,{\vect{b}_1}\quad {\vect{b}_2}\quad \cdots \quad {\mathbf{b}_p})\) is defined by
So we have
Note that this makes \(AB\) an \(m \times p\)-matrix.
If the number of columns of \(A\) is not equal to the number of rows of \(B\) the product \(AB\) is not defined.
Example 3.2.6
For instance, the third column is computed as
Proposition 3.2.3
The product of the \(m\times n\)-matrix \(A\) and the \(n\times p\)-matrix \(B\) is the \(m\times p\)-matrix \(C\) for which the entry on the position \((i,j)\) is given by
This is sometimes called the row-column expansion of the product.
Proof of Proposition 3.2.3
We already saw this row-column expansion in Section 2.4.
The following scheme nicely visualises the row-column expansion
Example 3.2.7
Let us again consider the matrix product
The \(-5\) on position \((1,3)\) and the \(3\) on position \((3,2)\) in the product come from
Exercise 3.2.2
Explain why the product
is not defined.
Solution to Exercise 3.2.2
The first matrix is \(3\times 2\), just as the second matrix. The product is not defined, as the number of columns (\(2\)) of the first matrix does not match the number of rows (\(3\)) of the second matrix.
Remark 3.2.2
The product of a matrix \(A\) with itself is only defined if \(A\) is an \(n \times n\)-matrix. In that case we use the obvious notation
Example 3.2.8
This example illustrates the existence of a unit element with respect to the multiplication. To identify it we first introduce some more terminology.
Definition 3.2.7
An \(n\times n \)-matrix \(A\) is called a square matrix. So it is a matrix where the number of columns is equal to the number of rows.
For a square matrix \(A\) we call the elements \(a_{ii}\) the diagonal elements. Together the diagonal elements form the (main) diagonal of \(A\).
A square matrix where all non-diagonal elements are equal to \(0\) is called a diagonal matrix.
Remark 3.2.3
The other diagonal of a square matrix, the one from bottom left to top right, plays a minor role. For this reason we don’t reserve a name for it. By ‘diagonal’ we will always mean: main diagonal.
Example 3.2.9
Consider the matrices
The matrices \(A\) and \(B\) are square, and only \(B\) is a diagonal matrix.
Exercise 3.2.3
Is the following statement true or false?
The \(n \times n\) zero matrix \(O_{nn}\) is a diagonal matrix.
Solution to Exercise 3.2.3
Recall Definition 3.2.7 and Definition 3.2.2.
\(O_{nn}\) is a square matrix that has \(0\)s everywhere, so definitely all non-diagonal elements are equal to zero.
Therefore, the statement is true.
Exercise 3.2.4
Suppose \(A = \begin{pmatrix} \mathbf{a}_1 & \mathbf{a}_2 & \cdots & \mathbf{a}_n \end{pmatrix} \) is an \(m\times n\)-matrix and
\(B= \begin{pmatrix} \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_p \end{pmatrix} \) an \(m\times p\)-matrix. Show that
where \(\mathbf{a}\ip\mathbf{b}\) is the dot product of the vectors \(\mathbf{a}\) and \(\mathbf{b}\).
Hint
Note that now we can also write the dot product of two (column) vectors in \(\R^n\) as a matrix product. Namely
Solution to Exercise 3.2.4
Exercise 3.2.5
The special case in the previous exercise where \(A = B\) will become very important when we will look at orthogonal projections. For now, show that the columns of a matrix \(A\) are orthogonal if and only if the matrix \(A^TA\) is a diagonal matrix.
Solution to Exercise 3.2.5
First recognise that
If \(A\) has orthogonal columns, then at least \(\mathbf{a}_i\ip\mathbf{a}_j=0\) for \(i\neq j\). This gives
So \(A^TA\) is a diagonal matrix.
The other way, if \(A^T\) is a diagonal matrix, then at least \(\mathbf{a}_i\ip\mathbf{a}_j=0\) for \(i\neq j\), so \(A\) has orthogonal columns.
Definition 3.2.8
The identity matrix \(I_n\) is the \(n \times n\) diagonal matrix with \(1\)s on the diagonal. If the size is irrelevant or clear from the context, we denote it simply by \(I\).
Exercise 3.2.6
Let
Show that \(IA = A\).
Solution to Exercise 3.2.6
We first focus on the first column of \(A=\begin{pmatrix}\mathbf{a}_1&\mathbf{a}_2&\mathbf{a}_3\end{pmatrix}=\begin{pmatrix} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ a_{31}& a_{32} & a_{33} \\ a_{41} & a_{42} & a_{43} \end{pmatrix}\):
Similarly, \(I\mathbf{a}_2=\mathbf{a}_2\) and \(I\mathbf{a}_3=\mathbf{a}_3\).
Therefore,
The definition of the product of two matrices and the earlier definition of the product of a matrix and a vector (Definition 2.4.1) immediately imply that the columns of the product of two matrices are linear combinations of the columns of the first matrix.
As is often the case in linear algebra things can be looked at from a different perspective. From Proposition 3.2.3
it follows that the elements \(c_{i1},c_{i2},\ldots,c_{in}\) of the \(i\)-th row of the product \(C = AB\) as far as \(A\) is concerned only depend on the elements \(a_{ik}\) of its \(i\)-th row. The following proposition explains in which way.
Proposition 3.2.4
The \(i\)-th row of the product \(AB\) is the linear combination of the rows of the second matrix, \(B\), with the entries of the \(i\)-th row of \(A\) as coefficients.
Proof of Proposition 3.2.4
The indicated linear combination yields:
This is a row vector with on the \(j\)-th position the number
and that is precisely the entry \(c_{ij}\) of the matrix \(C = AB\).
Interestingly this opens the way to describe the row operations of Chapter 2 via matrix multiplication. The following example illustrates this for the three basic row operations.
Example 3.2.10
The following multiplication adds the first row of the matrix
four times to the second row:
Here the third row is scaled with a factor \(5\):
And with the following multiplication the first and third row of \(A\) are swapped:
For future reference we give these matrices a name:
Definition 3.2.9
The matrices \(E\) that perform one single row operation (row replacement, row scaling, or row exchange) via \(A \mapsto EA\) are called elementary matrices.
Exercise 3.2.7
Describe in words which row operations are the effect of pre-multiplying a \(4\times n\)-matrix \(A\) with the following elementary matrices:
Solution to Exercise 3.2.7
\(E_1\) is the elementary matrix that substracts row \(4\) of \(A\) from row \(2\) of \(A\).
\(E_2\) is the elementary matrix that switches row \(2\) of \(A\) and row \(4\) of \(A\).
Example 3.2.11
The following product may at first sight seem a bit odd, but it is exactly according to the definition:
The column-row product in the last example is the building block for yet another way to look at the matrix product. The next exercise explains how.
Exercise 3.2.8 (Column-row expansion of the product)
Denote the columns of the \(m\times n\)-matrix \(A\) by \(A_{(1)}, \ldots, A_{(n)}\), and the rows of the \(n\times p\)-matrix \(B\) by \(B^{(1)}, \ldots, B^{(p)}\), so
Show that
i.e., \(AB\) is the sum of \(n\) column-row products (like in Example 3.2.11).
Solution to Exercise 3.2.8
Using Proposition 3.2.3, we see that with \(C=AB\) and
Now consider one column-row product:
This shows that \(\left(A_{(k)}B^{(k)}\right)_{ij}=a_{ik}b_{kj}\). Now follows
This shows that
3.2.5. Properties of the matrix product#
Now let us have a look which of the rules of the products of numbers also hold for products of matrices. And which do not. Section 3.2.5
Proposition 3.2.5
For all \(m \times n\)-matrices \(A,A_1,A_2\), all \(n \times p\)-matrices \(B,B_1,B_2\), all \(p \times q\)-matrices \(C\) and all real numbers \(c\) the following are true:
-
\(A(B_1+B_2) = AB_1 + AB_2\) and \((A_1+A_2)B = A_1B+A_2B\);
-
\(A(cB) = c(AB) = (cA)B\);
-
\(AI_n = A\) and \(I_mA = A\) (the identity matrix \(I\) acts as a unit element);
-
\(A(BC) = (AB)C\).
Example 3.2.12
As an illustration of rule iv. we compute the two triple products for the three matrices
On the one hand
and on the other hand
So the products are indeed equal. But it is not immediately clear how. For instance, the value \(14\) on position \((2,2)\) comes about in two ways
We need a good perspective to give a proof of the general case.
Proof of Proposition 3.2.5
Rules i. and ii. are checked in a straightforward way. See Exercise 3.2.9.
-
We saw instances of this property already in Example 3.2.8 and Exercise 3.2.6. For the general case, one way to show validity of the first statement is to note that the \(j\)-th column of \(AI_n\) is \(A\mathbf{e}_j\) where \(\mathbf{e}_j\) is the \(j\)-th column of the identity matrix \(I_n\). This gives the linear combination
\[ A\mathbf{e}_j = 0\mathbf{a}_1 + 0\mathbf{a}_2 + \cdots + 1\mathbf{a}_j +\dots + 0\mathbf{a}_n = \mathbf{a}_j \]which shows that the \(j\)-th column of \(AI_n\) is equal to the \(j\)-th column of \(A\). And this holds for any column.
The identity \(\quad I_mA = A\quad\) is shown in an analogous way, working row by row.
-
First we observe that both triple products yield \(m \times q\)-matrices. Then the identity can be proved ‘column by column’, as the previous one.
We are done if we can show that
\[\begin{split} \begin{array}{rcl} k\text{-th column of }A(BC) &=& k\text{-th column of }(AB)C \\ &=& (AB)( k\text{-th column of }C) = (AB)\mathbf{c}_k, \end{array} \end{split}\]for \( k = 1,2,\ldots, q \).
Now recall that (by definition)
\[ k\text{-th column of }BC = B\vect{c}_k, \]so
\[ k\text{-th column of }A(BC) = A\,(B\vect{c}_k). \]Making extensive use of the rule
\[ A(c_1\mathbf{x} + c_2\mathbf{y}) = c_1A\mathbf{x} + c_2A\mathbf{y}, \]we find
\[\begin{split} \begin{array}{ccl} A\,(B\mathbf{c_k}) & = & A \,(c_{1k}\mathbf{b}_1 +c_{2k}\mathbf{b}_2 + \cdots + c_{pk}\mathbf{b}_p)\\ & = & c_{1k}(A\mathbf{b}_1) +c_{2k}(A\mathbf{b}_2) + \cdots + c_{pk}(A\mathbf{b}_p)\\ & = & \begin{pmatrix} A\mathbf{b}_1 & A\mathbf{b}_2 & \cdots & A\mathbf{b}_p \end{pmatrix} \begin{pmatrix} c_{1k} \\ \vdots \\ c_{pk} \end{pmatrix} \\ & = & (AB)\mathbf{c}_k. \end{array} \end{split}\]
Exercise 3.2.9
Prove rules i. and ii. of Proposition 3.2.5.
Recall that matrices are equal when they have the same size and the entries on corresponding positions are equal (which may be checked column by column or row by row).
Solution to Exercise 3.2.9
-
We show the first identity, assuming \(B_1=\begin{pmatrix}\mathbf{b}^1_1&\mathbf{b}^1_2&\cdots&\mathbf{b}^1_q\end{pmatrix}\) and \(B_2=\begin{pmatrix}\mathbf{b}^2_1&\mathbf{b}^2_2&\cdots&\mathbf{b}^2_q\end{pmatrix}\):
\[\begin{split} \begin{align*} A(B_1+B_2) &= A\begin{pmatrix}\mathbf{b}^1_1+\mathbf{b}^2_1&\mathbf{b}^1_2+\mathbf{b}^2_2&\cdots&\mathbf{b}^1_q+\mathbf{b}^2_q\end{pmatrix} \\ &= \begin{pmatrix}A\left(\mathbf{b}^1_1+\mathbf{b}^2_1\right)&A\left(\mathbf{b}^1_2+\mathbf{b}^2_2\right)&\cdots&A\left(\mathbf{b}^1_q+\mathbf{b}^2_q\right)\end{pmatrix} \\ &= \begin{pmatrix}A\mathbf{b}^1_1+A\mathbf{b}^2_1&A\mathbf{b}^1_2+A\mathbf{b}^2_2&\cdots&A\mathbf{b}^1_q+A\mathbf{b}^2_q\end{pmatrix} \\ &= \begin{pmatrix}A\mathbf{b}^1_1&A\mathbf{b}^1_2&\cdots&A\mathbf{b}^1_q\end{pmatrix}+\begin{pmatrix}A\mathbf{b}^2_1&A\mathbf{b}^2_2&\cdots&A\mathbf{b}^2_q\end{pmatrix} \\ &= A\begin{pmatrix}\mathbf{b}^1_1&\mathbf{b}^1_2&\cdots&\mathbf{b}^1_q\end{pmatrix}+A\begin{pmatrix}\mathbf{b}^2_1&\mathbf{b}^2_2&\cdots&\mathbf{b}^2_q\end{pmatrix} \\ &= AB_1+AB_2. \end{align*} \end{split}\]The second identity can be shown similarly.
-
We only show the first equality, using \(B=\begin{pmatrix}\mathbf{b}_1&\mathbf{b}_2&\cdots&\mathbf{b}_q\end{pmatrix}\):
\[\begin{split} \begin{align*} A(cB) &= A\left(c\begin{pmatrix}\mathbf{b}_1&\mathbf{b}_2&\cdots&\mathbf{b}_q\end{pmatrix}\right) \\ &= A\begin{pmatrix}c\mathbf{b}_1&c\mathbf{b}_2&\cdots&c\mathbf{b}_q\end{pmatrix} \\ &= \begin{pmatrix}A\left(c\mathbf{b}_1\right)&A\left(c\mathbf{b}_2\right)&\cdots&A\left(c\mathbf{b}_q\right)\end{pmatrix} \\ &= \begin{pmatrix}c\left(A\mathbf{b}_1\right)&c\left(A\mathbf{b}_2\right)&\cdots&c\left(A\mathbf{b}_q\right)\end{pmatrix} \\ &= c\begin{pmatrix}A\mathbf{b}_1&A\mathbf{b}_2&\cdots&A\mathbf{b}_q\end{pmatrix} \\ &= c\left(AB\right). \end{align*} \end{split}\]The second equality can be shown similarly.
Remark 3.2.4
The proof of Proposition 3.2.5 iv. can be seen in another light. In the Section Linear transformations we saw that an \(m\times n\)-matrix \(A\) defines a transformation \(T\) from \(\mathbb{R}^n\) to \(\mathbb{R}^m\), namely
The definition of the product of two matrices then precisely matches the composition of two of such transformations: if \(A\) is an \(m\times n\)-matrix and \(B\) is an \(n\times p\)-matrix
and
yield the same vector:
So far so good: matrix multiplication behaves as multiplication of numbers. However, in two important respects the concepts deviate. First of all, commutativity no longer holds.
Example 3.2.13
For the matrices
it is clear that
simply because the two products are not of the same size: \(AB\) is a \(2\times 2\)-matrix, \(BA\) a \(3\times3\)-matrix.
The following example illustrates that \(AB = BA\) is not even guaranteed for two \(n\times n\)-matrices \(A\) and \(B\):
The fact that \(AB \neq BA\) can be understood by thinking about the composition of the two transformations corresponding to \(A\) and \(B\). (See Section Linear transformations.)
The following two exercises shed some light on the non-commutativity.
Example 3.2.14
Consider the two matrices
and the corresponding linear transformations
and
We get
and likewise
Fig. 3.2.1 \( \begin{pmatrix} 2 & 0 \\ 0 & 1 \end{pmatrix}\begin{pmatrix} 0 & 1 \\ 1&0 \end{pmatrix} \neq \begin{pmatrix} 0 & 1 \\ 1&0 \end{pmatrix}\begin{pmatrix} 2 & 0 \\ 0 & 1 \end{pmatrix}\).#
Note that \(S\) is a transformation that ‘stretches’ horizontally, and \(T\) is a reflection. Figure 3.2.1 visualises the transformations corresponding to \(AB\) and \(BA\). When we apply the transformations one after another, the order in which we do this is important.
Exercise 3.2.10
Another way to understand why \(AB\neq BA\) is the following.
Recall that the matrices
perform row operations, when multiplied with a \(2 \times n\)-matrix \(A\).
-
Describe in words the row operations corresponding to \(E_1\) and \(E_2\).
-
Describe in words the combined row operations corresponding to \(E_1E_2\) and \(E_2E_1\). Can you explain why \(E_1E_2 \neq E_2E_1\)?
-
Compute \(E_1E_2\) and \(E_2E_1\) to double check the last non-identity.
Solution to Exercise 3.2.10
-
\(E_1\) is the elementary matrix corresponding to adding \(2\) times row \(1\) of \(A\) to row \(2\) of \(A\).
\(E_2\) is the elementary matrix corresponding to multiplying row \(1\) of \(A\) by \(3\).
-
\(E_1E_2\) first multiplies row \(1\) by \(3\) and then adds that new row twice to row \(2\). The new first row is thus \(3\) times row \(1\) and the new second row is thus \(6\) times row \(1\) plus row \(2\).
\(E_2E_1\) first adds that row \(1\) twice to row \(2\) and then multiplies row \(1\) by \(3\) (leaving the second row the same). The new first row is thus \(3\) times row \(1\) and the new second row is thus \(2\) times row \(1\) plus row \(2\).
\(E_1E_2\) and \(E_2E_1\) have thus a different second row as a result, so \(E_1E_2 \neq E_2E_1\).
-
\(E_1E_2=\begin{pmatrix}3&0\\6&1\end{pmatrix}\).
\(E_2E_1=\begin{pmatrix}3&0\\2&1\end{pmatrix}\).
The second major difference between the product of numbers and the product of matrices: for two (e.g. real) numbers \(a\) and \(b\) it is known that
or, equivalently,
As the following example shows, things are different in the realm of matrices.
Example 3.2.15
So the product of two non-zero matrices may be the zero matrix.
The following example shows that things are even ‘worse’:
Example 3.2.16
which shows that we cannot even conclude from \(A A = O\) that \(A\) itself must be the zero matrix.
And here is another example of a non-zero matrix whose square is the zero matrix. In this case it can be seen geometrically what is going on.
Example 3.2.17
For the matrix \(A = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}\) we again have that
It holds that
Now consider the transformations corresponding to the matrices \(A_1\) and \(A_2\).
is the projection onto the \(x_2\)-axis, and
is the clockwise rotation about an angle \(\frac12\pi\).
Now let us see, step by step, what is the effect of the transformation \(T_2T_1T_2T_1\), corresponding to \(A^2\).
An arbitrary vector \(\vect{x}\) is sent to a vector \(T_1(\vect{x})\) on the \(x_2\)-axis by \(T_1\).
The rotation sends this to a vector \(T_2(T_1(\vect{x}))\) on the \(x_1\)-axis. Projecting onto the \(x_2\)-axis again, will bring this to \(T_1(T_2(T_1(\vect{x}))) = \vect{0}\).
Lastly, applying \(T_2\) leaves the zero vector where it is.
So \(A^2\vect{x} = T_2(T_1(T_2(T_1(\vect{x})))) = \vect{0}\), for each vector \(\vect{x}\).
See Figure 3.2.2.
Fig. 3.2.2 Visualisation of \(\vect{x} \mapsto A^2\vect{x}\).#
Remark 3.2.5
The next list gives six situations where matrix multiplication acts differently than multiplication of numbers.
In fact, all statements can be related to one of the first two.
-
In general, \(AB = BA\) does not hold for two \(n\times n\)-matrices \(A\) and \(B\).
-
In general, from \(AB = O\) it does not follow that either \(A =O\) or \(B = O\).
-
In general, \((A+B)(A+B) = A^2 + 2AB + B^2\) does not hold for two \(n\times n\)-matrices \(A\) and \(B\).
-
In general, \((A+B)(A-B) = A^2 - B^2\) does not hold for two \(n\times n\)-matrices \(A\) and \(B\).
-
In general, from \(AB = AC\) and \(A \neq O\) it does not follow that \(B = C\).
-
In general, from \(A^2 = I\) it does not follow that either \(A = I\) or \(A = -I\).
For each statement counterexamples can be given, as we already did for the first two. To get more insight in what is really going on, we can also try to find out how the third till the sixth statement relate to the first two statements.
For instance, the third statement is closely related to the first. Let us check where ‘things go wrong’.
The last expression is only equal to
if
And that is only the case if
So any pair of two matrices \(A\) and \(B\) with
provides a counterexample where
Likewise, v. follows from ii. Namely,
According to ii. from the last equation we cannot deduce that
We can create a counterexample by taking for \(A\) and \(B\) non-zero matrices for which
and let \(C\) be the zero matrix. Then \(B \neq C\), whereas
Statement vi. also relates to ii. Namely,
From the last equality we cannot conclude that one of the factors \((A+I)\) or \((A-I)\) must be the zero matrix. In this case we do not get a counterexample for free. You are asked to construct counterexamples in Exercise 3.2.11.
Exercise 3.2.11
-
Give a \(2 \times 2\)-matrix \(A\) not containing any zeros, for which \(A^2 = I\).
-
Give a \(2 \times 2\)-matrix \(B\) for which \(B^2 = -I\).
Solution to Exercise 3.2.11
-
\(A=\begin{pmatrix}\frac{1}{\sqrt{2}}&\frac{1}{\sqrt{2}}\\\frac{1}{\sqrt{2}}&-\frac{1}{\sqrt{2}}\end{pmatrix}\).
-
\(B=\begin{pmatrix}0&-1\\1&0\end{pmatrix}\).
The following property connects the two operations matrix transposition and matrix multiplication.
Proposition 3.2.6
If \(A\) is an \(m\times n\)-matrix and \(B\) an \(n\times p\)-matrix, then
Before we present the proof, we consider a typical example.
Example 3.2.18
We verify the rule for the two matrices
We compute:
and
so that indeed
Careful inspection learns that for the two matrix products exactly the same sums and products of numbers have to be computed. For instance, in both products \(12\) is the sum of products
As Example 3.2.18 illustrates the rule is not restricted to square matrices \(A\) and \(B\). The proof for general matrices \(A\) and \(B\) for which the product \(AB\) is well defined is as follows
Proof of Proposition 3.2.6
To show that
we have to show that the matrices have the same size, and are equal entry by entry. First, we see that \(AB\) is an \(m \times p\)-matrix, so \((AB)^T\) is a \(p \times m\)-matrix, and \(B^TA^T\), being the product of a \(p \times n\)-matrix with an \(n \times m\), is also a \(p \times m\)-matrix.
Second, the \((i,j)\) entry of \((AB)^T\) is the \((j,i)\) entry of \(AB\), which is the (row-column) product of the \(j\)-th row of \(A\) and the \(i\)-th column of \(B\):
The \((i,j)\) entry of \(B^TA^T\) is the product of the \(i\)-th row of \(B^T\) and the \(j\)-th column of \(A^T\).
Now
the \(i\)-th row of \(B^T\) is the \(i\)-th column of \(B\) written as a row, and the \(j\)-th column of \(A^T\) is the \(j\)-th row of \(A\) written as a column:
Both row-column products end up as the same value
Remark 3.2.6
We already defined \(A^2\) for a square matrix \(A\). We can extend this to higher powers of \(A\) in an obvious way:
Since
we can do without the parentheses:
For the same reason
If we define
then
And what can we say about \(A^{-1}\)?
We will dedicate Section 3.4 to this topic.
3.2.6. Grasple exercises (2)#
Grasple Exercise 3.2.6
To compute a product \(AB\).
Click to show/hide
Grasple Exercise 3.2.7
To compute a product \(AB\).
Click to show/hide
Grasple Exercise 3.2.8
To compute a product \(AB\).
Click to show/hide
Grasple Exercise 3.2.9
To compute a product \(AB\).
Click to show/hide
Grasple Exercise 3.2.10
To compute several matrix products.
Click to show/hide
Grasple Exercise 3.2.11
To compute \(\vect{u}^T\vect{v}\) and \(\vect{u}\vect{v}^T\).
Click to show/hide
Grasple Exercise 3.2.12
To find \(k\) for which \(AB=BA\).
Click to show/hide
Grasple Exercise 3.2.13
To find \(k\) for which \(AB=BA\).
Click to show/hide
Grasple Exercise 3.2.14
To find two products \(AD_1\) and \(D_2A\).
Click to show/hide
Grasple Exercise 3.2.15
To find a high power of a special matrix.
Click to show/hide
Grasple Exercise 3.2.16
To find a high power of a special matrix.
Click to show/hide
The remaining exercises are less of a compuational character.
Grasple Exercise 3.2.17
Is the zero matrix a diagonal matrix?
Click to show/hide
Grasple Exercise 3.2.18
To explain why a certain product does not exist.
Click to show/hide
Grasple Exercise 3.2.19
To find a \(2\times2\) matrix \(A\) for which \(A^2 = -I\).
Click to show/hide
Grasple Exercise 3.2.20
To show that \((cA)^T = cA^T\).
Click to show/hide
Grasple Exercise 3.2.21
To show: \(A^TA = D \iff A\) has orthogonal columns.
Click to show/hide
Grasple Exercise 3.2.22
To find the size of \(C\) if \(AC = B\).
Click to show/hide
Grasple Exercise 3.2.23
Number of columns of \(C\) if \(AC=B\).
Click to show/hide
Grasple Exercise 3.2.24
To find the number of rows of \(B\) if \(BC\) is an \(m\times n\)-matrix.
Click to show/hide
Grasple Exercise 3.2.25
Finding \(E\) such that \(EA = M\) (or \(AE = M\)).
Click to show/hide
Grasple Exercise 3.2.26
A bit like the previous one ‘by inspection’.
Click to show/hide
Grasple Exercise 3.2.27
Two True/False questions about products and symmmetric matrices.