3.4. The Inverse of a Matrix#

3.4.1. Introduction#

In Section 3.2 we defined the sum and product of matrices (of compatible sizes), and we saw that to a certain extent matrix algebra is guided by the same rules as the arithmetic of real numbers. We can also subtract two matrices via

\[ A - B = A + (-1)B, \]

but we did not mention division of matrices.
For two numbers \(a\) and \(b\), with \(a \neq 0\), the equation

\[ ax = b \]

has the unique solution

\[ x = \frac{b}{a} = a^{-1}b = ba^{-1}, \]

where

\[ a^{-1} = \frac1a \]

is the (unique) solution of the equation

\[ ax = 1. \]

The bad news:

\[ \frac{A}{B} \quad \text{cannot be defined in any useful way!} \]

First of all the corresponding matrix equation

\[ AX = B, \,\, A\neq O \]

does not always have a solution, or the solution is not unique, not even in the case of two \(n \times n\) matrices \(A\) and \(B\). Two examples to illustrate this:

Example 3.4.1

The matrix equation

\[ AX = B \]

where

\[\begin{split} A = \begin{bmatrix} 1 & 2 \\ 1 & 2 \end{bmatrix}, \quad B = \begin{bmatrix} 1 & 1 \\ 0 & 0 \end{bmatrix} \end{split}\]

does not have a solution. Why? Well, any column of \(AX\) is a linear combination of the columns of \(A\), and the columns of \(B\) obviously cannot be written as such linear combinations:

\[\begin{split} \begin{bmatrix} 1 \\ 0 \end{bmatrix} \neq c_1 \begin{bmatrix} 1 \\ 1 \end{bmatrix} + c_2 \begin{bmatrix} 2 \\ 2 \end{bmatrix} \quad \text{for all } c_1,c_2 \quad\text{in } \mathbb{R}. \end{split}\]

Example 3.4.2

The matrix equation

\[ AX = B \]

where

\[\begin{split} A = \begin{bmatrix} 1 & 2 \\ 1 & 2 \end{bmatrix}, \quad B = \begin{bmatrix} 1 & 4 \\ 1 & 4 \end{bmatrix} \end{split}\]

has infinitely many solutions. Two of those are for instance

\[\begin{split} X_1 = \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix} \quad \text{and} \quad X_2 = \begin{bmatrix} -1 & 2 \\ 1 & 1 \end{bmatrix}. \end{split}\]

And lastly, if there is a matrix \(C\) for which

\[ CA = I \]

and we would adopt the notation

\[ C = A^{-1} \]

then

\[\begin{split} \begin{array}{rcl} AX = B &\Rightarrow& C(AX) = CB \\ &\Rightarrow& (CA)X \,=\, IX \,=\, \underline{\underline{X \,=}}\,\,\,\,\, CB \,=\, \underline{\underline{A^{-1}B}}. \end{array} \end{split}\]

So \(X = A^{-1}B\). However, it is in no way clear why \(A^{-1}B\) and \(BA^{-1}\) should be equal, and in general indeed they are not. So the notation

\[ \dfrac{B}{A} \]

will still be ambiguous.

For non-square matrices things are even worse. In this section we will only consider square matrices.

3.4.2. Definition and Basic Properties of the Inverse#

Definition 3.4.1

A square matrix \(A\) is called invertible if there exists a matrix \(B\) for which

\[ AB = BA = I. \]

In this situation the matrix \(B\) is called the inverse of \(A\) and we write

\[ B = A^{-1}. \]

A matrix that is invertible is also called a regular matrix, and a non-invertible matrix is also called a singular matrix.

Note the use of the definite article the in the sentence ‘\(B\) is called the inverse of \(A\)’. The following proposition justifies this choice of word.

Proposition 3.4.1

If an inverse of a matrix \(A\) exists, then it is unique.

The proof is very short, when we plug in the right idea at the right place.

Proof. Suppose \(B\) and \(C\) are two matrices that satisfy the properties of being an inverse of \(A\), i.e.

\[ AB = BA = I \quad \text{and} \quad AC = CA = I. \]

Then the following chain of identities proves that \(B\) and \(C\) must be equal:

\[ B = B\,I \,= B\,(AC) = (BA)\,C= I\,C = C. \]

Remark 3.4.1

Actually, the proof shows slightly more, as the assumptions

\[ CA= I, \quad AB = I \]

are not used. In fact it shows that for three \(n \times n\) matrices \(A\), \(B\) and \(C\)

\[ \text{if} \quad BA = I \quad \text{ and }\quad AC = I \quad\text{ then } \quad B = C. \]

Example 3.4.3

For the matrices

\[\begin{split} A = \begin{bmatrix} 1 & 2 \\ 3 & 5 \end{bmatrix} \quad \text{and} \quad B = \begin{bmatrix} -5 & 2 \\ 3 & -1 \end{bmatrix} \end{split}\]

we see

\[\begin{split} \begin{bmatrix} 1 & 2 \\ 3 & 5 \end{bmatrix} \begin{bmatrix} -5 & 2 \\ 3 & -1 \end{bmatrix} = \begin{bmatrix} -5 & 2 \\ 3 & -1 \end{bmatrix} \begin{bmatrix} 1 & 2 \\ 3 & 5 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}. \end{split}\]

So \(A\) and \(B\) are each other’s inverse.

Another example:

\[\begin{split} \begin{bmatrix} 1 & 1 & 0 \\ 1 & 1 & 1 \\ 0 & 1 & 1 \end{bmatrix} \begin{bmatrix} 0 & 1 & -1 \\ 1 & -1 & 1 \\ -1 & 1 & 0 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0 \\ 0 & 0 & 1\end{bmatrix}. \end{split}\]

You may check for yourself that the product in the other order also gives \(I\), so

\[\begin{split} \begin{bmatrix} 1 & 1 & 0 \\ 1 & 1 & 1 \\ 0 & 1 & 1 \end{bmatrix}^{-1} = \begin{bmatrix} 0 & 1 & -1 \\ 1 & -1 & 1 \\ -1 & 1 & 0 \end{bmatrix} \end{split}\]

It will appear (Remark 3.4.3) that for square matrices, a one-sided inverse is automatically a two-sided inverse, by which we mean

\[ \text{if} \quad AB = I \quad \text{then also}\quad BA = I. \]

The first example can be generalized:

Proposition 3.4.2

If \(A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}\), then \(A^{-1}\) exists if and only if

\[ ad - bc \neq 0. \]

In that case

\[\begin{split} A^{-1} = \begin{bmatrix} a & b \\ c & d \end{bmatrix}^{-1} = \begin{bmatrix} \dfrac{d}{ad - bc} & \dfrac{-b}{ad - bc} \\ \dfrac{-c}{ad - bc} & \dfrac{a}{ad - bc} \end{bmatrix} = \frac{1}{ad-bc}\begin{bmatrix} d &- b \\ -c & a \end{bmatrix}. \end{split}\]

We leave the verification as an exercise.

Exercise 3.4.1

Verify that the matrix \(B=A^{-1}\) proposed in Proposition 3.4.2 indeed satisfies

\[ AB = BA = I. \]

Also check that the first matrix in Example 3.4.3 illustrates the formula.

Solution to Exercise 3.4.1 (click to show)
\[\begin{split} \begin{array}{rcl} BA &=& \dfrac{1}{ad-bc}\begin{bmatrix} d &-b \\ -c & a \end{bmatrix} \begin{bmatrix} a & b \\ c & d \end{bmatrix}\\ &=& \dfrac{1}{ad-bc}\begin{bmatrix} da-bc &db- bd \\ -ca+ac & -cb+ad \end{bmatrix} \\ &=& \begin{bmatrix} \dfrac{da-bc}{ad-bc} &0 \\ 0 & \dfrac{-cb+ad}{ad-bc} \end{bmatrix} = \begin{bmatrix} 1&0 \\ 0 & 1 \end{bmatrix}. \end{array} \end{split}\]

Which is one of the two identities.

Applying the formula of Proposition 3.4.2 to the matrix \(A = \begin{bmatrix} 1 & 2 \\ 3 & 5 \end{bmatrix}\) of Example 3.4.3 gives

\[\begin{split} A^{-1} = \dfrac{1}{1\cdot5 - 2\cdot 3}\begin{bmatrix} 5 & -2 \\ -3&1 \end{bmatrix} = -\begin{bmatrix} 5 & -2 \\ -3&1 \end{bmatrix} = \begin{bmatrix} -5 & 2 \\ 3&-1 \end{bmatrix}, \end{split}\]

which is indeed the matrix \(B\) that was proposed there.

Remark 3.4.2

The condition

\[ ad - bc \neq 0 \]

is equivalent to the statement

\[\begin{split} \text{the vectors } \begin{bmatrix} a \\ c \end{bmatrix} \text{ and } \begin{bmatrix} b \\ d \end{bmatrix} \text{ are linearly independent.} \end{split}\]

First we show that

\[\begin{split} ad - bc = 0 \text{ implies that } \begin{bmatrix} a &b\\ c&d \end{bmatrix} \text{ has linearly dependent columns.} \end{split}\]

It is best to split this in two cases:

\[ ad = 0 \quad \text{and} \quad ad \neq 0. \]

If we assume

\[ ad - bc = 0 \quad \text{and} \quad ad = 0, \]

then we have

\[ b = 0 \quad \text{or} \quad c = 0 \]

which leads to a matrix

\[\begin{split} \begin{bmatrix} a & b \\ c & d \end{bmatrix} \end{split}\]

with either a zero row or a zero column, which will indeed have linearly dependent columns. Second, if we assume

\[ ad - bc = 0 \quad\text{and} \quad ad \neq 0 \]

then both \(a \neq 0\) and \(d \neq 0\), in which case

\[\begin{split} d = \frac{bc}{a}, \quad \text{so } \begin{bmatrix} b \\ d \end{bmatrix} = \begin{bmatrix} b \\ \frac{bc}{a} \end{bmatrix} = \dfrac{b}{a}\begin{bmatrix} a \\ c \end{bmatrix}, \end{split}\]

hence the columns are again linearly dependent. Thus we have shown:

\[\begin{split} ad-bc = 0 \quad \Longrightarrow \quad \begin{bmatrix} a & b \\ c & d \end{bmatrix} \text{ has linearly dependent columns.} \end{split}\]

Next let us consider the converse, i.e.

\[\begin{split} \begin{bmatrix} a & b \\ c&d \end{bmatrix} \text{ has linearly dependent columns} \quad \text{implies: } \quad ad - bc = 0. \end{split}\]

If a \(2 \times 2\) matrix has two linearly dependent columns, then one of the columns will be a multiple of the other column, e.g.

\[\begin{split} \text{either } \quad \begin{bmatrix} a \\ c \end{bmatrix} = k \begin{bmatrix} b \\ d \end{bmatrix} \quad \text{or}\quad \begin{bmatrix} b \\ d \end{bmatrix} = k \begin{bmatrix} a \\ c \end{bmatrix} . \end{split}\]

In both cases it is easily checked that

\[ ad-bc = 0. \]

The following proposition shows that the above considerations can be generalized.

Proposition 3.4.3

If \(A\) is a square matrix, then

\[ AX = I \]

has a unique solution if and only if

\[ A \text{ has linearly independent columns.} \]

Proof. As in the proof in Remark 3.4.2 we have to prove two implications:

\[ AX = I \text{ has a unique solution } \quad\Longrightarrow \quad A \text{ has linearly independent columns} \]

and

\[ A \text{ has linearly independent columns} \quad\Longrightarrow \quad AX = I \text{ has a unique solution.} \]

For the first part, assume that

\[ AX = I \quad \text{has a (unique) solution.} \]

That means that every column \(\mathbf{e_j}\) of the identity matrix is a linear combination of columns \(\mathbf{a_1}, \ldots, \mathbf{a_n}\) of \(A\). So the span of the columns of \(A\) contains the span of the columns of \(\mathbf{e_1}, \ldots, \mathbf{e}_n\), which is the whole \(\mathbb{R}^n\). Thus every linear system

\[ A\mathbf{x} =\mathbf{b}, \quad \mathbf{b} \in \mathbb{R}^{n} \]

has a solution. Then the reduced echelon form of \(A\) must have a pivot in every row, and, since it is a square matrix, it must be the identity matrix. Consequently, it has a pivot in every column, so the linear system

\[ A\mathbf{x} =\mathbf{0} \]

only has the trivial solution, which proves that indeed the columns of \(A\) are linearly independent.

For the converse, suppose that \(A\) has linearly independent columns. Then the reduced echelon form of \(A\) must be the identity matrix. This implies that for each \(\mathbf{b}\) in \(\mathbb{R}^n\)

\[ [\,A\,|\,\mathbf{b}\,] \sim[\,I\,|\,\mathbf{b'}\,], \]

and in particular, each linear system

\[ A\mathbf{x} =\mathbf{e_j} \]

has a unique solution. If we denote this solution by \(\mathbf{c_j}\) we have that

\[ A[\,\mathbf{c_1}\,\,\mathbf{c_2}\,\, \ldots \,\, \mathbf{c_n}\,] = [\,A\mathbf{c_1}\,\,A\mathbf{c_2}\,\, \ldots \,\, A\mathbf{c_n}\,] = [\,\mathbf{e_1}\,\,\mathbf{e_2}\,\, \ldots \,\, \mathbf{e_n}\,] = I. \]

Since all solutions \(\mathbf{c_j}\) are unique, the solution of the equation

\[ AX = I\]

is unique as well.

It makes sense that the solution \(B\) of this matrix equation will be the inverse of \(A\), and it is, but it takes some effort to show that the other requirement,

\[ BA = I \]

is also fulfilled. In the next subsection we will see that the matrix equation

\[ AX = I \]

will lead the way to an algorithm to compute the inverse of a matrix. Before we go there we will look at some general properties of invertible matrices.

Proposition 3.4.4

If the \(n \times n\) matrix \(A\) is invertible and \(B\) is an \(n \times p\) matrix, then the solution of the matrix equation

\[ AX = B \]

is unique, and given by

\[ X = A^{-1}B. \]

In particular, if the matrix \(B\) has only one column, i.e., if it is a vector, then

\[ A\mathbf{x} = \mathbf{b} \quad \text{has the unique solution} \quad \mathbf{x} = A^{-1}\mathbf{b}. \]

Proof. We multiply both sides of the equation

\[ AX = B \]

by \(A^{-1}\) and use the fact that the matrix product has the associative property:

\[\begin{split} \begin{array}{rl} AX = B \quad\Longrightarrow\quad A^{-1}(AX) = A^{-1}B & \Longrightarrow\quad (A^{-1}A)X = IX = A^{-1}B \\ &\Longrightarrow \quad X = A^{-1}B. \end{array} \end{split}\]

We illustrate the proposition by an example.

Example 3.4.4

Suppose the matrix \(A\) and the vectors \(\mathbf{b}_1\) and \(\mathbf{b}_1\) are given by

\[\begin{split} A=\begin{bmatrix}1 & 2 \\ 3 & 4 \end{bmatrix}, \quad \mathbf{b}_1= \begin{bmatrix}-1 \\ 1 \end{bmatrix} \quad \text{and} \quad \mathbf{b}_2=\begin{bmatrix}2 \\ 10 \end{bmatrix}. \end{split}\]

Consider the two linear systems

\[ A\mathbf{x} =\mathbf{b}_1\quad \text{and} \quad A\mathbf{x} = \mathbf{b}_2. \]

Using the inverse matrix

\[\begin{split} A^{-1} = \frac{1}{-2}\begin{bmatrix}4 & -2 \\ -3 & 1 \end{bmatrix} = \frac{1}{2}\begin{bmatrix}-4 & 2 \\ 3 & -1 \end{bmatrix}, \end{split}\]

the two solutions are quickly written down:

\[\begin{split} \mathbf{x_1}= A^{-1}\mathbf{b_1}= \frac{1}{2}\begin{bmatrix}-4 & 2 \\ 3 & -1 \end{bmatrix} \begin{bmatrix}-1 \\ 1 \end{bmatrix} = \begin{bmatrix}3 \\ -2 \end{bmatrix} \end{split}\]

and likewise

\[\begin{split} \mathbf{x_2}= A^{-1}\mathbf{b_2}= \frac{1}{2}\begin{bmatrix}-4 & 2 \\ 3 & -1 \end{bmatrix} \begin{bmatrix}2 \\ 10 \end{bmatrix} = \begin{bmatrix}6 \\ -2 \end{bmatrix}. \end{split}\]

A note of warning: the proof of Proposition 3.4.4 is based on the existence of the inverse of the matrix \(A\). Beware of this: never start using the expression \(A^{-1}\) unless you have made sure first that the matrix \(A\) is indeed invertible. If not, you may lead yourself into inconsistencies like in the following example:

Example 3.4.5

What goes wrong in the following ‘proof’ of the statement:

\[ \text{ if } \quad A^2 = A \quad\text{ and } \quad A\neq O, \quad\text{ then } \quad A = I. \]

‘Fallacious proof’:

Assume \(A^2 = A\).

Then

\[ A^{-1}A^2 = A^{-1}A = I. \]

On the other hand

\[ A^{-1}A^2 = A^{-1}(A\,A) = (A^{-1}A)A = IA = A. \]

So

\[ I = A^{-1}A^2 = A, \]

which ‘proves’ that \(A=I\).

Somewhere something must have gone wrong, as the following counterexample shows.

For the matrix \(B = \begin{bmatrix} \frac12 & \frac12 \\ \frac12 & \frac12 \end{bmatrix}\)

it can be checked that

\[ B^2 = B, \]

whereas obviously

\[ B \neq O, \quad B \neq I. \]

So, where exactly did it go wrong?!

The next proposition contains a few rules to manipulate inverse matrices.

Proposition 3.4.5

If \(A\) is invertible and \(c \neq 0\), then the following is true

  1. The matrix \(cA\) is invertible, and

    \[ (cA)^{-1} = \dfrac1c A^{-1}. \]
  2. The matrix \(A^T\) is invertible, and

    \[ (A^T)^{-1} = (A^{-1})^T. \]
  3. The matrix \(A^{-1}\) is invertible, and

    \[ (A^{-1})^{-1} = A. \]

Proof. All statements can be proved by verifying that the relevant products are equal to \(I\).

  1. The matrix \(A^{-1}\) exists, and so does \(\dfrac1c A^{-1}\). We find:

    \[ (cA) \cdot \dfrac1c A^{-1} = c\cdot \dfrac1c A\cdot A^{-1} = 1 \cdot I = I, \]

    and likewise \(\dfrac1c A^{-1}\cdot (cA) = I\),

    which proves that indeed \(\dfrac1c A^{-1} = (cA)^{-1}\).

  2. Since it is given that \(A^{-1}\) exists we can proceed as follows, where we make use of the characteristic property \( B^TA^T = (AB)^T\).


    \[ (A^{-1})^TA^T = ( AA^{-1})^T = I^T = I \]

    and

    \[ A^T(A^{-1})^T =( A^{-1}A)^T = I^T = I, \]

    which settles the second statement. To prove iii., see Exercise 3.4.2.

Exercise 3.4.2

Prove the last statement of the previous proposition.

Solution to Exercise 3.4.2 (click to show)

For the inverse \(C = (A^{-1})^{-1}\) of \(A^{-1}\), it should hold that

\[ CA^{-1} = A^{-1}C = I. \]

The matrix \(C = A\) has these properties.

The next example gives an illustration of ii. in Proposition 3.4.5.

Example 3.4.6

We consider the matrix

\[\begin{split} A = \begin{bmatrix} 2 & 6 & 5 \\ 0 & 2 & 2 \\ 0 & 0 & 3 \end{bmatrix}. \end{split}\]

It has the inverse matrix

\[\begin{split} B = \begin{bmatrix} 1/2 & -3/2 & 1/6 \\ 0 & 1/2 & -1/3 \\ 0 & 0 & 1/3 \end{bmatrix}, \end{split}\]

which can be checked by showing that \(AB\) and \(BA\) are equal to \(I\).

So \(B = A^{-1}\), and \(B^T = (A^{-1})^T\).

We also have

\[\begin{split} A^TB^T = \begin{bmatrix} 2 & 0 & 0 \\ 6 & 2 & 0 \\ 5 & 2 & 3 \end{bmatrix} \begin{bmatrix} 1/2 & 0 & 0 \\ -3/2 & 1/2 & 0 \\ 1/6 & -1/3 & 1/3 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{bmatrix}, \end{split}\]

as well as \(B^TA^T = I\), which proves that \(B^T = (A^T)^{-1}\).

As we already saw that \(B^T = (A^{-1})^{T}\), the matter is settled:

\[ (A^{-1})^T = (A^T)^{-1}. \]

The last property we mention and prove is the product rule for the matrix inverse.

Proposition 3.4.6

If \(A\) and \(B\) are invertible \(n \times n\) matrices then the matrix \(AB\) is also invertible, and

\[ (AB)^{-1} = B^{-1}A^{-1}. \]

Proof. Again we just check that the properties of the definition hold.

Suppose that \(A\) and \(B\) are invertible with inverses \(A^{-1}\) and \(B^{-1}\).

Then using the associative property we find

\[ (B^{-1}A^{-1})(AB) = B^{-1}A^{-1}AB = B^{-1}(A^{-1}A)B = B^{-1}IB = B^{-1}B = I, \]

and along the same lines

\[ (AB) B^{-1}A^{-1} = I. \]

This shows that \(B^{-1}A^{-1}\) is indeed the inverse of \(AB\).

Exercise 3.4.3

Is the identity

\[ ((AB)^T)^{-1} = (A^T)^{-1}(B^T)^{-1} \]

true or false?

In case it is true, give an argument, when false, give a counterexample.

Solution to Exercise 3.4.3 (click to show)

The statement is true.
From the two properties

\[ (AB)^T = B^TA^T, \quad (AB)^{-1} = B^{-1}A^{-1} \]

it follows that

\[ ((AB)^T)^{-1} = (B^TA^T)^{-1} = (A^T)^{-1}(B^T)^{-1}. \]

3.4.3. How to Compute the Inverse#

The construction of the inverse of a matrix was already present implicitly in Proposition 3.4.3.

The inverse of the matrix \(A\) must satisfy the equation \(AX = I\).
Written out column by column this means that

\[ AX = I \quad \iff \quad A[\,\mathbf{x_1}\,\,\mathbf{x_2}\, \ldots\, \mathbf{x_n}\,] = [\,\mathbf{e_1}\,\mathbf{e_2}\, \ldots\, \mathbf{e_n}\,]. \]

For the existence of a solution of this equation Proposition 3.4.3 tells us it is necessary that \(A\) has linearly independent columns, and we can furthermore read off that the columns of the matrix \(X\) will be the (unique) solutions of the linear systems

\[ A\mathbf{x_k} = \mathbf{e_k}, \]

where \(k = 1,2,\ldots, n\).

Let us first focus on this equation by considering a fairly general \(3\times 3\) matrix \(A\).

Example 3.4.7

For the matrix

\[\begin{split} A = \begin{bmatrix} 1 & 1 & 4 \\ 1 & -1 & -1 \\ 2 & -2 & -4 \end{bmatrix} \end{split}\]

we find the solution \(B\) of the matrix equation (which will appear to exist)

\[ AX = I \]

and then check whether

\[ BA = I \]

also holds. In which case we can truthfully assert that \(B = A^{-1}\).

Instead of finding the solution \(X\) column by column, which gives three linear systems with the same coefficient matrix,

\[\begin{split} \left[\begin{array}{rrr|r}1 & 1 & 4 & 1\\1 & -1 & -1 & 0\\2 & -2 & -4 & 0\\\end{array}\right], \quad \left[\begin{array}{rrr|r}1 & 1 & 4 & 0\\1 & -1 & -1 & 1\\2 & -2 & -4 & 0\\\end{array}\right], \quad \left[\begin{array}{rrr|r}1 & 1 & 4 & 0\\1 & -1 & -1 & 0\\2 & -2 & -4 & 1\\\end{array}\right], \end{split}\]

we can solve the three linear systems simultaneously using a combined augmented matrix which we may denote by either

\[\begin{split} \left[\begin{array}{rrr|r|r|r}1 & 1 & 4 & 1 & 0 & 0\\1 & -1 & -1 & 0 & 1 & 0\\2 & -2 & -4 & 0 & 0 & 1\\\end{array}\right] \quad \text{or} \quad \left[\begin{array}{rrr|rrr}1 & 1 & 4 & 1 & 0 & 0\\1 & -1 & -1 & 0 & 1 & 0\\2 & -2 & -4 & 0 & 0 & 1\\\end{array}\right]= \left[\, A \rule[-1.5ex]{0ex}{4ex}\,\,\,\,\rule[-.5ex]{0.1ex}{2.5ex}\,\,\rule[-1.5ex]{0ex}{4ex}\,\,I\,\right]. \end{split}\]

Let us first row reduce this matrix and then draw conclusions:

\[\begin{split} \left[\, A \rule[-1.5ex]{0ex}{4ex}\,\,\,\,\rule[-.5ex]{0.1ex}{2.5ex}\,\,\rule[-1.5ex]{0ex}{4ex}\,\,I\,\right]= \left[\begin{array}{rrr|rrr}1 & 1 & 4 & 1 & 0 & 0\\1 & -1 & -1 & 0 & 1 & 0\\2 & -2 & -4 & 0 & 0 & 1 \end{array}\right]\begin{array}{l} [R_1] \\ {[R_2-1R_1]} \\ {[R_3-2R_1]} \end{array} \end{split}\]
\[\begin{split} \sim \left[\begin{array}{rrr|rrr}1 & 1 & 4 & 1 & 0 & 0\\0 & -2 & -5 & -1 & 1 & 0\\0 & -4 & -12 & -2 & 0 & 1 \end{array}\right]\begin{array}{l} [R_1+\nicefrac12R_2] \\ {[R_2]} \\ {[R_3-2R_2]} \end{array} \end{split}\]
\[\begin{split} \sim \left[\begin{array}{rrr|rrr}1 & 0 & 3/2 & 1/2 & 1/2 & 0\\0 & -2 & -5 & -1 & 1 & 0\\0 & 0 & -2 & 0 & -2 & 1 \end{array}\right]\begin{array}{l} [R_1+\nicefrac34R_3] \\ {[R_2-\nicefrac52R_3]} \\ {[R_3]} \end{array} \end{split}\]
\[\begin{split} \sim \left[\begin{array}{rrr|rrr}1 & 0 & 0 & 1/2 & -1 & 3/4\\0 & -2 & 0 & -1 & 6 & -5/2\\0 & 0 & -2 & 0 & -2 & 1 \end{array}\right]\begin{array}{l} [R_1] \\ {[(-\nicefrac12)R_1]} \\ {[(-\nicefrac12)R_2]} \end{array} \end{split}\]
\[\begin{split} \sim \left[\begin{array}{rrr|rrr}1 & 0 & 0 & 1/2 & -1 & 3/4\\0 & 1 & 0 & 1/2 & -3 & 5/4\\0 & 0 & 1 & 0 & 1 & -1/2 \end{array}\right] = \left[\, I \rule[-1.5ex]{0ex}{4ex}\,\,\,\,\rule[-.5ex]{0.1ex}{2.5ex}\,\,\rule[-1.5ex]{0ex}{4ex}\,\,B\,\right]. \end{split}\]

By construction we have that the matrix

\[\begin{split} B = \begin{bmatrix} 1/2 & -1 & 3/4 \\ 1/2 & -3 & 5/4 \\ 0 & 1 & -1/2 \end{bmatrix} = \frac14 \begin{bmatrix} 2 & -4 & 3 \\ 2 & -12 & 5 \\ 0 & 4 & -2 \end{bmatrix} \end{split}\]

satisfies

\[ AB = I. \]

Let us check the product in the other order

\[\begin{split} BA = \frac14 \begin{bmatrix} 2 & -4 & 3 \\ 2 & -12 & 5 \\ 0 & 4 & -2 \end{bmatrix} \begin{bmatrix} 1 & 1 & 4 \\ 1 & -1 & -1 \\ 2 & -2 & -4 \end{bmatrix} = \frac14 \begin{bmatrix} 4 & 0 & 0 \\0 & 4 & 0 \\ 0 & 0 & 4 \end{bmatrix} = I. \end{split}\]

So indeed we can conclude

\[\begin{split} \begin{bmatrix} 1 & 1 & 4 \\ 1 & -1 & -1 \\ 2 & -2 & -4 \end{bmatrix}^{-1} \,=\, \frac14 \begin{bmatrix} 2 & -4 & 3 \\ 2 & -12 & 5 \\ 0 & 4 & -2 \end{bmatrix}\,. \end{split}\]

Now was this just beginners’ luck? It wasn’t, as the next proposition shows.

Proposition 3.4.7

A square matrix \(A\) is invertible if and only it has linearly independent columns.

In that case the inverse can be found by reducing the matrix

\[ \left[\, A \rule[-1.5ex]{0ex}{4ex}\,\,\,\,\rule[-.5ex]{0.1ex}{2.5ex}\,\,\rule[-1.5ex]{0ex}{4ex}\,\,I\,\right] \]

to the reduced echelon form

\[ \left[\, I \rule[-1.5ex]{0ex}{4ex}\,\,\,\,\rule[-.5ex]{0.1ex}{2.5ex}\,\,\rule[-1.5ex]{0ex}{4ex}\,\,B\,\right], \]

and then

\[ B = A^{-1}. \]

Proof. We have already seen (Proposition 3.4.3) that an invertible matrix linearly independent columns, which implies that the reduced echelon form of \(A\) is indeed the identity matrix. And then it is clear that via row operations we get

\[ \left[\, A \rule[-1.5ex]{0ex}{4ex}\,\,\,\,\rule[-.5ex]{0.1ex}{2.5ex}\,\,\rule[-1.5ex]{0ex}{4ex}\,\,I\,\right]\sim \quad .\,.\,.\,.\,. \quad \sim \left[\, I \rule[-1.5ex]{0ex}{4ex}\,\,\,\,\rule[-.5ex]{0.1ex}{2.5ex}\,\,\rule[-1.5ex]{0ex}{4ex}\,\,B\,\right], \]

where the matrix \(B\) satisfies \(AB = I\).

What we have to show is that

\[ BA = I \]

as well.

To understand that this is indeed true, we recall (Definition 3.2.9) that row operations can be effectuated via multiplications with elementary matrices. Furthermore, since the matrix product is defined column by column, i.e.

\[ MX = M\left[\begin{array}{cccc}\mathbf{x_1} &\mathbf{x_2} &\ldots &\mathbf{x_p} \end{array}\right]= \left[\begin{array}{cccc}M\mathbf{x_1} &M\mathbf{x_2} &\ldots &M\mathbf{x_p} \end{array}\right], \]

we also have

\[ E\left[\, A_1 \,\, \rule[-.5ex]{0.1ex}{2.5ex}\,\,\,A_2\,\right]= \left[\, EA_1 \,\, \rule[-.5ex]{0.1ex}{2.5ex}\,\,\,\,EA_2\,\right].\]

A series of \(k\) row operations can be mimicked by \(k\) multiplications with elementary matrices:

\[\begin{split} \begin{array}{ccl} \left[\, A \,\, \rule[-.5ex]{0.1ex}{2.5ex}\,\, I\,\right]&\sim& \left[\, E_1A \,\, \rule[-.5ex]{0.1ex}{2.5ex}\,\, E_1I\,\right] \sim \left[\,E_2 E_1A \,\, \rule[-.5ex]{0.1ex}{2.5ex}\,\, \,E_2E_1I\,\right]\sim \ldots \sim \\ &\sim& \left[\,E_k\cdots E_2 E_1A \,\,\, \rule[-.5ex]{0.1ex}{2.5ex}\,\,\, E_k\cdots E_2E_1I\,\right] = \left[\, I \,\, \rule[-.5ex]{0.1ex}{2.5ex}\,\,B\,\right]. \end{array} \end{split}\]

So the matrix \(B\) that was found as the solution of the matrix equation

\[ AX = I \]

is the product of all the elementary matrices by which \(A\) is reduced to the identity matrix. Thus we have shown that indeed

\[ BA = (E_k\cdots E_2 E_1)A = I. \]

Remark 3.4.3

In the proof we in fact showed that for a square matrix \(A\):

\[ \text{if} \quad AB = I \quad \text{then} \quad BA = I. \]

For non-square matrices this statement is not correct. The interested reader is invited to take a look at the last exercises in the Grasple subsection (Section 3.4.5).

Remark 3.4.4

If \(A\) is not invertible, then the outcome of the row reduction of

\[ \left[\, A \,\,\rule[-.5ex]{0.1ex}{2.5ex}\,\,\,I\,\right] \]

will also lead to the correct answer: as soon as it is clear that \(A\) cannot be row reduced to \(I\) we can conclude that \(A\) is not invertible.

To help understand the above exposition let us run through the whole procedure for a specific matrix .

Example 3.4.8

We want to compute the inverse of the matrix

\[\begin{split} A = \begin{bmatrix} 1 & 4 \\ 2 & 6 \end{bmatrix}. \end{split}\]

The short way:

\[\begin{split} \begin{array}{rcl} \left[\begin{array}{rr|rr}1 & 4 & 1 & 0\\2 & 6 & 0 & 1 \end{array}\right] \begin{array}{l} [R_1] \\ {[R_2-2R_1]} \\ \end{array} \!\!\! &\sim& \left[\begin{array}{rr|rr} 1 & 4 & 1 & 0 \\ 0 & -2 & -2 & 1 \end{array}\right] \begin{array}{l} [R_1+2R_2] \\ {[R_2]} \\ \end{array} \\ &\sim& \left[\begin{array}{rr|rr}1 & 0 & -3 & 2\\0 & -2 & -2 & 1 \end{array}\right] \begin{array}{l} [R_1] \\ {[(-\frac12)R_2]} \\ \end{array} \\ &\sim& \left[\begin{array}{rr|rr}1 & 0 & -3 & 2\\0 & 1 & 1 & -\nicefrac12 \end{array}\right] \end{array} \end{split}\]

So:

\[\begin{split} A^{-1} = \begin{bmatrix} -3 & 2 \\ 1 & -\frac12 \end{bmatrix}. \end{split}\]

End of story.

To see how the proof of Proposition 3.4.7 works for this specific matrix, we will give a derivation using elementary matrices.

First step: row replacement with the entry on position (1,1) as a first pivot:

\[\begin{split} \begin{bmatrix} 1 & 0 \\ -2 & 1 \end{bmatrix}\, \left[\begin{array}{rr|rr}1 & 4 & 1 & 0\\2 & 6 & 0 & 1 \end{array}\right] \,\,=\,\, \left[\begin{array}{rr|rr}1 & 4 & 1 & 0\\0 & -2 & -2 & 1 \end{array}\right], \quad E_1 = \begin{bmatrix} 1 & 0 \\ -2 & 1 \end{bmatrix}. \end{split}\]

Second step: another row replacement, using the entry on position (2,2) as pivot:

\[\begin{split} \begin{bmatrix} 1 & 2 \\ 0 & 1 \end{bmatrix}\, \left[\begin{array}{rr|rr}1 & 4 & 1 & 0\\0 & -2 & -2 & 1 \end{array}\right] \,\,=\,\, \left[\begin{array}{rr|rr}1 & 0 & -3 & 2\\0 & -2 & -2 & 1 \end{array}\right], \quad E_2 = \begin{bmatrix} 1 & 2 \\ 0 & 1 \end{bmatrix}. \end{split}\]

Third step: the scaling of the second row:

\[\begin{split} \begin{bmatrix} 1 & 0 \\ 0 & -\nicefrac12 \end{bmatrix}\, \left[\begin{array}{rr|rr}1 & 0 & -3 & 2\\0 & -2 & -2 & 1 \end{array}\right] \,\,=\,\, \left[\begin{array}{rr|rr}1 & 0 & -3 & 2\\0 & 1 & 1 & -\nicefrac12 \end{array}\right],\quad E_3 = \begin{bmatrix} 1 & 0 \\ 0 & -\nicefrac12 \end{bmatrix}. \end{split}\]

All in all

\[\begin{split} (E_3E_2E_1)A = \left(\begin{bmatrix} 1 & 0 \\ 0 & -\nicefrac12 \end{bmatrix} \begin{bmatrix} 1 & 2 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ -2 & 1 \end{bmatrix}\right)\,A = \begin{bmatrix} -3 & 2 \\ 1 & -\nicefrac12 \end{bmatrix}A = I, \end{split}\]

which reconfirms

\[\begin{split} E_3E_2E_1 = A^{-1} = \begin{bmatrix} -3 & 2 \\ 1 & -\tfrac12 \end{bmatrix}. \end{split}\]

Exercise 3.4.4

Prove the following converse of Proposition 3.4.6.

If \(A\) and \(B\) are \(n\times n\) matrices for which the product \(AB\) is invertible, then \(A\) and \(B\) are both invertible.

Make sure that you do not use \(A^{-1}\) or \(B^{-1}\) prematurely, i.e., before you have established that they exist.

Solution to Exercise 3.4.4 (click to show)

Suppose \(A\) and \(B\) are two \(n \times n\) matrices for which \(AB\) is invertible. Let \(C=(AB)^{-1}\) be the inverse of \(AB\). We claim that \(BC\) is the inverse of \(A\).

Now since

\[ A(BC) = (AB)C = AB(AB)^{-1} = I, \]

it follows from Remark 3.4.3 that \((BC)A=I\) also holds. So we have

\[ A(BC) = (BC)A = I, \]

which means that \(A\) is invertible and has as inverse the matrix \(BC\).

In the same vein it is shown that \(CA\) is the inverse of \(B\).

3.4.4. Characterizations of Invertibility#

In the previous subsections quite a few properties of invertible matrices came along, either explicitly or implicitly. For future reference we list them in a theorem.

Recall that by definition a (square) matrix \(A\) is invertible (or regular) if and only if there exists a matrix \(B\) for which

\[ AB = BA= I. \]

Theorem 3.4.1

For an \(n\times n\) matrix \(A\), the following statements are equivalent.
That is, each of the following properties is a characterization of invertibility of a square matrix \(A\):

  1. \(A\) is invertible;

  2. there exists a matrix \(B\) for which \(AB = I\);

  3. for each \(\mathbf{b}\in\mathbb{R}^n\) the linear system \(A\mathbf{x} = \mathbf{b}\) has a unique solution;

  4. \(A\) is row equivalent to the identity matrix \(I_n\);

  5. \(A\) has linearly independent columns;

  6. the equation \(A\vect{x} = \vect{0}\) has only the trivial solution \(\vect{x} = \vect{0}\);

  7. \(A\) can be written as a product of elementary matrices: \(A = E_1E_2\cdots E_k\).

Proof. It is a good exercise to find out where the evidence of each characterization is found, and wherever necessary to fill in the missing details.

There are many variations on Theorem 3.4.1. The following exercise contains a few.

Exercise 3.4.5

Show that invertibility of an \(n\times n\) matrix \(A\) is also equivalent to

  • there exists a matrix \(B\) such that \(BA = I\);

  • \(A\) has linearly independent rows;

  • each column of the matrix \(A\) is a pivot column;

  • the columns of \(A\) span the whole \(\mathbb{R}^n\).

Again it may very well be that you have to resort to previous sections.

3.4.5. Grasple Exercises#

The first exercises are quite straightfordwardly computational. The remaining exercises tend to be more theoretic.

Grasple Exercise 3.4.1

https://embed.grasple.com/exercises/6683a2f9-7b6b-4dd1-bec1-1e8b894fa3bb?id=71086

To compute the inverse of a \(2 \times 2\) matrix.

Grasple Exercise 3.4.2

https://embed.grasple.com/exercises/1bbca38b-a734-4049-b8a2-f79d4bf1b098?id=71087

To compute the inverse of a \(2 \times 2\) matrix.

Grasple Exercise 3.4.3

https://embed.grasple.com/exercises/045cd1

To compute the inverse of a \(2 \times 2\) matrix.

Grasple Exercise 3.4.4

https://embed.grasple.com/exercises/82c06a56-8ee8-4f36-8173-e5d56da1e8e3?id=71073

To compute step by step the inverse of a \(3 \times 3\) matrix.

Grasple Exercise 3.4.5

https://embed.grasple.com/exercises/551172d9-861c-4958-9b17-dfa828acdabe?id=71088

To compute the inverse of a \(3 \times 3\) matrix.

Grasple Exercise 3.4.6

https://embed.grasple.com/exercises/9174c68c-e2d5-4c23-af96-e3fe3dd36f42?id=71089

To compute the inverse of a \(3 \times 3\) matrix.

Grasple Exercise 3.4.7

https://embed.grasple.com/exercises/800dc2f9-227e-401b-818b-093fc9647dd9?id=83083

To compute the inverse of a \(4 \times 4\) matrix.

Grasple Exercise 3.4.8

https://embed.grasple.com/exercises/9146f49d-74a5-4fda-a641-181c4536fe01?id=83086

To find \(p\) for which a matrix \(A\) is singular.

The remaining exercises have more theoretic flavour.

Grasple Exercise 3.4.9

https://embed.grasple.com/exercises/677aa3ee-4594-4d77-ace6-583a1efcba59?id=71090

True/False question about invertibility versus

Grasple Exercise 3.4.10

https://embed.grasple.com/exercises/f789ebd5-171b-4556-83a9-eefc5ef830ef?id=71092

To show: if \(A\) is invertible, then so is \(A^T\).

Grasple Exercise 3.4.11

https://embed.grasple.com/exercises/29dc7c2f-6636-493e-9c97-da1847a336b7?id=68908

To show: if \(AB\) is invertible, then so are \(A\) and \(B\).

Grasple Exercise 3.4.12

https://embed.grasple.com/exercises/8f3feb75-b41b-42e0-b574-f6442da253ce?id=70272

What about \((AB)^{-1} = A^{-1}B^{-1}\)?

Grasple Exercise 3.4.13

https://embed.grasple.com/exercises/5185c5c0-4d92-4e0e-92a7-6dc5eed8f7cf?id=68896

What about \(((AB)^T)^{-1} = (A^T)^{-1}(B^T)^{-1}\)?

Grasple Exercise 3.4.14

https://embed.grasple.com/exercises/ee4bb61e-6939-4074-a556-b82f3d0e8c28?id=71091

True/False: Every elementary matrix is invertible.

Grasple Exercise 3.4.15

https://embed.grasple.com/exercises/1732d75b-2027-4a92-b8bb-c98bda62475d?id=71093

True/False: If \(A\) and \(B\) are invertible, then so is \(A+B\).

Grasple Exercise 3.4.16

https://embed.grasple.com/exercises/f8602d4f-57b7-4752-9edc-69c83069fe36?id=71095

True/False: If \(A\) and \(B\) are singular, then so is \(A+B\).

Grasple Exercise 3.4.17

https://embed.grasple.com/exercises/a8ea864d-1164-4afc-9a24-c0a126ee8e54?id=71097

True/False: If \(A\) is row equivalent to \(I\), then so is \(A^2\).

Grasple Exercise 3.4.18

https://embed.grasple.com/exercises/73a16f62-28d7-4a4c-baf5-7ce3be9272ce?id=71104

To find ‘by inspection’ inverses of elemenatry matrices.

Grasple Exercise 3.4.19

https://embed.grasple.com/exercises/a8c2b8ed-9961-4779-8841-491a9529b71c?id=71466

To find the inverses of \(AE\) and \(EA\), when \(A^{-1}\) is given.

Grasple Exercise 3.4.20

https://embed.grasple.com/exercises/dfe429bd-1ab9-47f7-8f6c-06150c468645?id=71468

Finding the inverses of (almost) elementary matrices.

Grasple Exercise 3.4.21

https://embed.grasple.com/exercises/9af5928a-7ecb-478e-a896-7c66d16d9d09?id=71463

Distilling \(A^{-1}\) from a relation \(c_2A^2 + c_1A + c_0I = 0\).

In the last two exercises (non-)invertibility of non-square matrices is considered.

Grasple Exercise 3.4.22

https://embed.grasple.com/exercises/ca504661-cc62-454f-8035-04a9bef85f91?id=61170

To explore invertibility for a \(2\times 3\) matrix

Grasple Exercise 3.4.23

https://embed.grasple.com/exercises/4e9b4ec1-f775-430f-b81f-c76c42fcbc76?id=60136

To explore invertibility for a \(3\times 2\) matrix