3.4. The Inverse of a Matrix#
3.4.1. Introduction#
In Section 3.2 we defined the sum and product of matrices (of compatible sizes), and we saw that to a certain extent matrix algebra is guided by the same rules as the arithmetic of real numbers. We can also subtract two matrices via
but we did not mention division of matrices.
For two numbers \(a\) and \(b\), with \(a \neq 0\), the equation
has the unique solution
where
is the (unique) solution of the equation
The bad news:
cannot be defined in any useful way!
First of all the corresponding matrix equation
does not always have a solution, or the solution is not unique, not even in the case of two \(n \times n\) matrices \(A\) and \(B\). Two examples to illustrate this:
The matrix equation
where
does not have a solution. Why? Well, any column of \(AX\) is a linear combination of the columns of \(A\), and the columns of \(B\) obviously cannot be written as such linear combinations:
The matrix equation
where
has infinitely many solutions. Two of those are for instance
And lastly, if there is a matrix \(C\) for which
and we would adopt the notation
then
So \(X = A^{-1}B\). However, it is in no way clear why \(A^{-1}B\) and \(BA^{-1}\) should be equal, and in general indeed they are not. So the notation
will still be ambiguous.
For non-square matrices things are even worse. In this section we will only consider square matrices.
3.4.2. Definition and Basic Properties of the Inverse#
A square matrix \(A\) is called invertible if there exists a matrix \(B\) for which
In this situation the matrix \(B\) is called the inverse of \(A\) and we write
A matrix that is invertible is also called a regular matrix, and a non-invertible matrix is also called a singular matrix.
Note the use of the definite article the in the sentence ‘\(B\) is called the inverse of \(A\)’. The following proposition justifies this choice of word.
If an inverse of a matrix \(A\) exists, then it is unique.
The proof is very short, when we plug in the right idea at the right place.
Proof of Proposition 3.4.1
Suppose \(B\) and \(C\) are two matrices that satisfy the properties of being an inverse of \(A\), i.e.
Then the following chain of identities proves that \(B\) and \(C\) must be equal:
Actually, the proof shows slightly more, as the assumptions
are not used. In fact it shows that for three \(n \times n\) matrices \(A\), \(B\) and \(C\)
For the matrices
we see
So \(A\) and \(B\) are each other’s inverse.
Another example:
You may check for yourself that the product in the other order also gives \(I\), so
It will appear (Remark 3.4.3) that for square matrices, a one-sided inverse is automatically a two-sided inverse, by which we mean
The first example can be generalized:
If \(A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}\), then \(A^{-1}\) exists if and only if
In that case
We leave the verification as an exercise.
Verify that the matrix \(B=A^{-1}\) proposed in Proposition 3.4.2 indeed satisfies
Also check that the first matrix in Example 3.4.3 illustrates the formula.
Solution to Exercise 3.4.1
Which is one of the two identities.
Applying the formula of Proposition 3.4.2 to the matrix \(A = \begin{bmatrix} 1 & 2 \\ 3 & 5 \end{bmatrix}\) of Example 3.4.3 gives
which is indeed the matrix \(B\) that was proposed there.
The condition
is equivalent to the statement
First we show that
It is best to split this in two cases:
If we assume
then we have
which leads to a matrix
with either a zero row or a zero column, which will indeed have linearly dependent columns. Second, if we assume
then both \(a \neq 0\) and \(d \neq 0\), in which case
hence the columns are again linearly dependent. Thus we have shown:
Next let us consider the converse, i.e.
If a \(2 \times 2\) matrix has two linearly dependent columns, then one of the columns will be a multiple of the other column, e.g.
In both cases it is easily checked that
The following proposition shows that the above considerations can be generalized.
If \(A\) is a square matrix, then
has a unique solution if and only if
Proof of Proposition 3.4.3
As in the proof in Remark 3.4.2 we have to prove two implications:
and
For the first part, assume that
That means that every column \(\mathbf{e_j}\) of the identity matrix is a linear combination of columns \(\mathbf{a_1}, \ldots, \mathbf{a_n}\) of \(A\). So the span of the columns of \(A\) contains the span of the columns of \(\mathbf{e_1}, \ldots, \mathbf{e}_n\), which is the whole \(\mathbb{R}^n\). Thus every linear system
has a solution. Then the reduced echelon form of \(A\) must have a pivot in every row, and, since it is a square matrix, it must be the identity matrix. Consequently, it has a pivot in every column, so the linear system
only has the trivial solution, which proves that indeed the columns of \(A\) are linearly independent.
For the converse, suppose that \(A\) has linearly independent columns. Then the reduced echelon form of \(A\) must be the identity matrix. This implies that for each \(\mathbf{b}\) in \(\mathbb{R}^n\)
and in particular, each linear system
has a unique solution. If we denote this solution by \(\mathbf{c_j}\) we have that
Since all solutions \(\mathbf{c_j}\) are unique, the solution of the equation
is unique as well.
It makes sense that the solution \(B\) of this matrix equation will be the inverse of \(A\), and it is, but it takes some effort to show that the other requirement,
is also fulfilled. In the next subsection we will see that the matrix equation
will lead the way to an algorithm to compute the inverse of a matrix. Before we go there we will look at some general properties of invertible matrices.
If the \(n \times n\) matrix \(A\) is invertible and \(B\) is an \(n \times p\) matrix, then the solution of the matrix equation
is unique, and given by
In particular, if the matrix \(B\) has only one column, i.e., if it is a vector, then
Proof of Proposition 3.4.4
We multiply both sides of the equation
by \(A^{-1}\) and use the fact that the matrix product has the associative property:
We illustrate the proposition by an example.
Suppose the matrix \(A\) and the vectors \(\mathbf{b}_1\) and \(\mathbf{b}_1\) are given by
Consider the two linear systems
Using the inverse matrix
the two solutions are quickly written down:
and likewise
A note of warning: the proof of Proposition 3.4.4 is based on the existence of the inverse of the matrix \(A\). Beware of this: never start using the expression \(A^{-1}\) unless you have made sure first that the matrix \(A\) is indeed invertible. If not, you may lead yourself into inconsistencies like in the following example:
What goes wrong in the following ‘proof’ of the statement:
‘Fallacious proof’:
Assume \(A^2 = A\).
Then
On the other hand
So
which ‘proves’ that \(A=I\).
Somewhere something must have gone wrong, as the following counterexample shows.
For the matrix \(B = \begin{bmatrix} \frac12 & \frac12 \\ \frac12 & \frac12 \end{bmatrix}\)
it can be checked that
whereas obviously
So, where exactly did it go wrong?!
The next proposition contains a few rules to manipulate inverse matrices.
If \(A\) is invertible and \(c \neq 0\), then the following is true
-
The matrix \(cA\) is invertible, and
\[ (cA)^{-1} = \dfrac1c A^{-1}. \] -
The matrix \(A^T\) is invertible, and
\[ (A^T)^{-1} = (A^{-1})^T. \] -
The matrix \(A^{-1}\) is invertible, and
\[ (A^{-1})^{-1} = A. \]
Proof of Proposition 3.4.5
All statements can be proved by verifying that the relevant products are equal to \(I\).
-
The matrix \(A^{-1}\) exists, and so does \(\dfrac1c A^{-1}\). We find:
\[ (cA) \cdot \dfrac1c A^{-1} = c\cdot \dfrac1c A\cdot A^{-1} = 1 \cdot I = I, \]and likewise \(\dfrac1c A^{-1}\cdot (cA) = I\),
which proves that indeed \(\dfrac1c A^{-1} = (cA)^{-1}\).
-
Since it is given that \(A^{-1}\) exists we can proceed as follows, where we make use of the characteristic property \( B^TA^T = (AB)^T\).
\[ (A^{-1})^TA^T = ( AA^{-1})^T = I^T = I \]and
\[ A^T(A^{-1})^T =( A^{-1}A)^T = I^T = I, \]which settles the second statement.
To prove iii., see Exercise 3.4.2.
Prove the last statement of the previous proposition.
Solution to Exercise 3.4.2
For the inverse \(C = (A^{-1})^{-1}\) of \(A^{-1}\), it should hold that
The matrix \(C = A\) has these properties.
The next example gives an illustration of ii. in Proposition 3.4.5.
We consider the matrix
It has the inverse matrix
which can be checked by showing that \(AB\) and \(BA\) are equal to \(I\).
So \(B = A^{-1}\), and \(B^T = (A^{-1})^T\).
We also have
as well as \(B^TA^T = I\), which proves that \(B^T = (A^T)^{-1}\).
As we already saw that \(B^T = (A^{-1})^{T}\), the matter is settled:
The last property we mention and prove is the product rule for the matrix inverse.
If \(A\) and \(B\) are invertible \(n \times n\) matrices then the matrix \(AB\) is also invertible, and
Proof of Proposition 3.4.6
Again we just check that the properties of the definition hold.
Suppose that \(A\) and \(B\) are invertible with inverses \(A^{-1}\) and \(B^{-1}\).
Then using the associative property we find
and along the same lines
This shows that \(B^{-1}A^{-1}\) is indeed the inverse of \(AB\).
Is the identity
true or false?
In case it is true, give an argument, when false, give a counterexample.
Solution to Exercise 3.4.3
The statement is true.
From the two properties
it follows that
3.4.3. How to Compute the Inverse#
The construction of the inverse of a matrix was already present implicitly in Proposition 3.4.3.
The inverse of the matrix \(A\) must satisfy the equation \(AX = I\).
Written out column by column this means that
For the existence of a solution of this Equation Proposition 3.4.3 tells us it is necessary that \(A\) has linearly independent columns, and we can furthermore read off that the columns of the matrix \(X\) will be the (unique) solutions of the linear systems
where \(k = 1,2,\ldots, n\).
Let us first focus on this equation by considering a fairly general \(3\times 3\) matrix \(A\).
For the matrix
we find the solution \(B\) of the matrix equation (which will appear to exist)
and then check whether
also holds. In which case we can truthfully assert that \(B = A^{-1}\).
Instead of finding the solution \(X\) column by column, which gives three linear systems with the same coefficient matrix,
we can solve the three linear systems simultaneously using a combined augmented matrix which we may denote by either
Let us first row reduce this matrix and then draw conclusions:
By construction we have that the matrix
satisfies
Let us check the product in the other order
So indeed we can conclude
Now was this just beginners’ luck? It wasn’t, as the next proposition shows.
A square matrix \(A\) is invertible if and only it has linearly independent columns.
In that case the inverse can be found by reducing the matrix
to the reduced echelon form
and then
Proof of Proposition 3.4.7
We have already seen (Proposition 3.4.3) that an invertible matrix linearly independent columns, which implies that the reduced echelon form of \(A\) is indeed the identity matrix. And then it is clear that via row operations we get
where the matrix \(B\) satisfies \(AB = I\).
What we have to show is that
as well.
To understand that this is indeed true, we recall (Definition 3.2.9) that row operations can be effectuated via multiplications with elementary matrices. Furthermore, since the matrix product is defined column by column, i.e.
we also have
A series of \(k\) row operations can be mimicked by \(k\) multiplications with elementary matrices:
So the matrix \(B\) that was found as the solution of the matrix equation
is the product of all the elementary matrices by which \(A\) is reduced to the identity matrix. Thus we have shown that indeed
In the proof we in fact showed that for a square matrix \(A\):
For non-square matrices this statement is not correct. The interested reader is invited to take a look at the last exercises in the Grasple subsection (Section 3.4.5).
If \(A\) is not invertible, then the outcome of the row reduction of
will also lead to the correct answer: as soon as it is clear that \(A\) cannot be row reduced to \(I\) we can conclude that \(A\) is not invertible.
To help understand the above exposition let us run through the whole procedure for a specific matrix .
We want to compute the inverse of the matrix
The short way:
So:
End of story.
To see how the proof of Proposition 3.4.7 works for this specific matrix, we will give a derivation using elementary matrices.
First step: row replacement with the entry on position (1,1) as a first pivot:
Second step: another row replacement, using the entry on position (2,2) as pivot:
Third step: the scaling of the second row:
All in all
which reconfirms
Prove the following converse of Proposition 3.4.6.
If \(A\) and \(B\) are \(n\times n\) matrices for which the product \(AB\) is invertible, then \(A\) and \(B\) are both invertible.
Make sure that you do not use \(A^{-1}\) or \(B^{-1}\) prematurely, i.e., before you have established that they exist.
Solution to Exercise 3.4.4
Suppose \(A\) and \(B\) are two \(n \times n\) matrices for which \(AB\) is invertible. Let \(C=(AB)^{-1}\) be the inverse of \(AB\). We claim that \(BC\) is the inverse of \(A\).
Now since
it follows from Remark 3.4.3 that \((BC)A=I\) also holds. So we have
which means that \(A\) is invertible and has as inverse the matrix \(BC\).
In the same vein it is shown that \(CA\) is the inverse of \(B\).
3.4.4. Characterizations of Invertibility#
In the previous subsections quite a few properties of invertible matrices came along, either explicitly or implicitly. For future reference we list them in a theorem.
Recall that by definition a (square) matrix \(A\) is invertible (or regular) if and only if there exists a matrix \(B\) for which
For an \(n\times n\) matrix \(A\), the following statements are equivalent.
That is, each of the following properties is a characterization of invertibility of a square matrix \(A\):
-
\(A\) is invertible;
-
there exists a matrix \(B\) for which \(AB = I\);
-
for each \(\mathbf{b}\in\mathbb{R}^n\) the linear system \(A\mathbf{x} = \mathbf{b}\) has a unique solution;
-
\(A\) is row equivalent to the identity matrix \(I_n\);
-
\(A\) has linearly independent columns;
-
the equation \(A\vect{x} = \vect{0}\) has only the trivial solution \(\vect{x} = \vect{0}\);
-
\(A\) can be written as a product of elementary matrices: \(A = E_1E_2\cdots E_k\).
Proof of Theorem 3.4.1
It is a good exercise to find out where the evidence of each characterization is found, and wherever necessary to fill in the missing details.
There are many variations on Theorem 3.4.1. The following exercise contains a few.
Show that invertibility of an \(n\times n\) matrix \(A\) is also equivalent to
-
there exists a matrix \(B\) such that \(BA = I\);
-
\(A\) has linearly independent rows;
-
each column of the matrix \(A\) is a pivot column;
-
the columns of \(A\) span the whole \(\mathbb{R}^n\).
Again it may very well be that you have to resort to previous sections.
3.4.5. Grasple Exercises#
The first exercises are quite straightfordwardly computational. The remaining exercises tend to be more theoretic.
To compute the inverse of a \(2 \times 2\) matrix.
Show/Hide Content
To compute the inverse of a \(2 \times 2\) matrix.
Show/Hide Content
To compute the inverse of a \(2 \times 2\) matrix.
Show/Hide Content
To compute step by step the inverse of a \(3 \times 3\) matrix.
Show/Hide Content
To compute the inverse of a \(3 \times 3\) matrix.
Show/Hide Content
To compute the inverse of a \(3 \times 3\) matrix.
Show/Hide Content
To compute the inverse of a \(4 \times 4\) matrix.
Show/Hide Content
To find \(p\) for which a matrix \(A\) is singular.
Show/Hide Content
The remaining exercises have more theoretic flavour.
True/False question about invertibility versus consistent linear systems.
Show/Hide Content
To show: if \(A\) is invertible, then so is \(A^T\).
Show/Hide Content
To show: if \(AB\) is invertible, then so are \(A\) and \(B\).
Show/Hide Content
What about \((AB)^{-1} = A^{-1}B^{-1}\)?
Show/Hide Content
What about \(((AB)^T)^{-1} = (A^T)^{-1}(B^T)^{-1}\)?
Show/Hide Content
True/False: Every elementary matrix is invertible.
Show/Hide Content
True/False: If \(A\) and \(B\) are invertible, then so is \(A+B\).
Show/Hide Content
True/False: If \(A\) and \(B\) are singular, then so is \(A+B\).
Show/Hide Content
True/False: If \(A\) is row equivalent to \(I\), then so is \(A^2\).
Show/Hide Content
To find ‘by inspection’ inverses of elemenatry matrices.
Show/Hide Content
To find the inverses of \(AE\) and \(EA\), when \(A^{-1}\) is given.
Show/Hide Content
Finding the inverses of (almost) elementary matrices.
Show/Hide Content
Distilling \(A^{-1}\) from a relation \(c_2A^2 + c_1A + c_0I = 0\).
Show/Hide Content
In the last two exercises (non-)invertibility of non-square matrices is considered.
To explore invertibility for a \(2\times 3\) matrix.
Show/Hide Content
To explore invertibility for a \(3\times 2\) matrix.