6.2. The Characteristic Polynomial#
Until now we have seen how we can check whether a real number is an eigenvalue, but we have not come up with a method to actually find the eigenvalues (better than just trying all real numbers one by one …).
In this section we will show that the eigenvalues are exactly the zeros of a polynomial function intricately connected with the matrix \(A\). So intricately that it is called the characteristic polynomial of \(A\).
6.2.1. The definition of the characteristic polynomial#
Suppose \(A\) is an \(n\times n\) matrix. Then \(\lambda\) is an eigenvalue of \(A\) if and only if the determinant of the matrix \(A -\lambda I\) is equal to zero.
Proof of Proposition 6.2.1
There’s not much new here.
We have seen in Proposition 6.1.1 that the statement
\(\lambda\) is an eigenvalue of \(A\)
is equivalent to
\((A-\lambda I)\vect{x} = \vect{0}, \quad\) for a nonzero vector \( \vect{x}\).
We also know (Theorem 3.4.1) that such a nonzero solution exists only if
the matrix \(A - \lambda I\) is not invertible,
and this, in its turn, is equivalent to
Consider the matrix \(A = \begin{bmatrix} 1 & 4 \\ 1 & 1 \end{bmatrix}\). We evaluate det\((A - \lambda I)\).
So we see that \(\lambda\) is an eigenvalue if and only if
and conclude that the eigenvalues of \(A\) are \(\lambda_1 =3, \lambda_2 = -1\). We call \(\lambda_1\) a double eigenvalue.
Proposition 6.2.1 can also be used to show that a matrix does not have any (real) eigenvalues.
For the matrix \(R=\begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix}\) the determinant of \(R - \lambda I\) becomes
Since this polynomial has no zeros, the matrix \(A\) has no eigenvalues. We have already seen a geometric argument when we considered this matrix in Example 6.1.8: \(R\) is the matrix of a rotation.
In the remark immediately after that example we mentioned that it is possible to treat \(\lambda = \pm i\) as eigenvalues of the matrix \(R\). Of course these are exactly the complex zeros of the polynomial \(p(\lambda) = \lambda^2 + 1\).
In general, to compute the characteristic polynomial of an \(n \times n\) matrix when \(n > 2\) becomes cumbersome. And to find its zeros is close to impossible. There is one exception, which is when the matrix is of triangular form. Recall that this means that either all entries below the diagonal are zero (in which case the matrix is upper triangular), or all entries above the diagonal are zero.
Consider the matrix \(U = \begin{bmatrix} u_{11} & u_{12} & u_{13} \\ 0 & u_{22} & u_{23} \\ 0 & 0 & u_{33} \end{bmatrix}\), an arbitrary \(3 \times 3\) upper triangular matrix.
Since \(U - \lambda I\) is also an upper triangular matrix, we find that
The last expression becomes 0 exactly for the values
So for a \(3 \times 3\) upper triangular matrix the eigenvalues are the diagonal entries.
Obviously Example 6.2.3 can be generalized. This leads to the following proposition.
For any upper or lower triangular matrix \(A\) the eigenvalues are given by the diagonal entries.
Note that this includes diagonal matrices \(D\).
For the \(2\times 2\) matrix in Example 6.2.1 the expression det\((A - \lambda I)\) eventually comes down to a polynomial of degree 2.
For an arbitrary \(2 \times 2\) matrix \(A = \begin{bmatrix} a-\lambda & b \\ c & d-\lambda \end{bmatrix}\) we quickly see that
Likewise, we saw that for \(n\times n\) triangular matrices the characteristic polynomial is a polynomial of degree \(n\). That this can be generalized to arbitrary matrices is the content of the next proposition.
For an \(n\times n\) matrix \(A\) the function det\((A - \lambda I)\) is a polynomial of degree \(n\).
Proof of Proposition 6.2.3
We have to dive into the hardware of determinants a bit. If the determinant of an \(n\times n\) matrix \(M\) is computed by iteratively expanding along the first rows, i.e., doing it the hard way, we end up with a sum of \(n!\) terms. Each term is the product of \(n\) entries of \(M\), where each row and each column of \(A\) is represented exactly once.
Now we apply this to the matrix \(M = (A - \lambda I)\), where \(A\) is the most general \(n\times n\) matrix. We then can deduce that
is indeed a polynomial function of \(\lambda\), and the highest power of \(\lambda\) comes from the product of the diagonal elements
We see that the highest power of \(\lambda\) in this expression is the \(n\)-th power, and that its coefficient is \((-1)^n\).
The function det\((A - \lambda I)\) is of paramount importance. We will see that it reveals important intrinsic properties of the matrix \(A\). It deserves a name.
The function det\((A - \lambda I)\) is called the characteristic polynomial of \(A\). We will sometimes denote it by \(p_A(\lambda)\), so
Let us put the important properties of the characteristic polynomial that we have seen so far into a theorem.
The characteristic polynomial of the \(n \times n\) matrix \(A\)
-
is a polynomial of degree \(n\),
and
-
its zeros are the eigenvalues of the matrix \(A\).
As a corollary we find a second argument why an \(n\times n\) matrix cannot have more than \(n\) different eigenvalues: a polynomial of degree \(n\) can have at most \(n\) zeros.
A note of warning: to compute the determinant of the matrix \(A - \lambda I\) it may seem helpful to first row reduce the matrix \(A\) to echelon form \(E\), and then take the determinant of \(E - \lambda I\), which is a triangular matrix. However, that procedure is incorrect. Except for very special cases, in general
For instance, take the earlier example \( A = \left[\begin{array}{cc} 1 & 4 \\ 1 & 1 \end{array}\right]\) with characteristic polynomial
Since \(A\) is invertible, \(A\) can be row reduced to the echelon matrix \(E = I\), the identity matrix, and
which has nothing to do with \(\det{(A - \lambda I)}\).
So, if you want to find the characteristic polynomial via row reduction of det\((A - \lambda I)\) you have to include \(\lambda\) right from the beginning.
Let’s look at one other example before we give the characteristic polynomial a closer look.
We find the characteristic polynomial of the matrix \(A = \begin{bmatrix} 4 & -1 & -2 \\0 & 3 & 0 \\ 1 & 2 & 1 \end{bmatrix}\).
Because of the zeros in the second row a good start is to expand along this row:
which can be rewritten as
We certainly do not eliminate the parentheses here, since in this factorized form the eigenvalue \(\lambda = 3\) is staring us right into the eyes. From the last expression we read off that \(\lambda\) is an eigenvalue of \(A\) if either \( (3-\lambda)=0 \) or \((\lambda^2 - 5\lambda +6)=0\). Noting that
we may conclude that the eigenvalues of this matrix are
From the examples so far it seems we have solved the question of how to find the eigenvalues. However, there is a proviso: if we start with a ‘full’ \(3 \times 3\) matrix \(A\), there may be nothing better to do than to compute det\((A - \lambda I)\) by iteratively expanding across columns or rows. We then end up with a cubic polynomial, not in factorized form. In general it will be quite a hard task to compute its zeros. Obviously, things get even worse in higher dimensions.
In the previous example (Example 6.2.4) the eigenvalue \(\lambda_1 = 3\) seems to play a different role than the eigenvalue \(\lambda_2\). That is, the characteristic polynomial
contains two factors \((\lambda - 3)\) and only one factor \((\lambda - 2)\). In algebra it is then said that \(\lambda = 3\) is a zero/root of multiplicity 2 of the polynomial \(p_A\), and likewise the root \(\lambda = 2\) has multiplicity 1.
Another natural question is how many linearly independent eigenvectors there are for an eigenvalue \(\lambda\). This we will refer to as the geometric multiplicity.
6.2.2. Algebraic and geometric multiplicity#
The algebraic multiplicity of an eigenvalue \(\lambda_k\) is the number of factors \((\lambda - \lambda_k)\) appearing in the characteristic polynomial. It is often abbreviated as a.m.(\(\lambda_k\)).
The geometric multiplicity of an eigenvalue \(\lambda_k\), with short notation g.m.(\(\lambda_k\)), is the dimension of the eigenspace corresponding to \(\lambda_k\). In other words, it is the number of independent eigenvectors for \(\lambda_k\).
We continue with the matrix \(A = \left[\begin{array}{ccc} 4 & -1 & -2 \\0 & 3 & 0 \\ 1 & 2 & 1 \end{array}\right]\) of Example 6.2.4 and find the geometric multiplicities of the eigenvalues. The characteristic polynomial of \(A\) was found to be \(p_A(\lambda) = (3-\lambda)^2(2-\lambda)\), so \(A\) has the eigenvalues \(\lambda_1 = 3\) with algebraic multiplicity \(2\) and \(\lambda_2 = 2\) with algebraic multiplicity \(1\).
To find the geometric multiplicities we proceed as follows.
The eigenspace for \(\lambda_{1}\) is the null space of \(A - \lambda_{1} I = A -3I\).
This is a \(3 \times 3\) matrix of rank 2, so its null space has dimension \(3-2 =1\), and we conclude that the geometric multiplicity of the eigenvalue \(\lambda = 3\) is equal to 1. For the other eigenvalue we perform a similar computation:
from which we deduce that the geometric multiplicity of the eigenvalue \(\lambda = 2\) is also equal to 1.
At this moment it is not so easy to prove the following proposition, of which the previous example gives an illustration. (For the proof: see the section ‘Similar Matrices’, right after formula (6.3.1).)
For every eigenvalue of a matrix \(A\), the geometric multiplicity is at most equal to the algebraic multiplicity. So we always have
\(1\) \(\leq\) geometric multiplicity of \(\lambda\) \(\leq\) algebraic multiplicity of \(\lambda\).
A matrix \(A\) that has an eigenvalue \(\lambda\) for which the geometric multiplicity is strictly smaller than the algebraic multiplicity is called a defect matrix.
As a consequence of Proposition 6.2.4 there is one case where the geometric multiplicity follows immediately from the algebraic multiplicity. Namely, if \(\lambda\) is an eigenvalue of algebraic multiplicity 1, then the geometric multiplicity must be 1 too: it cannot be larger, because of Proposition 6.2.4, and it cannot be smaller either, since for any eigenvalue there must be at least one eigenvector.
The matrix \(A = \begin{bmatrix} 1 & 2 & 2 \\ 2 & 1 & 2 \\ 2 & 2 & 1 \end{bmatrix}\) has the two independent eigenvectors \(\vect{v}_1 = \begin{bmatrix} 1 \\ 0 \\ -1 \end{bmatrix}, \vect{v}_2 =\begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix}\) for the eigenvalue \(\lambda = -1\), and the eigenvector \(\vect{v}_3 = \begin{bmatrix} 1 \\1\\ 1 \end{bmatrix}\) for the eigenvalue 5. (We studied this matrix in Example 6.1.7.) From Proposition 6.2.4 we can deduce that the characteristic polynomial must contain at least two factors \((\lambda - (-1))\) and one factor \((\lambda - 5)\). Since its degree is equal to 3, and the coefficient of \(\lambda^3\) is equal to \((-1)^3 = -1\), we may conclude that
The eigenvalue \(\lambda = -1\) has both algebraic multiplicity and geometric multiplicity equal to 2.
The following exercise, which is meant to shed some more light on the concepts just introduced, requires few technicalities.
Finding eigenvalues and their multiplicities for an upper triangular matrix \(A\).
Show/Hide Content
6.2.3. Some special properties of the characteristic polynomial#
In the proof of Proposition 6.2.3 it was mentioned that for an \(n \times n\) matrix \(A\) the coefficient of the highest power \(\lambda^n\) is equal to \((-1)^n\). In this subsection we will find expressions for two other coefficients of the characteristic polynomial. The results we mention are interesting in themselves, but they are not essential for the sequel.
Suppose the characteristic polynomial of the \(n \times n\) matrix \(A\) is given by
Then the following identities hold for the coefficients \(c_n, c_{n-1}\) and \(c_0\):
and
(Sketch of the) Proof of Proposition 6.2.5
For \(n=2\) we have already seen that the characteristic polynomial of the most general \(2 \times 2\) matrix \(A = \left[\begin{array}{cc} a_{11} & a_{12} \\ a_{21} & a_{22} \end{array}\right] = \left[\begin{array}{cc} a & b \\ c & d \end{array}\right] \) is given by
which is indeed equal to
The value of \(c_0\) is the easiest to establish: just plug in \(\lambda=0\) in Equation (6.2.1):
For the other coefficient we will not give the slightly technical argument for an \(n\times n\) matrix. The idea will be pretty much clear when we consider a general \(3 \times 3\) matrix
When we expand
\( \quad \text{det}(A - \lambda I) = \left|\begin{array}{ccc} a_{11}- \lambda & a_{12} & a_{13} \\ a_{21} & a_{22}- \lambda & a_{23} \\ a_{31} & a_{32} & a_{33}- \lambda \end{array}\right|\quad \)
along its first row we find
The highest power of \(\lambda\) coming from the second and the third terms is \(\lambda^1\), so the coefficients of \(\lambda^3\) and of \(\lambda^2\) are completely determined by the first term. A closer look at that term yields that these two coefficients in fact come from the product \((a_{11}-\lambda)(a_{22}-\lambda)(a_{33}-\lambda)\). For a general \(n\times n\) matrix \(A\) the first two coefficients also come from the ‘diagonal product’ of the complete expansion of \(\det{(A - \lambda I)}\).
Expanding this product further we see that
Thus the coefficients \(c_3\) and \(c_2\) of \(\lambda^3\) and \(\lambda^2\) are indeed as stated, i.e.
The sum of the diagonal entries of an \(n\times n\) matrix \(A\) is called the trace of \(A\):
With this new terminology we can restate the second property in Proposition 6.2.6 as follows. For an \(n\times n\) matrix \(A\) the coefficient \(c_{n-1}\) of \(\lambda^{n-1}\) satisfies
Let \(A\) an \(n\times n\) matrix with \(n\) eigenvalues \(\lambda_1,\lambda_2, \ldots , \lambda_n\), where eigenvalues/zeros of multiplicity \(k\) are counted \(k\) times. Then the sum of the eigenvalues is equal to the trace of \(A\) and the product of the eigenvalues equals the determinant of \(A\). For short:
Proof of Proposition 6.2.6
This is more a statement about algebra, in particular about polynomials, than about linear algebra. In Section 6.4 we will see that it also holds for matrices with complex eigenvalues.
If \(A\) has \(n\) real eigenvalues \(\lambda_1, \ldots, \Lambda_n\), the characteristic polynomial \(p_A(\lambda)\) of \(A\) must contain the \(n\) factors \((\lambda - \lambda_i)\). Since the ‘leading’ coefficent’ \(c_n = (-1)^n\) we may deduce that
Now we focus on the coefficient of \(\lambda^{n-1}\) and the constant term:
Comparing this with the expressions for the coefficients we found in Proposition 6.2.6
we readily read off the identities put forward in Equation (6.2.2).
The identity involving the sum gives an easy check on the eigenvalues; with the other identity one has to do some work to apply it for a check. The next example gives an illustration.
In Example 6.2.4 we found that the eigenvalues of the matrix \(A = \begin{bmatrix} 4 & -1 & -2 \\0 & 3 & 0 \\ 1 & 2 & 1 \end{bmatrix}\) are \(\lambda_{1,3} = 3\), \(\lambda_{2} = 2\).
We see that indeed
and also
The last ‘mind blowing’ property of the characteristic polynomial we will only illustrate by an example.
Consider the matrix \(A = \begin{bmatrix} 1 & 2 \\ 4 & 5 \end{bmatrix}\). Its characteristic polynomial is computed as
Now if we replace in this last expression \(\lambda\) by the matrix \(A\), where we then interpret the term \(3\) as the matrix \(3I\), we get
So, interpreting the constant term as we did, we have for this matrix \(A\) established the property
the zero matrix.
That this is not a coincidence is the content of the following theorem, the proof of which is beyond the scope of this linear algebra book. (The interested reader can of course search the internet on “Cayley-Hamilton”.)
(Cayley-Hamilton)
Every matrix \(A\) is a zero of its characteristic polynomial.
6.2.4. Grasple Exercises#
To find the characteristic polynomial of a \(2\times 2\) matrix \(A\).
Show/Hide Content
To find the eigenvalues of a \(2\times 2\) matrix \(A\).
Show/Hide Content
To find the eigenvalues of a \(3\times 3\) matrix \(A\).
Show/Hide Content
To find the eigenvalues of a \(3\times 3\) matrix \(A\).
Show/Hide Content
To find the eigenvalues of a \(4\times 4\) matrix \(A\).
Show/Hide Content
Is an eigenvalue of a matrix \(A\) always an eigenvalue of \(A^T\)?
Show/Hide Content
Finding the geometric multiplicities of the eigenvalues of a \(3\times3\) matrix \(A\).
Show/Hide Content
Finding the geometric multiplicity of the eigenvalue of a \(3\times3\) matrix \(A\).
Show/Hide Content
To find the eigenvalues and their multiplicities of an almost upper triangular \(4 \times 4\) matrix \(A\).
Show/Hide Content
To find the geometric multiplicity of the single eigenvalue of an almost diagonal matrix \(A\).
Show/Hide Content
To find the geometric multiplicities of the eigenvalues of an almost diagonal matrix \(A\).
Show/Hide Content
To find the 2nd and 3rd eigenvalue of \(3\times 3\) matrix with known determinant and trace.