8.1. Symmetric Matrices#
8.1.1. Introduction#
A matrix \(A\) is called a symmetric matrix if
Note that this definition implies that a symmetric matrix must be a square matrix.
The matrices
are symmetric. The matrices
are not symmetric.
Symmetric matrices appear in many different contexts. In statistics the covariance matrix is an example of a symmetric matrix. In engineering the so-called elastic strain matrix and the moment of inertia tensor provide examples.
The crucial thing about symmetric matrices is stated in the main theorem of this section.
Every symmetric matrix \(A\) is orthogonally diagonalizable.
By this we mean: there exist an orthogonal matrix \(Q\) and a diagonal matrix \(D\) for which
Conversely, every orthogonally diagonalizable matrix is symmetric.
This theorem is known as the Spectral Theorem for Symmetric Matrices. In other contexts the word spectrum of a transformation is used for the set of eigenvalues.
So, for a symmetric matrix an orthonormal basis of eigenvectors always exists. For the inertia tensor of a 3D body such a basis corresponds to the (perpendicular) principal axes.
Of the converse of Theorem 8.1.1
Recall that an orthogonal matrix is a matrix \(Q\) for which \(Q^{-1} = Q^T\).
With this reminder it is just a one line proof. If \(A = QDQ^{-1} = QDQ^T\),
then \(A^T = (QDQ^{-1} )^T = (Q^{-1} )^TD^TQ^T = (Q^T)^TD^TQ^T = QDQ^T = A\).
The proof of the other implication we postpone till Subsection 8.1.3.
We end this introductory section with one representative example.
Let \(A\) be given by \(A = \begin{bmatrix} 1&2\\2&-2 \end{bmatrix}\).
The eigenvalues are found via
They are \(\lambda_1 = 2\) and \(\lambda_2 = -3\).
Corresponding eigenvectors are \(\mathbf{v}_1 = \begin{bmatrix} 2\\1 \end{bmatrix}\) for \(\lambda_1\), and \(\mathbf{v}_2 = \begin{bmatrix} 1\\-2 \end{bmatrix}\).
The eigenvectors are orthogonal,
and \(A\) can be diagonalized as
In Figure 8.1.1 the image of the unit circle under the transformation \(\vect{x} \mapsto A\vect{x}\) is shown. \(\vect{q}_1\) and \(\vect{q}_2\) are two orthonormal unit eigenvectors.
Furthermore, if we normalize the eigenvectors, i.e., the columns of \(P\), we find the following diagonalization of \(A\) with an orthogonal matrix \(Q\):
8.1.2. The essential properties of symmetric matrices#
Suppose \(A\) is a symmetric matrix.
If \(\mathbf{v}_1\) and \(\mathbf{v}_2\) are eigenvectors of \(A\) for different eigenvalues, then \(\mathbf{v}_1\perp \mathbf{v}_2\).
Proof of Proposition 8.1.1
Suppose \(\mathbf{v}_1\) and \(\mathbf{v}_2\) are eigenvectors of the symmetric matrix \(A\) for the different eigenvalues \(\lambda_1,\lambda_2\). We want to show that \(\mathbf{v}_1 \ip \mathbf{v}_2 = 0\).
The trick is to consider the expression
On the one hand
On the other hand
Since we assumed that \(A^T = A\) we can extend the chain of identities:
So we have shown that
Since
it follows that indeed
as was to be shown.
Prove the following slight generalization of Proposition 8.1.1.
If \(\vect{u}\) is an eigenvector of \(A\) for the eigenvalue \(\lambda\), and \(\vect{v}\) is an eigenvector of \(A^T\) for a different eigenvalue \(\mu\), then \(\vect{u} \perp \vect{v}\).
Solution to Exercise 8.1.1
The proof is completely analogous to the proof of Proposition 8.1.1. Suppose
We consider the expression \(\mathbf{u}^T\ip A \mathbf{v} = \mathbf{u}T A \mathbf{v}\).
On the one hand
On the other hand
Comparing (8.1.2) and (8.1.3) we can conclude that \(\mathbf{u}\ip\mathbf{v} = 0\), i.e., \(\mathbf{u}\) and \(\mathbf{v}\) are indeed orthogonal.
All eigenvalues of symmetric matrices are real.
The easiest proof is via complex numbers. Feel free to skip it, in particular when you don’t feel comfortable with complex numbers.
Proof of Proposition 8.1.2
For two vectors \(\mathbf{u},\mathbf{v}\) in \(\C^n\) we consider the expression
If we take \(\mathbf{v}\) equal to \(\mathbf{u}\) we get
where \(|u_i|\) denotes the modulus of the complex number \(u_i\). This sum of squares (of real numbers) is a non-negative real number. We also see that \(\overline{\mathbf{u}}^{T}\mathbf{u} = 0\) only holds if \(\mathbf{u} = \mathbf{0}\).
It can also be verified that
Now suppose that \(\lambda\) is an eigenvalue of the symmetric matrix \(A\), and \(\mathbf{v}\) is a nonzero (possibly complex) eigenvector of \(A\) for the eigenvalue \(\lambda\). Note that, since \(A\) is real and symmetric, \(\overline{{A}^T} = \overline{A} = A\). To prove that \(\lambda\) is real, we will show that \(\overline{\lambda} = \lambda\).
We use kind of the same ‘trick’ as in Proposition 8.1.1 Equation (8.1.1).
On the one hand
On the other hand,
So we have that
Since we assumed that \(\mathbf{v}\) is not the zero vector, we have that \(\overline{\mathbf{v}}^T \mathbf{v} \neq 0\) , and so it follows that \( \overline{\lambda} =\lambda\). Which is equivalent to \(\lambda\) being real.
Let \(A = \begin{bmatrix} a&b\\b&d \end{bmatrix} \).
Then the characteristic polynomial is computed as
The discriminant of this second order polynomial is given by
The discriminant is non-negative, so the characteristic polynomial has only real roots, and consequently the eigenvalues of the matrix are real.
Obviously, an elementary approach like this will soon get very complicated for larger \(n \times n\) matrices.
Lastly we come to the third of three essential properties of symmetric matrices.
For each eigenvalue of a symmetric matrix the geometric multiplicity is equal to the algebraic multiplicity.
We will incorporate the proof of this proposition into the proof of the main theorem in Subsection 8.1.3. For now, we will look at a few examples.
We will verify that the symmetric matrix \(A = \begin{bmatrix} 1 & 0 & 1\\0 & 1 & 2 \\ 1 & 2 & 5 \end{bmatrix}\) is diagonalizable and has mutually orthogonal eigenvectors.
We first compute the characteristic polynomial.
Expansion along the first column gives
So \(A\) has the real eigenvalues \(\lambda_{1} = 1\), \(\lambda_2 = 6\) and \(\lambda_3 = 0\). Since all eigenvalues have algebraic multiplicity 1, the corresponding eigenvectors will give a basis of eigenvectors, and we can immediately conclude that \(A\) is diagonalizable.
The eigenvectors are found to be
We see: the three eigenvectors form an orthogonal threesome, in accordance with Proposition 8.1.1.
Consider the matrix \(A = \begin{bmatrix} 2&2&4\\2 & -1 & 2 \\ 4&2&2 \end{bmatrix}\).
A (rather involved) computation yields the eigenvalues \(\lambda_{1,2} = -2\) and \(\lambda_3 = 7\). Indeed all eigenvalues are real, conforming to Proposition 8.1.2.
Next we find the eigenvectors and the geometric multiplicities of the eigenvalues.
For \(\lambda = -2\) we find via row reduction
the two linearly independent eigenvectors \(\mathbf{v}_1 = \begin{bmatrix} 1 \\ 0 \\ -1\end{bmatrix}\) and \(\mathbf{v}_2 = \begin{bmatrix} 1 \\ -2 \\ 0\end{bmatrix}\). The geometric multiplicity of \(\lambda_{1,2}\) is equal to 2. The other eigenvalue has algebraic multiplicity 1, so its geometric multiplicity has to be 1 as well. With this Proposition 8.1.3 is verified.
Lastly we leave it to you to check that an eigenvector for \(\lambda_3 = 7\) is given by \(\mathbf{v}_3 = \begin{bmatrix} 2 \\ 1 \\ 2\end{bmatrix}\). And that both \(\mathbf{v}_3 \perp \mathbf{v}_1\) and \(\mathbf{v}_3 \perp \mathbf{v}_2\), so that Proposition 8.1.1 is satisfied as well.
8.1.3. Orthogonal Diagonalizability of Symmetric Matrices#
Let us restate the main theorem (Theorem 8.1.1) about symmetric matrices:
A matrix \(A\) is symmetric if and only if it is orthogonally diagonalizable.
Note that this also establishes the property that for each eigenvalue of a symmetric matrix the geometric multiplicity equals the algebraic multiplicity (Proposition 8.1.3).
We will put the intricate proof at the end of the subsection, and first consider two examples.
The first example is a continuation of the earlier Example 8.1.5.
The matrix \(A = \begin{bmatrix} 2&2&4\\2 & -1 & 2 \\ 4&2&2 \end{bmatrix}\) was shown to have the eigenvalues/eigenvectors
The pairs \(\mathbf{v}_1, \mathbf{v}_3\) and \(\mathbf{v}_2, \mathbf{v}_3\) are ‘automatically’ orthogonal.
For the eigenspace \(E_{-2} = \Span{\mathbf{v}_1, \mathbf{v}_2}\) we can use Gram-Schmidt to get an orthogonal basis:
Normalizing the orthogonal basis \(\{\mathbf{u}_1, \mathbf{u}_2, \mathbf{v}_3\}\) and putting them side by side in a matrix yields the orthogonal matrix
The conclusion becomes that
One more example before we get to the proof (or you jump over to Section 8.1.4).
Let the symmetric matrix \(A\) be given by \( A = \begin{bmatrix} 1 & 2 & 2 & 0 \\ 2 & -1 & 0 & 2 \\ 2 & 0 & -1 & -2 \\ 0 & 2 & -2 & 1 \end{bmatrix}\).
The hard part is to find the eigenvalues. (I.e., how to solve an equation of the order four?!) Once we know what the eigenvalues are, the other steps are ‘routine’.
It appears that \(A\) has the double eigenvalues \(\lambda_{1,2} = 3\) and \(\lambda_{3,4} = -3\).
To find the eigenvectors for the eigenvalue 3 we row reduce the matrix \((A - 3I)\).
We can read off two linearly independent eigenvectors
As in Example 8.1.6 we can construct an orthogonal basis for the eigenspace \(E_{3}\):
Likewise we can first find a ‘natural’ basis for the eigenspace \(E_{-3}\) by row reducing \((A - (-3I))\):
Two independent eigenvectors: \(\vect{v}_3 = \left[\begin{array}{c} -1 \\ 1 \\ 1 \\ 0 \end{array} \right]\) and \(\vect{v}_4 = \left[\begin{array}{c} 1 \\ -2 \\ 0 \\ 1 \end{array} \right]\).
Again these can be orthogonalized, and then we find the following complete set of eigenvectors, i.e., a basis for \(\R^4\):
We conclude that \(A = QDQ^{-1}\), where
And now it’s time for the proof of the main theorem. The proof is of the type technical and intricate. Skip it if you like.
Proof of Theorem 8.1.1
Suppose that \(A\) is a symmetric \(n \times n\) matrix. We know there are \(n\) real, possibly multiple, eigenvalues \(\lambda_1, \lambda_2, \ldots, \lambda_n\). Suppose \(\vect{q}_1\) is an eigenvector for \(\lambda_1\) with unit length. We can extend \(\{\vect{q}_1\}\) to an orthonormal basis \(\{\vect{q}_1,\vect{q}_2,\ldots,\vect{q}_n\}\). Let \(Q_1\) be the matrix with the columns \(\vect{q}_1,\vect{q}_2,\ldots,\vect{q}_n\).
It can be shown that \(A_1 = Q_1^{-1}AQ_1 = Q_1^TAQ_1\) is of the form
where \(B_1\) is an \((n-1)\times(n-1)\) matrix that is also symmetric.
Namely, the first column of \(A_1\) can be computed as
and \(Q_1^{-1}\vect{q}_1\) is the first column of \(Q_1^{-1}Q_1\), which is \(\vect{e}_1\).
This shows that the first column of \(A_1\) must indeed be \(\lambda_1\vect{e}_1 = \left[\begin{array}{c} \lambda_1 \\ 0 \\ \vdots \\ 0 \end{array}\right]\).
Since \(A\) is symmetric and \(Q_1\) is by construction an orthogonal matrix,
So \(A_1\) is also symmetric. Thus if the first column of \(A_1\) contains \(n-1\) zeros, so does its first row.
Since \(A\) and \(A_1\) are similar, they have the same eigenvalues. It follows that \(B_1\) has the eigenvalues \(\lambda_2, \ldots, \lambda_n\).
We can apply the same construction to \(B_1\), yielding
Note that in this formula the matrices have size \((n-1)\) by \((n-1)\).
If we then define
it follows that
Continuing in this fashion we find
This proves that \(A\) is diagonalizable, with \(Q = Q_1Q_2 \cdots Q_{n-1}\) as a diagonalizing matrix.
Moreover, since the product of orthogonal matrices is orthogonal, \(A\) is in fact orthogonally diagonalizable.
We will illustrate the proof for the matrix
Since
we have as a starter the eigenvalue and corresponding eigenvector
An orthogonal basis for \(\mathbb{R}^4\), starting with this first eigenvector is, for instance
Rescaling and putting them into a matrix yields
Next we compute
This is indeed of the form stated in the proof.
We continue with the matrix \(B_1 = \left[\begin{array}{ccc} 2 & \sqrt{3} & \sqrt{2} \\ \sqrt{3} & 0 & -\sqrt{6} \\ \sqrt{2} & -\sqrt{6} & 1 \end{array} \right]\).
\(B_1\) has eigenvalue \(-3\) with eigenvector \(\vect{u}_1 = \left[\begin{array}{c} 1 \\ -\sqrt{3} \\ -\sqrt{2} \end{array} \right]\).
Again we extend to an orthogonal basis for \(\mathbb{R}^3\). For instance,
If we normalize and use them as the columns of \(\tilde{Q}_2\) as in the proof of Theorem 8.1.1, we find as second matrix in that construction
And then
indeed a diagonal matrix.
For this example the matrix has the second double eigenvalue \(\lambda_{3,4} = 3\). Because of that, the construction takes one step less than in the general case.
Defining \(Q = Q_1Q_2\), we can rewrite the last identity as
for the matrix \(Q = Q_1Q_2\). This is the matrix
So we see that \(A\) has the ‘simpler’ eigenvectors
Note: given the eigenvalues, these eigenvectors could have been found more efficiently by solving the systems
\((A - \lambda_iI)\vect{x} = \vect{0}\), and then orthogonalize by the Gram-Schmidt procedure. As is done in
Example 8.1.6.
The importance of the step-by-step reduction is that it shows that from the ‘minimal’ assumptions of symmetry and the existence of real eigenvalues it is possible to create an orthogonal diagonalization.
In the last subsection we will show how the orthogonal diagonalization can be rewritten in an interesting and meaningful way.
8.1.4. The Spectral Decomposition of a Symmetric Matrix.#
Let’s take up an earlier example (Example 8.1.2) to illustrate what the spectral decomposition is about.
For the matrix \(A = \begin{bmatrix} 1&2\\2&-2 \end{bmatrix}\) we found the orthogonal diagonalization
This is of the form
We bring in mind the column-row expansion of the matrix product. For two \(2\times 2\) matrices this reads
Applying this to the last expression for \(A = QDQ^T\) we find
The matrices
represent the orthogonal projections onto the one-dimensional subspaces \(\Span{\mathbf{q}_1}\) and \(\Span{\mathbf{q}_2}\).
Furthermore these one-dimensional subspaces are orthogonal to each other.
So we have that this symmetric matrix can be written as a linear combination of matrices that represent orthogonal projections.
The construction we performed in the last example can be generalized. Which is the content of the last theorem in this section.
(Spectral Decomposition of Symmetric Matrices)
Every \(n \times n\) symmetric matrix \(A\) is the linear combination
of \(n\) matrices \(P_i\) that represent orthogonal projections onto one-dimensional subspaces that are mutually orthogonal.
Formula (8.1.4) is referred to as being a spectral decomposition of the matrix \(A\).
Proof of Theorem 8.1.2
For a general \(n\times n\) symmetric matrix \(A\), there exists an orthogonal diagonalization
Exactly as in Example 8.1.9 we can use the column-row expansion of the matrix product to derive
where the vectors \(\mathbf{q}_i\) of course are the (orthonormal) columns of the diagonalizing matrix \(Q\). This is indeed a linear combination of orthogonal projections, as was to be shown.
The eigenvalues of the matrix \(A=\begin{bmatrix} 2 & 1 & 0 \\ 1 & 3 & 1\\ 0 & 1& 2 \end{bmatrix}\) are 1, 2 and 4.
Find the spectral decomposition of \(A\).
If in Theorem 8.1.2 the projections onto eigenvectors for the same eigenvalue are grouped together, then the following alternative form of the spectral decomposition results.
(Spectral Theorem, alternative version)
Every symmetric \(n \times n\) matrix \(A\) can be written as a linear combination of the orthogonal projections onto its (orthogonal) eigenspaces.
where \(P_i\) denotes the orthogonal projection onto the eigenspace \(E_{\lambda_i}\).
Proof of Corollary 8.1.1
We know that
If all eigenvalues \(\lambda_1, \ldots, \lambda_n\) are different that’s just it.
If \(\lambda_i\) is an eigenvalue of multiplicity \(m\) with \(m\) orthonormal eigenvectors \(\vect{q}_1, \ldots, \vect{q}_m\), then
\(P_i = Q_iQ_i^T\) is precisely the orthogonal projection onto the eigenspace \(E_{\lambda_i}\).
The following example provides an illustration.
For the matrix \(A = \begin{bmatrix} 1 & 2 & 2 & 0 \\ 2 & -1 & 0 & 2 \\ 2 & 0 & -1 & -2 \\ 0 & 2 & -2 & 1 \end{bmatrix}\) we had already found the orthogonal decomposition \(A = QDQ^{-1}= QDQ^T\) with
and
The spectral decomposition according to Corollary 8.1.1 then becomes
8.1.5. Grasple Exercises#
To check whether a matrix \(A\) is symmetric.
Show/Hide Content
To check whether a matrix \(A\) is orthogonal. And, if it is, to give its inverse.
Show/Hide Content
To check whether a matrix \(A\) is orthogonal. And, if it is, to give its inverse.
Show/Hide Content
To give an orthogonal diagonalization of a (2x2) matrix.
Show/Hide Content
To give an orthogonal diagonalization of a (2x2) matrix.
Show/Hide Content
To give an orthogonal diagonalization of a (3x3) matrix.
Show/Hide Content
To give an orthogonal diagonalization of a (3x3) matrix.
Show/Hide Content
To give an orthogonal diagonalization of a (3x3) matrix.
Show/Hide Content
To give an orthogonal diagonalization of a (3x3) matrix.
Show/Hide Content
To give an orthogonal diagonalization of a (4x4) matrix.
Show/Hide Content
One step in an orthogonal diagonalization (as in the proof of the existence of an orthogonal diagonalization)
Show/Hide Content
Sequel to previous question, now for a 4x4 matrix
Show/Hide Content
To give an example of an symmetric 2x2 matrix with 1 eigenvalue and 1 eigenvector given.
Show/Hide Content
To give an example of a 3x3 symm matrix with given eigenvalues and eigenspace.
Show/Hide Content
Deciding about the spectral decomposition of a 3x3 matrix (with lot of prerequisites laid out).
Show/Hide Content
The following exercise have a more theoretical flavour.
To think about symmetric versus orthogonally diagonalizable. (true/false questions)
Show/Hide Content
About the (non-)symmetry of \(A + A^T\) and \(A - A^T\).
Show/Hide Content
About the (non-)symmetry of products.
Show/Hide Content
If \(A\) and \(B\) are symmetric, what about \(A^2\), \(A^{-1}\) and \(AB\)?
Show/Hide Content
True or false. If \(A\) is symmetric, then \(A^2\) has nonnegative eigenvalues. (and what if \(A\) is not symmetric?)