8.1. Symmetric Matrices#

8.1.1. Introduction#

Definition 8.1.1

A matrix \(A\) is called a symmetric matrix if

\[ A^T = A. \]

Note that this definition implies that a symmetric matrix must be a square matrix.

Example 8.1.1

The matrices

\[\begin{split} A_1 = \begin{bmatrix} 2&\color{blue}3&\color{red}4\\\color{blue}3&1&\color{green}5 \\\color{red}4&\color{green}5&7 \end{bmatrix} \quad \text{and} \quad A_2 = \begin{bmatrix} 0&2&3&4\\ 2&0&1&5 \\ 3&1&0&6 \\ 4&5&6&7\end{bmatrix} \end{split}\]

are symmetric. The matrices

\[\begin{split} A_3 = \begin{bmatrix} 2&3&4\\2&3&4 \\ 2&3&4 \end{bmatrix} \quad \text{and} \quad A_4 = \begin{bmatrix} 0&2&3&0\\ 2&0&1&0 \\ 3&1&0&0 \\ \end{bmatrix} \end{split}\]

are not symmetric.

Symmetric matrices appear in many different contexts. In statistics the covariance matrix is an example of a symmetric matrix. In engineering the so-called elastic strain matrix and the moment of inertia tensor provide examples.

The crucial thing about symmetric matrices is stated in the main theorem of this section.

Theorem 8.1.1

Every symmetric matrix \(A\) is orthogonally diagonalizable.

By this we mean: there exist an orthogonal matrix \(Q\) and a diagonal matrix \(D\) for which

\[ A = QDQ^{-1} = QDQ^T. \]

Conversely, every orthogonally diagonalizable matrix is symmetric.

This theorem is known as the Spectral Theorem for Symmetric Matrices. In other contexts the word spectrum of a transformation is used for the set of eigenvalues.

So, for a symmetric matrix an orthonormal basis of eigenvectors always exists. For the inertia tensor of a 3D body such a basis corresponds to the (perpendicular) principle axes.

Proof. Of the converse of Theorem 8.1.1.

Recall that an orthogonal matrix is a matrix \(Q\) for which \(Q^{-1} = Q^T\).

With this reminder it is just a one line proof. If \(A = QDQ^{-1} = QDQ^T\),

then \(A^T = (QDQ^{-1} )^T = (Q^{-1} )^TD^TQ^T = (Q^T)^TD^TQ^T = QDQ^T = A\).

The proof of the other implication we postpone till Subsection 8.1.3.

We end this introductory section with one representative example.

Example 8.1.2

Let \(A\) be given by \(A = \begin{bmatrix} 1&2\\2&-2 \end{bmatrix}\).

The eigenvalues are found via

\[\begin{split} \det{(A - \lambda I)} = \begin{vmatrix} 1-\lambda&2\\2&-2-\lambda \end{vmatrix} = (1-\lambda)(-2-\lambda) -4 = \lambda^2 +\lambda -6 = (\lambda-2)(\lambda+3) . \end{split}\]

They are \(\lambda_1 = 2\) and \(\lambda_2 = -3\).

Corresponding eigenvectors are \(\mathbf{v}_1 = \begin{bmatrix} 2\\1 \end{bmatrix}\) for \(\lambda_1\), and \(\mathbf{v}_2 = \begin{bmatrix} 1\\-2 \end{bmatrix}\).

The eigenvectors are orthogonal,

\[\begin{split} \mathbf{v}_1 \ip \mathbf{v}_2 = \begin{bmatrix} 2\\1 \end{bmatrix}\ip \begin{bmatrix} 1\\-2 \end{bmatrix} = 2 - 2 = 0, \end{split}\]

and \(A\) can be diagonalized as

\[\begin{split} A = PDP^{-1} = \begin{bmatrix}2&1\\1&-2 \end{bmatrix}\begin{bmatrix}2 & 0\\0& -3 \end{bmatrix} \begin{bmatrix}2&1\\1&-2 \end{bmatrix}^{-1}. \end{split}\]

In Figure 7.3.1 the image of the unit circle under the transformation \(\vect{x} \mapsto A\vect{x}\) is shown. \(\vect{q}_1\) and \(\vect{q}_2\) are two orthonormal unit eigenvectors.

../_images/Fig-SymmetricMat-Evectors.svg

Fig. 8.1.1 The transformation \(T(\vect{x}) = \begin{bmatrix} 1&2\\2&-2 \end{bmatrix}\vect{x}\).#

Furthermore, if we normalize the eigenvectors, i.e., the columns of \(P\), we find the following diagonalization of \(A\) with an orthogonal matrix \(Q\):

\[\begin{split} A = QDQ^{-1} = \begin{bmatrix}2/\sqrt{5}&1/\sqrt{5}\\1/\sqrt{5}&-2/\sqrt{5} \end{bmatrix}\begin{bmatrix}2 & 0\\0& -3 \end{bmatrix} \begin{bmatrix}2/\sqrt{5}&1/\sqrt{5}\\1/\sqrt{5}&-2/\sqrt{5} \end{bmatrix}^{-1}. \end{split}\]

8.1.2. The essential properties of symmetric matrices#

Proposition 8.1.1

Suppose \(A\) is a symmetric matrix.

If \(\mathbf{v}_1\) and \(\mathbf{v}_2\) are eigenvectors of \(A\) for different eigenvalues, then \(\mathbf{v}_1\perp \mathbf{v}_2\).

Proof. Suppose \(\mathbf{v}_1\) and \(\mathbf{v}_2\) are eigenvectors of the symmetric matrix \(A\) for the different eigenvalues \(\lambda_1,\lambda_2\). We want to show that \(\mathbf{v}_1 \ip \mathbf{v}_2 = 0\).

The trick is to consider the expression

(8.1.1)#\[(A\mathbf{v}_1) \ip \mathbf{v}_2.\]

On the one hand

\[ (A\mathbf{v}_1) \ip \mathbf{v}_2 = (\lambda_1\mathbf{v}_1) \ip \mathbf{v}_2 = \lambda_1(\mathbf{v}_1 \ip \mathbf{v}_2). \nonumber \]

On the other hand

\[ (A\mathbf{v}_1) \ip \mathbf{v}_2 = (A\mathbf{v}_1)^T \mathbf{v}_2 =\mathbf{v}_1^TA^T \mathbf{v}_2. \nonumber \]

Since we assumed that \(A^T = A\) we can extend the chain of identities:

\[ \mathbf{v}_1^TA^T \mathbf{v}_2 = \mathbf{v}_1^T A \mathbf{v}_2 =\mathbf{v}_1^T (A \mathbf{v}_2) = \mathbf{v}_1^T (\lambda_2 \mathbf{v}_2) = \lambda_2(\mathbf{v}_1^T \mathbf{v}_2) = \lambda_2(\mathbf{v}_1 \ip \mathbf{v}_2). \nonumber \]

So we have shown that

\[ (A\mathbf{v}_1) \ip \mathbf{v}_2 = \lambda_1(\mathbf{v}_1 \ip \mathbf{v}_2) = \lambda_2(\mathbf{v}_1 \ip \mathbf{v}_2). \nonumber \]

Since

\[ \lambda_1 \neq \lambda_2, \nonumber \]

it follows that indeed

\[ \mathbf{v}_1\ip \mathbf{v}_2 = 0,\]

as was to be shown.

Exercise 8.1.1

Prove the following slight generalization of Proposition 8.1.1.

If \(\vect{u}\) is an eigenvector of \(A\) for the eigenvalue \(\lambda\), and \(\vect{v}\) is an eigenvector of \(A^T\) for a different eigenvalue \(\mu\), then \(\vect{u} \perp \vect{v}\).

Solution to Exercise 8.1.1 (click to show)

The proof is completely analogous to the proof of Proposition 8.1.1. Suppose

\[ A\mathbf{u} = \lambda\mathbf{u},\quad A\mathbf{v} = \mu\mathbf{v},\quad\text{ where} \,\,\,\lambda \neq \mu. \]

We consider the expression \(\mathbf{u}^T\ip A \mathbf{v} = \mathbf{u}T A \mathbf{v}\).

On the one hand

(8.1.2)#\[ \mathbf{u}\ip A \mathbf{v} = \mathbf{u}^T (A\mathbf{v}) = \mathbf{u}^T \mu \mathbf{v} = \mu \mathbf{u}T\mathbf{v} = \mu (\mathbf{u}\mathbf{v})\]

On the other hand

(8.1.3)#\[ \mathbf{u}\ip A \mathbf{v} = \mathbf{u}^T\ A^T \mathbf{v} = (A\mathbf{u})^T\mathbf{v} = \lambda \mathbf{u}^T\mathbf{v} = \lambda (\mathbf{u}\mathbf{v})\]

Comparing (8.1.2) and (8.1.3) we can conlude that \(\mathbf{u}\ip\mathbf{v} = 0\), i.e., \(\mathbf{u}\) and \(\mathbf{v}\) are indeed orthogonal.

Proposition 8.1.2

All eigenvalues of symmetric matrices are real.

The easiest proof is via complex numbers. Feel free to skip it, in particular when you don’t feel comfortable with complex numbers.

Proof of Proposition 8.1.2

For two vectors \(\mathbf{u},\mathbf{v}\) in \(\C^n\) we consider the expression

\[ \overline{\mathbf{u}}^{T}\mathbf{v} = \overline{u_1}v_1 + \ldots + \overline{u_n}v_n. \]

If we take \(\mathbf{v}\) equal to \(\mathbf{u}\) we get

\[ \overline{\mathbf{u}}^{T}\mathbf{u} = \overline{u_1}u_1 + \overline{u_2}u_2 + \ldots + \overline{u_n}u_n = |u_1|^2 + |u_2|^2 + \ldots + |u_n|^2, \]

where \(|u_i|\) denotes the modulus of the complex number \(u_i\). This sum of squares (of real numbers) is a non-negative real number. We also see that \(\overline{\mathbf{u}}^{T}\mathbf{u} = 0\) only holds if \(\mathbf{u} = \mathbf{0}\).

It can also be verified that

\[ \overline{\overline{\mathbf{u}}^{T}\mathbf{v}} = \overline{\mathbf{v}}^T \mathbf{u}. \]

Now suppose that \(\lambda\) is an eigenvalue of the symmetric matrix \(A\), and \(\mathbf{v}\) is a nonzero (possibly complex) eigenvector of \(A\) for the eigenvalue \(\lambda\). Note that, since \(A\) is real and symmetric, \(\overline{{A}^T} = \overline{A} = A\). To prove that \(\lambda\) is real, we will show that \(\overline{\lambda} = \lambda\).

We use kind of the same ‘trick’ as in Proposition 8.1.1 Equation (8.1.1).
On the one hand

\[ \overline{(A \mathbf{v})^T} \mathbf{v} = \overline{\lambda\mathbf{v}^T} \mathbf{v} = \overline{\lambda} \overline{\mathbf{v}}^T \mathbf{v}. \]

On the other hand,

\[ \overline{(A \mathbf{v})^T} \mathbf{v} = \overline{\mathbf{v}^T A^T}\mathbf{v} = \overline{\mathbf{v}^T}\,\overline{{A}^T} \mathbf{v} = \overline{\mathbf{v}}^T\,\overline{A} \mathbf{v} = \overline{\mathbf{v}}^T A\mathbf{v} = \overline{\mathbf{v}}^T \lambda\mathbf{v} = \lambda\overline{\mathbf{v}}^T \mathbf{v}. \]

So we have that

\[ \overline{\lambda} \overline{\mathbf{v}}^T \mathbf{v} = \lambda\overline{\mathbf{v}}^T \mathbf{v}. \]

Since we assumed that \(\mathbf{v}\) is not the zero vector, we have that \(\overline{\mathbf{v}}^T \mathbf{v} \neq 0\) , and so it follows that \( \overline{\lambda} =\lambda\). Which is equivalent to \(\lambda\) being real.

Example 8.1.3

Let \(A = \begin{bmatrix} a&b\\b&d \end{bmatrix} \).

Then the characteristic polynomial is computed as

\[\begin{split} \begin{vmatrix} a-\lambda&b\\b&d-\lambda \end{vmatrix} = (a-\lambda)(d-\lambda) - b^2 = \lambda^2 - (a+d)\lambda + ad - b^2. \nonumber \end{split}\]

The discriminant of this second order polynomial is given by

\[ D = (a+d)^2 -4(ad -b^2) = a^2+d^2 - 2ad + 4b^2 = (a-d)^2 + 4b^2 \geq 0. \nonumber \]

The discriminant is non-negative, so the characteristic polynomial has only real roots, and consequently the eigenvalues of the matrix are real.

Obviously, an elementary approach like this will soon get very complicated for larger \(n \times n\) matrices.

Lastly we come to the third of three essential properties of symmetric matrices.

Proposition 8.1.3

For each eigenvalue of a symmetric matrix the geometric multiplicity is equal to the algebraic multiplicity.

We will incorporate the proof of this proposition into the proof of the main theorem in Subsection 8.1.3. For now, we will look at a few examples.

Example 8.1.4

We will verify that the symmetric matrix \(A = \begin{bmatrix} 1 & 0 & 1\\0 & 1 & 2 \\ 1 & 2 & 5 \end{bmatrix}\) is diagonalizable and has mutually orthogonal eigenvectors.

We first compute the characteristic polynomial.

Expansion along the first column gives

\[\begin{split} \begin{array}{rcl} \begin{vmatrix} 1-\lambda & 0 & 1\\0 & 1-\lambda & 2 \\ 1 & 2 & 5-\lambda \end{vmatrix} &=& (1-\lambda)\begin{vmatrix} 1-\lambda & 2 \\ 2 & 5-\lambda\end{vmatrix} + 1\cdot\begin{vmatrix} 0 & 1 \\ 1-\lambda & 2\end{vmatrix} \\ &=& (1-\lambda)\big((1-\lambda)(5-\lambda) -4 \big)- (1-\lambda) \\ &=& (1-\lambda) (\lambda^2-6\lambda) = (1-\lambda) (\lambda-6)\lambda. \end{array} \nonumber \end{split}\]

So \(A\) has the real eigenvalues \(\lambda_{1} = 1\), \(\lambda_2 = 6\) and \(\lambda_3 = 0\). Since all eigenvalues have algebraic multiplicity 1, the corresponding eigenvectors will give a basis of eigenvectors, and we can immediately conclude that \(A\) is diagonalizable.

The eigenvectors are found to be

\[\begin{split} \mathbf{v}_1 = \begin{bmatrix} 2 \\-1 \\ 0 \end{bmatrix} \text{ for } \lambda_1 = 1, \quad \mathbf{v}_2 = \begin{bmatrix} 1 \\ 2 \\ 5 \end{bmatrix} \text{ for } \lambda_2, \quad \mathbf{v}_3 = \begin{bmatrix} 1 \\ 2 \\ -1 \end{bmatrix} \text{ for } \lambda_3. \nonumber \end{split}\]

We see: the three eigenvectors form an orthogonal threesome, in accordance with Proposition 8.1.1.

Example 8.1.5

Consider the matrix \(A = \begin{bmatrix} 2&2&4\\2 & -1 & 2 \\ 4&2&2 \end{bmatrix}\).

A (rather involved) computation yields the eigenvalues \(\lambda_{1,2} = -2\) and \(\lambda_3 = 7\). Indeed all eigenvalues are real, conforming to Proposition 8.1.2.

Next we find the eigenvectors and the geometric multiplicities of the eigenvalues.

For \(\lambda = -2\) we find via row reduction

\[\begin{split} [A - (-2)I\,|\,\mathbf{0}] = \left[\begin{array}{ccc|c} 4&2&4&0\\2 & 1 & 2 &0\\ 4&2&4&0\end{array}\right] \sim \left[\begin{array}{ccc|c} 2&1&2&0\\0&0&0&0 \\0&0&0&0\end{array}\right] \nonumber \end{split}\]

the two linearly independent eigenvectors \(\mathbf{v}_1 = \begin{bmatrix} 1 \\ 0 \\ -1\end{bmatrix}\) and \(\mathbf{v}_2 = \begin{bmatrix} 1 \\ -2 \\ 0\end{bmatrix}\). The geometric multiplicity of \(\lambda_{1,2}\) is equal to 2. The other eigenvalue has algebraic multiplicity 1, so its geometric multiplicity has to be 1 as well. With this Proposition 8.1.3 is verified.

Lastly we leave it to you to check that an eigenvector for \(\lambda_3 = 7\) is given by \(\mathbf{v}_3 = \begin{bmatrix} 2 \\ 1 \\ 2\end{bmatrix}\). And that both \(\mathbf{v}_3 \perp \mathbf{v}_1\) and \(\mathbf{v}_3 \perp \mathbf{v}_2\), so that Proposition 8.1.1 is satisfied as well.

8.1.3. Orthogonal Diagonalizability of Symmetric Matrices#

Let us restate the main theorem (Theorem 8.1.1) about symmetric matrices:

A matrix \(A\) is symmetric if and only if it is orthogonally diagonalizable.

Note that this also establishes the property that for each eigenvalue of a symmetric matrix the geometric muliplicity equals the algebraic multiplicity (Proposition 8.1.3).

We will put the intricate proof at the end of the subsection, and first consider two examples.

The first example is a continuation of the earlier Example 8.1.5.

Example 8.1.6

The matrix \(A = \begin{bmatrix} 2&2&4\\2 & -1 & 2 \\ 4&2&2 \end{bmatrix}\) was shown to have the eigenvalues/eigenvectors

\[\begin{split} \lambda_{1,2} = -2, \quad \mathbf{v}_1 = \begin{bmatrix} 1 \\ 0 \\ -1\end{bmatrix}, \, \mathbf{v}_2 = \begin{bmatrix} 1 \\ -2 \\ 0\end{bmatrix}, \quad \lambda_3 = 7, \quad \mathbf{v}_3 = \begin{bmatrix} 2 \\ 1 \\ 2\end{bmatrix}. \end{split}\]

The pairs \(\mathbf{v}_1, \mathbf{v}_3\) and \(\mathbf{v}_2, \mathbf{v}_3\) are ‘automatically’ orthogonal.

For the eigenspace \(E_{-2} = \Span{\mathbf{v}_1, \mathbf{v}_2}\) we can use Gram-Schmidt to get an orthogonal basis:

\[\begin{split} \mathbf{u}_1 = \mathbf{v}_1, \quad \mathbf{u}_2 = \mathbf{v}_2 - \dfrac{\mathbf{v}_2 \ip \mathbf{u}_1}{\mathbf{u}_1 \ip \mathbf{u}_1} \mathbf{u}_1 = \dfrac12\begin{bmatrix} 1 \\ -4 \\ 1\end{bmatrix}. \end{split}\]

Normalizing the orthogonal basis \(\{\mathbf{u}_1, \mathbf{u}_2, \mathbf{v}_3\}\) and putting them side by side in a matrix yields the orthogonal matrix

\[\begin{split} Q = \begin{bmatrix} \dfrac{1}{\sqrt{2}} & \dfrac{1}{\sqrt{18}} & \dfrac{2}{3} \\ 0 & \dfrac{-4}{\sqrt{18}} & \dfrac{1}{3}\\ \dfrac{-1}{\sqrt{2}} & \dfrac{1}{\sqrt{18}} & \dfrac{2}{3} \end{bmatrix}. \end{split}\]

The conclusion becomes that

\[\begin{split} A = QDQ^{-1} = QDQ^T, \quad \text{where still} \,\,\, D = \begin{bmatrix} -2 & 0 & 0 \\ 0 & -2 & 0 \\ 0 & 0 & 7\end{bmatrix}. \end{split}\]

One more example before we get to the proof (or you jump over to Section 8.1.4).

Example 8.1.7

Let the symmetric matrix \(A\) be given by \( A = \begin{bmatrix} 1 & 2 & 2 & 0 \\ 2 & -1 & 0 & 2 \\ 2 & 0 & -1 & -2 \\ 0 & 2 & -2 & 1 \end{bmatrix}\).

The hard part is to find the eigenvalues. (I.e., how to solve an equation of the order four?!) Once we know what the eigenvalues are, the other steps are ‘routine’.

It appears that \(A\) has the double eigenvalues \(\lambda_{1,2} = 3\) and \(\lambda_{3,4} = -3\).

To find the eigenvectors for the eigenvalue 3 we row reduce the matrix \((A - 3I)\).

\[\begin{split} \left[\begin{array}{cccc}1-3 & 2 & 2 & 0\\ 2 & -1-3 & 0 & 2 \\ 2 & 0 & -1-3 & -2 \\ 0 & 2 & -2 & 1-3 \end{array} \right] \,\, \sim \,\,\ldots\,\, \sim \,\, \left[\begin{array}{cccc}1 & 0 & -2 & -1\\ 0 & 1 & -1 & -1 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{array} \right]. \end{split}\]

We can read off two linearly independent eigenvectors

\[\begin{split} \vect{v}_1 = \left[\begin{array}{c} 1 \\ 1 \\ 0 \\ 1 \end{array} \right], \quad \vect{v}_2 = \left[\begin{array}{c} 2 \\ 1 \\ 1 \\ 0 \end{array} \right]. \end{split}\]

As in Example 8.1.6 we can construct an orthogonal basis for the eigenspace \(E_{3}\):

\[\begin{split} \mathbf{u}_1 = \mathbf{v}_1, \quad \mathbf{u}_2 = \mathbf{v}_2 - \dfrac{\mathbf{v}_2 \ip \mathbf{u}_1}{\mathbf{u}_1 \ip \mathbf{u}_1} \mathbf{u}_1 = \begin{bmatrix} 1 \\ 0 \\ 1\\ -1\end{bmatrix} \end{split}\]

Likewise we can first find a ‘natural’ basis for the eigenspace \(E_{-3}\) by row reducing \((A - (-3I))\):

\[\begin{split} (A - (-3I)) = \left[\begin{array}{cccc}4 & 2 & 2 & 0\\ 2 & 2 & 0 & 2 \\ 2 & 0 & 2 & -2 \\ 0 & 2 & -2 & 4 \end{array} \right] \quad \sim \ldots \sim \quad \left[\begin{array}{cccc}1 & 0 & 1 & -1\\ 0 & 1 & -1 & 2 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{array} \right]. \end{split}\]

Two independent eigenvectors: \(\vect{v}_3 = \left[\begin{array}{c} -1 \\ 1 \\ 1 \\ 0 \end{array} \right]\) and \(\vect{v}_4 = \left[\begin{array}{c} 1 \\ -2 \\ 0 \\ 1 \end{array} \right]\).

Again these can be orthogonalized, and then we find the following complete set of eigenvectors, i.e., a basis for \(\R^4\):

\[\begin{split} \vect{u}_1 = \begin{bmatrix} 1 \\ 1 \\ 0\\ 1\end{bmatrix}, \quad \vect{u}_2 = \begin{bmatrix} 1 \\ 0 \\ 1\\ -1\end{bmatrix}, \quad \vect{u}_3 = \begin{bmatrix} -1 \\ 1 \\ 1\\ 0\end{bmatrix}, \quad \vect{u}_4 = \begin{bmatrix} 0 \\ -1 \\ 1 \\ 1\end{bmatrix}. \end{split}\]

We conclude that \(A = QDQ^{-1}\), where

\[\begin{split} D = \left[\begin{array}{cccc}3 & 0 & 0 & 0\\ 0 & 3 & 0 & 0 \\ 0 & 0 & -3 & 0 \\ 0 & 0 & 0 & -3 \end{array} \right], \quad Q = \dfrac{1}{\sqrt{3}} \left[\begin{array}{cccc}1 & 1 & -1 & 0\\ 1 & 0 & 1 & -1 \\ 0 & 1 & 1 & 1 \\ 1 & -1 & 0 & 1 \end{array} \right]. \end{split}\]

And now it’s time for the proof of the main theorem. The proof is of the type technical and intricate. Skip it if you like.

Proof of Theorem 8.1.1

Suppose that \(A\) is a symmetric \(n \times n\) matrix. We know there are \(n\) real, possibly multiple, eigenvalues \(\lambda_1, \lambda_2, \ldots, \lambda_n\). Suppose \(\vect{q}_1\) is an eigenvector for \(\lambda_1\) with unit length. We can extend \(\{\vect{q}_1\}\) to an orthonormal basis \(\{\vect{q}_1,\vect{q}_2,\ldots,\vect{q}_n\}\). Let \(Q_1\) be the matrix with the columns \(\vect{q}_1,\vect{q}_2,\ldots,\vect{q}_n\).

It can be shown that \(A_1 = Q_1^{-1}AQ_1 = Q_1^TAQ_1\) is of the form

\[\begin{split} \left[\begin{array}{ccc} \lambda_1 & 0 & \ldots & 0 \\ 0 & \\ \vdots & & B_1 & \\ 0 & \end{array}\right] \end{split}\]

where \(B_1\) is an \((n-1)\times(n-1)\) matrix that is also symmetric.

Namely, the first column of \(A_1\) can be computed as

\[ A_1\vect{e}_1 = Q_1^{-1}AQ_1\vect{e}_1 = Q_1^{-1}A\vect{q}_1 = Q_1^{-1}\lambda_1\vect{q}_1 = \lambda_1Q_1^{-1}\vect{q}_1 \]

and \(Q_1^{-1}\vect{q}_1\) is the first column of \(Q_1^{-1}Q_1\), which is \(\vect{e}_1\).

This shows that the first column of \(A_1\) must indeed be \(\lambda_1\vect{e}_1 = \left[\begin{array}{c} \lambda_1 \\ 0 \\ \vdots \\ 0 \end{array}\right]\).

Since \(A\) is symmetric and \(Q_1\) is by construction an orthogonal matrix,

\[ A_1^T = (Q_1^{T}AQ_1)^T = Q_1^T A^T (Q_1^T)^T = Q_1^{T}AQ_1 = A_1. \]

So \(A_1\) is also symmetric. Thus if the first column of \(A_1\) contains \(n-1\) zeros, so does its first row.

Since \(A\) and \(A_1\) are similar, they have the same eigenvalues. It follows that \(B_1\) has the eigenvalues \(\lambda_2, \ldots, \lambda_n\).

We can apply the same construction to \(B_1\), yielding

\[\begin{split} B_2 = (\tilde{Q}_2)^{-1}B_1\tilde{Q}_2 = \left[\begin{array}{cccc} \lambda_2 & 0 & \ldots & 0 \\ 0 & \\ \vdots & & \tilde{B}_2 & \\ 0 & \end{array}\right]. \end{split}\]

Note that in this formula the matrices have size \((n-1)\) by \((n-1)\).

If we then define

\[\begin{split} Q_2 = \left[\begin{array}{cccc} 1 & 0 & \ldots & 0 \\ 0 & \\ \vdots & & \tilde{Q}_2 & \\ 0 & \end{array}\right]\end{split}\]

it follows that

\[\begin{split} A_2 = Q_2^{-1}A_1Q_2 = \left[\begin{array}{cccccc} \lambda_1 & 0 & 0 & \ldots & 0 \\ 0 & \lambda_2 & 0 & \ldots & 0 \\ 0 & 0 \\ \vdots & \vdots & & \tilde{B_2} \\ 0 & 0 & \end{array}\right]. \end{split}\]

Continuing in this fashion we find

\[\begin{split} A_{n-1} = Q_{n-1}^{-1} \cdots Q_2^{-1}Q_1^{-1}A Q_1 Q_2 \cdots Q_{n-1} = \left[\begin{array}{ccccc} \lambda_1 & 0 & 0 & \ldots &0 \\ 0 & \lambda_2 & 0 &\ldots &0 \\ \vdots & & \ddots & & \vdots\\ \vdots & & & \ddots & \vdots\\ 0 & 0 & \ldots & 0 &\lambda_n \end{array}\right].\end{split}\]

This proves that \(A\) is diagonalizable, with \(Q = Q_1Q_2 \cdots Q_{n-1}\) as a diagonalizing matrix.

Moreover, since the product of orthogonal matrices is orthogonal, \(A\) is in fact orthogonally diagonalizable.

Example 8.1.8

We will illustrate the proof for the matrix

\[\begin{split} A = \begin{bmatrix} 1 & 2 & 2 & 0 \\ 2 & -1 & 0 & 2 \\ 2 & 0 & -1 & -2 \\ 0 & 2 & -2 & 1 \end{bmatrix}. \end{split}\]

Since

\[\begin{split} \begin{bmatrix} 1 & 2 & 2 & 0 \\ 2 & -1 & 0 & 2 \\ 2 & 0 & -1 & -2 \\ 0 & 2 & -2 & 1 \end{bmatrix} \begin{bmatrix} 1 \\-1\\-1\\0 \end{bmatrix} = \begin{bmatrix} -3 \\3\\3\\0 \end{bmatrix} \end{split}\]

we have as a starter the eigenvalue and corresponding eigenvector

\[\begin{split} \lambda_1 = -3, \quad \vect{v}_1 = \begin{bmatrix} 1 \\-1\\-1\\0 \end{bmatrix}. \end{split}\]

An orthogonal basis for \(\mathbb{R}^4\), starting with this first eigenvector is, for instance

\[\begin{split} \vect{v}_1 = \begin{bmatrix} 1 \\-1\\-1\\0 \end{bmatrix}, \quad \vect{v}_2 = \begin{bmatrix} 1 \\1\\0\\0 \end{bmatrix}, \quad \vect{v}_3 = \begin{bmatrix} 1 \\-1\\2\\0 \end{bmatrix}, \quad \vect{v}_4 = \begin{bmatrix} 0\\0\\0\\1 \end{bmatrix}. \quad \end{split}\]

Rescaling and putting them into a matrix yields

\[\begin{split} Q_1 = \begin{bmatrix} 1/\sqrt{3} & 1/\sqrt{2} & 1/\sqrt{6} & 0 \\ -1/\sqrt{3} & 1/\sqrt{2} & -1/\sqrt{6} & 0 \\ -1/\sqrt{3} & 0 & 2/\sqrt{6} & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}. \end{split}\]

Next we compute

\[\begin{split} A_1 = Q_1^{-1}AQ_1 = Q_1^TAQ_1 = \begin{bmatrix} -3 & 0 & 0 & 0 \\ 0 & 2 & \sqrt{3} & \sqrt{2} \\ 0 & \sqrt{3} & 0 & -\sqrt{6} \\ 0 & \sqrt{2} & -\sqrt{6} & 1 \end{bmatrix}. \end{split}\]

This is indeed of the form stated in the proof.

We continue with the matrix \(B_1 = \left[\begin{array}{ccc} 2 & \sqrt{3} & \sqrt{2} \\ \sqrt{3} & 0 & -\sqrt{6} \\ \sqrt{2} & -\sqrt{6} & 1 \end{array} \right]\).

\(B_1\) has eigenvalue \(-3\) with eigenvector \(\vect{u}_1 = \left[\begin{array}{c} 1 \\ -\sqrt{3} \\ -\sqrt{2} \end{array} \right]\).

Again we extend to an orthogonal basis for \(\mathbb{R}^3\). For instance,

\[\begin{split} \vect{u}_1, \quad \vect{u}_2 = \left[\begin{array}{c} \sqrt{2} \\ 0\\ 1 \end{array} \right], \quad \vect{u}_3 = \left[\begin{array}{c} 1 \\ \sqrt{3} \\ -\sqrt{2} \end{array} \right]. \end{split}\]

If we normalize and use them as the columns of \(\tilde{Q}_2\) as in the proof of Theorem 8.1.1, we find as second matrix in that construction

\[\begin{split} Q_2 = \left[\begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & \dfrac{1}{\sqrt{6}} & \dfrac{\sqrt{2}}{\sqrt{3}} & \dfrac{1}{\sqrt{6}} \\ 0 & \dfrac{-1}{\sqrt{2}} & 0 & \dfrac{1}{\sqrt{2}} \\ 0 & \dfrac{-1}{\sqrt{3}} & \dfrac{1}{\sqrt{3}} & -\dfrac{1}{\sqrt{3}} \end{array} \right].\end{split}\]

And then

\[\begin{split} A_2 = Q_2^TQ_1^T A Q_1Q_2 = \left[\begin{array}{cccc} -3 & 0 & 0 & 0 \\ 0 &-3 & 0 & 0 \\ 0 & 0 & 3 & 0 \\ 0 & 0 & 0 & 3 \end{array} \right] = D, \end{split}\]

indeed a diagonal matrix.
For this example the matrix has the second double eigenvalue \(\lambda_{3,4} = 3\). Because of that, the construction takes one step less than in the general case.
Defining \(Q = Q_1Q_2\), we can rewrite the last identity as

\[ Q^{-1}AQ = D, \,\,\text{ so }\,\, A = QDQ^{-1} = QDQ^T \]

for the matrix \(Q = Q_1Q_2\). This is the matrix

\[\begin{split} Q = \left[\begin{array}{cccc} \dfrac{1}{\sqrt{3}} & 0 & \dfrac{1}{\sqrt{3}} & \dfrac{1}{\sqrt{3}} \\ -\dfrac{1}{\sqrt{3}} & \dfrac{1}{\sqrt{3}} & \dfrac{1}{\sqrt{3}} & 0 \\ -\dfrac{1}{\sqrt{3}} & -\dfrac{1}{\sqrt{3}} & 0 & \dfrac{1}{\sqrt{3}} \\ 0 & -\dfrac{1}{\sqrt{3}} & \dfrac{1}{\sqrt{3}} & -\dfrac{1}{\sqrt{3}} \end{array} \right] = \dfrac{1}{\sqrt{3}}\left[\begin{array}{cccc} 1 & 0 & 1 & 1 \\ -1 & 1 & 1 & 0 \\ -1 & -1 & 0 & 1 \\ 0 & -1 & 1 & -1 \end{array} \right]. \end{split}\]

So we see that \(A\) has the ‘simpler’ eigenvectors

\[\begin{split} \vect{v}_1 = \left[\begin{array}{c} 1 \\ -1 \\ -1 \\ 0 \end{array} \right], \quad \vect{v}_2 = \left[\begin{array}{c} 0 \\ 1 \\ -1 \\ -1 \end{array} \right], \quad \vect{v}_3 = \left[\begin{array}{c} 1 \\ 1 \\ 0 \\ 1 \end{array} \right], \quad \vect{v}_4 = \left[\begin{array}{c} 1 \\ 0 \\ 1 \\ -1 \end{array} \right]. \end{split}\]

Note: given the eigenvalues, these eigenvectors could have been found more efficiently by solving the systems \((A - \lambda_iI)\vect{x} = \vect{0}\), and then orthogonalize by the Gram-Schmidt procedure. As is done in Example 8.1.6.
The importance of the step-by-step reduction is that it shows that from the ‘minimal’ assumptions of symmetry and the existence of real eigenvalues it is possible to create an orthogonal diagonalization.

In the last subsection we will show how the orthogonal diagonalization can be rewritten in an interesting and meaningful way.

8.1.4. The Spectral Decomposition of a Symmetric Matrix.#

Let’s take up an earlier example (Example 8.1.2) to illustrate what the spectral decomposition is about.

Example 8.1.9

For the matrix \(A = \begin{bmatrix} 1&2\\2&-2 \end{bmatrix}\) we found the orthogonal diagonalization

\[\begin{split} A = QDQ^T = \begin{bmatrix} 2/\sqrt{5}& 1/\sqrt{5}\\1/\sqrt{5}& -2/\sqrt{5} \end{bmatrix} \begin{bmatrix} 2 & 0 \\ 0 & -3 \end{bmatrix} \begin{bmatrix} 2/\sqrt{5}& 1/\sqrt{5}\\1/\sqrt{5}& -2/\sqrt{5} \end{bmatrix}^T. \end{split}\]

This is of the form

\[\begin{split} \begin{array}{rcl} A &=& [\,\mathbf{q}_1\,\,\mathbf{q}_2\,]\begin{bmatrix} 2 & 0 \\ 0 & -3 \end{bmatrix} \big[\,\mathbf{q}_1\,\,\mathbf{q}_2\,\big]^T = \big[\,2\mathbf{q}_1\,\,(-3)\mathbf{q}_2\big]\begin{bmatrix}\mathbf{q}_1^T \\ \mathbf{q}_2^T \end{bmatrix}. \end{array} \end{split}\]

We bring in mind the column-row expansion of the matrix product. For two \(2\times 2\) matrices this reads

\[\begin{split} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \begin{bmatrix} b_{11} &b_{12} \\ b_{21} & b_{22} \end{bmatrix} = \begin{bmatrix} a_{11} \\ a_{21} \end{bmatrix} \begin{bmatrix} b_{11} &b_{12} \end{bmatrix} + \begin{bmatrix} a_{12} \\ a_{22} \end{bmatrix} \begin{bmatrix} b_{21} &b_{22} \end{bmatrix}. \end{split}\]

Applying this to the last expression for \(A = QDQ^T\) we find

\[ A = 2 \mathbf{q}_1\mathbf{q}_1^T + (-3)\mathbf{q}_2\mathbf{q}_2^T . \]

The matrices

\[\begin{split} \mathbf{q}_1\mathbf{q}_1^T = \frac15 \begin{bmatrix} 4 & 2 \\ 2 & 1 \end{bmatrix} \quad \text{and} \quad \mathbf{q}_2\mathbf{q}_2^T = \frac15 \begin{bmatrix} 1 & -2 \\ -2 & 4 \end{bmatrix} \end{split}\]

represent the orthogonal projections onto the one-dimensional subspaces \(\Span{\mathbf{q}_1}\) and \(\Span{\mathbf{q}_2}\).

Furthermore these one-dimensional subspaces are orthogonal to each other.

So we have that this symmetric matrix can be written as a linear combination of matrices that represent orthogonal projections.

The construction we performed in the last example can be generalized. As is the content of the last theorem in this section.

Theorem 8.1.2 (Spectral Decomposition of Symmetric Matrices)

Every \(n \times n\) symmetric matrix \(A\) is the linear combination

(8.1.4)#\[ A = \lambda_1P_1 + \lambda_2P_2 + \ldots + \lambda_nP_n\]

of \(n\) matrices \(P_i\) that represent orthogonal projections onto one-dimensional subspaces that are mutually orthogonal.

Formula (8.1.4) is referred to as being a spectral decomposition of the matrix \(A\).

Proof. For a general \(n\times n\) symmetric matrix \(A\), there exists an orthogonal diagonalization

\[ A = QDQ^{-1} = QDQ^{T}. \]

Exactly as in Example 8.1.9 we can use the column-row expansion of the matrix product to derive

(8.1.5)#\[A = \lambda_1 \mathbf{q}_1\mathbf{q}_1^T + \lambda_2\mathbf{q}_2\mathbf{q}_2^T + \ldots + \lambda_n\mathbf{q}_n\mathbf{q}_n^T,\]

where the vectors \(\mathbf{q}_i\) of course are the (orthonormal) columns of the diagonalizing matrix \(Q\). This is indeed a linear combination of orthogonal projections, as was to be shown.

Exercise 8.1.2

The eigenvalues of the matrix \(A=\begin{bmatrix} 2 & 1 & 0 \\ 1 & 3 & 1\\ 0 & 1& 2 \end{bmatrix}\) are 1, 2 and 4.

Find the spectral decomposition of \(A\).

If in Theorem 8.1.2 the projections onto eigenvectors for the same eigenvalue are grouped together, then the following alternative form of the spectral decomposition results.

Corollary 8.1.1 (Spectral Theorem, alternative version)

Every symmetric \(n \times n\) matrix \(A\) can be written as a linear combination of the orthogonal projections onto its (orthogonal) eigenspaces.

\[ A = \lambda_1 P_1 + \, \ldots \, + \lambda_k P_k, \]

where \(P_i\) denotes the orthogonal projection onto the eigenspace \(E_{\lambda_i}\).

Proof. We know that

\[ A = \lambda_1P_1 + \ldots + \lambda_nP_n = \lambda_1\vect{q}_1\vect{q}_1^T + \ldots + \lambda_n\vect{q}_n\vect{q}_n^T. \]

If all eigenvalues \(\lambda_1, \ldots, \lambda_n\) are different that’s just it.

If \(\lambda_i\) is an eigenvalue of multiplicity \(m\) with \(m\) orthonormal eigenvectors \(\vect{q}_1, \ldots, \vect{q}_m\), then

\[ \lambda_i\vect{q}_1\vect{q}_1^T + \,\ldots\,+ \lambda_i\vect{q}_m\vect{q}_m^T = \lambda_i [\,\vect{q}_1\,\,\cdots\,\,\vect{q}_m] [\,\vect{q}_1\,\,\cdots\,\,\vect{q}_m]^T = \lambda_i Q_iQ_i^T. \]

\(P_i = Q_iQ_i^T\) is precisely the orthogonal projection onto the eigenspace \(E_{\lambda_i}\).

The following example provides an illustration.

Example 8.1.10

For the matrix \(A = \begin{bmatrix} 1 & 2 & 2 & 0 \\ 2 & -1 & 0 & 2 \\ 2 & 0 & -1 & -2 \\ 0 & 2 & -2 & 1 \end{bmatrix}\) we had already found the orthogonal decomposition \(A = QDQ^{-1}= QDQ^T\) with

\[\begin{split} Q = \left[\,\vect{q}_1\,\,\vect{q}_2\,\rule[-2ex]{0ex}{5ex}\,\vect{q}_3\,\,\vect{q}_4\,\right] = \dfrac{1}{\sqrt{3}}\left[\begin{array}{cccc} 1 & 0 & 1 & 1 \\ -1 & 1 & 1 & 0 \\ -1 & -1 & 0 & 1 \\ 0 & -1 & 1 & -1 \end{array} \right] \end{split}\]

and

\[\begin{split} D = \left[\begin{array}{cccc} -3 & 0 & 0 & 0 \\ 0 & -3 & 0 & 0 \\ 0 & 0 & 3 & 0\\ 0 & 0 & 0 & 3 \end{array}\right]. \end{split}\]

The spectral decomposition according to Corollary 8.1.1 then becomes

\[ A = (-3) \left[\vect{q}_1\,\rule[-2ex]{0ex}{5ex}\,\vect{q}_2\,\right]\left[\vect{q}_1\,\rule[-2ex]{0ex}{5ex}\,\vect{q}_2\,\right]^T + 3 \left[\vect{q}_3\,\rule[-2ex]{0ex}{5ex}\,\vect{q}_4\,\right]\left[\vect{q}_3\,\rule[-2ex]{0ex}{5ex}\,\vect{q}_4\,\right]^T = \,\,\ldots\,\, = \]
\[\begin{split} = (-3)\begin{bmatrix} 1/3 & -1/3 & -1/3 & 0 \\ -1/3 & 2/3 & 0 & -1/3 \\ -1/3 & 0 & 2/3 & 1/3 \\ 0 & -1/3 & 1/3 & 1/3 \end{bmatrix} + 3 \begin{bmatrix} 2/3 & 1/3 & 1/3 & 0 \\ 1/3 & 1/3 & 0 & 1/3 \\ 1/3 & 0 & 1/3 & -1/3 \\ 0 & 1/3 & -1/3 & 2/3 \end{bmatrix}. \end{split}\]

8.1.5. Grasple Exercises#

Grasple Exercise 8.1.1

https://embed.grasple.com/exercises/f76823e6-8936-4edf-bd0b-fa3a2aa7246f?id=88040

To check whether a matrix \(A\) is symmetric.

Grasple Exercise 8.1.2

https://embed.grasple.com/exercises/9828a4b4-98f7-46c3-8dab-74ac04fc1955?id=88032

To check whether a matrix \(A\) is orthogonal. And, if it is, to give its inverse.

Grasple Exercise 8.1.3

https://embed.grasple.com/exercises/8af926a0-80d8-459f-af55-c37a492a18c6?id=88045

To check whether a matrix \(A\) is orthogonal. And, if it is, to give its inverse.

Grasple Exercise 8.1.4

https://embed.grasple.com/exercises/03d75a31-7e1b-4dd2-be0a-5e9a93a0ef09?id=94940

To give an orthogonal diagonalization of a (2x2) matrix.

Grasple Exercise 8.1.5

https://embed.grasple.com/exercises/926933aa-a33e-40f5-8e70-84bb9ed63fc8?id=87465

To give an orthogonal diagonalization of a (2x2) matrix.

Grasple Exercise 8.1.6

https://embed.grasple.com/exercises/9aac9c37-aa3b-4d5a-bb92-f00c09e5f052?id=94943

To give an orthogonal diagonalization of a (3x3) matrix.

Grasple Exercise 8.1.7

https://embed.grasple.com/exercises/a6a95823-15e4-4354-b89d-559306a5a7fa?id=94941

To give an orthogonal diagonalization of a (3x3) matrix.

Grasple Exercise 8.1.8

https://embed.grasple.com/exercises/0403af25-edba-4bc6-b077-3de227253419?id=56931

To give an orthogonal diagonalization of a (3x3) matrix.

Grasple Exercise 8.1.9

https://embed.grasple.com/exercises/3a45e358-4898-4d1d-b6f4-ba9679dd13e0?id=87765

To give an orthogonal diagonalization of a (3x3) matrix.

Grasple Exercise 8.1.10

https://embed.grasple.com/exercises/eb8b0e2f-d909-47ce-8ef1-50ad67e2b0f6?id=87905

To give an orthogonal diagonalization of a (4x4) matrix.

Grasple Exercise 8.1.11

https://embed.grasple.com/exercises/5ce15529-61a7-43d0-9fd3-5ad5469618e8?id=89131

One step in an orthogonal diagonalization (as in the proof of the existence of an orthogonal diagonalization)

Grasple Exercise 8.1.12

https://embed.grasple.com/exercises/5511e064-f22d-4601-9156-f00545d59f80?id=88649

Sequel to previous question, now for a 4x4 matrix

Grasple Exercise 8.1.13

https://embed.grasple.com/exercises/c994fa76-f723-4700-922b-2f05ff0ef822?id=87760

To give an example of an symmetric 2x2 matrix with 1 eigenvalue and 1 eigenvector given.

Grasple Exercise 8.1.14

https://embed.grasple.com/exercises/4fd8d027-0e63-46ec-aaf5-f2d10d8707c9?id=87038

To give an example of a 3x3 symm matrix with given eigenvalues and eigenspace.

The following exercise have a more theoretical flavour.

Grasple Exercise 8.1.15

https://embed.grasple.com/exercises/6e0ebf73-fba2-46d0-aaa8-44e53ea07e53?id=88034

To think about symmetric versus orthogonally diagonalizable. (true/false questions)

Grasple Exercise 8.1.16

https://embed.grasple.com/exercises/73c272d7-dbb0-47c9-8bee-074b1f8cc154?id=82845

About the (non-)symmetry of \(A + A^T\) and \(A - A^T\).

Grasple Exercise 8.1.17

https://embed.grasple.com/exercises/7959665f-09d0-4362-a0e8-c0a3e613399f?id=82848

About the (non-)symmetry of products.

Grasple Exercise 8.1.18

https://embed.grasple.com/exercises/33f5be5a-1cfa-4056-ac91-c2282de234b1?id=87864

If \(A\) and \(B\) are symmetric, what about \(A^2\), \(A^{-1}\) and \(AB\)?

Grasple Exercise 8.1.19

https://embed.grasple.com/exercises/59c4c327-1603-4cc1-8b92-7415c691098b?id=87873

True or false. If \(A\) is symmetric, then \(A^2\) has nonnegative eigenvalues. (and what if \(A\) is not symmetric?)