Symmetric matrices

8.1. Symmetric matrices#

8.1.1. Introduction#

symmetric matrix

Definition 8.1.1

A matrix \(A\) is called a symmetric matrix if

\[ A^T = A. \]

Note that this definition implies that a symmetric matrix must be a square matrix.

Example 8.1.1

The matrices

\[\begin{split} A_1 = \begin{pmatrix} 2&\class{blue}3&\class{red}4\\\class{blue}3&1&\class{green}5 \\\class{red}4&\class{green}5&7 \end{pmatrix} \quad \text{and} \quad A_2 = \begin{pmatrix} 0&2&3&4\\ 2&0&1&5 \\ 3&1&0&6 \\ 4&5&6&7\end{pmatrix} \end{split}\]

are symmetric. The matrices

\[\begin{split} A_3 = \begin{pmatrix} 2&3&4\\2&3&4 \\ 2&3&4 \end{pmatrix} \quad \text{and} \quad A_4 = \begin{pmatrix} 0&2&3&0\\ 2&0&1&0 \\ 3&1&0&0 \\ \end{pmatrix} \end{split}\]

are not symmetric.

Symmetric matrices appear in many different contexts. In statistics the covariance matrix is an example of a symmetric matrix. In engineering the so-called elastic strain matrix and the moment of inertia tensor provide examples.

The crucial thing about symmetric matrices is stated in the main theorem of this section.

Theorem 8.1.1

Every symmetric matrix \(A\) is orthogonally diagonalisable.

By this we mean: there exist an orthogonal matrix \(Q\) and a diagonal matrix \(D\) for which

\[ A = QDQ^{-1} = QDQ^T. \]

Conversely, every orthogonally diagonalisable matrix is symmetric.

This theorem is known as the Spectral Theorem for Symmetric Matrices. In other contexts the word spectrum of a transformation is used for the set of eigenvalues.

So, for a symmetric matrix an orthonormal basis of eigenvectors always exists. For the inertia tensor of a 3D body such a basis corresponds to the (perpendicular) principal axes.

Proof of the converse in Theorem 8.1.1

Recall that an orthogonal matrix is a matrix \(Q\) for which \(Q^{-1} = Q^T\).

With this reminder it is just a one line proof.

If \(A = QDQ^{-1} = QDQ^T\),

then \(A^T = (QDQ^{-1} )^T = (Q^{-1} )^TD^TQ^T = (Q^T)^TD^TQ^T = QDQ^T = A\).

The proof of the other implication we postpone till Subsection 8.1.3.

We end this introductory section with one representative example.

Example 8.1.2

Let \(A\) be given by \(A = \begin{pmatrix} 1&2\\2&-2 \end{pmatrix}\).

The eigenvalues are found via

\[\begin{split} \det{(A - \lambda I)} = \begin{vmatrix} 1-\lambda&2\\2&-2-\lambda \end{vmatrix} = (1-\lambda)(-2-\lambda) -4 = \lambda^2 +\lambda -6 = (\lambda-2)(\lambda+3) . \end{split}\]

They are \(\lambda_1 = 2\) and \(\lambda_2 = -3\).

Corresponding eigenvectors are \(\mathbf{v}_1 = \begin{pmatrix} 2\\1 \end{pmatrix}\) for \(\lambda_1\), and \(\mathbf{v}_2 = \begin{pmatrix} -1\\2 \end{pmatrix}\).

The eigenvectors are orthogonal,

\[\begin{split} \mathbf{v}_1 \ip \mathbf{v}_2 = \begin{pmatrix} 2\\1 \end{pmatrix}\ip \begin{pmatrix} -1\\2 \end{pmatrix} = -2 +2 = 0, \end{split}\]

and \(A\) can be diagonalised as

\[\begin{split} A = PDP^{-1} = \begin{pmatrix}2&-1\\1&2 \end{pmatrix}\begin{pmatrix}2 & 0\\0& -3 \end{pmatrix} \begin{pmatrix}2&-1\\1&2 \end{pmatrix}^{-1}. \end{split}\]

In Figure 8.1.1 the image of the unit circle under the transformation \(\vect{x} \mapsto A\vect{x}\) is shown. In the picture on the right,

\[\begin{split} \vect{q}_1 = \frac{1}{\norm{\vect{v}_1}}\vect{v}_1 = \frac{1}{\sqrt{5}}\begin{pmatrix} 2\\1 \end{pmatrix} \quad \text{and} \quad \vect{q}_2= \frac{1}{\norm{\vect{v}_2}}\vect{v}_2 = \frac{1}{\sqrt{5}}\begin{pmatrix} -1\\2 \end{pmatrix} \end{split}\]

are two orthonormal unit eigenvectors.

../_images/Fig-SymmetricMat-Evectors.svg — Fig. 8.1.1 The transformation \(T(\vect{x}) = \begin{pmatrix} 1&2\\2&-2 \end{pmatrix}\vect{x}\). The vectors \(\vect{q}_1\) and \(\vect{q}_2\) are two orthogonal vectors on the unit circle that are mapped onto multiples of themselves.#

Furthermore, if we normalise the eigenvectors, i.e., the columns of \(P\), we find the following diagonalisation of \(A\) with an orthogonal matrix \(Q\):

\[\begin{split} A = QDQ^{-1} = \begin{pmatrix}2/\sqrt{5}&1/\sqrt{5}\\1/\sqrt{5}&-2/\sqrt{5} \end{pmatrix}\begin{pmatrix}2 & 0\\0& -3 \end{pmatrix} \begin{pmatrix}2/\sqrt{5}&1/\sqrt{5}\\1/\sqrt{5}&-2/\sqrt{5} \end{pmatrix}^{-1}. \end{split}\]

8.1.2. The essential properties of symmetric matrices#

Proposition 8.1.1

Suppose \(A\) is a symmetric matrix.

If \(\mathbf{v}_1\) and \(\mathbf{v}_2\) are eigenvectors of \(A\) for different eigenvalues, then \(\mathbf{v}_1\perp \mathbf{v}_2\).

Proof of Proposition 8.1.1

Suppose \(\mathbf{v}_1\) and \(\mathbf{v}_2\) are eigenvectors of the symmetric matrix \(A\) for the different eigenvalues \(\lambda_1,\lambda_2\). We want to show that \(\mathbf{v}_1 \ip \mathbf{v}_2 = 0\).

The trick is to consider the expression

(8.1.1)#\[(A\mathbf{v}_1) \ip \mathbf{v}_2.\]

On the one hand

\[ (A\mathbf{v}_1) \ip \mathbf{v}_2 = (\lambda_1\mathbf{v}_1) \ip \mathbf{v}_2 = \lambda_1(\mathbf{v}_1 \ip \mathbf{v}_2). \]

On the other hand

\[ (A\mathbf{v}_1) \ip \mathbf{v}_2 = (A\mathbf{v}_1)^T \mathbf{v}_2 =\mathbf{v}_1^TA^T \mathbf{v}_2. \]

Since we assumed that \(A^T = A\) we can extend the chain of identities:

\[ \mathbf{v}_1^TA^T \mathbf{v}_2 = \mathbf{v}_1^T A \mathbf{v}_2 =\mathbf{v}_1^T (A \mathbf{v}_2) = \mathbf{v}_1^T (\lambda_2 \mathbf{v}_2) = \lambda_2(\mathbf{v}_1^T \mathbf{v}_2) = \lambda_2(\mathbf{v}_1 \ip \mathbf{v}_2). \]

So we have shown that

\[ (A\mathbf{v}_1) \ip \mathbf{v}_2 = \lambda_1(\mathbf{v}_1 \ip \mathbf{v}_2) = \lambda_2(\mathbf{v}_1 \ip \mathbf{v}_2). \]

Since

\[ \lambda_1 \neq \lambda_2, \]

it follows that indeed

\[ \mathbf{v}_1\ip \mathbf{v}_2 = 0,\]

as was to be shown.

Exercise 8.1.1

Prove the following slight generalisation of Proposition 8.1.1.

If \(\vect{u}\) is an eigenvector of \(A\) for the eigenvalue \(\lambda\), and \(\vect{v}\) is an eigenvector of \(A^T\) for a different eigenvalue \(\mu\), then \(\vect{u} \perp \vect{v}\).

Solution to Exercise 8.1.1

The proof is completely analogous to the proof of Proposition 8.1.1. Suppose

\[ A\mathbf{u} = \lambda\mathbf{u},\quad A\mathbf{v} = \mu\mathbf{v},\text{ where }\lambda \neq \mu. \]

We consider the expression \(\mathbf{u}^T\ip A \mathbf{v} = \mathbf{u}^T A \mathbf{v}\).

On the one hand

(8.1.2)#\[ \mathbf{u}\ip A \mathbf{v} = \mathbf{u}^T(A\mathbf{v}) = \mathbf{u}^T\mu \mathbf{v} = \mu \mathbf{u}^T\mathbf{v} = \mu (\mathbf{u}\ip\mathbf{v})\]

On the other hand

(8.1.3)#\[ \mathbf{u}\ip A \mathbf{v} = \mathbf{u}^T\ A^T \mathbf{v} = (A\mathbf{u})^T\mathbf{v} = \lambda \mathbf{u}^T\mathbf{v} = \lambda (\mathbf{u}\mathbf{v})\]

Comparing Equation (8.1.2) and Equation (8.1.3) we can conclude that \(\mathbf{u}\ip\mathbf{v} = 0\), i.e., \(\mathbf{u}\) and \(\mathbf{v}\) are indeed orthogonal.

Proposition 8.1.2

All eigenvalues of symmetric matrices are real.

The easiest proof is via complex numbers. Feel free to skip it, in particular when you don’t feel comfortable with complex numbers.

Proof of Proposition 8.1.2

For two vectors \(\mathbf{u},\mathbf{v}\) in \(\C^n\) we consider the expression

\[ \overline{\mathbf{u}}^{T}\mathbf{v} = \overline{u_1}v_1 + \cdots + \overline{u_n}v_n. \]

If we take \(\mathbf{v}\) equal to \(\mathbf{u}\) we get

\[ \overline{\mathbf{u}}^{T}\mathbf{u} = \overline{u_1}u_1 + \overline{u_2}u_2 + \cdots + \overline{u_n}u_n = |u_1|^2 + |u_2|^2 + \cdots + |u_n|^2, \]

where \(|u_i|\) denotes the modulus of the complex number \(u_i\). This sum of squares (of real numbers) is a non-negative real number. We also see that \(\overline{\mathbf{u}}^{T}\mathbf{u} = 0\) only holds if \(\mathbf{u} = \mathbf{0}\).

It can also be verified that

\[ \overline{\overline{\mathbf{u}}^{T}\mathbf{v}} = \overline{\mathbf{v}}^T \mathbf{u}. \]

Now suppose that \(\lambda\) is an eigenvalue of the symmetric matrix \(A\), and \(\mathbf{v}\) is a non-zero (possibly complex) eigenvector of \(A\) for the eigenvalue \(\lambda\). Note that, since \(A\) is real and symmetric, \(\overline{{A}^T} = \overline{A} = A\). To prove that \(\lambda\) is real, we will show that \(\overline{\lambda} = \lambda\).

We use kind of the same ‘trick’ as in Equation (8.1.1) in thre proof of Proposition 8.1.1.

On the one hand

\[ \overline{(A \mathbf{v})^T} \mathbf{v} = \overline{\lambda\mathbf{v}^T} \mathbf{v} = \overline{\lambda} \overline{\mathbf{v}}^T \mathbf{v}. \]

On the other hand,

\[ \overline{(A \mathbf{v})^T} \mathbf{v} = \overline{\mathbf{v}^T A^T}\mathbf{v} = \overline{\mathbf{v}^T}\,\overline{{A}^T} \mathbf{v} = \overline{\mathbf{v}}^T\,\overline{A} \mathbf{v} = \overline{\mathbf{v}}^T A\mathbf{v} = \overline{\mathbf{v}}^T \lambda\mathbf{v} = \lambda\overline{\mathbf{v}}^T \mathbf{v}. \]

So we have that

\[ \overline{\lambda} \overline{\mathbf{v}}^T \mathbf{v} = \lambda\overline{\mathbf{v}}^T \mathbf{v}. \]

Since we assumed that \(\mathbf{v}\) is not the zero vector, we have that \(\overline{\mathbf{v}}^T \mathbf{v} \neq 0\) , and so it follows that \( \overline{\lambda} =\lambda\). Which is equivalent to \(\lambda\) being real.

Example 8.1.3

Let \(A = \begin{pmatrix} a&b\\b&d \end{pmatrix} \).

Then the characteristic polynomial is computed as

\[\begin{split} \begin{vmatrix} a-\lambda&b\\b&d-\lambda \end{vmatrix} = (a-\lambda)(d-\lambda) - b^2 = \lambda^2 - (a+d)\lambda + ad - b^2. \end{split}\]

The discriminant of this second order polynomial is given by

\[ D = (a+d)^2 -4(ad -b^2) = a^2+d^2 - 2ad + 4b^2 = (a-d)^2 + 4b^2 \geq 0. \]

The discriminant is non-negative, so the characteristic polynomial has only real roots, and consequently the eigenvalues of the matrix are real.

Obviously, an elementary approach like this will soon get very complicated for larger \(n \times n\)-matrices.

Lastly we come to the third of three essential properties of symmetric matrices.

Proposition 8.1.3

For each eigenvalue of a symmetric matrix the geometric multiplicity is equal to the algebraic multiplicity.

We will incorporate the proof of this proposition into the proof of the main theorem in Subsection 8.1.3. For now, we will look at a few examples.

Example 8.1.4

We will verify that the symmetric matrix \(A = \begin{pmatrix} 1 & 0 & 1\\0 & 1 & 2 \\ 1 & 2 & 5 \end{pmatrix}\) is diagonalisable and has mutually orthogonal eigenvectors.

We first compute the characteristic polynomial.

Expansion along the first column gives

\[\begin{split} \begin{array}{rcl} \begin{vmatrix} 1-\lambda & 0 & 1\\0 & 1-\lambda & 2 \\ 1 & 2 & 5-\lambda \end{vmatrix} &=& (1-\lambda)\begin{vmatrix} 1-\lambda & 2 \\ 2 & 5-\lambda\end{vmatrix} + 1\cdot\begin{vmatrix} 0 & 1 \\ 1-\lambda & 2\end{vmatrix} \\ &=& (1-\lambda)\big((1-\lambda)(5-\lambda) -4 \big)- (1-\lambda) \\ &=& (1-\lambda) (\lambda^2-6\lambda) = (1-\lambda) (\lambda-6)\lambda. \end{array} \end{split}\]

So \(A\) has the real eigenvalues \(\lambda_{1} = 1\), \(\lambda_2 = 6\) and \(\lambda_3 = 0\). Since all eigenvalues have algebraic multiplicity \(1\), the corresponding eigenvectors will give a basis of eigenvectors, and we can immediately conclude that \(A\) is diagonalisable.

The eigenvectors are found to be

\[\begin{split} \mathbf{v}_1 = \begin{pmatrix} 2 \\-1 \\ 0 \end{pmatrix} \text{ for } \lambda_1 = 1, \quad \mathbf{v}_2 = \begin{pmatrix} 1 \\ 2 \\ 5 \end{pmatrix} \text{ for } \lambda_2, \quad \mathbf{v}_3 = \begin{pmatrix} 1 \\ 2 \\ -1 \end{pmatrix} \text{ for } \lambda_3. \end{split}\]

We see: the three eigenvectors form an orthogonal threesome, in accordance with Proposition 8.1.1.

Example 8.1.5

Consider the matrix \(A = \begin{pmatrix} 2&2&4\\2 & -1 & 2 \\ 4&2&2 \end{pmatrix}\).

A (rather involved) computation yields the eigenvalues \(\lambda_{1,2} = -2\) and \(\lambda_3 = 7\). Indeed all eigenvalues are real, conforming to Proposition 8.1.2.

Next we find the eigenvectors and the geometric multiplicities of the eigenvalues.

For \(\lambda = -2\) we find via row reduction

\[\begin{split} (A - (-2)I\,|\,\mathbf{0}) = \left(\begin{array}{ccc|c} 4&2&4&0\\2 & 1 & 2 &0\\ 4&2&4&0\end{array}\right) \sim \left(\begin{array}{ccc|c} 2&1&2&0\\0&0&0&0 \\0&0&0&0\end{array}\right) \end{split}\]

the two linearly independent eigenvectors \(\mathbf{v}_1 = \begin{pmatrix} 1 \\ 0 \\ -1\end{pmatrix}\) and \(\mathbf{v}_2 = \begin{pmatrix} 1 \\ -2 \\ 0\end{pmatrix}\). The geometric multiplicity of \(\lambda_{1,2}\) is equal to \(2\). The other eigenvalue has algebraic multiplicity \(1\), so its geometric multiplicity has to be \(1\) as well. With this Proposition 8.1.3 is verified.

Lastly we leave it to you to check that an eigenvector for \(\lambda_3 = 7\) is given by \(\mathbf{v}_3 = \begin{pmatrix} 2 \\ 1 \\ 2\end{pmatrix}\). And that both \(\mathbf{v}_3 \perp \mathbf{v}_1\) and \(\mathbf{v}_3 \perp \mathbf{v}_2\), so that Proposition 8.1.1 is satisfied as well.

8.1.3. Orthogonal diagonalisability of symmetric matrices#

Let us restate the main theorem (Theorem 8.1.1) about symmetric matrices:

A matrix \(A\) is symmetric if and only if it is orthogonally diagonalisable.

Note that this also establishes the property that for each eigenvalue of a symmetric matrix the geometric multiplicity equals the algebraic multiplicity (Proposition 8.1.3).

We will put the intricate proof at the end of the subsection, and first consider two examples.

The first example is a continuation of the earlier Example 8.1.5.

Example 8.1.6

The matrix \(A = \begin{pmatrix} 2&2&4\\2 & -1 & 2 \\ 4&2&2 \end{pmatrix}\) was shown to have the eigenvalues/eigenvectors

\[\begin{split} \lambda_{1,2} = -2, \quad \mathbf{v}_1 = \begin{pmatrix} 1 \\ 0 \\ -1\end{pmatrix}, \, \mathbf{v}_2 = \begin{pmatrix} 1 \\ -2 \\ 0\end{pmatrix}, \quad \lambda_3 = 7, \quad \mathbf{v}_3 = \begin{pmatrix} 2 \\ 1 \\ 2\end{pmatrix}. \end{split}\]

The pairs \(\mathbf{v}_1, \mathbf{v}_3\) and \(\mathbf{v}_2, \mathbf{v}_3\) are ‘automatically’ orthogonal.

For the eigenspace \(E_{-2} = \Span{\mathbf{v}_1, \mathbf{v}_2}\) we can use Gram-Schmidt to get an orthogonal basis:

\[\begin{split} \mathbf{u}_1 = \mathbf{v}_1, \quad \mathbf{u}_2 = \mathbf{v}_2 - \dfrac{\mathbf{v}_2 \ip \mathbf{u}_1}{\mathbf{u}_1 \ip \mathbf{u}_1} \mathbf{u}_1 = \dfrac12\begin{pmatrix} 1 \\ -4 \\ 1\end{pmatrix}. \end{split}\]

Normalising the orthogonal basis \(\{\mathbf{u}_1, \mathbf{u}_2, \mathbf{v}_3\}\) and putting them side by side in a matrix yields the orthogonal matrix

\[\begin{split} Q = \begin{pmatrix} \dfrac{1}{\sqrt{2}} & \dfrac{1}{\sqrt{18}} & \dfrac{2}{3} \\ 0 & \dfrac{-4}{\sqrt{18}} & \dfrac{1}{3}\\ \dfrac{-1}{\sqrt{2}} & \dfrac{1}{\sqrt{18}} & \dfrac{2}{3} \end{pmatrix}. \end{split}\]

The conclusion becomes that

\[\begin{split} A = QDQ^{-1} = QDQ^T, \quad \text{where still} \,\,\, D = \begin{pmatrix} -2 & 0 & 0 \\ 0 & -2 & 0 \\ 0 & 0 & 7\end{pmatrix}. \end{split}\]

The procedure followed in Example 8.1.6 leads way to an algorithm for constructing an orthogonal diagonalisation.

Algorithm 8.1.1

Compute the eigenvalues of the matrix.
Find a basis for each eigenspace.
Use the Gram-Schmidt procedure to turn these bases into orthonormal bases for the eigenspaces.
Put everything together in the matrices \(D\) and \(Q\).

One more example to illustrate matters, before we get to the proof (or you jump over to Subsection 8.1.5).

Example 8.1.7

Let the symmetric matrix \(A\) be given by \( A = \begin{pmatrix} 1 & 2 & 2 & 0 \\ 2 & -1 & 0 & 2 \\ 2 & 0 & -1 & -2 \\ 0 & 2 & -2 & 1 \end{pmatrix}\).

The hard part is to find the eigenvalues. (I.e., how to solve an equation of the order four?!) Once we know what the eigenvalues are, the other steps are ‘routine’.

It appears that \(A\) has the double eigenvalues \(\lambda_{1,2} = 3\) and \(\lambda_{3,4} = -3\).

To find the eigenvectors for the eigenvalue \(3\) we row reduce the matrix \((A - 3I)\).

\[\begin{split} \left(\begin{array}{cccc}1-3 & 2 & 2 & 0\\ 2 & -1-3 & 0 & 2 \\ 2 & 0 & -1-3 & -2 \\ 0 & 2 & -2 & 1-3 \end{array} \right) \,\, \sim \,\,\cdots\,\, \sim \,\, \left(\begin{array}{cccc}1 & 0 & -2 & -1\\ 0 & 1 & -1 & -1 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{array} \right). \end{split}\]

We can read off two linearly independent eigenvectors

\[\begin{split} \vect{v}_1 = \left(\begin{array}{c} 1 \\ 1 \\ 0 \\ 1 \end{array} \right), \quad \vect{v}_2 = \left(\begin{array}{c} 2 \\ 1 \\ 1 \\ 0 \end{array} \right). \end{split}\]

As in Example 8.1.6 we can construct an orthogonal basis for the eigenspace \(E_{3}\):

\[\begin{split} \mathbf{u}_1 = \mathbf{v}_1, \quad \mathbf{u}_2 = \mathbf{v}_2 - \dfrac{\mathbf{v}_2 \ip \mathbf{u}_1}{\mathbf{u}_1 \ip \mathbf{u}_1} \mathbf{u}_1 = \begin{pmatrix} 1 \\ 0 \\ 1\\ -1\end{pmatrix} \end{split}\]

Likewise we can first find a ‘natural’ basis for the eigenspace \(E_{-3}\) by row reducing \((A - (-3I))\):

\[\begin{split} (A - (-3I)) = \left(\begin{array}{cccc}4 & 2 & 2 & 0\\ 2 & 2 & 0 & 2 \\ 2 & 0 & 2 & -2 \\ 0 & 2 & -2 & 4 \end{array} \right) \quad \sim \cdots \sim \quad \left(\begin{array}{cccc}1 & 0 & 1 & -1\\ 0 & 1 & -1 & 2 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{array} \right). \end{split}\]

Two independent eigenvectors: \(\vect{v}_3 = \left(\begin{array}{c} -1 \\ 1 \\ 1 \\ 0 \end{array} \right)\) and \(\vect{v}_4 = \left(\begin{array}{c} 1 \\ -2 \\ 0 \\ 1 \end{array} \right)\).

Again these can be orthogonalised, and then we find the following complete set of eigenvectors, i.e., a basis for \(\R^4\):

\[\begin{split} \vect{u}_1 = \begin{pmatrix} 1 \\ 1 \\ 0\\ 1\end{pmatrix}, \quad \vect{u}_2 = \begin{pmatrix} 1 \\ 0 \\ 1\\ -1\end{pmatrix}, \quad \vect{u}_3 = \begin{pmatrix} -1 \\ 1 \\ 1\\ 0\end{pmatrix}, \quad \vect{u}_4 = \begin{pmatrix} 0 \\ -1 \\ 1 \\ 1\end{pmatrix}. \end{split}\]

We conclude that \(A = QDQ^{-1}\), where

\[\begin{split} D = \left(\begin{array}{cccc}3 & 0 & 0 & 0\\ 0 & 3 & 0 & 0 \\ 0 & 0 & -3 & 0 \\ 0 & 0 & 0 & -3 \end{array} \right), \quad Q = \dfrac{1}{\sqrt{3}} \left(\begin{array}{cccc}1 & 1 & -1 & 0\\ 1 & 0 & 1 & -1 \\ 0 & 1 & 1 & 1 \\ 1 & -1 & 0 & 1 \end{array} \right). \end{split}\]

And now it’s time for the proof of the main theorem. The proof is of the type technical and intricate. Skip it if you like.

Proof of Theorem 8.1.1

Suppose that \(A\) is a symmetric \(n \times n\)-matrix. We know there are \(n\) real, possibly multiple, eigenvalues \(\lambda_1, \lambda_2, \ldots, \lambda_n\). Suppose \(\vect{q}_1\) is an eigenvector for \(\lambda_1\) with unit length. We can extend \(\{\vect{q}_1\}\) to an orthonormal basis \(\{\vect{q}_1,\vect{q}_2,\ldots,\vect{q}_n\}\). Let \(Q_1\) be the matrix with the columns \(\vect{q}_1,\vect{q}_2,\ldots,\vect{q}_n\).

It can be shown that \(A_1 = Q_1^{-1}AQ_1 = Q_1^TAQ_1\) is of the form

\[\begin{split} \left(\begin{array}{ccc} \lambda_1 & 0 & \cdots & 0 \\ 0 & \\ \vdots & & B_1 & \\ 0 & \end{array}\right) \end{split}\]

where \(B_1\) is an \((n-1)\times(n-1)\)-matrix that is also symmetric.

Namely, the first column of \(A_1\) can be computed as

\[ A_1\vect{e}_1 = Q_1^{-1}AQ_1\vect{e}_1 = Q_1^{-1}A\vect{q}_1 = Q_1^{-1}\lambda_1\vect{q}_1 = \lambda_1Q_1^{-1}\vect{q}_1 \]

and \(Q_1^{-1}\vect{q}_1\) is the first column of \(Q_1^{-1}Q_1\), which is \(\vect{e}_1\).

This shows that the first column of \(A_1\) must indeed be \(\lambda_1\vect{e}_1 = \left(\begin{array}{c} \lambda_1 \\ 0 \\ \vdots \\ 0 \end{array}\right)\).

Since \(A\) is symmetric and \(Q_1\) is by construction an orthogonal matrix,

\[ A_1^T = (Q_1^{T}AQ_1)^T = Q_1^T A^T (Q_1^T)^T = Q_1^{T}AQ_1 = A_1. \]

So \(A_1\) is also symmetric. Thus if the first column of \(A_1\) contains \(n-1\) zeros, so does its first row.

Since \(A\) and \(A_1\) are similar, they have the same eigenvalues. It follows that \(B_1\) has the eigenvalues \(\lambda_2, \ldots, \lambda_n\).

We can apply the same construction to \(B_1\), yielding

\[\begin{split} B_2 = (\tilde{Q}_2)^{-1}B_1\tilde{Q}_2 = \left(\begin{array}{cccc} \lambda_2 & 0 & \cdots & 0 \\ 0 & \\ \vdots & & \tilde{B}_2 & \\ 0 & \end{array}\right). \end{split}\]

Note that in this formula the matrices have size \((n-1)\) by \((n-1)\).

If we then define

\[\begin{split} Q_2 = \left(\begin{array}{cccc} 1 & 0 & \cdots & 0 \\ 0 & \\ \vdots & & \tilde{Q}_2 & \\ 0 & \end{array}\right),\end{split}\]

it follows that

\[\begin{split} A_2 = Q_2^{-1}A_1Q_2 = \left(\begin{array}{cccccc} \lambda_1 & 0 & 0 & \cdots & 0 \\ 0 & \lambda_2 & 0 & \cdots & 0 \\ 0 & 0 \\ \vdots & \vdots & & \tilde{B_2} \\ 0 & 0 & \end{array}\right). \end{split}\]

Continuing like this we find

\[\begin{split} A_{n-1} = Q_{n-1}^{-1} \cdots Q_2^{-1}Q_1^{-1}A Q_1 Q_2 \cdots Q_{n-1} = \left(\begin{array}{ccccc} \lambda_1 & 0 & 0 & \cdots &0 \\ 0 & \lambda_2 & 0 &\cdots &0 \\ 0 & 0 & \lambda_3 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & 0 & \cdots &\lambda_n \end{array}\right).\end{split}\]

This proves that \(A\) is diagonalisable, with \(Q = Q_1Q_2 \cdots Q_{n-1}\) as a diagonalising matrix.

Moreover, since the product of orthogonal matrices is orthogonal, \(A\) is in fact orthogonally diagonalisable.

Example 8.1.8

We will illustrate the proof for the matrix

\[\begin{split} A = \begin{pmatrix} 1 & 2 & 2 & 0 \\ 2 & -1 & 0 & 2 \\ 2 & 0 & -1 & -2 \\ 0 & 2 & -2 & 1 \end{pmatrix}. \end{split}\]

Since

\[\begin{split} \begin{pmatrix} 1 & 2 & 2 & 0 \\ 2 & -1 & 0 & 2 \\ 2 & 0 & -1 & -2 \\ 0 & 2 & -2 & 1 \end{pmatrix} \begin{pmatrix} 1 \\-1\\-1\\0 \end{pmatrix} = \begin{pmatrix} -3 \\3\\3\\0 \end{pmatrix} \end{split}\]

we have as a starter the eigenvalue and corresponding eigenvector

\[\begin{split} \lambda_1 = -3, \quad \vect{v}_1 = \begin{pmatrix} 1 \\-1\\-1\\0 \end{pmatrix}. \end{split}\]

An orthogonal basis for \(\mathbb{R}^4\), starting with this first eigenvector is, for instance

\[\begin{split} \vect{v}_1 = \begin{pmatrix} 1 \\-1\\-1\\0 \end{pmatrix}, \quad \vect{v}_2 = \begin{pmatrix} 1 \\1\\0\\0 \end{pmatrix}, \quad \vect{v}_3 = \begin{pmatrix} 1 \\-1\\2\\0 \end{pmatrix}, \quad \vect{v}_4 = \begin{pmatrix} 0\\0\\0\\1 \end{pmatrix}. \quad \end{split}\]

Rescaling and putting them into a matrix yields

\[\begin{split} Q_1 = \begin{pmatrix} 1/\sqrt{3} & 1/\sqrt{2} & 1/\sqrt{6} & 0 \\ -1/\sqrt{3} & 1/\sqrt{2} & -1/\sqrt{6} & 0 \\ -1/\sqrt{3} & 0 & 2/\sqrt{6} & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}. \end{split}\]

Next we compute

\[\begin{split} A_1 = Q_1^{-1}AQ_1 = Q_1^TAQ_1 = \begin{pmatrix} -3 & 0 & 0 & 0 \\ 0 & 2 & \sqrt{3} & \sqrt{2} \\ 0 & \sqrt{3} & 0 & -\sqrt{6} \\ 0 & \sqrt{2} & -\sqrt{6} & 1 \end{pmatrix}. \end{split}\]

This is indeed of the form stated in the proof.

We continue with the matrix \(B_1 = \left(\begin{array}{ccc} 2 & \sqrt{3} & \sqrt{2} \\ \sqrt{3} & 0 & -\sqrt{6} \\ \sqrt{2} & -\sqrt{6} & 1 \end{array} \right)\).

\(B_1\) has eigenvalue \(-3\) with eigenvector \(\vect{u}_1 = \left(\begin{array}{c} 1 \\ -\sqrt{3} \\ -\sqrt{2} \end{array} \right)\).

Again we extend to an orthogonal basis for \(\mathbb{R}^3\). For instance,

\[\begin{split} \vect{u}_1, \quad \vect{u}_2 = \left(\begin{array}{c} \sqrt{2} \\ 0\\ 1 \end{array} \right), \quad \vect{u}_3 = \left(\begin{array}{c} 1 \\ \sqrt{3} \\ -\sqrt{2} \end{array} \right). \end{split}\]

If we normalise and use them as the columns of \(\tilde{Q}_2\) as in the proof of Theorem 8.1.1, we find as second matrix in that construction

\[\begin{split} Q_2 = \left(\begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & \dfrac{1}{\sqrt{6}} & \dfrac{\sqrt{2}}{\sqrt{3}} & \dfrac{1}{\sqrt{6}} \\ 0 & \dfrac{-1}{\sqrt{2}} & 0 & \dfrac{1}{\sqrt{2}} \\ 0 & \dfrac{-1}{\sqrt{3}} & \dfrac{1}{\sqrt{3}} & -\dfrac{1}{\sqrt{3}} \end{array} \right).\end{split}\]

And then

\[\begin{split} A_2 = Q_2^TQ_1^T A Q_1Q_2 = \left(\begin{array}{cccc} -3 & 0 & 0 & 0 \\ 0 &-3 & 0 & 0 \\ 0 & 0 & 3 & 0 \\ 0 & 0 & 0 & 3 \end{array} \right) = D, \end{split}\]

indeed a diagonal matrix.

For this example the matrix has the second double eigenvalue \(\lambda_{3,4} = 3\). Because of that, the construction takes one step less than in the general case.
Defining \(Q = Q_1Q_2\), we can rewrite the last identity as

\[ Q^{-1}AQ = D, \,\,\text{ so }\,\, A = QDQ^{-1} = QDQ^T \]

for the matrix \(Q = Q_1Q_2\). This is the matrix

\[\begin{split} Q = \left(\begin{array}{cccc} \dfrac{1}{\sqrt{3}} & 0 & \dfrac{1}{\sqrt{3}} & \dfrac{1}{\sqrt{3}} \\ -\dfrac{1}{\sqrt{3}} & \dfrac{1}{\sqrt{3}} & \dfrac{1}{\sqrt{3}} & 0 \\ -\dfrac{1}{\sqrt{3}} & -\dfrac{1}{\sqrt{3}} & 0 & \dfrac{1}{\sqrt{3}} \\ 0 & -\dfrac{1}{\sqrt{3}} & \dfrac{1}{\sqrt{3}} & -\dfrac{1}{\sqrt{3}} \end{array} \right) = \dfrac{1}{\sqrt{3}}\left(\begin{array}{cccc} 1 & 0 & 1 & 1 \\ -1 & 1 & 1 & 0 \\ -1 & -1 & 0 & 1 \\ 0 & -1 & 1 & -1 \end{array} \right). \end{split}\]

So we see that \(A\) has the ‘simpler’ eigenvectors

\[\begin{split} \vect{v}_1 = \left(\begin{array}{c} 1 \\ -1 \\ -1 \\ 0 \end{array} \right), \quad \vect{v}_2 = \left(\begin{array}{c} 0 \\ 1 \\ -1 \\ -1 \end{array} \right), \quad \vect{v}_3 = \left(\begin{array}{c} 1 \\ 1 \\ 0 \\ 1 \end{array} \right), \quad \vect{v}_4 = \left(\begin{array}{c} 1 \\ 0 \\ 1 \\ -1 \end{array} \right). \end{split}\]

Note: given the eigenvalues, these eigenvectors could have been found more efficiently by solving the systems \((A - \lambda_iI)\vect{x} = \vect{0}\), and then orthogonalise by the Gram-Schmidt procedure. As is done in Example 8.1.6.

The importance of the step-by-step reduction is that it shows that from the ‘minimal’ assumptions of symmetry and the existence of real eigenvalues it is possible to create an orthogonal diagonalisation.

8.1.4. Maximising \(||A\vect{x}||\) for a symmetric matrix \(A\)#

How much can a vector \(\vect{x}\) in \(\R^{n}\) ‘blow up’ when multiplied by an \(m \times n\)-matrix \(A\)? To answer this question we have to consider how to maximise the ratio

(8.1.4)#\[ \frac{\norm{A\vect{x}}}{\norm{\vect{x}}}\]

for non-zero vectors \(\vect{x}\). Since

\[ \frac{\norm{A(c\vect{x})}}{\norm{c\vect{x}}} = \frac{|c|\norm{A\vect{x}}}{|c|\norm{\vect{x}}} = \frac{\norm{A\vect{x}}}{\norm{\vect{x}}}, \]

we may restrict ourselves to vectors of norm \(1\). Then the denominator in Equation (8.1.4) becomes \(1\), so we we just have to maximise \(\norm{A\vect{x}}\).
The general case, for non-square matrices, will be handled in Subsection 8.3.4. For symmetric matrices the question is answered by the next proposition.

Proposition 8.1.4

Suppose \(A\) is a symmetric matrix. Then the maximum value \(\norm{A\mathbf{x}}\) will attain on the set of unit vectors is equal to \(|\lambda_{\operatorname{max}}|\), where \(\lambda_{\operatorname{max}}\) is the eigenvalue of the highest absolute value. In formula form

\[ \operatorname{max}\norm{A\mathbf{x}} = |\lambda_{\operatorname{max}}| \,\, \text{on the set} \,\, \{\vect{x}: \norm{\vect{x}} = 1\}. \]

We will give a proof that makes good use of the existence of an orthogonal basis of eigenvectors. But first we will give an example that catches the main idea.

Example 8.1.9

The (symmetric) matrix \(A = \begin{pmatrix} -1 & 4 \\ 4 & -1 \end{pmatrix}\) has the eigenvalues \(\lambda_1 = -5\) and \(\lambda_2 = 3\) with corresponding unit eigenvectors \(\mathbf{u}_1 = \dfrac{1}{\sqrt{2}}\begin{pmatrix} 1 \\ -1 \end{pmatrix}\) and \(\mathbf{u}_2 = \dfrac{1}{\sqrt{2}}\begin{pmatrix} 1 \\ 1 \end{pmatrix}\) respectively. So according to Proposition 8.1.4 the maximum value of \(\norm{A\vect{x}}\) on the set of vectors with norm \(1\) must be \(5\).

First of all, for \(\vect{x} = \vect{u}_1\) it holds that \(\norm{A\vect{u}_1} = ||5\vect{u}_1|| = 5\).

Second, suppose \(\vect{x} \) is an arbitrary unit vector. We will in fact show that \(\norm{A\vect{x}}^2 \leq 5^2\). Since \(\{\vect{u}_1,\vect{u}_2\}\) is a basis, \(\vect{x} = c_1\vect{u}_1 + c_2\vect{u}_2 \), for some parameters \(c_1, c_2\). Then, since \(\vect{u}_1\) and \(\vect{u}_2\) are orthogonal unit vectors,

\[\begin{split} \begin{array}{rcl} \norm{\vect{x}}^2 = \norm{c_1\vect{u}_1 + c_2\vect{u}_2}^2 &=& \norm{c_1\vect{u}_1}^2 + \norm{c_2\vect{u}_2}^2 \\ & = & c_1^2\norm{\vect{u}_1}^2 + c_2^2 \norm{\vect{u}_2}^2 = c_1^2 + c_2^2, \end{array} \end{split}\]

so \(c_1^2 + c_2^2 = \norm{\vect{x}}^2 = 1\).

Likewise,

\[\begin{split} \begin{array}{rcl} \norm{A\vect{x}}^2 &=& \norm{c_1A\vect{u}_1 + c_2A\vect{u}_2}^2\\ &=& \norm{c_1\cdot(-5)\vect{u}_1 + c_2\cdot3\vect{u}_2}^2 \\ &=& \norm{c_1\cdot(-5)\vect{u}_1}^2 + \norm{c_2\cdot3\vect{u}_2}^2 \\ &=& (-5)^2c_1^2\norm{\vect{u}_1}^2 + 3^2c_2^2\norm{\vect{u}_2}^2 \\ &=& 25c_1^2 + 9c_2^2. \end{array} \end{split}\]

So we have

\[ \norm{A\vect{x}}^2 = 25c_1^2 + 9c_2^2 \leq 25c_1^2 + 25c_2^2 = 25(c_1^2 + c_2^2) = 25, \]

which implies that indeed \(\norm{A\vect{x}} \leq 5\) for all vectors \(\vect{x}\) with \(\norm{\vect{x}} = 1\).

The second example shows that symmetry of the matrix is necessary for the property to hold.

Example 8.1.10

The matrix \(B = \begin{pmatrix} 3 & 4 \\ 0 & 3\end{pmatrix}\) has the double eigenvalue \(\lambda_1 = \lambda_2 = 3\) and for the unit vector \(\mathbf{x} = \begin{pmatrix} 0 \\ 1 \end{pmatrix}\) it holds that \( \norm{A\vect{x}} = \norm{\begin{pmatrix} 4\\3 \end{pmatrix}} = 5 > 3 = |\lambda_1|\).

As mentioned, Example 8.1.9 contains the main idea, but for a proof of the general situation you can open the following exposition.

Proof of Proposition 8.1.4

Suppose \(A\) is a symmetric \(n \times n\)-matrix. Then \(A\) has an orthonormal basis \(\vect{u}_1, \vect{u}_2,\ldots,\vect{u}_n\) of eigenvectors for the eigenvalues \(\lambda_1, \ldots, \lambda_n\), where we may suppose that these are ordered according to their absolute values in decreasing order

\[ |\lambda_1| \geq |\lambda_2| \geq \cdots \geq |\lambda_n|. \]

First of all

\[ \norm{A\mathbf{u}_1} = \norm{\lambda_1\mathbf{u}_1} = |\lambda_1|, \]

so there is at least a unit vector of which the norm gets a factor \(|\lambda_1|\), as required by the proposition.

It remains to show that for an arbitrary unit vector \(\vect{x}\) always

\[ \norm{A\mathbf{x}} \leq |\lambda_1| \norm{\vect{x}} = |\lambda_1|. \]

We will in fact show that \(\norm{A\mathbf{x}}^2 \leq |\lambda_1|^2\).

Since \(\{\vect{u}_1, \ldots, \vect{u}_n \}\) is a basis of \(\R^n\) it follows that

\[ \vect{x} = c_1\mathbf{u}_1 + c_2\mathbf{u}_2 + \cdots + c_n\mathbf{u}_n, \quad \text{for some } \, c_1,\ldots,c_n. \]

From the orthonormality of the \(\vect{u}_i\) it follows that

\[ 1 = \norm{\vect{x}}^2 = \norm{c_1\mathbf{u}_1 + \cdots + c_n\mathbf{u}_n}^2 = c_1^2\norm{\mathbf{u}_1}^2 + \cdots + c_n^2\norm{\mathbf{u}_n}^2 = c_1^2 + \cdots + c_n^2, \]

thus \(c_1^2 + \cdots + c_n^2=1\).

Next, invoking that each \(\vect{u}_i\) is an eigenvector for \(\lambda_i\) and again that the \(\vect{u}_i\) form an orthonormal set, we get

\[\begin{split} \begin{array}{rcl} \norm{A\vect{x}}^2 &=& \norm{c_1\lambda_1\vect{u}_1+ c_n\lambda_n\vect{u}_n}^2 \\ &=& c_1^2\lambda_1^2 \norm{\vect{u}_1}^2 + \cdots + c_n^2\lambda_n^2 \norm{\vect{u}_n}^2 \\ &=& c_1^2\lambda_1^2 + \cdots + c_n^2\lambda_n^2 \\ &\leq & c_1^2\lambda_1^2 + \cdots + c_n^2\lambda_1^2 \\ &=& (c_1^2 + \cdots + c_n^2)\lambda_1^2 \\ &=& \lambda_1^2. \end{array} \end{split}\]

At the \(\leq\) step we used that \(\lambda_i^2 \leq \lambda_1^2\), for \(i = 2, \ldots, n\).

In the last subsection we will show how the orthogonal diagonalisation can be rewritten in an interesting and meaningful way.

8.1.5. The spectral decomposition of a symmetric matrix#

Let’s take up an earlier example (Example 8.1.2) to illustrate what the spectral decomposition is about.

Example 8.1.11

For the matrix \(A = \begin{pmatrix} 1&2\\2&-2 \end{pmatrix}\) we found the orthogonal diagonalisation

\[\begin{split} A = QDQ^T = \begin{pmatrix} 2/\sqrt{5}& 1/\sqrt{5}\\1/\sqrt{5}& -2/\sqrt{5} \end{pmatrix} \begin{pmatrix} 2 & 0 \\ 0 & -3 \end{pmatrix} \begin{pmatrix} 2/\sqrt{5}& 1/\sqrt{5}\\1/\sqrt{5}& -2/\sqrt{5} \end{pmatrix}^T. \end{split}\]

This is of the form

\[\begin{split} \begin{array}{rcl} A &=& (\,\mathbf{q}_1\,\,\mathbf{q}_2\,)\begin{pmatrix} 2 & 0 \\ 0 & -3 \end{pmatrix} \big(\,\mathbf{q}_1\,\,\mathbf{q}_2\,\big)^T = \big(\,2\mathbf{q}_1\,\,(-3)\mathbf{q}_2\big)\begin{pmatrix}\mathbf{q}_1^T \\ \mathbf{q}_2^T \end{pmatrix}. \end{array} \end{split}\]

We bring in mind the column-row expansion of the matrix product. For two \(2\times 2\)-matrices this reads

\[\begin{split} \begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix} \begin{pmatrix} b_{11} &b_{12} \\ b_{21} & b_{22} \end{pmatrix} = \begin{pmatrix} a_{11} \\ a_{21} \end{pmatrix} \begin{pmatrix} b_{11} &b_{12} \end{pmatrix} + \begin{pmatrix} a_{12} \\ a_{22} \end{pmatrix} \begin{pmatrix} b_{21} &b_{22} \end{pmatrix}. \end{split}\]

Applying this to the last expression for \(A = QDQ^T\) we find

\[ A = 2 \mathbf{q}_1\mathbf{q}_1^T + (-3)\mathbf{q}_2\mathbf{q}_2^T . \]

The matrices

\[\begin{split} \mathbf{q}_1\mathbf{q}_1^T = \frac15 \begin{pmatrix} 4 & 2 \\ 2 & 1 \end{pmatrix} \quad \text{and} \quad \mathbf{q}_2\mathbf{q}_2^T = \frac15 \begin{pmatrix} 1 & -2 \\ -2 & 4 \end{pmatrix} \end{split}\]

represent the orthogonal projections onto the one-dimensional subspaces \(\Span{\mathbf{q}_1}\) and \(\Span{\mathbf{q}_2}\).

Furthermore these one-dimensional subspaces are orthogonal to each other.

So we have that this symmetric matrix can be written as a linear combination of matrices that represent orthogonal projections.

The construction we performed in the last example can be generalised. Which is the content of the last theorem in this section.

spectral decomposition of symmetric matrices

Theorem 8.1.2 (Spectral Decomposition of Symmetric Matrices)

Every \(n \times n\) symmetric matrix \(A\) is the linear combination

(8.1.5)#\[ A = \lambda_1P_1 + \lambda_2P_2 + \cdots + \lambda_nP_n\]

of \(n\)-matrices \(P_i\) that represent orthogonal projections onto one-dimensional subspaces that are mutually orthogonal.

Equation (8.1.5) is referred to as being a spectral decomposition of the matrix \(A\).

Proof of Theorem 8.1.2

For a general \(n\times n\) symmetric matrix \(A\), there exists an orthogonal diagonalisation

\[ A = QDQ^{-1} = QDQ^{T}. \]

Exactly as in Example 8.1.11 we can use the column-row expansion of the matrix product to derive

(8.1.6)#\[A = \lambda_1 \mathbf{q}_1\mathbf{q}_1^T + \lambda_2\mathbf{q}_2\mathbf{q}_2^T + \cdots + \lambda_n\mathbf{q}_n\mathbf{q}_n^T,\]

where the vectors \(\mathbf{q}_i\) of course are the (orthonormal) columns of the diagonalising matrix \(Q\). This is indeed a linear combination of orthogonal projections, as was to be shown.

Exercise 8.1.2

The eigenvalues of the matrix \(A=\begin{pmatrix} 2 & 1 & 0 \\ 1 & 3 & 1\\ 0 & 1& 2 \end{pmatrix}\) are \(1\), \(2\) and \(4\).

Find the spectral decomposition of \(A\).

Solution to Exercise 8.1.2

We first find an orthogonal diagonalisation \(QDQ^T\) of \(A\), which results in

\[\begin{split} Q=\begin{pmatrix}\frac{1}{\sqrt3}&-\frac{1}{\sqrt2}&\frac{1}{\sqrt6}\\-\frac{1}{\sqrt3}&0&\frac{2}{\sqrt6}\\\frac{1}{\sqrt3}&\frac{1}{\sqrt2}&\frac{1}{\sqrt6}\end{pmatrix}\text{ and }D=\begin{pmatrix}1&0&0\\0&2&0\\0&0&4\end{pmatrix}. \end{split}\]

Use the column-row expansion of the matrix product results in:

\[\begin{split} A = (1) \begin{pmatrix} \frac{1}{3}&-\frac{1}{3}&\frac{1}{3}\\ -\frac{1}{3}&\frac{1}{3}&-\frac{1}{3}\\ \frac{1}{3}&-\frac{1}{3}&\frac{1}{3} \end{pmatrix} +(2) \begin{pmatrix} \frac{1}{2}&0&-\frac{1}{2}\\ 0&0&0\\ -\frac{1}{2}&0&\frac{1}{2} \end{pmatrix} +(4) \begin{pmatrix} \frac{1}{6}&\frac{2}{6}&\frac{1}{6}\\ \frac{2}{6}&\frac{4}{6}&\frac{2}{6}\\ \frac{1}{6}&\frac{2}{6}&\frac{1}{6} \end{pmatrix}. \end{split}\]

If in Theorem 8.1.2 the projections onto eigenvectors for the same eigenvalue are grouped together, then the following alternative form of the spectral decomposition results.

spectral theorem, alternative version

Corollary 8.1.1 (Spectral Theorem, alternative version)

Every symmetric \(n \times n\)-matrix \(A\) can be written as a linear combination of the orthogonal projections onto its (orthogonal) eigenspaces.

\[ A = \lambda_1 P_1 + \, \cdots \, + \lambda_k P_k, \]

where \(P_i\) denotes the orthogonal projection onto the eigenspace \(E_{\lambda_i}\).

Proof of Corollary 8.1.1

We know that

\[ A = \lambda_1P_1 + \cdots + \lambda_nP_n = \lambda_1\vect{q}_1\vect{q}_1^T + \cdots + \lambda_n\vect{q}_n\vect{q}_n^T. \]

If all eigenvalues \(\lambda_1, \ldots, \lambda_n\) are different that’s just it.

If \(\lambda_i\) is an eigenvalue of multiplicity \(m\) with \(m\) orthonormal eigenvectors \(\vect{q}_1, \ldots, \vect{q}_m\), then

\[ \lambda_i\vect{q}_1\vect{q}_1^T + \,\cdots\,+ \lambda_i\vect{q}_m\vect{q}_m^T = \lambda_i (\,\vect{q}_1\,\,\cdots\,\,\vect{q}_m) (\,\vect{q}_1\,\,\cdots\,\,\vect{q}_m)^T = \lambda_i Q_iQ_i^T. \]

\(P_i = Q_iQ_i^T\) is precisely the orthogonal projection onto the eigenspace \(E_{\lambda_i}\).

The following example provides an illustration.

Example 8.1.12

For the matrix \(A = \begin{pmatrix} 1 & 2 & 2 & 0 \\ 2 & -1 & 0 & 2 \\ 2 & 0 & -1 & -2 \\ 0 & 2 & -2 & 1 \end{pmatrix}\) we had already found the orthogonal decomposition \(A = QDQ^{-1}= QDQ^T\) with

\[\begin{split} Q = \left(\,\vect{q}_1\,\,\vect{q}_2\,\rule[-2ex]{0ex}{5ex}\,\vect{q}_3\,\,\vect{q}_4\,\right) = \dfrac{1}{\sqrt{3}}\left(\begin{array}{cccc} 1 & 0 & 1 & 1 \\ -1 & 1 & 1 & 0 \\ -1 & -1 & 0 & 1 \\ 0 & -1 & 1 & -1 \end{array} \right) \end{split}\]

and

\[\begin{split} D = \left(\begin{array}{cccc} -3 & 0 & 0 & 0 \\ 0 & -3 & 0 & 0 \\ 0 & 0 & 3 & 0\\ 0 & 0 & 0 & 3 \end{array}\right). \end{split}\]

The spectral decomposition according to Corollary 8.1.1 then becomes

\[ A = (-3) \left(\vect{q}_1\,\rule[-2ex]{0ex}{5ex}\,\vect{q}_2\,\right)\left(\vect{q}_1\,\rule[-2ex]{0ex}{5ex}\,\vect{q}_2\,\right)^T + 3 \left(\vect{q}_3\,\rule[-2ex]{0ex}{5ex}\,\vect{q}_4\,\right)\left(\vect{q}_3\,\rule[-2ex]{0ex}{5ex}\,\vect{q}_4\,\right)^T = \,\,\cdots\,\, = \]

\[\begin{split} = (-3)\begin{pmatrix} 1/3 & -1/3 & -1/3 & 0 \\ -1/3 & 2/3 & 0 & -1/3 \\ -1/3 & 0 & 2/3 & 1/3 \\ 0 & -1/3 & 1/3 & 1/3 \end{pmatrix} + 3 \begin{pmatrix} 2/3 & 1/3 & 1/3 & 0 \\ 1/3 & 1/3 & 0 & 1/3 \\ 1/3 & 0 & 1/3 & -1/3 \\ 0 & 1/3 & -1/3 & 2/3 \end{pmatrix}. \end{split}\]

8.1.6. Grasple exercises#

Grasple Exercise 8.1.1

https://embed.grasple.com/exercises/f76823e6-8936-4edf-bd0b-fa3a2aa7246f?id=88040

To check whether a matrix \(A\) is symmetric.

Grasple Exercise 8.1.2

https://embed.grasple.com/exercises/4b6f8822-31ff-4962-addf-97ee2133c449?id=118446

Recognising orthogonal matrices.

Grasple Exercise 8.1.3

https://embed.grasple.com/exercises/9828a4b4-98f7-46c3-8dab-74ac04fc1955?id=88032

To check whether a matrix \(A\) is orthogonal. And, if it is, to give its inverse.

Grasple Exercise 8.1.4

https://embed.grasple.com/exercises/8af926a0-80d8-459f-af55-c37a492a18c6?id=88045

To check whether a matrix \(A\) is orthogonal. And, if it is, to give its inverse.

Grasple Exercise 8.1.5

https://embed.grasple.com/exercises/03d75a31-7e1b-4dd2-be0a-5e9a93a0ef09?id=94940

To give an orthogonal diagonalisation of a \(2\times2\)-matrix.

Grasple Exercise 8.1.6

https://embed.grasple.com/exercises/926933aa-a33e-40f5-8e70-84bb9ed63fc8?id=87465

To give an orthogonal diagonalisation of a \(2\times2\)-matrix.

Grasple Exercise 8.1.7

https://embed.grasple.com/exercises/9aac9c37-aa3b-4d5a-bb92-f00c09e5f052?id=94943

To give an orthogonal diagonalisation of a \(3\times3\)-matrix.

Grasple Exercise 8.1.8

https://embed.grasple.com/exercises/a6a95823-15e4-4354-b89d-559306a5a7fa?id=94941

To give an orthogonal diagonalisation of a \(3\times3\)-matrix.

Grasple Exercise 8.1.9

https://embed.grasple.com/exercises/0403af25-edba-4bc6-b077-3de227253419?id=56931

To give an orthogonal diagonalisation of a \(3\times3\)-matrix.

Grasple Exercise 8.1.10

https://embed.grasple.com/exercises/3a45e358-4898-4d1d-b6f4-ba9679dd13e0?id=87765

To give an orthogonal diagonalisation of a \(3\times3\)-matrix.

Grasple Exercise 8.1.11

https://embed.grasple.com/exercises/eb8b0e2f-d909-47ce-8ef1-50ad67e2b0f6?id=87905

To give an orthogonal diagonalisation of a \(4\times4\)-matrix.

Grasple Exercise 8.1.12

https://embed.grasple.com/exercises/5ce15529-61a7-43d0-9fd3-5ad5469618e8?id=89131

One step in an orthogonal diagonalisation (as in the proof of the existence of an orthogonal diagonalisation).

Grasple Exercise 8.1.13

https://embed.grasple.com/exercises/5511e064-f22d-4601-9156-f00545d59f80?id=88649

Sequel to previous question, now for a \(4\times4\)-matrix.

Grasple Exercise 8.1.14

https://embed.grasple.com/exercises/c994fa76-f723-4700-922b-2f05ff0ef822?id=87760

To give an example of an symmetric \(2\times2\)-matrix with \(1\) eigenvalue and \(1\) eigenvector given.

Grasple Exercise 8.1.15

https://embed.grasple.com/exercises/4fd8d027-0e63-46ec-aaf5-f2d10d8707c9?id=87038

To give an example of a symmetric \(3\times3\)-matrix with given eigenvalues and eigenspace.

Grasple Exercise 8.1.16

https://embed.grasple.com/exercises/9b53d8cf-489e-4441-8fe8-408ae2d03290?id=73544

To find a third eigenvector of a symmetric matrix.

Grasple Exercise 8.1.17

https://embed.grasple.com/exercises/77b08679-8974-453a-8f68-7e08e8ecfaf5?id=94944

Deciding about the spectral decomposition of a \(3\times3\)-matrix (with lot of prerequisites laid out).

The following exercise have a more theoretical flavour.

Grasple Exercise 8.1.18

https://embed.grasple.com/exercises/6e0ebf73-fba2-46d0-aaa8-44e53ea07e53?id=88034

True/False question to think about symmetric versus orthogonally diagonalisable.

Grasple Exercise 8.1.19

https://embed.grasple.com/exercises/73c272d7-dbb0-47c9-8bee-074b1f8cc154?id=82845

About the (non-)symmetry of \(A + A^T\) and \(A - A^T\).

Grasple Exercise 8.1.20

https://embed.grasple.com/exercises/7959665f-09d0-4362-a0e8-c0a3e613399f?id=82848

About the (non-)symmetry of products.

Grasple Exercise 8.1.21

https://embed.grasple.com/exercises/33f5be5a-1cfa-4056-ac91-c2282de234b1?id=87864

If \(A\) and \(B\) are symmetric, what about \(A^2\), \(A^{-1}\) and \(AB\)?

Grasple Exercise 8.1.22

https://embed.grasple.com/exercises/59c4c327-1603-4cc1-8b92-7415c691098b?id=87873

True or false. If \(A\) is symmetric, then \(A^2\) has non-negative eigenvalues. (And what if \(A\) is not symmetric?)

A kind of counterpart of symmetric matrices are skew-symmetric matrices. We give the definition, and if interested you can explore this species by working through the exercises following this definition.

skew-symmetric

Definition 8.1.2

A matrix \(A\) is called skew-symmetric if \(A^T = -A\).

So two examples of skew-symmetric matrices are

\[\begin{split} \begin{bmatrix}0 & -2 \\ 2 & 0\end{bmatrix} \quad \text{and} \quad \begin{bmatrix}0 & 1 & -2 \\ -1 & 0 & 3 \\ 2 & -3 & 0\end{bmatrix}. \end{split}\]

Grasple Exercise 8.1.23

https://embed.grasple.com/exercises/d64e0b70-faca-4b83-be0d-d1259c5ca0a4?id=88076

Basic properties of skew-symmetric matrices.

Grasple Exercise 8.1.24

https://embed.grasple.com/exercises/60c03024-91fe-4c7e-bf02-c1e6d02ca65f?id=88080

Slightly less basic properties of skew-symmetric matrices.

Grasple Exercise 8.1.25

https://embed.grasple.com/exercises/0f6e1edf-cd4c-4aed-84f9-fcfd602862cb?id=88166

About the eigenvalues of skew-symmetric matrices.

Grasple Exercise 8.1.26

https://embed.grasple.com/exercises/a6f1f863-aa57-479f-a4cf-e0fb42edd28a?id=88173

Eigenvalues and eigenvectors of skew-symmetric matrices (sequel to previous exercise).

Grasple Exercise 8.1.27

https://embed.grasple.com/exercises/3cdc5312-1d4c-4c95-bc09-f5f90ee49955?id=122252

Geometric interpretation of 3x3 skew-symmetric matrices.

Symmetric matrices

Contents

8.1. Symmetric matrices#

8.1.1. Introduction#

8.1.2. The essential properties of symmetric matrices#

8.1.3. Orthogonal diagonalisability of symmetric matrices#

8.1.4. Maximising \(||A\vect{x}||\) for a symmetric matrix \(A\)#

8.1.5. The spectral decomposition of a symmetric matrix#

8.1.6. Grasple exercises#