The Power Method

9.3. The Power Method#

The eigenvalues of an \(n\times n\) matrix \(A\) to a large extent characterize the matrix. In theory they can be found as the zeros of the characteristic polynomial. Already for \(n = 3\) it is not an easy matter to find the exact zeros, and for \(n\geq 5\) it is close to impossible. One way to go about then is to use a numerical method to solve an equation of degree \(n\). Alternatively, there are algorithms more in the vein of linear algebra to find approximations of one or more eigenvalues. The simplest of these is the power method. This method often provides at least the eigenvalue of the largest absolute value (or, modulus). Note that this is in fact the most important eigenvalue concerning the stability or instability of the linear dynamical system connected to \(A\).

9.3.1. The Basics#

The idea behind the power method is really very simple. Let us for the moment consider the case where the matrix \(A\) is diagonalizable.

The power method is based on the dynamical system

\[ \vect{x}_0 = \vect{s}, \quad \vect{x}_{k+1} = A\vect{x}_k,\,\, k = 1,2,3,\ldots \]

We know from Proposition 9.1.1 in Section 9.1 that for a diagonalizable matrix \(A\) with eigenvalues \(\lambda_1, \ldots, \lambda_n\), the \(k\)-th vector in the process is given by

(9.3.1)#\[\vect{x}_k = A^k\vect{x}_0 = c_1\lambda_1^k\vect{v}_1 + c_2\lambda_2^k\vect{v}_2 + \ldots + c_n\lambda_n^k\vect{v}_n,\]

where \(\vect{v}_1, \ldots, \vect{v}_n\) are corresponding eigenvectors.
Now suppose the eigenvalues are ordered according to

\[ |\lambda_1| \geq |\lambda_2| \geq \,\,\ldots\,\,\geq |\lambda_n| \]

and that in fact we have \(|\lambda_1| > |\lambda_2|\), i.e. the first inequality is strict.
We can rewrite Equation (9.3.1) as

(9.3.2)#\[\vect{x}_k = \lambda_1^k\left(c_1\vect{v}_1 + c_2(\lambda_2/\lambda_1)^k\vect{v}_2 + \ldots + c_n(\lambda_n/\lambda_1)^k\vect{v}_n\right).\]

If the coefficient \(c_1\) is not equal to zero, then since all factors \((\lambda_i/\lambda_1)^k\), for \(i = 2,\ldots,n\) will go to zero for \(k \to \infty\), in the long run \(\vect{x}_k\) will behave as

\[ c_1(\lambda_1)^k\vect{v}_1. \]

That is \(\vect{x}_k\) is ‘almost’ an eigenvector. A minor complication: if \(|\lambda_1| > 1\), \(\vect{x}_k\) will ‘go to infinity’ and if \(|\lambda_1| < 1\), \(\vect{x}_k\) will ‘go to zero’. This complication is overcome by a proper rescaling.

We put a name to the situation where a matrix \(A\) has a single eigenvalue of highest absolute value.

Definition 9.3.1

Suppose the \(n\times n\) matrix \(A\) has the eigenvalues \(\lambda_1, \ldots, \lambda_n\) ordered according to

\[ |\lambda_1| \geq |\lambda_2| \geq \,\,\ldots\,\,\geq |\lambda_n|. \]

If in fact \(|\lambda_1| > |\lambda_2|\), then \(\lambda_1\) is called the dominant eigenvalue of \(A\).
An eigenvector corresponding to \(\lambda_1\) is called a dominant eigenvector.

Consider the following algorithm.

Algorithm 9.3.1

Suppose \(A\) is an \(n\times n\) matrix.

Step 1   Choose an arbitrary nonzero vector \(\vect{x}\) in \(\R^n\).

Step 2   Repeat the following steps:
  — Compute \(\vect{y} = A\vect{x}\);
  — Find the entry \(\mu\) of \(\vect{y}\) of the highest absolute value;
  — Replace \(\vect{x}\) by \(\dfrac{1}{\mu}\vect{y}\).

The steps are repeated until the process more or less stabilizes. For instance, until the difference between the last two computed vectors is smaller that a pre-determined ‘error’ \(\varepsilon\).

Remark 9.3.1

Instead of storing of the whole sequence

\[\vect{x}_1=A\vect{x}_0,\,\,\vect{x}_2 = A\vect{x}_1,\,\,\vect{x}_3=A\vect{x}_2,\,\ldots,\]

the algorithm discards all intermediate iterates.

We are only interested in the last iterate anyway, since we expect this to be the best approximation of an eigenvector.

The scaling step is necessary to avoid ending up at the zero vector or ‘at infinity’, where all information is lost.

Proposition 9.3.1

Suppose \(A\) is a diagonalizable matrix \(A\) with dominant eigenvalue \(\lambda_1\). Then in general the sequence constructed by the Power Method Algorithm, will converge to an eigenvector \(\vect{v}_1\) for \(\lambda_1\).
To be more specific, the sequence \(\vect{x}_k\) will converge a dominant eigenvector \(\vect{v}_1\) if the initial vector \(\vect{x}_0\) does not lie in \(\text{Span}\{\vect{v}_2, \vect{v}_3, \ldots, \vect{v}_n\}\).

Moreover, suppose \(\vect{x}\) is the result after a (sufficiently) large number of runs of the algorithm. Then (an approximation of) the dominant eigenvalue is the entry with the highest absolute value of the vector \(\vect{y} = A\vect{x}\).

Proof. With the scalings we consider in fact the dynamical process

\[ \vect{x}_0 = \vect{s}, \quad \vect{x}_{k+1} = \frac{1}{\mu_k} A\vect{x}_k. \]

This process yields a sequence of vectors with the same directions as the vectors of the process

\[ \vect{x}_0 = \vect{s}, \quad \vect{x}_{k+1} = A\vect{x}_k. \]

And we know (Formula (9.3.1)) that under the assumption of a dominant eigenvalue \(\lambda_1\) and a initial vector

\[ \vect{x}_0 = c_1\vect{v}_1 + c_2\vect{v}_2 + \ldots + c_n\vect{v}_n, \quad c_1 \neq 0 \]

the vectors \(\vect{x}_k\) in the long run behave as \(c_1\lambda_1^k\vect{v}_1\).
The effect of the scalings is that all along the way we keep vectors with largest entry 1. Thus the scaled process will converge to the dominant eigenvector with largest entry 1.

Example 9.3.1

Let \(A\) be the matrix   \(\begin{bmatrix}5 & 2 \\ 2 & 8 \end{bmatrix}\).
The matrix is symmetric, hence diagonalizable. Its eigenvalues are \(\lambda_1 = 9\), \(\lambda_2 = 4\).

If we start the process of Algorithm 9.3.1 from the initial vector

\[\begin{split} \vect{x}_0 = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \end{split}\]

up to 5 decimals we find the following results

\[\begin{split} \vect{y}_1 = \begin{bmatrix} 7 \\ 10 \end{bmatrix}, \quad \vect{x}_1 = \begin{bmatrix} 0.7 \\ 1 \end{bmatrix}, \quad \vect{x}_2 = \begin{bmatrix} 0.5851 \\ 1 \end{bmatrix}, \end{split}\]
\[\begin{split} \vect{x}_3 = \begin{bmatrix} 0.5371 \\ 1 \end{bmatrix}, \quad \vect{x}_4 = \begin{bmatrix} 0.5164 \\ 1 \end{bmatrix},\,\,\, \ldots ,\,\,\, \vect{x}_{10} = \begin{bmatrix} 0.5001 \\ 1 \end{bmatrix}. \end{split}\]

The last vector is almost equal to

\[\begin{split} \vect{v}_1 = \begin{bmatrix} 0.5 \\ 1 \end{bmatrix}, \end{split}\]

which is the dominant eigenvector with largest entry 1.
For this specific example the ‘convergence to an eigenvector’ goes very quickly!

Two circumstances help this rapid convergence. First of all, and most importantly, the ratio \(\lambda_2/\lambda_1\) of the two eigenvalues plays a role. Second, the ‘arbitrarily’ chosen initial vector does have a substantial component in the direction of the eigenvector for the largest eigenvalue.

For the given matrix \(A\) this can be analysed in detail. A complete set of eigenvalues and eigenvectors is given by

\[\begin{split} \lambda_1 = 9,\,\,\vect{v}_1 = \begin{bmatrix} 1 \\ 2 \end{bmatrix}, \quad \lambda_1 = 4,\,\,\vect{v}_2 = \begin{bmatrix} 2 \\ -1 \end{bmatrix}. \end{split}\]

At the start we have

\[\begin{split} \vect{x}_0 = \begin{bmatrix} 1 \\ 1 \end{bmatrix} = c_1\begin{bmatrix} 1 \\ 2 \end{bmatrix} + c_2\begin{bmatrix} 2 \\ -1 \end{bmatrix} = \frac35\begin{bmatrix} 1 \\ 2 \end{bmatrix} + \frac15\begin{bmatrix} 2 \\ -1 \end{bmatrix}. \end{split}\]

And then after \(k\) steps, if we would not rescale, we would get

\[\begin{split} \vect{x}_k = A^k\vect{x}_0 = c_19^k\begin{bmatrix} 1 \\ 2 \end{bmatrix} + c_24^k\begin{bmatrix} 2 \\ -1 \end{bmatrix} = 9^k \left( c_1\begin{bmatrix} 1 \\ 2 \end{bmatrix} + c_2\left(\frac49\right)^k\begin{bmatrix} 2 \\ -1 \end{bmatrix} \right). \end{split}\]

With rescaling we get a vector in the same direction with highest entry equal to 1.

That we have such a rapid convergence in the example is due to the two circumstances mentioned:
(1) the ratio \(|\lambda_1/\lambda_2| = 4/9\) is smaller than \(0.5\),
and
(2) the coefficient \(c_1 = \frac35\) is not too close to zero.
Thus the term with the eigenvector \(\mathbf{v}_2\) in the expression within the parentheses

\[\begin{split} c_1 \vect{v}_1 + c_2\left(\frac{\lambda_2}{\lambda_1}\right)^k\vect{v}_2 = \frac35\begin{bmatrix} 1 \\ 2 \end{bmatrix} + \frac15\left(\frac49\right)^k\begin{bmatrix} 2 \\ -1 \end{bmatrix} \end{split}\]

soon becomes negligibly small compared to the term with the eigenvector \(\vect{v}_1\).
In the exceptional situation where the ‘first guess’ was exactly a multiple of the second eigenvector \(\vect{v}_2\) all vectors \(\mathbf{x}_k\) would remain on the line generated by \(\vect{v}_2\). This would just be a very unfortunate first guess. In practice, even then, owing to minor round-off errors in the long long (no typo) run the process would probably end up converging to a multiple of \(\vect{v}_1\).

Example 9.3.2

Let us apply the power method to the following matrix plus ‘initial guess

\[\begin{split} A = \begin{bmatrix} 7 & -1 & 5\\ -2 & 7 & 6\\ 2 & -2 & 6 \end{bmatrix}, \quad \vect{x}_0 = \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}. \end{split}\]

We will continue refreshing the vector \(\vect{x}\) until two consecutive iterates satisfy

\[ \norm{\vect{x}_{k+1} - \vect{x}_{k}} \leq 10^{-4}. \]

This restriction is met after 23 iterations. The current value of \(\vect{x}_{23}\), up to four decimals is then

\[\begin{split} \vect{x}_{23} = \begin{bmatrix} 1.0000 \\ 0.3612 \\ 0.4456 \end{bmatrix}. \end{split}\]

And by computing \(A\vect{x}\), which yields

\[\begin{split} A\vect{x} = \begin{bmatrix} 8.8668 \\ 3.2024 \\ 3.9512 \end{bmatrix}, \end{split}\]

we read off that the dominant eigenvalue is (approximately) \(\lambda_1 = 8.8668\).

In this example the actual eigenvalues up to four decimals are given by

\[ \lambda_1 = 8.8675, \quad \lambda_{2,3} = 5.5663 \pm 1.8164i. \]

So the matrix is not real diagonalizable, but it is complex diagonalizable. Since the dominant eigenvalue is real, and the ratio

\[ \frac{|\lambda_{2,3}|}{|\lambda_{1}|} \approx 0.66 \]

we can still use Equation (9.3.2) to conclude that, except for very unfortunate initial vectors, the method will lead to a dominant eigenvector. As it did indeed.

Example 9.3.3

Consider the matrix \(A = \begin{bmatrix} 3 &4 &-1& 3 \\ 4 &-2& 3& 2\\ 2 & 1& 6& 3\\ 3 & 4& 2&-7 \end{bmatrix}\), and the initial vector \(\mathbf{x}_0 = \begin{bmatrix} 2 \\ 2 \\ 4 \\ 1 \end{bmatrix}\)

If we (let some computer program) run one hundred cycles of the power method algorithm, the last two iterates are

\[\begin{split} \vect{x}_{99} = \begin{bmatrix} 0.3345 \\ 0.5096 \\ 1.0000 \\ 0.6336 \end{bmatrix}, \quad \vect{x}_{100} = \begin{bmatrix} 0.4343 \\ 0.5051 \\ 1.0000 \\ 0.0668 \end{bmatrix}. \end{split}\]

The process has by far not reached its ‘steady state’. Look at the (significant!) change in the fourth entry.

After one hundred extra cycles, the output is stil pretty bad:

\[\begin{split} \vect{x}_{199} = \begin{bmatrix} 0.1374 \\ 0.3626 \\ 0.7028 \\1.0000 \end{bmatrix}, \quad \vect{x}_{200} = \begin{bmatrix} 0.5297 \\ 0.5007 \\ 1.0000 \\ -0.4751 \end{bmatrix}. \end{split}\]

Only after 700 to 800 iterations the process starts to stabilize:

\[\begin{split} \vect{x}_{799} = \begin{bmatrix} -0.2506 \\ -0.0763 \\ -0.1672 \\1.0000 \end{bmatrix} \,\, \approx \,\, \vect{x}_{800} = \begin{bmatrix} -0.2515 \\ -0.0773 \\ -0.1691 \\ 1.0000 \end{bmatrix} \,\, \approx\,\, \mathbf{v}_1 = \begin{bmatrix} -0.2510 \\ -0.0768 \\ -0.1682 \\ 1.0000 \end{bmatrix}. \end{split}\]

To find the corresponding eigenvalue we compute

\[\begin{split} \vect{y} = A\vect{x}_{800} = \begin{bmatrix} 2.1083 \\ 0.6458 \\ 1.4137 \\ -8.3955 \end{bmatrix} \,\,\approx \,\,-8.3955\vect{x}_{800} \,=\, \begin{bmatrix} 2.1067 \\ 0.6440 \\ 1.4100 \\ -8.3955 \end{bmatrix}. \end{split}\]

So the largest eigenvalue \(\lambda_1 \approx -8.3955\), and the above equations also shows that \(\vect{x}_{800}\) is approximately an eigenvector for it.

Now why are things going so slowly here?

Well, up to four decimals, the largest and the second largest eigenvalues are given by \(\lambda_1 = -8.3967\) and \(\lambda_2 = 8.27974\). The ratio of the absolute values of these two is very close to 1, namely

\[ \frac{|\lambda_2|}{|\lambda_1|} = 0.9882. \]

So the factor \(\left(\dfrac{\lambda_2}{\lambda_1} \right)^k\) goes to \(0\) very slowly.

Moreover, to aggreviate matters, we have chosen the initial vector very much in the direction of an eigenvector \(\vect{v}_2\) for \(\lambda_2\).

\[\begin{split} \vect{x}_0 = \begin{bmatrix} 2 \\ 2 \\ 4 \\ 1 \end{bmatrix}, \quad \vect{v}_2 = \begin{bmatrix} 1.5453 \\ 2.0291 \\ 4.0000 \\ 1.3566 \end{bmatrix}. \end{split}\]

If we write the initial vector as a linear combination of four unit eigenvectors

\[ c_1 \vect{u}_1 + c_2 \vect{u}_2 + c_3 \vect{u}_3 + c_4 \vect{u}_4 \]

the coefficient \(c_2\) will be large compared to the coefficient \(c_1\), so the contribution of the term

\[ c_2 \lambda_2^k \vect{u}_2 \]

for a long time will be noticeable compared to the contribution of the term

\[ c_1 \lambda_1^k \vect{u}_1. \]

9.3.2. Some Extensions#

In the previous section the power method was used find the dominant (real) eigenvalue of a matrix \(A\).
In this subsection we will consider three extensions:

— To find the smallest eigenvalue of a matrix \(A\).

— To find a second eigenvalue of \(A\) by using a shift of \(A\).

— To find the dominant complex eigenvalue
    (provided the eigenvalue with largest modulus is complex).

The first issue is covered by the next proposition.

Proposition 9.3.2 (Inverse Power Method)

Suppose \(A\) is an \(n\times n\) matrix with \(n\) eigenvalues ordered via

\[ |\lambda_1| \geq |\lambda_2| \geq \ldots \geq |\lambda_{n-1}| > |\lambda_{n}| > 0. \]

Here, again, \(|\lambda|\) denotes the modulus of \(\lambda\).
Note that the last two strict inequalities imply that \(A\) has a single smallest eigenvalue which is not equal to \(0\).
Since \(0\) is not an eigenvalue, we may conclude that \(A\) is invertible.

Then the power method applied to \(A^{-1}\) converges (apart from the usual exceptional cases) to an eigenvector \(\vect{v}_n\) for the smallest eigenvalue \(\lambda_n\).

Proof. We make use of the property in Exercise 6.1.1 in Section 6.1, which states that \(A^{-1}\) has the eigenvalues

\[ \frac{1}{\lambda_1}, \,\frac{1}{\lambda_2},\,\ldots\,,\,\frac{1}{\lambda_n} \]

where the order of the moduli is

\[ \frac{1}{|\lambda_1|}\leq \frac{1}{|\lambda_{2}|} \leq \,\ldots\, \leq \,\frac{1}{|\lambda_{n-1}|} \,< \, \frac{1}{|\lambda_n|}. \]

Thus, \(A^{-1}\) has the dominant eigenvalue \(\dfrac{1}{\lambda_n} = \lambda_n^{-1}\).

Moreover, in the same exercise it is stated that the eigenvectors of \(A^{-1}\) for eigenvalue \(\lambda_n^{-1}\) are exactly the eigenvectors of \(A\) for eigenvalue \(\lambda_n\).

Thus if we apply the algorithm of the power method to the matrix \(A^{-1}\), we will find an eigenvector \(\mathbf{v}_n\) of \(A^{-1}\) for the eigenvalue \(\lambda_n^{-1}\). This is then also an eigenvector of \(A\) for the smallest eigenvalue \(\lambda_n\).

Remark 9.3.2

From a computational point of view, in the case of a large matrix \(A\), it might be advantageous to compute \(\vect{y}_{k+1}\) from \(\vect{y}_{k}\) by solving the equation

\[ A\vect{y}_{k+1} = \vect{y}_k, \]

rather than computing

\[ \vect{y}_{k+1} = A^{-1}\vect{y}_k. \]

If the matrix \(A\) is sparse, i.e., has relatively few non-zero entries, its inverse may very well be a ‘full’ matrix.

Example 9.3.4

Let \(A\) be the matrix   \(\begin{bmatrix}5 & 2 \\ 2 & 8 \end{bmatrix}\) of the earlier example Example 9.3.1.
We saw that \(A\) has the eigenvalues \(\lambda_1 = 9\) and \(\lambda_2 = 4\).

If we apply the Inverse Power Method, starting from

\[\begin{split} \vect{x}_0 = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \end{split}\]

we find

\[\begin{split} \vect{x}_{12} = \begin{bmatrix} 1.0000 \\ -0.4998 \end{bmatrix}. \end{split}\]

This is indeed very close to the eigenvector

\[\begin{split} \vect{v}_2 = \begin{bmatrix} 1.0 \\ -0.5 \end{bmatrix} \end{split}\]

for the eigenvalue \(4\) of the matrix \(A\).

The inverse power method can often be used to find one or more other eigenvalues than just the largest and the smallest.

The starting point is that if a matrix \(A\) has the eigenvalues

\[ \lambda_1, \lambda_2, \ldots, \lambda_n \]

then the matrix \(A - \alpha I\), where \(\alpha\) is some constant, has the eigenvalues

\[ (\lambda_1-\alpha), (\lambda_2-\alpha), \ldots, (\lambda_2-\alpha). \]

We can use this in the following way.

If \(\alpha\) is an approximation of the largest eigenvalue \(\lambda_1\), then \((\lambda_1-\alpha)\) will be the eigenvalue of \(A - \alpha I\) that is closest to zero. So it will definitely not be the dominant eigenvalue of \(B\).

If we then apply the power method to \(B = A - \alpha I\) and if this matrix has a dominant real eigenvalue, we will find an eigenvector \(\mathbf{v}_2\) which might be the eigenvector for the smallest eigenvalue, but with some luck gives us a new eigenvector.

Example 9.3.5

Let us illustrate matters using the matrix \(A = \begin{bmatrix} 2 & 1 & 0 & 0 & 0 & 0 \\ 1 & 2 & 1 & 0 & 0 & 0 \\ 0 & 1 & 2 & 1 & 0 & 0 \\ 0 & 0 & 1 & 2 & 1 & 0 \\ 0 & 0 & 0 & 1 & 2 & 1 \\ 0 & 0 & 0 & 0 & 1 & 2 \end{bmatrix}\).

If we apply the power method iteratively multiplying the vector \(\mathbf{x}_0 = \begin{bmatrix} 1 \\ 1 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix}\) with the matrix \(A\) (and rescaling) we find that

\[\begin{split} \mathbf{x}_{25} = \begin{bmatrix} 0.4566 \\ 0.8143 \\ 1.0000 \\ 0.9829 \\ 0.7759 \\ 0.4259 \end{bmatrix}, \quad \mathbf{x}_{26} = \begin{bmatrix} 0.4550 \\ 0.8125 \\ 1.0000 \\ 0.9854 \\ 0.7797 \\ 0.4286 \end{bmatrix}, \quad \text{and} \quad A\mathbf{x}_{26} = \begin{bmatrix} 1.7224 \\ 3.0799 \\ 3.7979 \\ 3.7505 \\ 2.9734 \\1.6369 \end{bmatrix} \end{split}\]

and may conclude that we are close to an eigenvector for the largest eigenvalue \(\lambda_1\) which lies close to \(\alpha = 3.7979\).

If we round to 4, and apply the power method to the matrix \(B = A - 4I\), starting from the same vector \(\vect{x}_0\), the computations yield

\[\begin{split} \mathbf{x}_{50} = \begin{bmatrix} 0.4451 \\ -0.8020 \\ 1.0000 \\ -0.9999 \\ 0.8017 \\ -0.4449 \end{bmatrix}, \quad \mathbf{x}_{51} = \begin{bmatrix} 0.4451 \\ -0.8020 \\ 1.0000 \\ -0.9999 \\ 0.8018 \\ -0.4449 \end{bmatrix}, \quad \text{and} \quad A\mathbf{x}_{51} = \begin{bmatrix} -1.6922 \\ 3.0491 \\ -3.8019 \\ 3.8016 \\ -3.0484 \\ 1.6916 \end{bmatrix} \end{split}\]

From which we may conclude that \(\mathbf{x}_{50}\) is close to an eigenvector for \(\mu = -3.8019\), which is the dominant eigenvalue of \(B = A- 4I\).
Thus \(\mathbf{x}_{50}\) is close to an eigenvector of \(A\) for the eigenvalue \(\lambda = \mu + 4 = 0.1981\).
In this case in fact we have found the smallest eigenvalue, which we could also have found using the inverse power method.

Another idea possibility, that combines the above techniques, is the following.
If we have some idea of the location \(\alpha\) of an eigenvalue \(\lambda\), then \( \lambda - \alpha\) will be the eigenvalue of \(A - \alpha I\) that is closest to zero. So if we apply the inverse power method to the matrix \(A - \alpha I\) we will find an eigenvector for the eigenvalue \(\lambda\) that is closest to \(\alpha\).
For a not too large matrix \(A\) some idea of the location(s) of the real eigenvalues can for instance be found by considering the graph of its characteristic polynomial.

Example 9.3.6

Consider the matrix \(A = \begin{bmatrix} 3 & -3 & 1 & 3 \\ 0 & -2 & -3 & -3 \\ -2 & 4 & -2 & -3 \\ 0 & 4 & -1 & -2 \end{bmatrix}\).
It can be shown that the characteristic polynomial \(p_A(\lambda)\) of \(A\) changes sign between \(\lambda = -1\) and \(\lambda = 0\), and also between \(\lambda = 2\) and \(\lambda = 3\).

If we apply the inverse power method to the matrix \(B = A - 3I\), starting from the vector \(\vect{x}_0 = \begin{bmatrix} 1 \\ 0 \\ 0 \\ 1 \end{bmatrix}\) we find, up to four decimals

\[\begin{split} \mathbf{x}_{40} = \mathbf{x}_{41} = \begin{bmatrix} 1.0000 \\ 0.1407 \\ 0.4324 \\ 0.2086 \end{bmatrix}, \quad \text{and} \quad A\mathbf{x}_{40} = \begin{bmatrix} -0.2287 \\ -0.0322 \\ 0.0989 \\ -0.0477 \end{bmatrix}, \end{split}\]

From which we conclude that \(\vect{x}_{40}\) is close to an eigenvector of \(A\) for the eigenvalue \(\lambda \approx -0.2287 + 3 = 2.7713\).

In fact, using any computer algebra program we can find that the eigenvalues of \(A\) are

\[ \lambda_{1,2}=-2.6489 \pm 4.8444i, \quad \lambda_3 = 2.7713,\quad \lambda_4 = -0.4735. \]

Since \(|\lambda_{1,2}| > |\lambda_{3}| > |\lambda_{4}|\), the eigenvalue \(\lambda_3\) would not have been found using either the power method or the inverse power method applied directly to \(A\).

For the moment I think this is enough. I can add the strategy to handle a dominant eigenvalue that is complex, as in the last example. But should I?