3.6. The \(LU\) decomposition#

As we have seen, one way to solve a linear system \(A\vect{x}=\vect{b}\) is to row reduce it to echelon form and then use back substitution. See for instance Example 2.1.8.
In this section we will learn how to solve an \(m\times n\) linear system \(A\mathbf{x}=\mathbf{b}\) by decomposing (or factorising) a matrix into a product of two ‘special’ matrices \(L\) and \(U\). This is called an \(LU\)-decomposition of the matrix \(A\).

Example 3.6.1

Consider the system \(A\mathbf{x} = \mathbf{b}\) where

\[\begin{split} A = \begin{bmatrix} 1 & 1 & -1 \\ 2 & 4 & -3 \\ 1 & -1 & -3 \end{bmatrix}, \qquad \mathbf{b} = \begin{bmatrix} 2 \\ 1 \\8 \end{bmatrix}. \end{split}\]

To start with, the matrix \(A\) can be factorized as

\[\begin{split} A = \begin{bmatrix} 1 & 0 & 0 \\ 2 & 1 & 0 \\ 1 & -1 & 1 \end{bmatrix} \begin{bmatrix} 1 & 1 & -1 \\ 0 & 2 & -1 \\ 0 & 0 & -3 \end{bmatrix} = LU. \end{split}\]

We will see how this helps to solve the system in an efficient way.

With this factorisation we can solve the system in two steps, using both backward and forward substitution. In fact, the matrix \(U\) is an echelon matrix equivalent to \(A\). To solve the equation \(LU\vect{x} = L(U\vect{x}) =\vect{b}\) we can introduce the auxiliary vector \(\vect{y}\) via

\[ \vect{y} = U\vect{x}. \]

We then first solve

(3.6.1)#\[ L\vect{y} = \vect{b}\]

and after that we find \(\mathbf{x}\) by solving

\[ U\vect{x} = \tilde{\vect{y}}, \]

for the solution \( \tilde{\vect{y}}\) of system (3.6.1). For this solution \( \tilde{\vect{x}}\) we then have

\[ A \tilde{\vect{x}} = L\,U\, \tilde{\vect{x}} = L(U\tilde{\vect{x}})= L\tilde{\vect{y}} = \vect{b}. \]

We illustrate matters with the linear system of Example 3.6.1.

Example 3.6.2

The system \(A\vect{x} = \vect{b}\) of Example 3.6.1 can be written as

\[\begin{split} \begin{bmatrix} 1 & 0 & 0 \\ 2 & 1 & 0 \\ 1 & -1 & 1 \end{bmatrix} \begin{bmatrix} 1 & 1 & -1 \\ 0 & 2 & -1 \\ 0 & 0 & -3 \end{bmatrix} \begin{bmatrix} x_1\\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 2 \\ 1 \\8 \end{bmatrix}. \end{split}\]

By setting

\[\begin{split} \mathbf{y} = \begin{bmatrix} 1 & 1 & -1 \\ 0 & 2 & -1 \\ 0 & 0 & -3 \end{bmatrix} \begin{bmatrix} x_1\\ x_2 \\ x_3 \end{bmatrix}, \end{split}\]

we have

\[\begin{split} \begin{bmatrix} 1 & 0 & 0 \\ 2 & 1 & 0 \\ 1 & -1 & 1 \end{bmatrix} \begin{bmatrix} y_1\\y_2\\y_3 \end{bmatrix} = \begin{bmatrix} 2 \\ 1 \\ 8 \end{bmatrix}, \quad \text{i.e.,}\,\, \left\{\begin{array}{ccccccc} y_1 & & & & &=& 2 \\ 2y_1 &+& y_2 & & &=& 1 \\ y_1 &-& y_2 &+& y_3 &=& 8. \end{array} \right.\end{split}\]

With forward substitution we can find the solution:

\[\begin{split} \begin{array}{ll} y_1 = 2 & \Longrightarrow \quad y_2 = 1 - 2y_1 = -3 \\ & \Longrightarrow \quad y_3 = 8 - y_1 + y_2 = 3 \quad \Longrightarrow \quad \begin{bmatrix} y_1\\y_2\\y_3 \end{bmatrix} = \begin{bmatrix} 2 \\ -3 \\ 3 \end{bmatrix}. \end{array} \end{split}\]

Next we solve

\[\begin{split} \begin{bmatrix} 1 & 1 & -1 \\ 0 & 2 & -1 \\ 0 & 0 & -3 \end{bmatrix} \begin{bmatrix} x_1\\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 2 \\ -3 \\ 3 \end{bmatrix}, \end{split}\]

with back substitution. This gives the solution

\[\begin{split} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 3\\ -2 \\ -1 \end{bmatrix}, \end{split}\]

which is the solution to our original system. Since the factorisation was given, we did not have to solve the system from scratch by trying to find an echelon form.

Here is one to try for yourself.

Grasple Exercise 3.6.1

https://embed.grasple.com/exercises/b9a339dc-721f-4c85-aa77-3455269a3b7a?id=105809

To solve a system \(LU\vect{x} = \vect{b}\).

There are several methods for factorising matrices. The factorisations that we will see in this section use direct methods. Direct methods are methods that produce the exact factorisations in a finite number of steps.

In the next subsection we will address the questions of whether an \(LU\) decomposition always exists and if so, how to construct it. With an extra condition on \(L\) the decomposition, if it exists, will be unique. In the case where an \(LU\) decomposition does not exist we can instead consider the slightly more general \(PLU\) decomposition. In the remainder of the section we will consider the generalization to non-square matrices and we will analyze to which extent the \((P)LU\) decomposition can give an efficiency boost.

3.6.1. \(LU\) decomposition of a square matrix#

Definition 3.6.1

Let \(A\) be an \(n\times n\) matrix. An \(LU\) decomposition of \(A\) is a factorization of the type

\[ A=LU \]

where \(L\) is an \(n\times n\) lower triangular matrix with \(1\)s on the diagonal, and \(U\) an \(n\times n\) upper triangular matrix. So,

\[\begin{split} L= \begin{bmatrix} 1\\ l_{21} & 1 \\ l_{31} & l_{32} & \ddots \\ \vdots & \vdots & \ddots & \ddots\\ l_{n1} & l_{n2} & \cdots & l_{n,n-1} & 1 \end{bmatrix} , \quad U= \begin{bmatrix} u_{11} & u_{12} & u_{13} & \cdots &u_{1n} \\ & u_{22} & u_{23} & \cdots & u_{2n} \\ & & u_{33} & \cdots & u_{3n} \\ & & & \ddots & \vdots \\ & & & & u_{nn} \end{bmatrix}. \end{split}\]

For convenience we have used blanks for zeros.

Note that the condition of \(1\)s on the diagonal implies that the matrix \(L\) is invertible.

The next proposition captures the main interest of the \(LU\) decomposition, as is already illustrated in Example 3.6.2

Algorithm 3.6.1

Suppose that \(A=LU\), so that the linear system of equations \(A\mathbf{x}=\mathbf{b}\) can be written as \(LU\mathbf{x}=\mathbf{b}\). Then, by setting \(\mathbf{y} = U\mathbf{x}\), we can solve the linear system in two steps.

  1. Solve the system \(L\mathbf{y}=\mathbf{b}\) and find \(\mathbf{y}\).
    Note that this system has a unique solution.

  2. Solve the system \(U\mathbf{x}=\mathbf{y}\) to find \(\mathbf{x}\).

The solution of the second system is then a solution of the system \(A\mathbf{x}=\mathbf{b}\).

The next proposition tells us which matrices do have an \(LU\) decomposition.

Proposition 3.6.1

A matrix \(A\) can be written as \(A = LU\), with \(L\) and \(U\) as described in Definition 3.6.1 if and only if can be row reduced to an echelon matrix with only additions of multiples of rows to rows below it. We will call this a top-down row reduction.

Instead of a giving a formal proof, we will illustrate matters first with an example. In this example you will see how an \(LU\) decomposition of a matrix is found via a top-down row reduction that keeps track of the row operations involved.

Example 3.6.3

We consider the matrix

\[\begin{split} A= \begin{bmatrix} 3 & 1 & -2 \\ 2 & 4 & 1 \\ 1 & 2 & 1 \end{bmatrix} . \end{split}\]

To find an \(LU\) decomposition we will work towards an echelon form in the top-down direction. In addition, we will keep track of the actions we perform. For the first step,

\[\begin{split}\left[\begin{array}{rrr}3 & 1 & -2\\2 & 4 & 1\\1 & 2 & 1\end{array} \right] \begin{array}{l} [R_1] \\ {[R_2-2/3R_1]} \\ {[R_3-1/3R_1]} \\ \end{array} \quad\sim\quad \begin{bmatrix} 3 & 1 & -2 \\ 0 & {10}/{3} & {7}/{3} \\ 0 & {5}/{3} & {5}/{3} \end{bmatrix} = A_2. \end{split}\]

The numbers \(2/3\) and \(1/3\) are called multipliers.

A second row reduction step, involving the multiplier \(1/2\), leads to an echelon/upper triangular matrix.

\[\begin{split}\left[\begin{array}{rrr}3 & 1 & -2\\0 &{10}/{3}&{7}/{3}\\0 &{5}/{3}&{5}/{3}\end{array} \right] \begin{array}{l} [R_1] \\ {[R_2]} \\ {[R_3-1/2R_2]} \\ \end{array}\sim \begin{bmatrix} 3 & 1 & -2 \\ 0 & {10}/{3} & {7}/{3} \\ 0 & 0 & {1}/{2} \end{bmatrix} = A_3. \end{split}\]

We can effectuate these row operations by matrix multiplications.

\[\begin{split} A_2 = F_1A = \begin{bmatrix} 1 & 0 & 0 \\ -2/3 & 1 & 0 \\ -1/3 & 0 & 1 \end{bmatrix} \begin{bmatrix} 3 & 1 & -2 \\ 2 & 4 & 1 \\ 1 & 2 & 1 \end{bmatrix} = \begin{bmatrix} 3 & 1 & -2 \\ 0 & {10}/{3} & {7}/{3} \\ 0 & {5}/{3} & {5}/{3} \end{bmatrix}. \end{split}\]

In the same way the second row reduction step can be described by

\[\begin{split} A_3 = F_2A_2 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & -1/2 & 1 \end{bmatrix} \begin{bmatrix} 3 & 1 & -2 \\ 0 & {10}/{3} & {7}/{3} \\ 0 & {5}/{3} & {5}/{3} \end{bmatrix} = \begin{bmatrix} 3 & 1 & -2 \\ 0 & {10}/{3} & {7}/{3} \\ 0 & 0 & {1}/{2} \end{bmatrix}.\end{split}\]

The echelon matrix \(A_3\) can act as our upper triangular matrix \(U\), and the above computations can be summarized as

\[\begin{split} U = A_3 = F_2F_1A = FA = \begin{bmatrix} 1 & 0 & 0 \\ -2/3 & 1 & 0 \\ 0 & -1/2 & 1 \end{bmatrix} A, \end{split}\]

where \(F = F_2F_1\).

We conclude that

\[\begin{split} A = F^{-1} U = \begin{bmatrix} 1 & 0 & 0 \\ 2/3 & 1 & 0 \\ 1/3 & 1/2 & 1 \end{bmatrix} \begin{bmatrix} 3 & 1 & -2 \\ 0 & {10}/{3} & {7}/{3} \\ 0 & 0 & {1}/{2} \end{bmatrix}\end{split}\]

which is indeed a product of a lower triangular matrix \(L\) (with 1’s on its diagonal) and an upper triangular matrix \(U\).

The above procedure shows how in principle an \(LU\) decomposition can be computed. However, writing down the explicit elementary like matrices \(F_i\), computing their product \(F\), and finding the inverse of \(F\) is unnecessary. Look at the matrix \(L = F^{-1}\) we found! The entries below the diagonal are exactly the numbers \(2/3\), \(1/3\) and \(1/2\) we called the multipliers. The following algorithm describes this ‘shortcut’ to find an \(LU\) decomposition.

Algorithm 3.6.2

Suppose the \(n\times n\) matrix \(A\) can be row reduced top-down to the echelon matrix \(U\). If the numbers \(m_{jk}\) denote the multiples of the \(k\)th row that are subtracted from the rows below it in the \(k\)th step (so \(1 \leq k < j \leq n\)),   then

\[\begin{split} A = LU, \quad \text{for} \,\, L = \begin{bmatrix} 1\\ m_{21} & 1 \\ m_{31} & m_{32} & \ddots \\ \vdots & \vdots & \ddots & \ddots\\ m_{n1} & m_{n2} & \cdots & m_{n,n-1} & 1 \end{bmatrix}. \end{split}\]

Another look at Example 3.6.3 explains why the detour via the matrices \(F_i\) and the inverse of their product can be skipped, as is expressed in the above algorithm.

Example 3.6.4

For the matrix \(A= \begin{bmatrix} 3 & 1 & -2 \\ 2 & 4 & 1 \\ 1 & 2 & 1 \end{bmatrix}\), the same as in Example 3.6.3, the ‘trick’ is not to work with the matrices \(F_1\) and \(F_2\), but with their inverses.

The first row reduction step

\[\begin{split} A_2 = F_1A = \begin{bmatrix} 1 & 0 & 0 \\ -2/3 & 1 & 0 \\ -1/3 & 0 & 1 \end{bmatrix} \begin{bmatrix} 3 & 1 & -2 \\ 2 & 4 & 1 \\ 1 & 2 & 1 \end{bmatrix} = \begin{bmatrix} 3 & 1 & -2 \\ 0 & {10}/{3} & {7}/{3} \\ 0 & {5}/{3} & {5}/{3} \end{bmatrix} \end{split}\]

can be rewritten as

\[\begin{split} A = F_1^{-1}A_2 = L_1A_2 = \begin{bmatrix} 1 & 0 & 0 \\ 2/3 & 1 & 0 \\ 1/3 & 0 & 1 \end{bmatrix} \begin{bmatrix} 3 & 1 & -2 \\ 0 & {10}/{3} & {7}/{3} \\ 0 & {5}/{3} & {5}/{3} \end{bmatrix} \end{split}\]

Likewise the second row reduction step

\[\begin{split} A_3 = F_2A_2 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & -1/2 & 1 \end{bmatrix} \begin{bmatrix} 3 & 1 & -2 \\ 0 & {10}/{3} & {7}/{3} \\ 0 & {5}/{3} & {5}/{3} \end{bmatrix} = \begin{bmatrix} 3 & 1 & -2 \\ 0 & {10}/{3} & {7}/{3} \\ 0 & 0 & {1}/{2} \end{bmatrix} \end{split}\]

can be can be represented as

\[\begin{split} A_2 = F_2^{-1}A_3 = L_2U = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 1/2 & 1 \end{bmatrix} \begin{bmatrix} 3 & 1 & -2 \\ 0 & {10}/{3} & {7}/{3} \\ 0 & {5}/{3} & {5}/{3} \end{bmatrix}.\end{split}\]

Combining the two equations gives

\[\begin{split} A = L_2L_1U = LU = \begin{bmatrix} 1 & 0 & 0 \\ 2/3 & 1 & 0 \\ 1/3 & 1/2 & 1 \end{bmatrix} \begin{bmatrix} 3 & 1 & -2 \\ 0 & {10}/{3} & {7}/{3} \\ 0 & 0 & {1}/{2} \end{bmatrix}. \end{split}\]

We see the multipliers nicely fall into place!

Here is an example where we apply the algorithm without further ado to a \(4 \times 4\) matrix.

Example 3.6.5

We use the \(LU\)-algorithm for the matrix \(A = \begin{bmatrix} 2 & 1 & -2 & 3 \\ 2 & -3 & -4 & 7 \\ -4 & 0 & -2 & -5 \\ -6 & -1 & 8 & -8 \end{bmatrix}.\)

We row reduce the matrix \(A\) top-down, diligently registering the multipliers. Recall that the multiplier \(m_{jk}\) is defined as the multiple of row \(k\) that is subtracted from row \(j\) in the \(k\)-th step of the reduction process.

Here we go:

\[\begin{split} \left[\begin{array}{rrrr} 2 & 1 & -2 & 3 \\ 2 & -3 & -4 & 7 \\ -4 & 0 & 2 & -5 \\ 6 & 1 & -8 & 8\end{array} \right] \begin{array}{l} [R_1] \\ {[R_2-\class{red}{1}R_1]} \\ {[R_3-\class{red}{(-2)}R_1]} \\ {[R_4-\class{red}{3}R_1]} \\ \end{array} \sim \left[\begin{array}{rrrr} 2 & 1 & -2 & 3 \\ 0 & -4 & -2 & 4 \\ 0 & 2 & -2 & 1 \\ 0 & -2 & -2 &-1\end{array} \right] \begin{array}{l} [R_1] \\ [R_2] \\ {[R_3-\class{blue}{(-1/2)}R_2]} \\ {[R_4-\class{blue}{1/2}R_2]} \\ \end{array} \end{split}\]
\[\begin{split} \sim \left[\begin{array}{rrrr} 2 & 1 & -2 & 3 \\ 0 & -4 & -2 & 4 \\ 0 & 0 & -3 & 3 \\ 0 & 0 & -1 & -3 \end{array} \right] \begin{array}{l} [R_1] \\ [R_2] \\ {[R_3]} \\ {[R_4-\class{green}{1/3}R_2]} \\ \end{array} \,\,\sim\,\, \left[\begin{array}{rrrr} 2 & 1 & -2 & 3 \\ 0 & -4 & -2 & 4 \\ 0 & 0 & -3 & 3 \\ 0 & 0 & 0 & -4 \end{array} \right] \,\,=\,\, U. \end{split}\]

Putting every multiplier in its right place gives

\[\begin{split} L = \left[\begin{array}{rrrr} 1 & 0 & 0 & 0 \\ \class{red}{1} & 1 & 0 & 0 \\ \class{red}{-2} & \class{blue}{-1/2} & 1 & 0 \\ \class{red}{3} & \class{blue}{1/2} & \class{green}{1/3} & 1 \end{array} \right]. \end{split}\]

Example 3.6.6

For the matrix \(A= \begin{bmatrix} 1 & 2 & 1 \\ 4 & 8 & 6 \\ 2 & 5 & 7 \end{bmatrix}\) the procedure breaks down at the second step.

\[\begin{split} \begin{bmatrix} 1 & 2 & 1 \\ 4 & 8 & 6 \\ 2 & 5 & 7 \end{bmatrix} \begin{array}{l} [R_1] \\ {[R_2-4R_1]} \\ {[R_3-2R_1]} \end{array} \quad \sim \quad \begin{bmatrix} 1 & 2 & 1 \\ 0 & 0 & 2 \\ 0 & 1 & 5 \end{bmatrix}. \end{split}\]

The next logical step towards an echelon matrix would be a row swap. But then the lower triangular structure of \(L\) will be broken. In Subsection 3.6.2 we will study what we can do in such a situation.

Here’s one to try for yourself.

Grasple Exercise 3.6.2

https://embed.grasple.com/exercises/089f5e1e-3e96-4be3-9c9c-3422a847ec72?id=109255

To compute the \(LU\) decomposition of a 3x3 matrix \(A\).

We collect a few properties of triangular matrices that have their value of their own right, and moreover will be used in the proofs of two properties of the \(LU\) decomposition.

Proposition 3.6.2

Suppose \(A\) and \(B\) are lower triangular \(n \times n\) matrices with \(1\)s on their diagonals. Then the following properties hold.

  1. \(AB\) is also a lower triangular matrix with \(1\)s on its diagonal.

  2. \(A^{-1}\) is also a lower triangular matrix with \(1\)s on its diagonal.

The same properties hold when each instance of ‘lower’ in the above is be replaced by ‘upper’.

There are several ways to prove Proposition 3.6.2. The best would be to think of a proof yourself, but you can also have a peek at the exposition below.

People that are not satisfied with two examples to show that Algorithm 3.6.2 works can have a look at the following (informal) proof of Proposition 3.6.1.

The uniqueness issue is not of major importance, but for completeness’ sake we phrase it in a proposition, followed by a possible argument to prove it.

Proposition 3.6.3

For an invertible matrix \(A\), if it has an \(LU\) decomposition, it will be unique.

Lastly we give an example to show that the uniqueness may fail for a singular matrix.

Example 3.6.7

For the matrix \(A = \begin{bmatrix} 1 & 1 & 1 \\ 2 & 2 & 2 \\ 3 & 3 & 3 \end{bmatrix}\) it holds that

\[\begin{split} A = \begin{bmatrix} 1 & 0 & 0 \\ 2 & 1 & 0 \\ 3 & a & 1 \end{bmatrix}\begin{bmatrix} 1 & 1 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}, \end{split}\]

where the parameter \(a\) is free to choose.

3.6.2. Generalization to non-square matrices and \(PLU\) decomposition#

In this section we describe what can be said regarding \(LU\) decompositions for non-square matrices, and present a ‘workaround’ for matrices for which there is no top-down row reduction to echelon form.

We start with an example to show that not much changes if we apply Algorithm 3.6.2 to non-square matrices.

Example 3.6.9

The matrix \(A\) is given by

\[\begin{split} A = \begin{bmatrix} 1 & 3 & 1 & -1 \\ -1 & 1 & 1 & 2 \\ 2 & -2 & -1 & 3 \end{bmatrix}. \end{split}\]

We can row reduce \(A\) top-down to an echelon matrix:

\[\begin{split} \begin{array}{rcl} \begin{bmatrix} 1 & 3 & 1 & -1 \\ -1 & 1 & 1 & 2 \\ 2 & -2 & -1 & 3 \end{bmatrix} \begin{array}{l} [R_1] \\ {[R_2-\class{blue}{(-1)}R_1]} \\ {[R_3-\class{blue}{2}R_1]} \\ \end{array} & \sim & \begin{bmatrix} 1 & 3 & 1 & -1 \\ 0 & 4 & 2 & 1 \\ 0 & -8 & -3 & 5 \end{bmatrix} \begin{array}{l} [R_1] \\ {[R_2]} \\ {[R_3-\class{blue}{(-2)}R_2]} \\ \end{array}\\ &\sim& \begin{bmatrix} 1 & 3 & 1 & -1 \\ 0 & 4 & 2 & 1 \\ 0 & 0 & 1 & 7 \end{bmatrix}. \end{array} \end{split}\]

We then have

\[\begin{split} A = LU = \begin{bmatrix} 1 & 0 & 0 \\ \class{blue}{-1} & 1 & 0 \\ \class{blue}{2} & \class{blue}{-2} & 1 \end{bmatrix} \begin{bmatrix} 1 & 3 & 1 & -1 \\ 0 & 4 & 2 & 1 \\ 0 & 0 & 1 & 7 \end{bmatrix}. \end{split}\]

The two-step procedure as in Algorithm 3.6.1 to solve a linear system \(LU\vect{x} = \vect{b}\) is also feasible for non-square matrices, as the next example shows

Example 3.6.9

We will solve the equation

\[ A\vect{x}= \vect{b}, \]

where

\[\begin{split} A = LU = \begin{bmatrix} 1 & 0 & 0 \\ -1 & 1 & 0 \\ 2 & -2 & 1 \end{bmatrix} \begin{bmatrix} 1 & 3 & 1 & -1 \\ 0 & 4 & 2 & 1 \\ 0 & 0 & 1 & 7 \end{bmatrix} \quad \text{and} \quad \vect{b} = \begin{bmatrix}2 \\ -2 \\ 5 \end{bmatrix}. \end{split}\]

First we solve the system

\[\begin{split} L\vect{y} = \begin{bmatrix} 1 & 0 & 0 \\ -1 & 1 & 0 \\ 2 & -2 & 1 \end{bmatrix}\vect{y} = \begin{bmatrix}2 \\ -2 \\ 5 \end{bmatrix}. \end{split}\]

Using forward substitution we find the solution

\[\begin{split} \tilde{\vect{y}} = \begin{bmatrix}2 \\ 0 \\ 1 \end{bmatrix} \end{split}\]

and then the system

\[\begin{split} U\vect{x} = \begin{bmatrix} 1 & 3 & 1 & -1 \\ 0 & 4 & 2 & 1 \\ 0 & 0 & 1 & 7\end{bmatrix}\vect{x} = \begin{bmatrix}2 \\ 0 \\ 1 \end{bmatrix} \end{split}\]

gives the solution(s)

\[\begin{split} \vect{x} = \begin{bmatrix} \tfrac52 -\tfrac{7}{4}x_4\\ -\tfrac12 +\tfrac{13}{4}x_4\\ 1 - 7x_4\\ x_4 \end{bmatrix}, \quad x_4 \,\, \text{free.} \end{split}\]

The generalization of Proposition 3.6.1 to non-square matrices is captured in the next proposition.

Proposition 3.6.4

An \(m\times n\) matrix \(A\), with \(m \leq n\) can be written as \(A = LU\), with

  • \(L = \begin{bmatrix} 1 & \\ \ell_{21} & 1 & \\ \ell_{31} & \ell_{32} & 1 \\ \vdots & \vdots & \vdots & \ddots & \\ \ell_{m1} & \ell_{m2} & \ell_{m3} & \cdots & 1 \end{bmatrix}\),

an \(m\times m\) upper triangular matrix with \(1\)s on its diagonal,

and

  • \(U =\begin{bmatrix} u_{11} & u_{12} & u_{13} & \cdots &u_{1m} & \cdots & u_{1n} \\ & u_{22} & u_{23} & \cdots & u_{2m} & \cdots & u_{2n} \\ & & u_{33} & \cdots & u_{3m} & \cdots & u_{3n} \\ & & & \ddots & \vdots \\ & & & & u_{mm} & \cdots & u_{mn} \end{bmatrix}\)   an echelon matrix

if and only if \(A\) can be row reduced top-down to the echelon matrix \(U\).

Moreover, if the rows of \(A\) are linearly independent the matrices \(L\) and \(U\) are unique.

A similar proposition holds for \(m \times n\) matrices with \(m > n\). However, in this case the systems \(A\vect{x} = \vect{b}\) are inconsistent for most vectors \(\vect{b}\), and then other techniques come into play (e.g., see Section 7.4.2).

In the remainder of this subsection we address the next best thing we can do in case a matrix does not have an \(LU\) decomposition.

Example 3.6.10

Let us return to Example 3.6.6 where the matrix \(A\) cannot be written as \(LU\). We copy from there

\[\begin{split} A \,=\, \begin{bmatrix} 1 & 2 & 1 \\ 4 & 8 & 6 \\ 2 & 5 & 7 \end{bmatrix} \begin{array}{l} [R_1] \\ {[R_2-4R_1]} \\ {[R_3-2R_1]} \end{array} \quad \sim \quad \begin{bmatrix} 1 & 2 & 1 \\ 0 & 0 & 2 \\ 0 & 1 & 5 \end{bmatrix}. \end{split}\]

We need a row swap to bring the last matrix to echelon form. The key idea is to perform all the row exchanges first and then add multiples of rows (top-down) to other rows to obtain an echelon form.
Here the one row exchange, swapping row 2 and row 3, is captured by the matrix

\[\begin{split} P = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{bmatrix}. \end{split}\]

We get

\[\begin{split} PA \,=\, \begin{bmatrix} 1 & 2 & 1 \\ 2 & 5 & 7 \\ 4 & 8 & 6 \end{bmatrix} \,\, = \,\, \begin{bmatrix} 1 & 0 & 0 \\ 2 & 1 & 0 \\ 4 & 0 & 1 \end{bmatrix}\, \begin{bmatrix} 1 & 2 & 1 \\ 0 & 1 & 5 \\ 0 & 0 & 2 \end{bmatrix}. \end{split}\]

This is what we will call a \(PLU\) decomposition (a permuted \(LU\) decomposition).

A series of row swaps may lead to any reordering of the rows of a matrix. In mathematics the word for reordering is permutation, and the action of reordering can be accomplished via a multiplication with a permutation matrix \(P\). Before we continue with the \(PLU\) decomposition we define this concept and derive some of its properties.

Definition 3.6.2

A permutation matrix is an \(n \times n\) matrix \(P\) with only entries \(0\) and \(1\) in such a way that each row and each column contain exactly one \(1\).

Example 3.6.11

Two \(4\times 4\) permutation matrices are

(3.6.4)#\[\begin{split} P_1 = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \end{bmatrix} \quad \text{and} \quad P_2 = \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \end{bmatrix}\end{split}\]

Note that for an arbitrary \(4 \times 4\) matrix \(A\) we have

\[\begin{split} P_1 A = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ \class{blue}{0} & \class{blue}{0} & \class{blue}{0} & \class{blue}{1} \\ \class{red}{0} & \class{red}{0} & \class{red}{1} & \class{red}{0} \end{bmatrix} \begin{bmatrix} a_{11} & a_{12} & a_{13} & a_{14} \\ a_{21} & a_{22} & a_{23} & a_{24} \\ \class{red}{a_{31}} & \class{red}{a_{32}} & \class{red}{a_{33}} & \class{red}{a_{34}} \\ \class{blue}{a_{41}} & \class{blue}{a_{42}} & \class{blue}{a_{43}} & \class{blue}{a_{44}} \end{bmatrix} = \begin{bmatrix} a_{11} & a_{12} & a_{13} & a_{14} \\ a_{21} & a_{22} & a_{23} & a_{24} \\ \class{blue}{a_{41}} & \class{blue}{a_{42}} & \class{blue}{a_{43}} & \class{blue}{a_{44}} \\ a_{31} & a_{32} & a_{33} & a_{34} \end{bmatrix} \end{split}\]

and

\[\begin{split} P_2 A = \begin{bmatrix} \class{blue}{0} & \class{blue}{1} & \class{blue}{0} & \class{blue}{0} \\ \class{red}{0} & \class{red}{0} & \class{red}{1} & \class{red}{0} \\ 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} a_{11} & a_{12} & a_{13} & a_{14} \\ \class{blue}{a_{21}} & \class{blue}{a_{22}} & \class{blue}{a_{23}} & \class{blue}{a_{24}} \\ \class{red}{a_{31}} & \class{red}{a_{32}} & \class{red}{a_{33}} & \class{red}{a_{34}} \\ a_{41} & a_{42} & a_{43} & a_{44} \end{bmatrix} = \begin{bmatrix} \class{blue}{a_{21}} & \class{blue}{a_{22}} & \class{blue}{a_{23}} & \class{blue}{a_{24}} \\ \class{red}{a_{31}} & \class{red}{a_{32}} & \class{red}{a_{33}} & \class{red}{a_{34}} \\ a_{41} & a_{42} & a_{43} & a_{44} \\a_{11} & a_{12} & a_{13} & a_{14} \end{bmatrix} \end{split}\]

In general (pre-)multiplication of a matrix \(A\) with any permutation matrix reorders the rows of \(A\). Try to convince yourself why this is so.

The following properties of permutation matrices come in handy.

Proposition 3.6.5

  1. The product of two \(n\times n\) permutation matrices is again a permutation matrix.

  2. The inverse of a permutation matrix is its transpose.

The following example provides some illustrations. And the interested reader may open the proof below it.

Example 3.6.12

For the matrices \(P_1\) and \(P_2\) from (3.6.4) we have

\[\begin{split} P_1P_2 = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \end{bmatrix} = \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \end{split}\]

and

\[\begin{split} P_2^TP_2 = \begin{bmatrix} 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}. \end{split}\]

Note that from the second identity it follows that \(P_2^{-1} = P_2^T\).

Theorem 3.6.1 (Existence of a \(PLU\) Decomposition)

Suppose that \(A\) is an \(m\times n\) matrix with real coefficients, and let \(m \leq n\). Then there exist a permutation matrix \(P\), a lower triangular matrix \(L\) and an echelon matrix \(U\) such that

\[ PA = LU. \]

Remark 3.6.1

As was mentioned before, the key idea is to perform the row exchanges first. These can be put together in the permutation matrix \(P\). The algorithm to actually find the \(LU\) decomposition of \(PA\) without doing the whole row reduction process for \(PA\) all over again is rather intricate, and in our view belongs to a course of numerical linear algebra. There it will be explained that it may also be preferable to work towards a \(PLU\) decomposition instead of an \(LU\) decomposition in cases where theoretically it is not absolutely necessary. For numerical reasons, having to do with finite accuracy when representing real numbers in computers, it may be better to choose the pivots in another order than just top-down.

In the following example the matrix does have a proper \(LU\) decomposition, but we will row reduce it in another order than top-down to echelon form. It is a tiny example to illustrate that it is possible to deduce a \(PLU\) decomposition from it when we keep record of both the multipliers and the positions of the pivots.

Example 3.6.13

We will row reduce the matrix \( A= \begin{bmatrix}2&4&3 \\ 1&2&3\\1&3&2 \end{bmatrix}\) in an alternative order than top-down and extricate a \(PLU\) decomposition from it.

\[\begin{split} \left[\begin{array}{rrr}2&4&3 \\ 1&2&3\\ \fbox{1}&3&2 \end{array} \right] \begin{array}{l} [R_1-\class{blue}{2}R_3] \\ {[R_2-\class{blue}{1}R_3]} \\ {[R_3]} \\ \end{array} \sim \left[\begin{array}{rrr}0&\fbox{$-2$}&-1 \\ 0 & -1 &1 \\ 1&3&2\end{array} \right] \begin{array}{l} [R_1] \\ {[R_2-\class{red}{\frac12}R_1]} \\ {[R_3]} \\ \end{array} \sim \left[\begin{array}{rrr}0&-2&-1 \\ 0 & 0 &\fbox{$\frac32$} \\ 1&3&2\end{array} \right], \end{split}\]

where we have put boxes at the pivot positions.

If we put together the matrices that describe the row operations we get

(3.6.5)#\[\begin{split} A = \begin{bmatrix}2&\fbox{4}&3 \\ 1&2&\fbox{3}\\ \fbox{1}&3&2\end{bmatrix} = \begin{bmatrix} 1 & 0 & \class{blue}{2} \\ \class{red}{\frac12} &1 & \class{blue}{1} \\ 0 & 0 & 1\end{bmatrix} \begin{bmatrix} 0 & -2 & -1 \\ 0 & 0 &\frac32 \\1 & 3 & 2\end{bmatrix} = \tilde{L}\tilde{U}.\end{split}\]

If we would have started from

\[\begin{split} \begin{bmatrix} 1&3&2 \\2&4&3 \\ 1&2&3\end{bmatrix} = PA = \begin{bmatrix} 0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0\end{bmatrix} \begin{bmatrix} 2&4&3 \\ 1&2&3\\1&3&2 \end{bmatrix} \end{split}\]

using the same pivots would have arrived at

\[\begin{split} \begin{bmatrix} 1&3&2 \\2&4&3 \\ 1&2&3\end{bmatrix} \sim \begin{bmatrix} 1&3&2 \\0&-2&-1 \\ 0&-1&1\end{bmatrix} \sim \begin{bmatrix} 1&3&2 \\0&-2&-1 \\ 0&0&\frac32\end{bmatrix}, \end{split}\]

so

\[\begin{split} PA = \begin{bmatrix} 1&3&2 \\2&4&3 \\ 1&2&3\end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 \\\class{blue}{2} & 1 & 0 \\ \class{blue}{1} & \class{red}{\frac12} & 1\end{bmatrix} \begin{bmatrix} 1&3&2 \\0&-2&-1 \\ 0&0&\frac32\end{bmatrix} = LU. \end{split}\]

The matrix \(P\) is the inverse of (so also the transpose of) the matrix that describes the positions of the pivots as in Equation (3.6.5). It asks for some careful analysis how to reconstruct \(L\) from \(\tilde{L}\).

3.6.3. Grasple Exercises#

Grasple Exercise 3.6.3

https://embed.grasple.com/exercises/cca386f8-a6ff-44b6-9b83-fe96482a4763?id=108857

To identify triangular matrices.

Grasple Exercise 3.6.4

https://embed.grasple.com/exercises/4ef9b6fe-e204-44e7-9463-3e1c3537a10b?id=82913

To compute the \(LU\) decomposition of a 2x2 matrix \(A\).

Grasple Exercise 3.6.5

https://embed.grasple.com/exercises/b167deea-922f-4a80-9b3e-7cbdf16f023f?id=106332

To compute the \(LU\) decomposition of a 3x3 matrix \(A\).

Grasple Exercise 3.6.6

https://embed.grasple.com/exercises/9708bce4-5c01-4486-8f44-7ea3a5157950?id=82914

To compute the \(LU\) decomposition of a 3x3 matrix \(A\) and use it to solve \(A\vect{x} = \vect{b}\).

Grasple Exercise 3.6.7

https://embed.grasple.com/exercises/d7ec03b4-32c4-4c3e-8c1e-2714878ef558?id=82917

To compute the \(LU\) decomposition of a 3x3 matrix \(A\) and use it to solve \(A\vect{x} = \vect{b}\).

Grasple Exercise 3.6.8

https://embed.grasple.com/exercises/9cbcf004-bfbe-428f-927f-5c64ca802946?id=82919

To decide solving \(A\vect{x} = \vect{b}\) via (given) \(A=LU\) or (given) \(A^{-1}\).

Grasple Exercise 3.6.9

https://embed.grasple.com/exercises/8fe1ca5c-3f24-4148-9725-a96c44e3f43a?id=82920

To compute \(A^{-1}\) using \(A = LU\).

Grasple Exercise 3.6.10

https://embed.grasple.com/exercises/7d8c3553-18cd-4866-bfa1-9ff273ee18e8?id=82925

Explorative exercise about the \(LDU\) decomposition.

Grasple Exercise 3.6.11

https://embed.grasple.com/exercises/5bb6dcac-4575-4953-bb3a-c0e8d4594798?id=82928

To compute the \(LU\) decomposition of 3x4 matrix \(A\) and use it to solve \(A\vect{x} = \vect{b}\).

Grasple Exercise 3.6.12

https://embed.grasple.com/exercises/9a2cb913-3462-4832-8e33-9f4b878f1da7?id=106804

To compute a \(PLU\) decomposition of a 3x3 matrix.

Grasple Exercise 3.6.13

https://embed.grasple.com/exercises/48ef4fa5-cdcf-4449-8c5a-0d5f5d448025?id=106870

To solve a system \(A\vect{x} = \vect{b}\) using \( PA = LU\).

3.6.4. Efficiency Issues#

One way to measure the performance of an algorithm is counting the number of arithmetic operations that are necessary for solving a problem. By arithmetic operations we will take into account additions, multiplications and divisions.

Let us first compute this number when we solve the (square) linear system \(A\mathbf{x}=\mathbf{b}\) by taking the augmented matrix \([ A | \vect{b}]\), find an echelon form and then use backward substitution. Let us suppose that the matrix \(A\) is invertible and possesses an \(LU\) decomposition.

In the worst-case scenario, for a \(3\times 3\) matrix \(A\), (so a \(3\times 4\) augmented matrix), we need the following number of arithmetic operations:

  • To convert the components \(a_{21}\) and \(a_{31}\) to a zero value, we need to
    compute two multipliers \(m_{21}\) and \(m_{31}\) (2 divisions), then we need to
    multiply each component of the first row, starting at \(a_{12}\), by each \(m_{i1}\) (this happens twice, so \(2\times 3=6\) products) and then we need to
    subtract the resulting values to the corresponding components in each row (\(2\times 3=6\) subtractions).
    Therefore, we need a total of \(14\) arithmetic operations (8 products/divisions and 6 additions/subtractions).

  • To convert the component \(a_{32}\) to a zero value, we need to compute the multiplier \(m_{32}\) (one division), then we need to multiply each component of the second row starting at \(a_{23}\) (this asks for 2 products), and then we need to subtract the resulting values to the corresponding components in the third row (2 subtractions). This gives 5 arithmetic operations.

So just to bring the augmented matrix to an echelon form requires 19 arithmetic operations (11 multiplications/divisions and 8 additions/subtractions).

Next, to solve the system we use backward substitution.

  • to find \(x_2\) requires one multiplication and one subtraction,

  • to find \(x_1\) requires two multiplications and two subtractions.

So in total, we needed \(19+6\) arithmetic operations.

Supposing that \(A=LU\) and that \(L\) and \(U\) are given, then, we solve first \(L\mathbf{y}=\mathbf{b}\) and then \(U\mathbf{x}=\mathbf{y}\).

  • For \(L\mathbf{y}=\mathbf{b}\) we use forward substitution. Since the elements in the main diagonal are ones, then we have that we need no operations to determine \(y_1\), we need one subtraction and one division for \(y_2\), and two subtractions and one division for \(y_3\). This totals 6 arithmetic operations.

  • To solve \(U\mathbf{x}=\mathbf{y}\) we use backward substitution, and we have just seen that it requires 9 arithmetic operations.

So when the matrix \(A\) is already \(LU\) factorised, the number of operations required to solve the system is significantly lower.

Similar computations for a non-singular \(n\times n\) matrix \(A\) leads to the following results.

Proposition 3.6.6

Suppose \(A\) is an invertible \(n\times n\) matrix and \(\vect{b}\) an arbitrary vector in \(\R^n\).

  1. Row reduction of \(A\) to echelon form requires \(\frac23n^3-\frac12n^2-\frac16n\) (arithmetic) operations.

  2. Row reduction of the augmented matrix \([A | \vect{b}]\) to echelon form requires \(\frac23n^3+\frac32n^2-\frac76n\) operations.

  3. Solving a linear system \(L\vect{y} = \vect{b}\) for an \(n \times n\) lower triangular matrix \(L\) with \(1\)s on the diagonal requires \(n(n-1)\) operations.

  4. Solving a linear system \(U\vect{y} = \vect{b}\) for an \(n \times n\) upper triangular matrix \(U\) requires \(n^2\) operations.

From ii and iii it follows that solving the system \(A\vect{x} = \vect{b}\) by row reduction + backward substitution requires

\[ \frac23n^3+\frac32n^2 -\frac76n + n(n-1) = \frac{4n^3+9n^2-13n}{6} \]

arithmetic operations.

From iii and iv it follows that solving the system \(LU\vect{x} = \vect{b}\) requires

\[ n(n-1) + n^2= 2n^2 - n \]

arithmetic operations.

In many applications in engineering, it is required to solve many, say \(m\), linear systems
\(A\vect{x}=\vect{b}_i\), with the same coefficient matrix \(A\), where typically the vectors \(\vect{b}_i\) are not known beforehand. In this situation is where the \(LU\) decomposition comes in handy. Instead of solving the systems one by one, we can first find an \(LU\)-decomposition, and then solve the systems \(LU\vect{x}=\vect{b}_i\). If we compare solving \(m\) linear systems with row reduction (\(RR\)) and with \(LU\) decomposition (\(LU\)), we get

\[ m\cdot\left(\frac{4n^3+9n^2-13n}{6}\right) \quad (RR) \]

versus

\[ \frac{4n^3-3n^2-n}{6} + m\cdot(2n^2-n) \quad (LU). \]

In Table 3.6.1 we can see the comparison in the numbers of operations needed to solve several linear systems when using row reduction and \(LU\) decomposition.

Table 3.6.1 Comparison between solving linear systems by row reduction or by \(LU\) decomposition.#

\(n\)

\(m=5\)

\(m=10\)

\(m=50\)

\(RR\)

\(LU\)

\(RR\)

\(LU\)

\(RR\)

\(LU\)

\(3\)

\(140\)

\(88\)

\(280\)

\(163\)

\(1400\)

\(763\)

\(5\)

\(575\)

\(295\)

\(1,150\)

\(520\)

\(5,750\)

\(2,320\)

\(10\)

\(4,025\)

\(1,565\)

\(8,050\)

\(2,515\)

\(40,250\)

\(10,115\)

We will mention one other advantage the \(LU\) decomposition may have, namely when the coefficient matrix \(A\) is a band matrix. In that case it is much more efficient to work with the \(LU\) decomposition than with the inverse. Such systems \(A\vect{x} = \vect{b}\) for instance appear a when (partial) differential equations are solved via discretizations. It falls outside the scope of this textbook to go into the details, but we consider the special case of tridiagonal matrices to illustrate once more the usefulness of the \(LU\) decomposition.

Definition 3.6.3

A band matrix of width \(d \geq 0\) is a matrix where only the entries within a distance \(d\) from the diagonal are nonzero. So, \(a_{ij} = 0\) if \(|i-j|>d\). A tridiagonal matrix \(A\) is a band matrix of width 1.

Example 3.6.14

Three examples of band matrices are

\[\begin{split} A = \begin{bmatrix} 2 & 1 & 0 & 0 \\ 1 & 3 & 2 & 0\\ 0 & 2 & 5 & 4 \\ 0 & 0 & 6 & 5 \end{bmatrix}, \quad B = \begin{bmatrix} 2 & 1 & 1 & 0 & 0 \\ 1 & 3 & 2 & 2 & 0\\ 1 & 2 & 1 & 4 & 3\\ 0 & 4 & 0 & 2 & 1 \\ 0 & 0 & 3 & 3 & 5 \end{bmatrix} \quad \text{and} \quad U = \begin{bmatrix} 1 & 1 & 0 & 0 \\ 0 & 2 & 2 & 0 \\ 0 & 0 & 3 & 4 \\ 0 & 0 & 0 & 5 \end{bmatrix}. \end{split}\]

\(A\) is a tridiagonal matrix, matrix \(B\) is a band matrix of width 2, and the third matrix is both upper triangular and tridiagonal.

Proposition 3.6.7

Suppose \(A\) is a band matrix of width \(d\). Then the matrices \(L\) and \(U\) of an \(LU\) decomposition of \(A\) (insofar this exists) are band matrices of width \(d\) as well

Now suppose we need to solve many linear systems \(A\vect{x} =\vect{b}_i\) where \(A\) is an \(n\times n\) tridiagonal matrix. We can compute \(A^{-1}\) and solve the systems by computing the consecutive products \(A^{-1}\vect{b}_i\), or we can find an \(LU\) decomposition and use Algorithm 3.6.1 to solve \(A\vect{x} =\vect{b}_i\).

There are two reasons why it is advantageous to work with the \(LU\) decomposition here.

  1. To store \(A^{-1}\) we need to store \(n^2\) numbers whereas to store \(L\) and \(U\) only \(4n-2\) memory places are asked for. In fact, if we may assume that \(L\) has ones on its diagonal, only \(3n-2\) places are needed. For large \(n\) this is a significant gain.

  2. One product \(A^{-1}\vect{b}\) involves \(n^2\) products of numbers and \(n(n-1)\) additions, so the total number of arithmetic operations equals \(2n^2 - n\).

    To solve

    \[\begin{split} L\vect{y} = \begin{bmatrix} 1 \\ \ell_{21} & 1 \\ 0 &\ell_{32} & 1 \\ 0 & 0 & \ell_{32} & 1 \\[1ex] \vdots & \vdots & & \ddots & \\[1ex] 0 & 0 & 0 & \cdots & \ell_{n,n-1} & 1 \end{bmatrix} \begin{bmatrix} y_1 \\ y_2 \\ y_3 \\ y_4 \\[1ex] \vdots \\[1ex] y_n \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \\ b_3 \\ b_4 \\[1ex] \vdots \\[1ex] b_n \end{bmatrix} \end{split}\]

    by forward substitution we find one after another

    \[\begin{split} \begin{array}{lcl} y_1 &=& b_1 \\ y_2 &=& b_2 - \ell_{21}y_1 \\ y_3 &=& b_3 - \ell_{32}y_2 \\[1ex] \vdots & & \quad \vdots \\[1ex] y_n &=& b_n - \ell_{n,n-1}y_{n-1} \\ \end{array} \end{split}\]

    For all but the first row we need two operations per row, so in total \(2(n-1)\) operations.

    Likewise, for solving \(U\vect{x} = \vect{y}\) we need an extra division for each row, so the number of operations for backward substitution is \(2(n-1)+n\), and the total number for the two phases becomes

    \[ 2(n-1) + 2(n-1) + n = 5n-2. \]

    For large \(n\) this is again much smaller than the \(2n^2-n\) arithmetic operations in the computation of \(A^{-1}\mathbf{b}\).