Miscellaneous applications of determinants

5.4. Miscellaneous applications of determinants#

5.4.1. Introduction#

In this section we will address the following matters:

The determinant as a uniform scale factor for an arbitrary linear transformation from $\R^n$ to $\R^n$.
Cramer’s rule. Seemingly the ultimate solution to almost all systems of $n$ linear equations in $n$ unknowns.
The generalisation of the formula

\[\begin{split} \left(\begin{array}{cc} a & b \\ c & d\end{array} \right)^{-1} = \dfrac{1}{ad-bc} \left(\begin{array}{cc} d & -b \\ -c & a\end{array} \right) \end{split}\]

to $n\times n$-matrices.
A certain generalisation of the cross product to $n$ dimensions.

5.4.2. Volume and orientation revisited#

We have seen in Section 5.1 how determinants arise in the context of areas of parallelograms and volumes of parallelepipeds.

In Section 1.2 we used the dot product to define length, distance and orthogonality in $\R^n$. Determinants permit to define the concepts of orientation and volume (at least, a beginning) in $n$ dimensions.

volume in $\R^n$volume

Definition 5.4.1 (Volume in $\R^n$)

Let $\{\vect{v}_1, \ldots, \vect{v}_n\}$ be a set of $n$ vectors in $\R^n$. The $n$-dimensional parallelepiped $\mathcal{P}$ spanned by $\vect{v}_1, \ldots, \vect{v}_n$ is the set

\[ \mathcal{P} = \mathcal{P}(\vect{v}_1, \ldots, \vect{v}_n) = \{c_1\vect{v}_1+c_2\vect{v}_2 + \cdots + c_1\vect{v}_n \,|\, c_i \in \R, 0 \leq c_i \leq 1\}. \]

See Figure 5.4.1 for an illustration of such a set in $\R^2$.

The volume of such a parallelepiped is defined by

\[ \operatorname{Vol}(\mathcal{P}) = |\det{(\,\vect{v}_1\, \,\cdots \,\, \vect{v}_n\,)}|. \]

So, it is the absolute value of a determinant.

Note that if the vectors $\vect{v}_1, \ldots, \vect{v}_n$ in Definition 5.4.1 are linearly dependent the volume automatically becomes $0$.

../_images/Fig-DetExtras-ParPed.svg — Fig. 5.4.1 The parallelepiped generated by two vectors is a parallelogram.#

Remark 5.4.1

The intuition behind Definition 5.4.1 is the following.

To start with we can postulate that the $n$-dimensional unit cube, generated by the unit vectors $\{\vect{e}_1, \ldots, \vect{e}_n\}$, has volume $1$.

\[\begin{split} \operatorname{Vol}\left({\mathcal{P}}(\vect{e}_1, \ldots, \vect{e}_n)\right) = \begin{vmatrix} 1 & 0 & 0 & \cdots & 0 \\ 0 & 1 & 0 & \cdots & 0 \\ 0 & 0 & 1 & \cdots & 0 \\ \vdots & & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 \end{vmatrix} = 1. \end{split}\]

Next, if we scale one of the vectors with a factor $c$, we would want the volume to get a factor $|c|$,

\[ \operatorname{Vol}\left({\mathcal{P}}(\vect{v}_1, \ldots, {c} \vect{v}_k, \ldots , \vect{v}_n)\right) = {|c|}\,\operatorname{Vol}\left({\mathcal{P}}(\vect{v}_1, \ldots, \vect{v}_k, \ldots , \vect{v}_n)\right). \]

Third, if to one vector $\vect{v}_i$ a linear combination of the other vectors is added, the volume should not change. The underlying idea is borrowed from considerations in $\R^3$.

\[ \operatorname{Vol}\left({\mathcal{P}}(\vect{v}_1, \vect{v}_2, \vect{v}_3)\right) = \operatorname{Vol}\left({\mathcal{P}}(\vect{v}_1, \vect{v}_2, \vect{v}_3 + a\vect{v}_1+b\vect{v}_2)\right), \]

since the operation leaves the ‘ground region’ $ \mathcal{P}(\vect{v}_1, \vect{v}_2)$ and the ‘height’ invariant.

In fact, these two operations exactly match the first two rules of Proposition 5.3.1, when applied to the columns of a determinant. The third rule of that proposition, swapping two columns in a determinant, leads to a sign change of the determinant, but should have no influence on the value of the volume, which we prefer to have non-negative.

The content of the next proposition is that (the absolute value of) the determinant of a matrix $A$ acts as a uniform scaling factor of the linear transformation that corresponds to $A$. We first consider the case of a $2\times 2$-matrix.

Proposition 5.4.1

Suppose $T$ is a linear transformation from $\R^2$ to $\R^2$, with standard matrix $A = (\,\vect{a}_1 \,\, \vect{a}_2\,)$. So we have

\[ T(\vect{x}) = A\vect{x}, \quad \text{for} \,\, \vect{x} \,\text{ in }\, \R^2. \]

Let $R$ be any region in $\R^2$ for which the area is well-defined, and let $S$ be the image of $R$ under $T$.

Then for the area of $S$ it holds that

\[ \operatorname{area}(S) = |\det{A}|\cdot \operatorname{area}(R). \]

Proof of Proposition 5.4.1

If the matrix $A$ is not invertible, the range of $T$, which is given by $\operatorname{Span}\{\vect{a}_1, \vect{a}_2\}$, is contained in a line. Each region $R$ is then mapped onto a subset $S$ that is contained in this line, so

\[ \operatorname{area}(S) = 0 = 0\cdot \operatorname{area}(R) =|\det{A}|\cdot \operatorname{area}(R). \]

Next suppose that $A$ is invertible. Then the unit grid is mapped onto a grid with as unit region the parallelogram with sides $\vect{a}_1 = T(\vect{e}_1) = A\vect{e}_1$ and $\vect{a}_2 = T(\vect{e}_2) = A\vect{e}_2$. See Figure 5.4.2.

../_images/Fig-DetExtras-StandardGrid.svg — Fig. 5.4.2 The image of the standard grid.#

First we show that the formula holds if $R$ is the unit square, i.e., the parallelogram generated by $\vect{e}_1$ and $\vect{e}_2$. The unit square is mapped onto the parallelogram $S$ generated by $T(\vect{e}_1)=\vect{a}_1$ and $T(\vect{e}_2)=\vect{a}_2$. It follows that

\[ \operatorname{area}(S) = |\det{(\,\vect{a}_1\,\, \vect{a}_2\,)}| = |\det{A}|,\]

and since the area of $R$ is equal to $1$, we have

\[ |\det{A}| = |\det{A}| \cdot 1 = |\det{A}| \cdot \operatorname{area}(R). \]

This then also holds for any square $R$ with sides of length $r$ that are parallel to the axes. Namely, such a square has area $r^2$ and can be described as the square with vertices

\[ \vect{p},\quad \vect{p}+ r\vect{e}_1, \quad \vect{p}+ r\vect{e}_1+r\vect{e}_2 \quad\text{and}\quad \vect{p}+ r\vect{e}_2. \]

These are mapped to

\[ A\vect{p},\quad A\vect{p}+ rA\vect{e}_1, \quad A\vect{p}+ rA\vect{e}_1+rA\vect{e}_2 \quad\text{and}\quad A\vect{p}+ rA\vect{e}_2. \]

This is a parallelogram with sides $rA\vect{e}_1 = r\vect{a}_1$ and $rA\vect{e}_2 =r \vect{a}_2$, which has area

\[ \operatorname{area}(S) = |\det{(\,r\vect{a}_1\,\, r\vect{a}_2\,)}| = r^2 |\det{A}| = |\det{A}|\cdot \operatorname{area}(R). \]

See Figure 5.4.3

../_images/Fig-DetExtras-ImageOfSquare.svg — Fig. 5.4.3 The image of a square with ‘corner’ $\vect{p}$ and sides of length $r$.#

For a general (reasonable) region $R$ we sketch the idea and omit the technical details.

The region $R$ can be approximated arbitrarily close by a collection of smaller and smaller squares $R_i$ of which the interiors do not overlap. See Figure 5.4.4. The limit of the areas of these approximations when the grids get finer and finer gives the area of $R$.

../_images/Fig-DetExtras-Subdivision.svg — Fig. 5.4.4 Approximating a region by smaller and smaller squares.#

The formula holds for each of the $R_i$. Since $T$ is one-to-one, the images $S_i = T(R_i)$ will not overlap either, and the images taken together will approximate the image $S = T(R)$ as well. We deduce that

\[\begin{split} \begin{array}{rl} \operatorname{area}(S) \approx \sum \operatorname{area}(S_i) \!\!\!&=\sum |\det{A}|\cdot \operatorname{area}(R_i) \\ &= |\det{A}| \sum \operatorname{area}(R_i) \approx |\det{A}|\cdot \operatorname{area}(R). \end{array} \end{split}\]

By taking an appropriate limit one can show that in fact

\[ \operatorname{area}(S) = |\det{A}|\cdot \operatorname{area}(R). \]

Proposition 5.4.3 can be generalised to higher dimensions. For $n = 3$ area becomes volume.

For higher dimensions the starting point for volume is Definition 5.4.1 for the volume of parallelepids. To extend the definition to volumes of more general regions is by no means a trivial matter, and we will not consider it here.

In $\R^n$ we can at least generalise Proposition 5.4.1 to parallelepipeds.

Proposition 5.4.2

Suppose $T$ is a linear transformation from $\R^n$ to $\R^n$, with standard matrix $A$.

Then for any parallelepiped $\mathcal{P}$ generated by $\{\vect{v}_1, \ldots, \vect{v}_n\}$ , it holds that

\[ \operatorname{Vol}(S) = |\det{A}|\cdot \operatorname{Vol}(\mathcal{P}), \]

where $S$ is the image of $\mathcal{P}$ under $T$.

Proof of Proposition 5.4.2

If $R$ is the $n$-dimensional parallelepiped $\mathcal{P}$ generated by $\{\vect{v}_1, \ldots, \vect{v}_n\}$ we have that $T(\mathcal{P})$ is generated by $\{T(\vect{v}_1), \ldots, T(\vect{v}_n)\}$.

Then

\[\begin{split} \begin{array}{rcl} \operatorname{Vol}(T(\mathcal{P})) &=& |\det{(\,T(\vect{v}_1)\, \,\cdots \,\, T(\vect{v}_n)\,)}| \\ &=& |\det{(\,A(\vect{v}_1)\, \,\cdots \,\, A(\vect{v}_n)\,)}| \\ &=& |\det{\left(A (\,\vect{v}_1\, \,\cdots \,\, \vect{v}_n\,)\right)}|\\ &=& |\det{A} \det{(\,\vect{v}_1\, \,\cdots \,\, \vect{v}_n\,)}|\\ &=& |\det{A}| \,\operatorname{Vol}(\mathcal{P}). \end{array} \end{split}\]

To conclude our interpretation of the determinant of $A$ regarding the linear transformation $T(\vect{x}) = A\vect{x}$ we look at the orientation.

positively orientatednegatively orientatedorientation in $\R^n$

Definition 5.4.2 (Orientation in $\R^n$)

Suppose the vectors $(\vect{v}_1, \ldots, \vect{v}_n)$ in $\R^n$ are linearly independent.

Then we say that the ordered set $(\vect{v}_1, \ldots, \vect{v}_n)$ is positively orientated if $ \det{(\vect{v}_1 \cdots \vect{v}_n)}>0$.

If this determinant is negative the set is called negatively orientated.

For vectors that are linearly dependent we do not define the orientation.

Proposition 5.4.3

Suppose $A = (\,\vect{a}_1\,\,\vect{a}_2\,\,\cdots\,\,\vect{a}_n\, )$ is the standard matrix of the linear transformation $T: \R^n \to \R^n$. So we have

\[ T(\vect{x}) = A\vect{x}. \]

Suppose $(\vect{v}_1,\,\vect{v}_2,\,\ldots\,,\,\vect{v}_n)$ is an ordered set of vectors in $\R^n$.

Then the following holds.

If $\det{A} > 0$, the (ordered) set $\big(T(\vect{v}_1),\,T(\vect{v}_2),\,\ldots\,,\,T(\vect{v}_n)\big)$ has the same orientation as the set $(\vect{v}_1,\,\vect{v}_2,\,\ldots\,,\,\vect{v}_n)$.

If $\det{A} < 0$ the set $\big(T(\vect{v}_1),\,T(\vect{v}_2),\,\ldots\,,\,T(\vect{v}_n)\big)$ has the opposite orientation as the set $(\vect{v}_1,\,\vect{v}_2,\,\ldots\,,\,\vect{v}_n)$.

In short: the transformation $T(\vect{x}) = A\vect{x}$ preserves the orientation if $\det{A} > 0$ and reverses the orientation if $\det{A} < 0$.

If the determinant is $0$, then the set $\{T(\vect{v}_1), \ldots,T(\vect{v}_n) \}$ will be linearly dependent, and for such a set the orientation is not defined.

Proof of Proposition 5.4.3

This too follows immediately from the product rule of determinants.

\[\begin{split} \begin{array}{rcl} \det{\left(\,T(\vect{v}_1)\,\,T(\vect{v}_2)\,\,\cdots\,\,T(\vect{v}_n)\, \right)} &=& \det{\left(\,A\vect{v}_1\,\,A\vect{v}_2\,\,\cdots\,\,A\vect{v}_n\, \right)} \\ &=& \det{\big(A\left(\,\vect{v}_1\,\,\vect{v}_2\,\,\cdots\,\,\vect{v}_n\, \right)\big)} \\ &=& \det{A}\cdot\det{\left(\,\vect{v}_1\,\,\vect{v}_2\,\,\cdots\,\,\vect{v}_n\, \right)}. \end{array} \end{split}\]

A nice illustration of what this means in $\R^2$ is given by the following example.

Example 5.4.1

Consider the two linear transformations from $\R^2$ to $\R^2$ with matrices

\[\begin{split} A = \begin{pmatrix} 1 & 3 \\ 3 & 1 \end{pmatrix}, \quad B = \begin{pmatrix} 3 & 1 \\ 1 & 3 \end{pmatrix}. \end{split}\]

Note that

\[ \det{A} = -8 < 0 \quad \text{and} \quad \det{B} = 8 > 0. \]

Figure 5.4.5 visualises what is going on.

../_images/Fig-DetExtras-Orientation.svg — Fig. 5.4.5 Images under transformations with negative and positive determinant.#

The images of a unit vector that rotates anticlockwise under transformation $A$ move around clockwise, i.e., in the opposite orientation/direction. Under transformation $B$ the images will go around the origin anticlockwise, i.e., in the same direction as the original vectors.

5.4.3. Cramer’s rule#

We first introduce a new notation that will help to simplify formulas later.

Definition 5.4.3

Let $A$ be an $n\times n$-matrix, and $\vect{v}$ a vector in $\R^n$. Then $A^{(i)}(\vect{v})$ denotes the matrix that results when the $i$-th column of $A$ is replaced by the vector $\vect{v}$.

Example 5.4.2

For the matrix $A = \begin{pmatrix} 1 & 3 & 1 \\ 1 & 4 & 2 \\ 3 & 1 & 5 \end{pmatrix}$ and the vector $\vect{v} = \begin{pmatrix} \class{blue}6 \\ \class{blue}7 \\ \class{blue}8 \end{pmatrix}$ we have that

\[\begin{split} A^{(2)}(\vect{v}) = \begin{pmatrix} 1 & \class{blue}6 & 1 \\ 1 & \class{blue}7 & 2 \\ 3 & \class{blue}8 & 5 \end{pmatrix}. \end{split}\]

Suppose that $A$ is an invertible $n \times n$-matrix. Then we know that the linear system $A\vect{x} = \vect{b}$ has a unique solution for each $\vect{b}$ in $\R^n$. And we also know that the determinant of $A$ is not equal to zero.

The next proposition gives a ready-made formula for the solution.

Cramer’s rule

Theorem 5.4.1 (Cramer’s Rule)

Suppose $A$ is an invertible $n \times n$-matrix, and $\vect{b}$ a vector in $\R^n$. The entries of $x_i$ of the unique solution $\vect{x}$ of the linear system

\[ A\vect{x} = \vect{b} \]

are given by

(5.4.1)#\[x_i = \dfrac{\det{\left(A^{(i)}(\vect{b})\right)}}{\det{A}}.\]

Example 5.4.3

We use Cramer’s rule to solve the system

\[\begin{split} \left\lbrace \begin{array}{rcc} x_1 + 2x_2 + x_3 & = & 3 \\ x_1 - x_2 + 2x_3 & = & 4 \\ 3x_1 + x_2 -5x_3 & = & 1 \end{array} \right. \quad\quad \text{i.e.,} \quad \begin{pmatrix} 1 & 2 & 1 \\ 1 & -1 & 2 \\ 3 & 1 & -5 \end{pmatrix} \left(\begin{array}{c} x_1 \\ x_2 \\ x_3 \end{array} \right) = \begin{pmatrix}3 \\ 4 \\ 1 \end{pmatrix}. \end{split}\]

First of all, the determinant of $A$ can be computed as follows (in the first step we use column reduction, with the boxed $1$ as a pivot):

\[\begin{split} \left|\begin{array}{ccc} \fbox{$1$} & 2 & 1 \\ 1 & -1 & 2 \\ 3 & 1 & -5 \end{array} \right|= \left|\begin{array}{ccc} 1 & 0 & 0 \\ 1 & -3 & 1 \\ 3 & -5 & -8 \end{array} \right|= \left|\begin{array}{cc} -3 & 1 \\ -5 & -8 \end{array} \right|= 29 \neq 0, \end{split}\]

so the coefficient matrix is invertible and consequently the system has a unique solution.

According to Cramer’s rule we find the first entry of the solution as follows (again we use the boxed $1$ as a pivot):

\[\begin{split} x_1 = \dfrac{\begin{vmatrix} 3 & 2 & 1 \\ 4 & -1 & 2 \\ 1 & \fbox{$1$} & -5 \end{vmatrix}}{29} = \dfrac{\begin{vmatrix} 1 & 0 & 11 \\ 5 & 0 & -3 \\ 1 & 1 & -5 \end{vmatrix}}{29} = \dfrac{-\begin{vmatrix} 1 & 11 \\ 5 & -3 \end{vmatrix}}{29} = \dfrac{58}{29} = 2. \end{split}\]

Likewise we can compute the other two entries of the solution.

\[\begin{split} x_2 = \dfrac{\begin{vmatrix} 1 & 3 & 1 \\ 1 & 4 & 2 \\ 3 & 1 & -5 \end{vmatrix}}{29} = 0 \quad \text{and} \quad x_3 = \dfrac{\begin{vmatrix} 1 & 2 & 3 \\ 1 & -1 & 4 \\ 3 & 1 & 1 \end{vmatrix}}{29} = 1. \end{split}\]

The following proof of Cramer’s rule rests rather nicely on properties of the determinant function.

Proof of Theorem 5.4.1

Suppose $\vect{x} = \vect{c} = \left(\begin{array}{c} c_1 \\ \vdots\\ c_n\end{array} \right) $ is the unique solution of the linear system $A\vect{x} = \vect{b}$, with the invertible matrix $A = ( \vect{a}_1 \, \, \vect{a}_2 \, \cdots \,\vect{a}_n )$.

We show that Equation (5.4.1) holds for $c_1$. The argument can be copied for the other $c_i$.

We first note that

\[\begin{split} \begin{array}{ccl} A\vect{c} = \vect{b} &\iff \quad & c_1\vect{a}_1+c_2\vect{a}_2 + \cdots + c_n\vect{a}_n =\vect{b} \\ &\iff \quad & c_1\vect{a}_1+c_2\vect{a}_2 + \cdots + c_n\vect{a}_n - \vect{b} = \vect{0}. \end{array} \end{split}\]

The smart next move is to replace the first column of $A$ by the zero column disguised as

\[ c_1\vect{a}_1+c_2\vect{a}_2 + \cdots + c_n\vect{a}_n - \vect{b}. \]

So we have

\[ \det{((c_1\vect{a}_1+ \,\cdots\, + c_n\vect{a}_n - \vect{b}) \,\,\,\vect{a}_2\,\, \cdots \,\, \vect{a}_n)} =\det{(\vect{0} \,\,\vect{a}_2 \,\, \cdots\,\,\vect{a}_n)} = 0. \]

By the linearity property (in all of the columns) of the determinant (Proposition 5.3.2) we may deduce

(5.4.2)#\[c_1\det{(A)} + c_2\det{(A^{(1)}(\vect{a}_2))} + \cdots + c_n\det{(A^{(1)}(\vect{a}_n))} - \det{(A^{(1)}(\vect{b}))} = 0.\]

Now we note that

\[ \det{A^{(1)}(\vect{a}_i)} = 0, \quad i = 2,3,\ldots, n, \]

since in the matrix $A^{(1)}(\vect{a}_i)$ the first column and the $i$-th column are identical. Hence all but the first and last determinant in Equation (5.4.2) drop out and we can conclude that indeed

\[ c_1\det{(A)} - \det{(A^{(1)}(\vect{b}))} = 0 \quad \iff \quad c_1 = \dfrac{\det{(A^{(1)}(\vect{b}))}}{\det{(A)}}. \]

Caution

Cramer’s formula seems the solution to all your linear systems. However, it has its drawbacks:

Cramer’s formula can only be used for a linear system with a coefficient matrix that is both square and invertible.
For a system with two equations in two unknowns Cramer’s rule may come in handy, but for solving larger systems it is highly inefficient. For instance, for a system of four equations in four unknowns, to find the solution using Cramer’s rule, one needs to compute five $4 \times 4$-determinants. The good-old method using the augmented matrix $(\,A\,|\,\vect{b}\,)$ only asks for one row reduction process.

5.4.4. The inverse of a matrix in terms of determinants#

As an interesting corollary of Cramer’s Theorem we can give a ready-made formula for the inverse of an invertible matrix. The following proposition considers the notation of the previous section for a special case.

Proposition 5.4.4

Let $A$ be an $n\times n$-matrix, and $\vect{e}_j$ the $j$-th vector of the standard basis of $\R^n$. Then

\[ \det{(A^{(i)}(\vect{e}_j))} = (-1)^{j+i} \det{A_{ji}} = C_{ji}, \]

where $A_{ji}$ is the submatrix and $C_{ji} = (-1)^{j+i} \det{\left(A_{ji}\right)} $ the cofactor as introduced in the definition of the $n \times n$-determinant (Definition 5.2.2).

The following example serves as an illustration of what is going on here.

Example 5.4.4

Let $A = \left(\begin{array}{rrrr} a_{11} &a_{12} &a_{13} &a_{14} \\ a_{21} &a_{22} &a_{23} &a_{24} \\ a_{31} &a_{32} &a_{33} &a_{34} \\ a_{41} &a_{42} &a_{43} &a_{44} \end{array} \right) $ be any $4 \times 4$-matrix.

Then $ A^{(4)}(\vect{e}_2) = \left(\begin{array}{rrrr} a_{11} &a_{12} &a_{13} &0 \\ a_{21} &a_{22} &a_{23} &1 \\ a_{31} &a_{32} &a_{33} &0 \\ a_{41} &a_{42} &a_{43} &0 \end{array} \right).$

Expanding along the fourth column gives

\[\begin{split} \det{(A^{(4)}(\vect{e}_2))} = \left|\begin{array}{rrrr} a_{11} &a_{12} &a_{13} &0 \\ a_{21} &a_{22} &a_{23} &1 \\ a_{31} &a_{32} &a_{33} &0 \\ a_{41} &a_{42} &a_{43} &0 \end{array} \right|= (-1)^{(2+4)} \left|\begin{array}{rrr} a_{11} &a_{12} &a_{13} \\ a_{31} &a_{32} &a_{33} \\ a_{41} &a_{42} &a_{43} \end{array} \right|= C_{24}. \end{split}\]

Proposition 5.4.5

If $A$ is an invertible $n \times n$-matrix then the inverse of $A$ is given by

(5.4.3)#\[\begin{split}A^{-1} = \dfrac{1}{\det{A}} \left(\begin{array}{ccccc} C_{11} &C_{21} &C_{31} & \cdots &C_{n1} \\ C_{12} &C_{22} &C_{32} & \cdots &C_{n2} \\ C_{13} &C_{23} &C_{33} & \cdots &C_{n3} \\ \vdots & \vdots &\vdots & \ddots & \vdots \\ C_{1n} &C_{2n} &C_{3n} & \cdots &C_{nn} \\ \end{array} \right),\end{split}\]

where again the $C_{ij}$ are the cofactors as defined in Definition 5.2.2.

Proof of Proposition 5.4.5

The $j$-th column $\vect{b}_j$ of $B = A^{-1}$ is the solution of the linear system $A\vect{x} = \vect{e}_j$.

Cramer’s rule then gives that $b_{ij}$, the $i$-th entry of this column, is equal to

\[ b_{ij} = \dfrac{\det{\left(A^{(i)}(\vect{e}_j)\right)}}{\det{A}} = \dfrac{C_{ji}}{\det{A}}. \]

For the last step we used Proposition 5.4.4.

cofactor matrixadjugate matrix

Definition 5.4.4

For an $n \times n$-matrix $A$ the matrix

\[\begin{split} \left(\begin{array}{ccccc} C_{11} &C_{12} &C_{13} & \cdots &C_{1n} \\ C_{21} &C_{22} &C_{23} & \cdots &C_{2n} \\ C_{31} &C_{32} &C_{33} & \cdots &C_{3n} \\ \vdots & \vdots &\vdots & \ddots & \vdots \\ C_{n1} &C_{n2} &C_{n3} & \cdots &C_{nn} \\ \end{array} \right) \end{split}\]

is called its cofactor matrix.

The adjugate matrix of $A$ is defined as the transpose of the cofactor matrix. So

\[\begin{split} \operatorname{Adj}(A) = \left(\begin{array}{ccccc} C_{11} &C_{21} &C_{31} & \cdots &C_{n1} \\ C_{12} &C_{22} &C_{32} & \cdots &C_{n2} \\ C_{13} &C_{23} &C_{33} & \cdots &C_{n3} \\ \vdots & \vdots &\vdots & \ddots & \vdots \\ C_{1n} &C_{2n} &C_{3n} & \cdots &C_{nn} \\ \end{array} \right) . \end{split}\]

Thus Proposition 5.4.5 states that

\[ A^{-1} = \dfrac{1}{\det{A}} \operatorname{Adj}(A), \]

provided that $A$ is invertible. In fact a slightly more general formula holds for any square matrix.

Proposition 5.4.6

For any square matrix $A$ the following identity holds:

\[ A\,\operatorname{Adj}(A) = \operatorname{Adj}(A)\, A = (\det{A})\,I. \]

Note that the first two products are matrix products and the third product is a scalar times a matrix.

The proof we think, is short and instructive.

Proof of Proposition 5.4.6

For an invertible matrix the statement follows immediately from Proposition 5.4.5.

However, we can give an ‘elementary’ proof, that includes the non-invertible case where $\det{A}=0$. We will use two properties of determinants from earlier sections. First Theorem 5.2.1, that states that the determinant of a matrix can be found by expansion along an arbitrary column

\[ \det{A} = \sum_{i=1}^n (-1)^{i+j} a_{ij}\det{A_{ij}} = \sum_{i=1}^n a_{ij} C_{ij}. \]

And second Corollary 5.3.1: the determinant of a matrix with two equal rows (or columns) is equal to 0.

Let us consider the product $\operatorname{Adj}(A) A$ very carefully:

\[\begin{split} \left(\begin{array}{ccccc} C_{11} &C_{21} &C_{31} & \cdots &C_{n1} \\ C_{12} &C_{22} &C_{32} & \cdots &C_{n2} \\ C_{13} &C_{23} &C_{33} & \cdots &C_{n3} \\ \vdots & \vdots &\vdots & \ddots & \vdots \\ C_{1n} &C_{2n} &C_{3n} & \cdots &C_{nn} \\ \end{array} \right) \left(\begin{array}{ccccc} a_{11} &a_{12} &a_{13} & \cdots &a_{1n} \\ a_{21} &a_{22} &a_{23} & \cdots &a_{2n} \\ a_{31} &a_{32} &a_{33} & \cdots &a_{3n} \\ \vdots & \vdots &\vdots & \ddots & \vdots \\ a_{n1} &a_{n2} &a_{n3} & \cdots &a_{nn} \\ \end{array} \right). \end{split}\]

On the diagonal we see that the $j$-th entry is equal to

\[ C_{1j}a_{1j} + C_{2j}a_{2j} + \cdots + C_{nj}a_{nj} = \sum_{i=1}^n a_{ij} C_{ij} = \det{A}. \]

For the off-diagonal elements we find as product of the $j$-th row of $\operatorname{Adj}(A)$ with the $k$-th column of $A$ the sum

\[ C_{1j}a_{1k} + C_{2j}a_{2k} + \cdots + C_{nj}a_{nk} = \sum_{i=1}^n a_{ik} C_{ij}. \]

This expression can be interpreted as the expansion along the $k$-th row of the determinant of the matrix $A^{(j)}(\vect{a}_k)$ that results if the $j$-th column of $A$ is replaced by the $k$-th column of $A$. Since this matrix has two equal columns, its determinant must be zero!

For $n = 2$ Proposition 5.4.5 gives us back the formula for the inverse of Proposition 3.4.2. That is, if we define the determinant of a $1 \times 1$-matrix $A = (a)$ as the number $a$.

For an arbitrary invertible $3 \times 3$-matrix $A=\left(\begin{array}{ccc} a_{11} &a_{12} &a_{13} \\ a_{21} &a_{22} &a_{23} \\ a_{31} &a_{32} &a_{33} \end{array} \right) $ the formula yields

\[\begin{split} A^{-1} = \dfrac{1}{\begin{vmatrix} a_{11} &a_{12} &a_{13} \\ a_{21} &a_{22} &a_{23} \\ a_{31} &a_{32} &a_{33} \end{vmatrix}} \left(\begin{array}{ccc} \begin{vmatrix} a_{22} & a_{23} \\ a_{32} & a_{33} \end{vmatrix} & - \begin{vmatrix} a_{12} & a_{13} \\ a_{32} & a_{33} \end{vmatrix} & \begin{vmatrix} a_{12} & a_{13} \\ a_{22} & a_{23} \end{vmatrix} \\ - \begin{vmatrix} a_{21} & a_{23} \\ a_{31} & a_{33} \end{vmatrix} & \begin{vmatrix} a_{11} & a_{13} \\ a_{31} & a_{33} \end{vmatrix} & - \begin{vmatrix} a_{11} & a_{13} \\ a_{21} & a_{23} \end{vmatrix} \\ \begin{vmatrix} a_{21} & a_{22} \\ a_{31} & a_{32} \end{vmatrix} & - \begin{vmatrix} a_{11} & a_{12} \\ a_{31} & a_{32} \end{vmatrix} & \begin{vmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{vmatrix} \end{array} \right). \end{split}\]

Caution

Like Cramer’s rule, the formula for the inverse is highly inefficient. The comparison between the efforts required to compute the inverse via the adjugate matrix versus row reduction of the augmented matrix $(\,A\,|\,I\,)$ works out rather favorably for the latter. A glimpse of this inefficiency is shown by the above formula for the inverse of a $3 \times 3$-matrix.

5.4.5. Determinant and cross product#

In Section 1.3 the cross product of two vectors $\mathbf{u}$ and $\mathbf{v}$ in $\R^3$ is defined. It is the unique vector $\mathbf{w}$ that is (1) orthogonal to $\mathbf{u}$ and $\mathbf{v}$, with (2) length equal to the area of the parallelogram with sides $\mathbf{u}$ and $\mathbf{v}$, and (3) such that the triple $(\mathbf{u},\mathbf{v},\mathbf{w})$ is ‘righthanded’ (= positively orientated).

In Section 5.1 we defined the determinant of the ordered set $(\vect{a},\vect{b},\vect{c})$ in $\R^3$ via

\[\begin{split} \begin{array}{rcl} \det{(\vect{a},\vect{b},\vect{c})} &=& (\vect{a}\times\vect{b})\ip\vect{c} = \left|\begin{array}{ccc} a_1 & b_1 &c_1 \\ a_2 & b_2 &c_2 \\ a_3 & b_3 & c_3 \end{array}\right|\\ &=& \left|\begin{array}{cc} a_2 & b_2 \\a_3 & b_3 \end{array}\right| c_1 - \left|\begin{array}{cc} a_1 & b_1 \\ a_3& b_3 \end{array}\right| c_2 + \left|\begin{array}{cc} a_1 & b_1 \\ a_2 & b_2 \end{array}\right| c_3. \end{array} \end{split}\]

Conversely, we can write the cross product in terms containing determinants.

(5.4.4)#\[\begin{split}\begin{array}{rcl} \left(\begin{array}{c} a_1 \\ a_2 \\ a_3 \end{array}\right) \times \left(\begin{array}{c}b_1 \\ b_2 \\ b_3 \end{array}\right) &=& \left(\begin{array}{c}a_2b_3-a_3b_2 \\ a_3b_1 - a_1b_3 \\ a_1b_2-a_2b_1 \end{array}\right) \\ &=& \left|\begin{array}{cc} a_2 & b_2 \\a_3 & b_3 \end{array}\right|\vect{e}_1 - \left|\begin{array}{cc} a_1 & b_1 \\ a_3 & b_3 \end{array}\right|\vect{e}_2 + \left|\begin{array}{cc} a_1 & b_1 \\a_2 & b_2 \end{array}\right|\vect{e}_3. \end{array}\end{split}\]

The last expression can formally be written as

\[\begin{split} \left|\begin{array}{ccc} a_1 & b_1 &\vect{e}_1 \\ a_2 & b_2 &\vect{e}_2 \\ a_3 & b_3 & \vect{e}_3 \end{array}\right|. \end{split}\]

In exactly the same fashion, we can, for $n-1$ vectors $\vect{a}_1, \ldots, \vect{a}_{n-1}$ in $\R^n$, say

\[\begin{split} \vect{a}_1 = \left(\begin{array}{c} a_{11} \\ a_{21} \\ \vdots \\ a_{n1} \end{array}\right), \quad \vect{a}_2 = \left(\begin{array}{c} a_{12} \\ a_{22} \\ \vdots \\ a_{n2} \end{array}\right), \quad \ldots \quad , \quad \vect{a}_{n-1} = \left(\begin{array}{c} a_{1,(n-1)} \\ a_{2,(n-1)} \\ \vdots \\ a_{n,(n-1)} \end{array}\right) \end{split}\]

define

(5.4.5)#\[\begin{split}\vect{a}^{\ast}_n = \vect{N}(\vect{a}_1, \ldots, \vect{a}_{n-1}) = \left|\begin{array}{ccccc} a_{11} & a_{12} & \cdots & a_{1,(n-1)} & \vect{e}_1 \\ a_{21} & a_{22} & \cdots & a_{2,(n-1)} & \vect{e}_2 \\ \vdots & \vdots & & \vdots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{n,(n-1)} & \vect{e}_n \end{array}\right|.\end{split}\]

Here $\vect{e}_1, \ldots , \vect{e}_n$ denote the vectors of the standard basis for $\R^n$.

With some effort it can be shown that the following properties hold.

Proposition 5.4.7

Suppose that $\vect{a}_1, \ldots, \vect{a}_{n-1}$ are vectors in $\R^n$ and $\vect{a}^{\ast}_n$ is defined as in Equation (5.4.5). Then the following properties hold.

$\vect{a}^{\ast}_n \perp \vect{a}_i$, for $i = 1,2,\ldots, n-1$.
$ \{\vect{a}_1, \, \ldots, \,\vect{a}_{n-1}\}$ is linearly dependent if and only if $ \vect{a}^{\ast}_n = \vect{0}$.
If $ \{\vect{a}_1, \ldots, \vect{a}_{n-1}\}$ is linearly independent, then $\det{\left(\,\vect{a}_1, \ldots, \vect{a}_{n-1}, \vect{a}^{\ast}_n\,\right) } > 0$.
The norm of the vector $\vect{a}^{\ast}_n$ is equal to the $(n-1)$-dimensional volume of the $(n-1)$-dimensional parallelepiped generated by $\vect{a}_1, \ldots, \vect{a}_{n-1}$.

For an independent set of vectors $\{\vect{a}_1, \ldots, \vect{a}_{n-1}\}$ in $\R^n$, the properties of Proposition 5.4.7 uniquely determine $\vect{a}^{\ast}_n$ as the vector $\vect{v}$ that is orthogonal to $ \vect{a}_1, \ldots, \vect{a}_{n-1}$, has a prescribed length, and makes the ordered set $(\vect{a}_1, \ldots, \vect{a}_{n-1}, \vect{v}) $ positively orientated.

For a linearly dependent set of vectors property iv. implies that $\vect{a}^{\ast}_n = \vect{0}$.

Example 5.4.5

For $n = 2$ we get, for an arbitrary vector $\vect{v} = \left(\begin{array}{c} a \\ b \end{array}\right) \neq \left(\begin{array}{c} 0\\0 \end{array}\right) $:

\[\begin{split} \vect{w} = \vect{N}\left(\vect{v}\right) = \left|\begin{array}{cc} a & \vect{e}_1\\ b & \vect{e}_2 \end{array}\right|= a\vect{e}_2 - b\vect{e}_1 = \left(\begin{array}{c} -b \\ a \end{array}\right). \end{split}\]

This is indeed a vector orthogonal to $\vect{v}$ with the same ‘one-dimensional volume’, i.e., length, as the vector $\vect{v}$.

Moreover, $\left(\vect{v}, \vect{w}\right) = \left(\left(\begin{array}{c} a \\ b \end{array}\right) , \left(\begin{array}{c} -b \\ a \end{array}\right) \right) $ is positively orientated, as can be seen by making a sketch.

This shows that the construction also works in $\R^2$.

Example 5.4.6

We will find the vector $\mathbf{v} = \vect{a}^{\ast}_4 = N(\vect{a}_1, \vect{a}_2, \vect{a}_3)$ for the columns of the matrix

\[\begin{split} A = \left(\begin{array}{ccc} 1 & 1 & 3 \\ 1 & -1 & 1 \\ 1 & 1 & -3 \\ -1 & 1 & 1 \end{array}\right). \end{split}\]

The first entry $v_1$ is computed as

\[\begin{split} v_1 = (-1)^{1+4}\left|\begin{array}{ccc} 1 & -1 & 1 \\ 1 & 1 & -3 \\ -1 & 1 & 1 \end{array}\right| = -\left|\begin{array}{ccc} 1 & -1 & 1 \\ 0 & 2 & -4 \\ 0 & 0 & 2 \end{array}\right| = -4. \end{split}\]

All in all we find

\[\begin{split} \vect{v} = \left(\begin{array}{c} -4 \\ 12 \\ 4 \\ 12 \end{array}\right). \end{split}\]

By taking inner products, or by computing $A^T\vect{v}$, it is checked that indeed $\vect{v} \perp \vect{a}_i$ for each column $\vect{a}_i$. So property i. of Proposition 5.4.7 is satisfied.

Since the three columns of $A$ are orthogonal, the ‘rectangular box’ in $\R^4$ they generate will have 3d-volume

\[ \norm{\vect{a}_1} \cdot \norm{\vect{a}_2} \cdot \norm{\vect{a}_3} = \sqrt{4}\cdot \sqrt{4}\cdot \sqrt{20} = 8\sqrt{5}. \]

This is indeed equal to

\[ \norm{\vect{v}} = \sqrt{4^2+12^2+4^2+12^2} = \sqrt{320}, \]

so property iv. is satisfied too.

We end the chapter with a proof of Proposition 5.4.7.

So, if you are interested, push the button on the right.

Proof of Proposition 5.4.7

The properties follow from the observation that for each vector $\vect{v}$ in $\R^n$

(5.4.6)#\[\begin{split}\begin{array}{rcl} \vect{N}(\vect{a}_1, \ldots, \vect{a}_{n-1})\ip\vect{v} &=& \left|\begin{array}{ccccc} a_{11} & a_{12} & \cdots & a_{1,(n-1)} & v_1 \\ a_{21} & a_{22} & \cdots & a_{2,(n-1)} & v_2 \\ \vdots & \vdots & & \vdots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{n,(n-1)} & v_n \end{array}\right|\\ &=& \det{(\,\vect{a}_1\,\, \cdots\,\, \vect{a}_{n-1}\,\,\vect{v}\,)}. \end{array}\end{split}\]

This immediate generalisation of the identity $(\vect{a}\times\vect{b})\ip\vect{c} = \det{(\,\vect{a}\,\,\vect{b}\,\,\vect{c}\,) }$ follows if we write Equation (5.4.5) as in Equation (5.4.4).

Take any of the vectors $\vect{a}_j$. Then (by Equation (5.4.6))

\[ \vect{N}(\vect{a}_1, \ldots, \vect{a}_{n-1}) \ip \vect{a}_j= \det{ \left(\,\vect{a}_1\,\, \cdots\,\, \vect{a}_{n-1}\,\,\vect{a}_j\, \right) } = 0, \]

since the determinant has two equal columns. So indeed

\[ \vect{N}(\vect{a}_1, \ldots, \vect{a}_{n-1}) \perp \vect{a}_j,\quad j = 1, \ldots, n-1. \]
First suppose that the columns of the matrix

\[\begin{split} \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1,(n-1)} \\ a_{21} & a_{22} & \cdots & a_{2,(n-1)} \\ \vdots & \vdots & & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{n,(n-1)} \end{pmatrix} \end{split}\]

are linearly dependent. Then for each vector $\vect{v}$ in $\R^n$

\[ \vect{N}(\vect{a}_1, \ldots, \vect{a}_{n-1}) \ip \vect{v} = \det{ \left(\,\vect{a}_1\,\, \cdots\,\, \vect{a}_{n-1}\,\,\vect{v}\, \right) } = 0. \]

Namely, the first $n-1$ columns in the determinant are already linearly dependent. This implies that $\vect{N}(\vect{a}_1, \ldots, \vect{a}_{n-1}) $ must be the zero vector.

To prove the other implication, suppose the vectors $\{ \vect{a}_1, \,\ldots\, \, , \vect{a}_{n-1} \}$ are linearly independent. Then the $n \times (n-1)$-matrix $A = (\,\vect{a}_1 \,\, \cdots \,\, \vect{a}_{n-1} \,) $ has rank $n-1$. The matrix $A$ must have $n-1$ linearly independent rows. Say, if we delete the $k$-th row we have an $(n-1) \times (n-1)$ sub-matrix with independent rows. Then the coefficient of $\vect{e}_k$ in the expansion of $ \vect{N} ( \vect{a}_1, \ldots, \vect{a}_{n-1})$, which by the defining Equation (5.4.5) is precisely (plus or minus) the determinant of this submatrix, is non-zero.
This is a consequence of the observation (again using Equation (5.4.6))

\[\begin{split} \begin{array}{rcl} \det{\left(\,\vect{a}_1\,\, \cdots\,\, \vect{a}_{n-1}\,\,\vect{N}(\vect{a}_1, \ldots, \vect{a}_{n-1})\, \right)} &=& \vect{N}(\vect{a}_1, \ldots, \vect{a}_{n-1}) \ip \vect{N}(\vect{a}_1, \ldots, \vect{a}_{n-1})\\ &=& \norm{\vect{N}(\vect{a}_1, \ldots, \vect{a}_{n-1})}^2 \geq 0, \end{array} \end{split}\]

and the already established fact that $\vect{N}(\vect{a}_1, \ldots, \vect{a}_{n-1}) \neq \vect{0}$ if $\{\vect{a}_1,\, \ldots\,,\, \vect{a}_{n-1}\}$ is linearly independent.
We sketch the idea, which we borrow from volume versus area considerations in $\R^2$ and $\R^3$. We defined the volume of the $n$-dimensional parallelepiped $\mathcal{P} \left(\vect{a}_1, \ldots, \vect{a}_{n} \right) $ generated by the $n$ vectors $\vect{a}_1, \ldots, \vect{a}_{n}$ as the absolute value of a determinant:

\[ \operatorname{Vol}_n\!\left(\mathcal{P}(\vect{a}_1, \ldots, \vect{a}_{n}) \right) = |\det{\left(\,\vect{a}_1\,\, \cdots\,\, \,\vect{a}_{n}\,\right) }|. \]

The height times base principle in $\R^n$ must be:

if

\[ a_{n} \perp \mathcal{P}(\vect{a}_1, \ldots, \vect{a}_{n-1}) \]

then

\[ \operatorname{Vol}_n\!\left(\mathcal{P}(\vect{a}_1, \ldots, \vect{a}_{n}) \right) = \operatorname{Vol}_{n-1}\! \left(\mathcal{P}(\vect{a}_1, \ldots, \vect{a}_{n-1}) \right) \cdot \norm{\vect{a}_{n}}. \]

where $\operatorname{Vol}_{n-1}$ denotes the $(n-1)$-dimensional volume of an $(n-1)$-dimensional subset of $\R^n$.

We apply this principle to the vector $\vect{a}_{n} = \vect{a}^{\ast}_n = \vect{N}(\vect{a}_1, \ldots, \vect{a}_{n-1})$.

We know that $\vect{a}^{\ast}_n$ is orthogonal to all vectors $\vect{a}_1, \ldots, \vect{a}_{n-1}$. So the ‘height’ of $\mathcal{P}(\vect{a}_1, \ldots, \vect{a}_{n-1}, \vect{a}^{\ast}_n)$ is equal to $\norm{\vect{a}^{\ast}_n}$.

On the one hand we then have that

\[ \operatorname{Vol}_n\!\left(\mathcal{P}(\vect{a}_1, \ldots, \vect{a}_{n-1}, \vect{a}^{\ast}_n) \right) = \operatorname{Vol}_{n-1}\!\left(\mathcal{P}(\vect{a}_1, \ldots, \vect{a}_{n-1}) \right) \cdot \norm{\vect{a}^{\ast}_n} \]

and on the other hand

\[\begin{split} \begin{array}{rcl} \operatorname{Vol}_n\!\left(\mathcal{P}(\vect{a}_1, \ldots, \vect{a}_{n-1}, \vect{a}^{\ast}_n) \right) &=& |\det{ \left(\,\vect{a}_1\,\, \cdots\,\, \vect{a}_{n-1}\,\, \vect{a}^{\ast}_n\, \right) }| \\ &=& | \vect{a}^{\ast}_n\ip \vect{a}^{\ast}_n| = \norm{\vect{a}^{\ast}_n}^2. \end{array}. \end{split}\]

Equating the two expressions for $\operatorname{Vol}_n\! \left(\mathcal{P} (\vect{a}_1, \ldots, \vect{a}_{n-1}, \vect{a}^{\ast}_n) \right) $
we conclude that indeed

\[ \norm{\vect{a}^{\ast}_n} = \operatorname{Vol}_{n-1}\! \left(\mathcal{P} (\vect{a}_1, \ldots, \vect{a}_{n-1}) \right).\]

5.4.6. Grasple exercises#

Grasple exercise 5.4.1

https://embed.grasple.com/exercises/ddb8daf3-3773-44c9-8df0-fe3084a6e7c4?id=93170

Area of a triangle under a linear transformation.

Grasple exercise 5.4.2

https://embed.grasple.com/exercises/c630cded-e9cd-482e-a7fe-68ff75430f38?id=121803

Area of a rectangle under a linear transformation.

Grasple exercise 5.4.3

https://embed.grasple.com/exercises/8d7a0672-6283-4bb1-9b43-b41a03067e40?id=93171

To find a point $C$ on a line, such that the area of a triangle $ABC$ has a given value.

Grasple exercise 5.4.4

https://embed.grasple.com/exercises/62770bc4-da31-4212-a713-bb2843b0e580?id=93172

Which points lie on the same side of a plane?

Grasple exercise 5.4.5

https://embed.grasple.com/exercises/f787e084-9a77-40b4-b755-97890b98cfb6?id=93176

To solve a $3\times3$-system using Cramer’s rule.

Grasple exercise 5.4.6

https://embed.grasple.com/exercises/3add427a-88a3-4da0-8f0a-2bf8bb8781dd?id=93179

Finding two entries in the inverse of a $4\times4$-matrix (using the adjoint matrix).

Grasple exercise 5.4.7

https://embed.grasple.com/exercises/8d4a98f7-50ac-4705-8b34-680b7b8395d9?id=93181

To find a vector orthogonal to $\vect{v}_1,\vect{v}_2,\vect{v}_3$ in $\mathbb{R}^4$, with good orientation.

Grasple exercise 5.4.8

https://embed.grasple.com/exercises/bc3df113-95b3-470a-a730-3ad8faab08f5?id=93183

To compute the normal vector $N(\vect{a}_1,\vect{a}_2,\vect{a}_3)$ as in Subsection 5.4.5.

Grasple exercise 5.4.9

https://embed.grasple.com/exercises/45048350-326b-4f4a-bbbf-9f7edb947914?id=96120

To compute the area of parallelogram, a triangle and the image of a parallelogram.