1.2. Dot Product#

In this section we will consider other (geometric) properties of vectors, like the length of a vector and the angle between two vectors. When the angle between two vectors is equal to \(\frac12\pi\), two vectors are perpendicular, which is also known as orthogonal. These properties can all be expressed using a new operator: the inner product or dot product.

We will start by considering vectors in \(\mathbb{R}^2\) and \(\mathbb{R}^3\). The translation of the concepts to the general space \(\mathbb{R}^n\) will then become more or less immediate.

1.2.1. Length and perpendicularity in \(\mathbb{R}^2\) and \(\mathbb{R}^3\)#

The length of a vector

\[\begin{split} \mathbf{v}=\begin{bmatrix} a_{1}\\a_{2} \end{bmatrix} \end{split}\]

in the plane, which we denote by \(\norm{\mathbf{v}}\), can be computed using the Pythagorean theorem:

(1.2.1)#\[\norm{\mathbf{v}} = \sqrt{a_1^2+a_2^2}\]
../_images/Fig-InnerProduct-Length-2D.svg

Fig. 1.2.1 The length of a vector via Pythagoras’ Theorem#

../_images/Fig-InnerProduct-length-3D.svg

Fig. 1.2.2 The length of a vector via Pythagoras’ Theorem#

Using this theorem twice we find a similar formula for the length of a vector

\[\begin{split} \mathbf{v}=\begin{bmatrix} a_{1}\\a_{2}\\a_{3}\end{bmatrix} \end{split}\]

in \(\mathbb{R}^3\). Look at Figure 1.2.2. There are two right triangles: \(\Delta OPQ\) where \(\angle OPQ\) is right, and \(\Delta OQA\) where \(\angle OQA\) is right.

From

\[ OQ^2 = OP^2 + PQ^2 = a_1^2 + a_2^2, \]

where for two points \(A\) and \(B\), by \(AB\) we denote the length of the vector \(\overrightarrow{AB}\), and

\[ OA^2 = OQ^2+QA^2 = a_1^2 + a_2^2+a_3^2 \]

we find that

(1.2.2)#\[\norm{\mathbf{v}}= OA = \sqrt{a_1^2 + a_2^2+a_3^2}\]
../_images/Fig-InnerProduct-perp-non-perp.svg

Fig. 1.2.3 Perpendicular versus non-perpendicular#

Let us now turn our attention to another important geometric concept, namely that of perpendicularity. It is clear from Figure 1.2.3 that the vectors \(\begin{bmatrix}2\\3\end{bmatrix}\) and \(\begin{bmatrix}-3\\2\end{bmatrix}\) are perpendicular, whereas the vectors \(\begin{bmatrix}2\\3\end{bmatrix}\) and \(\begin{bmatrix}-1\\3\end{bmatrix}\) are not.
There is another way to look at this, which will be useful for the definition of perpendicularity in higher dimensions. To that end, consider Figure 1.2.4. Here you see two vectors \(\vect{v}\) and \(\vect{w}\) and the paralellogram they span. You also see the diagonals of this paralellogram, which are given by \(\vect{v}+\vect{w}\) and \(\vect{v}-\vect{w}\). Two vectors are perpendicular if and only if the paralellogram they span is a rectangle, and this is exacty the situation where the diagonals have the same length, i.e.,

(1.2.3)#\[\norm{\mathbf{v}+\mathbf{w}} = \norm{\mathbf{v}-\mathbf{w}}.\]
../_images/Fig-InnerProduct-diagonal-parallelogram.svg

Fig. 1.2.4 The parallelogram spanned by \(\vect{v}\) and \(\vect{w}\) and its diagonals. How should you choose \(\vect{v}\) and \(\vect{w}\) such taht the diagonals have the same length?#

In the picture on the right the vectors are not perpendicular and

\[ \norm{\mathbf{v}+\mathbf{w}} \neq \norm{\mathbf{v}-\mathbf{w}}. \]

So far we have been talking about two (non-zero) vectors in the plane, i.e., in \(\mathbb{R}^2\). However, two vectors in \(\mathbb{R}^3\) form a parallelogram as well, which also becomes a rectangle if and only if the vectors are perpendicular. We introduce a notation for this: if \( \mathbf{v}\) and \(\mathbf{w}\) are perpendicular, we write this as

(1.2.4)#\[\mathbf{v} \perp \mathbf{w}\]

Taking squares in Equation (1.2.3), we see that the following holds both in \(\mathbb{R}^2\) and in \(\mathbb{R}^3\):

\[ \mathbf{v} \perp \mathbf{w} \iff \norm{\mathbf{v}+\mathbf{w}}^2 = \norm{\mathbf{v}-\mathbf{w}}^2. \]

If we write this out for two arbitrary vectors \(\mathbf{v}=\begin{bmatrix} a_{1}\\a_{2}\end{bmatrix},\mathbf{w}=\begin{bmatrix} b_{1}\\b_{2}\end{bmatrix}\) in \(\mathbb{R}^2\) we get the following:

\[\begin{split} \begin{array}{rcl} \mathbf{v} \perp \mathbf{w} &\iff &\norm{\mathbf{v}+\mathbf{w}}^2 = \norm{\mathbf{v}-\mathbf{w}}^2\\ &\iff &(a_1+b_1)^2 + (a_2+b_2)^2 = (a_1-b_1)^2 + (a_2-b_2)^2\\ &\iff &a_1^2+2a_1b_1 + b_1^2 + a_2^2+2a_2b_2 + b_2^2 = a_1^2 -2a_1b_1+b_1^2+ a_2^2 -2a_2b_2+b_2^2\\ &\iff &4(a_1b_1 +a_2b_2)=0 \\ &\iff &a_1b_1 +a_2b_2=0. \end{array} \end{split}\]

Likewise, for vectors \(\mathbf{v}=\begin{bmatrix} a_{1}\\a_{2}\\a_{3}\end{bmatrix},\,\mathbf{w}=\begin{bmatrix} b_{1}\\b_{2}\\b_{3}\end{bmatrix}\) in \(\mathbb{R}^3\):

(1.2.5)#\[\mathbf{v} \perp \mathbf{w} \iff a_1b_1 +a_2b_2+a_3b_3=0.\]

The derivation is completely analogous to the one above, only now we have one extra term. So to check ‘algebraically’ whether two vectors are perpendicular we just have to compute \(a_1b_1 +a_2b_2\, (\,+\,a_3b_3\,)\) and see whether this is equal to 0.

This expression is called the inner product (or dot product) of the vectors \(\mathbf{v}\) and \(\mathbf{w}\). We denote it by \(\mathbf{v}\ip\mathbf{w}\). Note that the dot product of a general vector \(\mathbf{v}=\begin{bmatrix} a_{1}\\a_{2}\\a_{3}\end{bmatrix}\) in \(\mathbb{R}^3\) with itself gives

\[ \mathbf{v}\ip\mathbf{v} = a_1^2+a_2^2+a_3^2 = \norm{\mathbf{v}}^2, \]

so the length of a vector can be expressed as follows using the dot product

(1.2.6)#\[\norm{\mathbf{v}} = \sqrt{\mathbf{v}\ip\mathbf{v}\,}.\]

Using the dot product the concepts length and perpendicular easily carry over to any \(\mathbb{R}^n\), \(n \geq 4\). Let’s do it one by one, starting by generalizing the dot product in the next subsection.

1.2.2. Dot product in \(\mathbb{R}^n\)#

Definition 1.2.1

The dot product (or inner product) of two vectors \(\mathbf{v}=\begin{bmatrix}a_{1}\\a_{2}\\ \vdots\\a_{n}\end{bmatrix}\) and \(\mathbf{w}=\begin{bmatrix}b_{1}\\b_{2}\\ \vdots\\b_{n}\end{bmatrix}\) in \(\mathbb{R}^n\) is defined as

(1.2.7)#\[\mathbf{v}\ip\mathbf{w} = a_1b_1 +a_2b_2+ \ldots + a_nb_n.\]

Example 1.2.1

The dot product of the two vectors

\[\begin{split} \mathbf{v}_1=\begin{bmatrix} 5\\3\\4\\-2\end{bmatrix} \quad \text{and}\quad \mathbf{v}_2=\begin{bmatrix} 2\\3\\0\\1\end{bmatrix} \end{split}\]

is given by

\[ \mathbf{v}_1\ip\mathbf{v}_2 = 5\cdot2 + 3\cdot3 +4\cdot0 + (-2)\cdot1 = 17. \]

And the dot product of the two vectors

\[\begin{split} \mathbf{v}_1=\begin{bmatrix} 5\\3\\4\\-2\end{bmatrix} \quad \text{and}\quad \mathbf{v}_3=\begin{bmatrix} -4\\3\\2\end{bmatrix} \end{split}\]

is not defined. In fact, the dot product of a vector \(\mathbf{v}\) in \(\mathbb{R}^m\) and a vector \(\mathbf{w}\) in \(\mathbb{R}^n\) is only defined if \(m = n\).

We state the characteristic rules of the dot product in \(\mathbb{R}^n\), which in the sequel we will use time and again, in the following proposition.

Proposition 1.2.1

The following properties hold for any vectors \(\mathbf{v},\mathbf{v}_1,\mathbf{v}_2,\mathbf{v}_3\) in \(\mathbb{R}^n\) and scalars \(c \in \mathbb{R}\):

i. \(\mathbf{v}_1\ip\mathbf{v}_2 = \mathbf{v}_2\ip\mathbf{v}_1\).

ii. \((c\mathbf{v}_1)\ip\mathbf{v}_2 = c(\mathbf{v}_1\ip\mathbf{v}_2) = \mathbf{v}_1\ip(c \mathbf{v}_2)\).

iii. \((\mathbf{v}_1+\mathbf{v}_2)\ip\mathbf{v}_3 = \mathbf{v}_1\ip\mathbf{v}_3+\mathbf{v}_2\ip\mathbf{v}_3\).

iv. \(\mathbf{v}\ip\mathbf{v} \geq 0\), and \(\mathbf{v}\ip\mathbf{v} = 0 \iff \mathbf{v} = \mathbf{0}\).

Proof. The first three properties follow from the corresponding properties of real numbers. For instance, for the first rule we simply use that \(xy = yx\) holds for the product of real numbers.

i. Let

\[\begin{split} \mathbf{v}_1=\begin{bmatrix} a_1\\a_2\\ \vdots\\ a_n\end{bmatrix} \quad \text{and}\quad \mathbf{v}_2=\begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_n \end{bmatrix} \end{split}\]

be two arbitrary vectors in \(\mathbb{R}^n\). Then

\[\begin{split} \begin{align*} \mathbf{v}_1 \ip \mathbf{v}_2 &= \begin{bmatrix}a_{1} \\ a_{2}\\ \vdots\\a_{n}\end{bmatrix} \ip \begin{bmatrix}b_{1} \\ b_{2}\\ \vdots \\ b_{n}\end{bmatrix} = a_1b_1 +a_2b_2+ \ldots + a_nb_n \\ &= b_1a_1 +b_2a_2+ \ldots + b_na_n = \begin{bmatrix}b_{1} \\ b_{2}\\ \vdots \\ b_{n}\end{bmatrix}\ip\begin{bmatrix}a_{1} \\ a_{2}\\ \vdots\\ a_{n}\end{bmatrix} = \mathbf{v}_2\ip\mathbf{v}_1. \end{align*} \end{split}\]

ii. For two vectors \(\vect{v}_1 = \begin{bmatrix}a_{1} \\ a_{2}\\ \vdots\\ a_{n}\end{bmatrix}\), \(\vect{v}_2 = \begin{bmatrix}b_{1} \\ b_{2}\\ \vdots\\ b_{n}\end{bmatrix}\), and constants \(c\) we see that

\[\begin{split} \begin{eqnarray*} (c\mathbf{v_1})\ip\mathbf{v_2} &=& \begin{bmatrix}ca_{1}\\ca_{2}\\ \vdots\\ca_{n}\end{bmatrix}\ip\begin{bmatrix}b_{1}\\b_{2}\\ \vdots\\b_{n}\end{bmatrix} = (ca_1)b_1 + (ca_2)b_2+ \ldots + (ca_n)b_n \\ &=& c\,(a_1b_1 +a_2b_2+ \ldots + a_nb_n) = c\, (\mathbf{v_1}\ip\mathbf{v_2}) \end{eqnarray*} \end{split}\]

iii. Is proved in the same way as (ii).

iv. This consists of two statement. For the first, we note that

\[ \mathbf{v}\ip\mathbf{v} = a_1a_1 +a_2a_2+ \ldots + a_na_n = a_1^2+a_2^2 + \ldots + a_n^2 \]

is the sum of squares of real numbers, so it is nonnegative. That is,

\[ \mathbf{v}\ip\mathbf{v} \geq 0. \]

To prove the second statement, we see that

\[ \mathbf{v}\ip\mathbf{v} = a_1^2+a_2^2 + \ldots + a_n^2 = 0 \]

if and only if all the squares are 0, which only happens if each entry \(a_i\) is equal to zero, that is, if \(\mathbf{v} = \mathbf{0}\).

Exercise 1.2.1

Prove property iii.

Solution to Exercise 1.2.1 (click to show)

Let

\[\begin{split} \mathbf{v}_1=\begin{bmatrix} a_1\\a_2\\ \vdots\\ a_n\end{bmatrix} \quad \text{and}\quad \mathbf{v}_2=\begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_n \end{bmatrix} \quad \text{and}\quad \mathbf{v}_3=\begin{bmatrix} c_1 \\ c_2 \\ \vdots \\ c_n \end{bmatrix}\end{split}\]

be three arbitrary vectors in \(\mathbb{R}^n\). Then

\[\begin{split} \begin{align*} \left(\mathbf{v}_1 + \mathbf{v}_2 \right) \ip \mathbf{v}_3 &= \left(\begin{bmatrix}a_{1} \\ a_{2}\\ \vdots\\a_{n}\end{bmatrix} + \begin{bmatrix}b_{1} \\ b_{2}\\ \vdots \\ b_{n}\end{bmatrix} \right) \ip \begin{bmatrix} c_1 \\ c_2 \\ \vdots \\ c_n \end{bmatrix} \\ &= \begin{bmatrix} a_1+b_1\\a_2+b_2\\ \vdots\\ a_n+b_n\end{bmatrix}\ip \begin{bmatrix} c_1 \\ c_2 \\ \vdots \\ c_n \end{bmatrix} \\ &= (a_1+b_1)c_1 +(a_2+b_2)c_2+ \ldots + (a_n+b_n)c_n \\ &= a_1c_1 +b_1c_1+a_2c_2+b_2c_2 \ldots + a_nc_n+b_nc_n \\ &= a_1c_1 +a_2c_2+\ldots + a_nc_n +b_1c_1+b_2c_2 \ldots +b_nc_n \\ &= \begin{bmatrix}a_{1} \\ a_{2}\\ \vdots\\a_{n}\end{bmatrix}\ip\begin{bmatrix} c_1 \\ c_2 \\ \vdots \\ c_n \end{bmatrix}+\begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_n \end{bmatrix}\ip\begin{bmatrix} c_1 \\ c_2 \\ \vdots \\ c_n \end{bmatrix} \\ &= \mathbf{v}_1\ip\mathbf{v}_3+\mathbf{v}_2\ip\mathbf{v}_3. \end{align*} \end{split}\]

Exercise 1.2.2

Prove the identity

\[ (\mathbf{v}_1+\mathbf{v}_2)\ip(\mathbf{v}_1-\mathbf{v}_2) = \mathbf{v}_1\ip\mathbf{v}_1-\mathbf{v}_2\ip\mathbf{v}_2. \]
Solution to Exercise 1.2.2 (click to show)

First of all, because of rule i. and rule iii. of Proposition 1.2.1 it holds that

\[ \mathbf{v}_1\ip(\mathbf{v}_2+\mathbf{v}_3) = \mathbf{v}_1\ip\mathbf{v}_2+\mathbf{v}_1\ip\mathbf{v}_3 \]

and it also follows from ii. and iii. that

\[ \mathbf{v}_1\ip(\mathbf{v}_2-\mathbf{v}_3) = \mathbf{v}_1\ip(\mathbf{v}_2+(-1)\mathbf{v}_3) =\mathbf{v}_1\ip\mathbf{v}_2+\mathbf{v}_1\ip(-1\mathbf{v}_3) = \mathbf{v}_1\ip\mathbf{v}_2-\mathbf{v}_1\ip\mathbf{v}_3 \]

Then the statement is proved by the following chain of identities

\[\begin{split} \begin{array}{rcl}(\mathbf{v}_1+\mathbf{v}_2)\ip(\mathbf{v}_1-\mathbf{v}_2) &=& \mathbf{v}_1\ip(\mathbf{v}_1-\mathbf{v}_2) + \mathbf{v}_2\ip(\mathbf{v}_1-\mathbf{v}_2) \\ &=& \mathbf{v}_1\ip\mathbf{v}_1-\mathbf{v}_1\ip\mathbf{v}_2 + \mathbf{v}_2\ip\mathbf{v}_1-\mathbf{v}_2\ip\mathbf{v}_2\\ &=& \mathbf{v}_1\ip\mathbf{v}_1-\mathbf{v}_2\ip\mathbf{v}_2. \end{array}\end{split}\]

Exercise 1.2.3

Prove the identity

\[ \norm{\mathbf{v}_1+\mathbf{v}_2}^2 + \norm{\mathbf{v}_1-\mathbf{v}_2}^2 = 2 (\norm{\mathbf{v}_1}^2 + \norm{\mathbf{v}_2}^2), \]

and explain why it is called the parallelogram rule.

Solution to Exercise 1.2.3 (click to show)

Again it’s a chain of identities using basic properties of the dot product.

\[\begin{split} \begin{array}{rcl} \norm{\mathbf{v}_1+\mathbf{v}_2}^2 + \norm{\mathbf{v}_1-\mathbf{v}_2}^2&=& (\mathbf{v}_1+\mathbf{v}_2)\cdot(\mathbf{v}_1+\mathbf{v}_2) + (\mathbf{v}_1-\mathbf{v}_2)\cdot(\mathbf{v}_1-\mathbf{v}_2) \\ &=& \mathbf{v}_1\cdot\mathbf{v}_1 +2\mathbf{v}_1\cdot\mathbf{v}_2 + \mathbf{v}_1\cdot\mathbf{v}_2 + \mathbf{v}_1\cdot\mathbf{v}_1 -2\mathbf{v}_1\cdot\mathbf{v}_2 + \mathbf{v}_2\cdot\mathbf{v}_2 \\ &=& 2\,mathbf{v}_1\cdot\mathbf{v}_1 +2\,\mathbf{v}_2\cdot\mathbf{v}_2 \\ &=& 2 (\norm{\mathbf{v}_1}^2 + \norm{\mathbf{v}_2}^2). \end{array}\end{split}\]

1.2.3. Orthogonality#

In \(\mathbb{R}^2\) and \(\mathbb{R}^3\) the dot product gives an easy way to check whether two vectors are perpendicular:

\[ \mathbf{v}\perp\mathbf{w} \iff \mathbf{v}\ip\mathbf{w} = 0. \]

We use this identity to define the concept of perpendicularity in \(\mathbb{R}^n\). It seems a bit ‘academic’, but in this more general setting the term orthogonal is used.

Definition 1.2.2

Two vectors \(\mathbf{v}\) and \(\mathbf{w}\) in \(\mathbb{R}^n\) are called orthogonal if \(\mathbf{v}\ip\mathbf{w} = 0\). As before, we denote this by \(\mathbf{v}\perp\mathbf{w}\).

Example 1.2.2

Let \(\mathbf{u} = \begin{bmatrix} 1\\2\\-1\\-1\end{bmatrix}\), \(\mathbf{v} = \begin{bmatrix} 3\\-1\\2\\-1\end{bmatrix}\), \(\mathbf{w} = \begin{bmatrix} 2\\2\\-1\\2\end{bmatrix}\).

We compute

\[ \mathbf{u}\ip\mathbf{v} = 3-2-2+1 = 0, \]
\[ \mathbf{u}\ip\mathbf{w} = 2+4+1-2 = 5, \]
\[ \mathbf{v}\ip\mathbf{w} = 6 - 2 - 2 - 2 = 0, \]

and conclude that \(\mathbf{u}\) and \(\mathbf{v}\) are orthogonal, \(\mathbf{u}\) and \(\mathbf{w}\) are not orthogonal,
\(\mathbf{v}\) and \(\mathbf{w}\) are orthogonal.

Grasple Exercise 1.2.1

https://embed.grasple.com/exercises/59912254-6fc8-43c7-9c44-1ea7eab1c236?id=62409

To compute some dot products in \(\R^2, \R^3, \R^4\).

In \(\mathbb{R}^2\), two nonzero vectors that are orthogonal to the same nonzero vector \(\mathbf{v}\) are automatically multiples of each other (i.e. have either the same or the opposite direction). In \(\mathbb{R}^n\) with \(n \geq 3\) this no longer holds. In this example both vectors \(\mathbf{u}\) and \(\mathbf{w}\) are orthogonal to the vector \(\mathbf{v}\), but \(\mathbf{u} \neq c\mathbf{w}\).

By definition the zero vector is orthogonal to any vector, since \(\mathbf{0}\ip\mathbf{v} = 0\). Moreover, the zero vector is the only vector that is orthogonal to itself, which is the content of the next proposition.

Proposition 1.2.2

Suppose \(\mathbf{v} \in \mathbb{R}^n\).   Then \(\mathbf{v}\perp\mathbf{v} \iff \mathbf{v} = \mathbf{0}\).

Proof. By definition

\[ \mathbf{v}\perp\mathbf{v} \iff \mathbf{v}\ip\mathbf{v}=0 \]

In Proposition 1.2.1 iv. it was stated that the last equality only holds for \(\mathbf{v} = \mathbf{0}\).

The fact that the zero vector is orthogonal to any vector is an immediate consequence of the definition, but it may seem counter intuitive to you. The following example illustrates a situation where this orthogonality leads to a much nicer outcome.

Example 1.2.3

Let \(\mathbf{n}\) be any nonzero vector in the plane. The set of vectors that are orthogonal to \(\mathbf{n}\) all lie on a line through the origin. (See Figure 1.2.5.) If we agree that \(\mathbf{0}\perp\mathbf{n}\), it will be the whole line. The vector \(\mathbf{n}\) is often said to be a normal vector to the line.

../_images/Fig-InnerProduct-PerpendicularLine.svg

Fig. 1.2.5 Vectors orthogonal to a non-zero vector \(\mathbf{n}\) in the plane#

We conclude this subsection with another concept that we will come across later in a much more general context. Informally, it is the (orthogonal) projection of a vector onto another vector. More precisely, it is the orthogonal projection of a vector \(\mathbf{w}\) onto the line \(\mathcal{L}\) generated by the nonzero vector \(\mathbf{v}\), by which we mean \(\mathcal{L}= \{ c\mathbf{v}: c \in \mathbb{R}\}\).

See Figure 1.2.6.

Definition 1.2.3

The orthogonal projection of a vector \(\mathbf{w}\) onto the nonzero vector \(\mathbf{v}\) is the vector \(\mathbf{\hat{w}} = c\mathbf{v} \) for which

\[ (\mathbf{w} - \mathbf{\hat{w}}) \perp \mathbf{v}. \]

Another notation for this vector is

\[ \mathbf{\hat{w}} = \text{proj}_{\mathbf{v}}(\mathbf{w}). \]
../_images/Fig-InnerProduct-ProjectionVectorLine.svg

Fig. 1.2.6 Projection of a vector \(\mathbf{w}\) onto a non-zero vector \(\mathbf{v}\)#

Proposition 1.2.3

In the definition above the vector \(\mathbf{\hat{w}}\) with these properties is unique and it is given by

\[ \text{proj}_{\mathbf{v}}(\mathbf{w}) = \mathbf{\hat{w}} = \frac{\mathbf{w}\ip\mathbf{v}}{\mathbf{v}\ip\mathbf{v}} \mathbf{v}. \]

Proof. With the rules of the dot product the vector \(\mathbf{w}\) is easily constructed.
Starting from

\[ \mathbf{\hat{w}} = c\mathbf{v}, \text{ for some } c\in\mathbb{R} \]

and

\[ (\mathbf{w} - \mathbf{\hat{w}}) \perp \mathbf{v} \]

it follows that we must have

\[ (\mathbf{w} - c\mathbf{v}) \ip \mathbf{v} = \mathbf{w}\ip \mathbf{v} - c \,(\mathbf{v}\ip \mathbf{v}) = 0 \]

so that \(c\) is uniquely given by

\[ c = \frac{\mathbf{w}\ip \mathbf{v}}{\mathbf{v}\ip \mathbf{v}} \]

and indeed \(\mathbf{\hat{w}}\) must be as stated.

Example 1.2.4

We compute the orthogonal projection of the vector

\[\begin{split} \mathbf{w} = \begin{bmatrix} 2\\ -4 \\ -1 \\ -5\end{bmatrix} \end{split}\]

onto the vector

\[\begin{split} \mathbf{v} = \begin{bmatrix} 1 \\1\\1\\1\end{bmatrix}. \end{split}\]

We proceed as follows

\[\begin{split} \mathbf{\hat{w}} = \text{proj}_{\mathbf{v}}(\mathbf{w}) = \frac{\mathbf{w}\ip\mathbf{v}}{\mathbf{v}\ip\mathbf{v}} \mathbf{v} = \frac{-8}{4}\begin{bmatrix} 1 \\1\\1\\1\end{bmatrix} = \begin{bmatrix} -2\\-2\\-2\\-2\end{bmatrix}. \end{split}\]

We verify the orthogonality:

\[\begin{split} (\mathbf{w} - \mathbf{\hat{w}} )\ip \mathbf{v} = \begin{bmatrix} 4 \\-2\\1\\-3\end{bmatrix} \ip \begin{bmatrix} 1 \\1\\1\\1\end{bmatrix} = 4-2+1-3 = 0, \end{split}\]

so indeed

\[ (\mathbf{w} - \mathbf{\hat{w}} )\perp \mathbf{v}, \]

as required.

Grasple Exercise 1.2.2

https://embed.grasple.com/exercises/88c460cd-36ee-49b0-8fb8-d29b55ad253a?id=84822

Computing the projection of a vector \(\vect{w}\) onto a vector $\vect{v}

Exercise 1.2.4

Suppose \(\text{proj}_{\mathbf{v}}(\mathbf{w}_1) = \text{proj}_{\mathbf{v}}(\mathbf{w}_2) \), for three nonzero vectors \(\mathbf{v}, \,\mathbf{w}_1,\,\mathbf{w}_2\) in \(\mathbb{R}^n\). What does this say about the relative positions of the three vectors?

Verify your statement for the following three vectors

\[\begin{split} \mathbf{v} = \begin{bmatrix} 1\\ 1 \\ -2 \\ -3\end{bmatrix}, \quad \mathbf{w}_1 = \begin{bmatrix} 6\\ 4 \\ -7 \\ -7\end{bmatrix}, \quad \mathbf{w}_2 = \begin{bmatrix} 5\\ 6 \\ -2 \\ -10\end{bmatrix}. \end{split}\]
Solution to Exercise 1.2.4 (click to show)

Suppose \(\text{proj}_{\mathbf{v}}(\mathbf{w}_1) = \text{proj}_{\mathbf{v}}(\mathbf{w}_2) \). Thus \(\dfrac{\mathbf{w}_1\ip\mathbf{v}}{\mathbf{v}\ip\mathbf{v}} \mathbf{v} = \dfrac{\mathbf{w}_2\ip\mathbf{v}}{\mathbf{v}\ip\mathbf{v}} \mathbf{v}\).

Since \(\mathbf{v}\) is not the zero vector this implies that \(\mathbf{w}_1\ip\mathbf{v} = \mathbf{w}_2\ip\mathbf{v}\). In other words,

\[ \mathbf{w}_1\ip\mathbf{v} - \mathbf{w}_2\ip\mathbf{v} = (\mathbf{w}_1 - \mathbf{w}_2)\ip \mathbf{v} = 0, \]

which expresses that   \((\mathbf{w}_1 - \mathbf{w}_2)\perp \vect{v}\).

For the given vectors \(\mathbf{v}, \mathbf{w}_1, \mathbf{w}_2\) we find

\[ \dfrac{\mathbf{w}_1\ip\mathbf{v}}{\mathbf{v}\ip\mathbf{v}} \mathbf{v} = \frac{\mathbf{w}_2\ip\mathbf{v}}{\mathbf{v}\ip\mathbf{v}} \mathbf{v} = \dfrac{45}{15}\mathbf{v} \]

and

\[\begin{split} \mathbf{w}_1 - \mathbf{w}_2 = \begin{bmatrix} 6\\ 4 \\ -7 \\ -7\end{bmatrix} - \begin{bmatrix} 5\\ 6 \\ -2 \\ -10 \end{bmatrix} = \begin{bmatrix} 1\\ -2 \\ -5 \\ 3\end{bmatrix}. \end{split}\]

We see \((\mathbf{w}_1 - \mathbf{w}_2)\ip \mathbf{v} = 1 - 2 + 10 + 9 = 0\), so indeed \((\mathbf{w}_1 - \mathbf{w}_2)\) and \(\vect{v}\) are orthogonal.

Figure 1.2.7shows what’s going on.

../_images/Fig-InnerProduct-SameProj.svg

Fig. 1.2.7 Two vectors \(\vect{w}_1\), \(\vect{w}_2 \) with the same projection onto \(\vect{v}\).#

1.2.4. Norm in \(\mathbb{R}^n\)#

The length of a vector in the plane can be computed using the dot product: for \(\mathbf{v}=\begin{bmatrix}a_{1}\\a_{2}\end{bmatrix}\) in \(\mathbb{R}^2\) we have seen that

\[ \norm{\mathbf{v}} = \sqrt{a_1^2 + a_2^2} = \sqrt{\mathbf{v}\ip\mathbf{v}}. \]

The identity \(\norm{\mathbf{v}} = \sqrt{\mathbf{v}\ip\mathbf{v}}\) also holds in \(\mathbb{R}^3\).

It seems natural to extend the concept to \(\mathbb{R}^n\). Again, for this more general space a new word is introduced:

Definition 1.2.4

The norm of a vector \(\mathbf{v}\) in \(\mathbb{R}^n\), denoted by \(\norm{\mathbf{v}}\), is defined by

\[ \norm{\mathbf{v}} = \sqrt{\mathbf{v}\ip\mathbf{v}\,}. \]

Expressed in the entries of \(\mathbf{v}\) this yields

\[ \norm{\mathbf{v}} = \sqrt{a_1^2+ a_2^2 + \ldots +a_n^2\,}\,, \]

so for vectors in \(\mathbb{R}^2\) and \(\mathbb{R}^3\) the norm of a vector is just the length of the vector.

As we might expect the norm has many properties in common with length:

Proposition 1.2.4

For any \(\mathbf{v}, \,\mathbf{w} \in \mathbb{R}^{n}\) and all \(c \in \mathbb{R}\) the following holds:

i. \(\norm{\mathbf{v}}\geq 0\), and \(\norm{\mathbf{v}} = 0\) only for \(\mathbf{v}=\mathbf{0}\);

ii. Scaling property:

(1.2.8)#\[\norm{c\mathbf{v}} = |c|\norm{\mathbf{v}}.\]

iii. Triangle Inequality:

(1.2.9)#\[\norm{\mathbf{v}+\mathbf{w}} \leq \norm{\mathbf{v}}+\norm{\mathbf{w}}.\]

The first two of these properties are very easy to prove. The proof of the triangle inequality we postpone until the end of the section. Figure 1.2.8 explains the name.

../_images/Fig-InnerProduct-TriangleInequality.svg

Fig. 1.2.8 The Triangle Inequality#

Example 1.2.5

We compute the norms of the vectors

\[\begin{split} \mathbf{v} = \begin{bmatrix} 1 \\ -2 \\ 3 \\ -1 \end{bmatrix} \quad \text{and} \quad -2\mathbf{v} = \begin{bmatrix} -2 \\ 4 \\ -6 \\ 2 \end{bmatrix}. \end{split}\]

We find

\[ \norm{\mathbf{v}} = \sqrt{1^2 + (-2)^2 + 3^2 + (-1)^2\,} = \sqrt{15}. \]

and

\[ \norm{-2\mathbf{v}} = \sqrt{(-2)^2 + 4^2 + (-6)^2 + 2^2\,} = \sqrt{60} = 2\sqrt{15}. \]

The last norm can also be found via

\[ \norm{-2\mathbf{v}} = |-2|\cdot\norm{\mathbf{v}} = 2 \sqrt{15}. \]

Definition 1.2.5

The distance between two vectors in \(\R^n\) is defined by

\[ \text{dist}(\vect{u},\vect{v}) = \norm{\vect{v}-\vect{u}}. \]

Example 1.2.6

For the vectors \(\vect{u} = \begin{bmatrix}1 \\ 3 \\ 2 \\ 4 \end{bmatrix}\) and \(\vect{v} = \begin{bmatrix}5 \\ 1 \\ 3 \\ 4 \end{bmatrix}\) in \(\R^4\)

the distance is given by

\[\begin{split} \norm{\vect{v}-\vect{u}} = \norm{\begin{bmatrix}4 \\ -2 \\ 1 \\ 0 \end{bmatrix}} = \sqrt{4^2 + (-2)^2 + 1^2 + 0^2} = \sqrt{21}. \end{split}\]

Grasple Exercise 1.2.3

https://embed.grasple.com/exercises/5bc4274c-56a0-461b-bd3d-9f8bdb8f44e0?id=69740

Computing the distance between two vectors in \(\R^3\).

From the rules of the norm the following rules of the distance function can be deduced

Proposition 1.2.5

For any vectors three vectors \(\mathbf{u}, \mathbf{v}\) and \(\mathbf{w} \in \mathbb{R}^{n}\) the following statements hold.

i. \(\text{dist}(\vect{u},\vect{v}) = \text{dist}(\vect{v},\vect{u})\);

ii. \(\text{dist}(\vect{u},\vect{v}) = 0 \iff \vect{u}=\vect{v}\);

iii. \(\text{dist}(\vect{u},\vect{w}) \leq \text{dist}(\vect{u},\vect{v}) + \text{dist}(\vect{v},\vect{w})\).

Rule iii. is again called the Triangle Inequality.

Exercise 1.2.5

Check the three properties of the distance function as stated in Proposition 1.2.5.   For Rule iii., only show how it follows from the corresponding Rule iii. in that same proposition.

../_images/Fig-InnerProduct-Distance.svg

Fig. 1.2.9 The distance between two vectors#

With the tools so far we can define a notion that comes in handy later.

Definition 1.2.6

A unit vector is a vector of norm 1.

Moreover, for any nonzero vector \(\mathbf{v}\), the vector

\[ \mathbf{u} = \frac{\mathbf{v}}{\norm{\mathbf{v}}} \]

is called the unit vector in the direction of \(\mathbf{v}\).

Proposition 1.2.6

For a nonzero vector \(\mathbf{v}\)

\[ \frac{\mathbf{v}}{\norm{\mathbf{v}}} \]

is the unique vector \(\mathbf{u}\) of norm 1 such that

\[ \mathbf{u} = k\mathbf{v}, \text{ for some } k > 0. \]

Proof. Assume that \(\mathbf{v} \neq \mathbf{0}\). For \(\mathbf{u} = k\mathbf{v}\), with \(\norm{\mathbf{u}} = 1\) and \(k > 0\) to hold, we must have

\[ \norm{\mathbf{u}} = \norm{k\mathbf{v}} = |k|\norm{\mathbf{v}} = k\norm{\mathbf{v}} = 1. \]

We see that

\[ k = \dfrac{1}{\norm{\mathbf{v}}} \]

and consequently

\[ \mathbf{u} = \dfrac{1}{k}\mathbf{v} = \frac{\mathbf{v}}{\norm{\mathbf{v}}}. \]

Example 1.2.7

We compute the unit vector \(\mathbf{u}\) in the direction of the vector \(\mathbf{v} = \begin{bmatrix}1 \\ 2 \\ 4 \\ -2 \end{bmatrix}\) in \(\mathbb{R}^4\).
As follows:

\[\norm{\mathbf{v}} = \sqrt{1^2+2^2+4^2+(-2)^2} = \sqrt{25} = 5, \]

so

\[\begin{split} \mathbf{u} = \dfrac{1}{5} \begin{bmatrix}1 \\ 2 \\ 4 \\ -2 \end{bmatrix} = \begin{bmatrix}1/5 \\ 2/5 \\ 4/5 \\ -2/5 \end{bmatrix}. \end{split}\]

Interestingly, Pythagoras’ theorem also holds in \(\mathbb{R}^n\).

Theorem 1.2.1

For any two vectors \(\mathbf{v}\) and \(\mathbf{w}\) in \(\mathbb{R}^n\) we have

\[ \norm{\mathbf{v}+\mathbf{w}}^2 = \norm{\mathbf{v}}^2 + \norm{\mathbf{w}}^2 \iff \mathbf{v} \perp \mathbf{w}. \]

Proof. This follows quite straightforwardly from the properties of the dot product.

Let us start from the identity on the left and work our way to the conclusion on the right, making sure that each step is reversible. Note that from the definition of the norm it follows immediately that \(\norm{\mathbf{v}}^2 = \mathbf{v}\ip\mathbf{v}\).

\[\begin{split} \begin{array}{cl} &\norm{\mathbf{v}+\mathbf{w}}^2 = \norm{\mathbf{v}}^2 + \norm{\mathbf{w}}^2 \\ \iff &(\mathbf{v}+\mathbf{w})\ip(\mathbf{v}+\mathbf{w}) = \mathbf{v}\ip\mathbf{v} + \mathbf{w}\ip\mathbf{w} \\ \iff&\mathbf{v}\ip\mathbf{v} + \mathbf{v}\ip\mathbf{w}+\mathbf{w}\ip\mathbf{v}+ \mathbf{w}\ip\mathbf{w} = \mathbf{v}\ip\mathbf{v} + \mathbf{w}\ip\mathbf{w}. \end{array} \end{split}\]

Next we subtract \(\mathbf{v}\ip\mathbf{v} + \mathbf{w}\ip\mathbf{w}\) from both sides. Thus the last identity is equivalent to

\[\begin{split}\begin{array}{rcl} \mathbf{v}\ip\mathbf{w}+\mathbf{w}\ip\mathbf{v} = 0 &\iff& 2\mathbf{v}\ip\mathbf{w} = 0\\ &\iff& \mathbf{v}\ip\mathbf{w}= 0\\ &\iff& \mathbf{v}\perp\mathbf{w}. \end{array} \end{split}\]

Example 1.2.8

We verify the equality for the vectors \(\mathbf{v} = \begin{bmatrix} 2 \\ -3\\ 3 \\ 1 \end{bmatrix}\) and \(\mathbf{w} = \begin{bmatrix} 2 \\ 4 \\ 1 \\ 5 \end{bmatrix}\) in \(\mathbb{R}^4\).

First of all

\[ \mathbf{v} \ip \mathbf{w} = 4 - 12 + 3 + 5 = 0, \]

so \(\mathbf{v}\perp \mathbf{w}\), and second

\[ \norm{\mathbf{v}} = \sqrt{2^2 + (-3)^2 + 3^2 + 1^2} = \sqrt{23}, \quad \norm{\mathbf{w}} = \sqrt{2^2 + 4^2 + 1^2 + 5^2} = \sqrt{46}. \]

Furthermore

\[\begin{split} \mathbf{v}+\mathbf{w} = \begin{bmatrix} 4 \\ 1 \\ 4 \\ 6 \end{bmatrix} \Longrightarrow \norm{\mathbf{v}+\mathbf{w}} = \sqrt{4^2+1^2+4^2+6^2} = \sqrt{69}\end{split}\]

and we see that indeed

\[ \norm{\mathbf{v}+\mathbf{w}}^2 = 69 = 23 + 46 = \norm{\mathbf{v}}^2+\norm{\mathbf{w}}^2. \]

One of the most basic properties, also one with a wide range of applications, is the so-called Cauchy-Schwarz Inequality.

Theorem 1.2.2 (Cauchy-Schwarz Inequality)

For any two vectors in \(\mathbb{R}^n\)

\[ |\mathbf{v}\ip\mathbf{w}| \leq \norm{\mathbf{v}}\,\norm{\mathbf{w}}. \]

Proof. There are many ways to prove the Cauchy-Schwarz inequality. There is even a whole book devoted to it: “Cauchy Schwarz master class” by J.M. Steele.

The following proof is based on orthogonal projection and Pythagoras’ Theorem.

If \(\mathbf{v} = \mathbf{0}\), the zero vector, then the inequality obviously holds; in fact it becomes an equality:

\[ \mathbf{v} = \mathbf{0} \Longrightarrow \norm{\mathbf{v}} = 0 \Longrightarrow \norm{\mathbf{v}}\cdot\norm{\mathbf{w}} = 0 \]

and also

\[ \mathbf{v} = \mathbf{0} \Longrightarrow \mathbf{v}\ip \mathbf{w} = 0 \Longrightarrow |\mathbf{v}\ip \mathbf{w}| = 0. \]

So now suppose \(\mathbf{v} \neq \mathbf{0}\).

Let

\[ \mathbf{\hat{w}} = \dfrac{\mathbf{w}\ip\mathbf{v}}{\mathbf{v}\ip\mathbf{v}}\,\mathbf{v} \]

be the projection of \(\mathbf{w}\) onto \(\mathbf{v}\). Then we can apply Pythagoras’ Theorem!

\[ (\mathbf{w} - \mathbf{\hat{w}}) \perp \mathbf{\hat{w}} \Longrightarrow \norm{\mathbf{w} - \mathbf{\hat{w}}}^2 + \norm{ \mathbf{\hat{w}}}^2 = \norm{(\mathbf{w} - \mathbf{\hat{w}}) + \mathbf{\hat{w}}}^2 = \norm{\mathbf{w}}^2 \]

It follows that

\[ \norm{ \mathbf{\hat{w}}}^2 = \norm{\mathbf{w}}^2 - \norm{\mathbf{w} - \mathbf{\hat{w}}}^2 \leq \norm{\mathbf{w}}^2 \]

and substitution of the expression for \(\mathbf{\hat{w}}\) we arrive at

\[ \left(\dfrac{\mathbf{w}\ip\mathbf{v}}{\mathbf{v}\ip\mathbf{v}}\right)^2 \norm{\mathbf{v}}^2 = \dfrac{(\mathbf{w}\ip\mathbf{v})^2}{(\mathbf{v}\ip\mathbf{v})^2} \norm{\mathbf{v}}^2 \leq \norm{\mathbf{w}}^2. \]

Using

\[ \mathbf{v}\ip\mathbf{v} = \norm{\mathbf{v}}^2 \]

we deduce that

\[ (\mathbf{w}\ip\mathbf{v})^2 \leq \norm{\mathbf{v}}^2\norm{\mathbf{w}}^2. \]

Taking square roots we may conclude that indeed

\[ |\mathbf{w}\ip\mathbf{v}| \, \leq \, \norm{\mathbf{v}}\,\norm{\mathbf{w}}. \]

Example 1.2.9

We verify that the inequality holds for the vectors \(\mathbf{v} = \begin{bmatrix} 1 \\ -2\\ 3 \\ -4 \end{bmatrix}\) and \(\mathbf{w} = \begin{bmatrix} -5 \\ 4 \\-3 \\ 0 \end{bmatrix}\) in \(\mathbb{R}^4\).

As follows

\[ \mathbf{v}\ip\mathbf{w} = -5-8-9 = -22, \quad \norm{\mathbf{v}} = \sqrt{30}, \quad \norm{\mathbf{w}} = \sqrt{50} \]

and we see that indeed

\[ |\mathbf{v}\ip\mathbf{w}| = 22 \leq \norm{\mathbf{v}} \norm{\mathbf{w}} = \sqrt{1500}. \]

With this inequality established, the Triangle Inequality (1.2.9) is easily proved. Let’s repeat it, and prove it.

Theorem 1.2.3

For any two vectors in \(\mathbb{R}^n\):

\[ \norm{\mathbf{v}+\mathbf{w}} \leq \norm{\mathbf{v}}+\norm{\mathbf{w}}. \]

Proof. Since all terms involved are non-negative we may as well show that the inequality holds for the squares:

\[\begin{split} \begin{array}{l} \norm{\mathbf{v}+\mathbf{w}}^2 \leq (\norm{\mathbf{v}}+\norm{\mathbf{w}})^2 \\ \iff (\mathbf{v}+\mathbf{w})\ip(\mathbf{v}+\mathbf{w}) \leq \norm{\mathbf{v}}^2 + 2\norm{\mathbf{v}}\norm{\mathbf{w}} + \norm{\mathbf{w}}^2 \\ \iff \mathbf{v}\ip\mathbf{v} + 2\mathbf{v}\ip\mathbf{w}+\mathbf{w}\ip\mathbf{w} \leq \norm{\mathbf{v}}^2 + 2\norm{\mathbf{v}}\norm{\mathbf{w}} + \norm{\mathbf{w}}^2 \\ \iff 2\,\mathbf{v}\ip\mathbf{w} \leq 2\norm{\mathbf{v}}\norm{\mathbf{w}} \end{array} \end{split}\]

and this, apart from the factor 2, is the Cauchy-Schwarz Inequality.

Example 1.2.10

We verify the inequality for the vectors \(\mathbf{v} = \begin{bmatrix} -1 \\ 2\\ 3 \end{bmatrix}\) and \(\mathbf{w} = \begin{bmatrix} 4 \\ -4\\ 3 \end{bmatrix}\):

\[ \norm{\mathbf{v} + \mathbf{w}} = \sqrt{3^2+(-2)^2+6^2} =\sqrt{49} = 7 \]

and indeed

\[ \norm{\mathbf{v}} + \norm{\mathbf{w}} = \sqrt{14} + \sqrt{35} \approx 9.7 > \norm{\mathbf{v} + \mathbf{w}}. \]

1.2.5. Angles in \(\mathbb{R}^n\)#

The first motivation to consider the dot product came from the question of perpendicularity. We have seen that the length of a vector can also be computed using a dot product.

Below we will show that not only can the dot product be used to mark angles between vectors of \(\frac12\pi\) (namely, when the vectors are perpendicular), but that it is possible to express the angle between any two (nonzero) vectors into dot products.

../_images/Fig-InnerProduct-AngleAndProjection.svg

Fig. 1.2.10 Angle between two vectors#

First we will show a geometrical characterization of the dot product that holds in \(\mathbb{R}^2\) as well as in \(\mathbb{R}^3\).

Proposition 1.2.7

For two nonzero vectors \(\mathbf{v}\) and \(\mathbf{w}\) in either \(\mathbb{R}^2\) or \(\mathbb{R}^3\) the following identity holds:

(1.2.10)#\[\mathbf{v}\ip\mathbf{w} = \norm{\mathbf{v}}\norm{\mathbf{w}} \cos(\varphi)\]

where \(\varphi\) is the angle between \(\mathbf{v}\) and \(\mathbf{w}\).

Note that this is in line with the special case of two perpendicular vectors:

\[ \mathbf{v}\perp\mathbf{w} \iff \mathbf{v}\ip\mathbf{w}=0 \iff \cos(\varphi)=0. \]

Observation 1.2.1

The angle between two nonzero vectors \(\mathbf{v}\) and \(\mathbf{w}\) is thus determined by dot products in the following way

\[ \cos(\varphi) = \frac{\mathbf{w}\ip\mathbf{v}}{\norm{\mathbf{v}}\norm{\mathbf{w}}}. \]

The value of \(\varphi\) between \(0\) and \(\pi\) is then uniquely determined by

\[ \varphi = \arccos\left(\frac{\mathbf{w}\ip\mathbf{v}}{\norm{\mathbf{v}}\norm{\mathbf{w}}}\right)= \cos^{-1}\left(\frac{\mathbf{w}\ip\mathbf{v}}{\norm{\mathbf{v}}\norm{\mathbf{w}}}\right). \]

Proof. Now let’s derive formula (1.2.10). Assume that \(\mathbf{v}\) and \(\mathbf{w}\) are nonzero vectors. Recall the formula of the orthogonal projection

\[ \mathbf{\hat{w}} = \dfrac{\mathbf{w}\ip\mathbf{v}}{\mathbf{v}\ip\mathbf{v}}\mathbf{v}. \]

Let \(\varphi \in[0,\pi]\) denote the angle between two nonzero vectors \(\mathbf{v}\) and \(\mathbf{w}\).

From Figure 1.2.10 it is clear that the factor

\[ \dfrac{\mathbf{w}\ip\mathbf{v}}{\mathbf{v}\ip\mathbf{v}} \]

is positive if the angle is acute, zero if the angle is right, and negative if the angle is obtuse.

In the case of an acute angle, by considering the right triangle \(\Delta OAB\), where \(A\) is the end point of \(\mathbf{\hat{w}}\) and \(B\) is the end point of \(\mathbf{w}\), we see that on the one hand

\[ OA = \norm{\dfrac{\mathbf{w}\ip\mathbf{v}}{\mathbf{v}\ip\mathbf{v}}\mathbf{v}} = \dfrac{|\mathbf{w}\ip\mathbf{v}|}{\mathbf{v}\ip\mathbf{v}}\norm{\mathbf{v}} = \dfrac{\mathbf{w}\ip\mathbf{v}}{\norm{\mathbf{v}}^2} \norm{\mathbf{v}} = \dfrac{\mathbf{w}\ip\mathbf{v}}{\norm{\mathbf{v}}} \]

and on the other hand

\[ OA = OB\cos(\varphi) = \norm{\mathbf{w}}\cos(\varphi). \]

So we may conclude that

(1.2.11)#\[\mathbf{w}\ip\mathbf{v} = \norm{\mathbf{v}}\norm{\mathbf{w}}\cos(\varphi).\]

In the case of an obtuse angle, we use that the projection of \(\mathbf{w}\) onto \(\mathbf{v}\) is equal to the projection of \(\mathbf{w}\) onto \(-\mathbf{v}\), as it is in fact the projection onto the line consisting of all multiples of \(\mathbf{v}\). Now look at the picture on the right of Figure 1.2.10 . There you see that \(\mathbf{w}\) and \(-\mathbf{v}\) make an acute angle \(\psi = \pi - \phi\), so we can apply Equation (1.2.11) to \(\mathbf{w}\) and \(-\mathbf{v}\):

\[\begin{split} \begin{array}{rcl} \mathbf{w}\ip\mathbf{v} = - \mathbf{w}\ip(\mathbf{-v}) &=& -\norm{\mathbf{w}}\norm{\mathbf{-v}}\cos(\psi) \\ &=& -\norm{\mathbf{w}}\norm{\mathbf{v}}\cos(\pi-\varphi) \\ &=& \norm{\mathbf{w}}\norm{\mathbf{v}}\cos(\varphi). \end{array} \end{split}\]

Observation 1.2.2

Note that the absolute value of \(\norm{\mathbf{w}}\norm{\mathbf{v}}\cos(\varphi)\) is the length of the orthogonal projection of \(\vect{w}\) onto \(\vect{v}\).

Exercise 1.2.6

In a methane molecule \(\ce{CH_4}\) the four \(\ce{H}\)-atoms are positioned in a perfectly symmetrical way around the \(\ce{C}\)-atom. We can model this as follows: put the \(\ce{C}\)-atom at the origin of \(\mathbb{R}^3\), and the \(\ce{H}\)-atoms at the positions/vectors

\[\begin{split} \mathbf{v}_1 = \begin{bmatrix}1 \\ 1 \\ 1 \end{bmatrix}, \quad \mathbf{v}_2 = \begin{bmatrix}-1 \\ -1 \\ 1 \end{bmatrix}, \quad \mathbf{v}_3 = \begin{bmatrix}-1 \\ 1 \\ -1 \end{bmatrix} \quad \text{and} \quad \mathbf{v}_4 = \begin{bmatrix}1 \\ -1 \\ -1 \end{bmatrix}. \end{split}\]

Then all four points have the same distance \(\sqrt{3}\) to the origin, and all points have the same distance to each other, namely

\[ \norm{\vect{v}_i - \vect{v}_j} = \sqrt{2^2 + 2^2 + 0^2} = \sqrt{8}, \text{ for } i \neq j. \]

The angle between for instance \(\mathbf{v}_1\) and \(\mathbf{v}_3\) is determined by

\[ \cos(\varphi) = \dfrac{\mathbf{v}_1\ip\mathbf{v}_3}{\norm{\mathbf{v}_1}\norm{\mathbf{v}_3}} = \dfrac{-1}{\sqrt{3}\cdot\sqrt{3}} = -\frac13. \]

So

\[ \varphi = \arccos(-\tfrac13) \approx 1.9106 \approx 109.47^{o}. \]

Since we have defined the dot product and the norm in \(\mathbb{R}^n\), we can use the last formula to also define the angle between two vectors in \(\mathbb{R}^n\).

Definition 1.2.7

For two nonzero vectors \(\mathbf{v}\) and \(\mathbf{w}\) in \(\mathbb{R}^n\), the angle between the vectors is defined as

\[ \varphi = \angle(\mathbf{v},\mathbf{w}) = \arccos\left(\dfrac{\mathbf{v}\ip\mathbf{w}}{\norm{\mathbf{v}} \norm{\mathbf{w}}} \right). \]

This definition makes sense, since the Cauchy-Schwarz Inequality implies

\[ -1 \leq \dfrac{\mathbf{v}\ip\mathbf{w}}{\norm{\mathbf{v}}\,\norm{\mathbf{w}}} \leq 1. \]

Note that just as before in the plane and in three-dimensional space, for nonzero vectors \(\mathbf{v}\) and \(\mathbf{w}\) we have

\[ \mathbf{v}\perp\mathbf{w} \iff \mathbf{v}\ip\mathbf{w}=0 \iff \dfrac{\mathbf{v}\ip\mathbf{w}}{\norm{\mathbf{v}}\,\norm{\mathbf{w}}}=0 \iff \varphi = \angle(\mathbf{v},\mathbf{w}) = \tfrac12\pi. \]

Example 1.2.11

Let \(\mathbf{e_1}\) be the vector in \(\mathbb{R}^n\) with first entry equal to 1 and all other entries equal to 0, and \(\mathbf{v}\) be the vector with all entries equal to 1. We find the angle between \(\mathbf{e_1}\) and \(\mathbf{v}\) in all cases \(n = 2, 3, 4,\ldots\)

For each \(n\geq2\) we write \(\varphi_n = \angle(\mathbf{e_1},\mathbf{v})\). Then

\[ \cos(\varphi_n) = \dfrac{\mathbf{e_1}\ip\mathbf{v}}{\norm{\mathbf{e_1}}\norm{\mathbf{v}}} = \dfrac{1}{\sqrt{n}}. \]

So:

\[ \varphi_n = \arccos(\tfrac{1}{\sqrt{n}}), \, n = 1,2,3,\ldots \]

For \(n=1\) we find \(\cos(\varphi_1) = 1\), so \(\varphi_1 = 0\), which makes sense, and for \(n=2\), \(\cos(\varphi_2) = \frac{1}{\sqrt{2}}\), so \(\varphi_2 = \frac14\pi\), which you can check by a sketch in the plane.

For \(n\geq3\) we don’t get easy answers, but as \(\frac{1}{\sqrt{n}} \downarrow 0\) when \(n\) gets large, we may conclude that for large \(n\) in \(\mathbb{R}^n\) the two vectors are ‘almost’ orthogonal.

1.2.6. Grasple Exercises#

Grasple Exercise 1.2.4

https://embed.grasple.com/exercises/59912254-6fc8-43c7-9c44-1ea7eab1c236?id=62409

To compute dot products in \(\R^2\), \(\R^3\) and \(\R^4\).

Grasple Exercise 1.2.5

https://embed.grasple.com/exercises/7b49e0f5-ae8b-4e92-8878-665dc080b7ee?id=65601

To find a vector orthogonal to a given vector in \(\R^2\).

Grasple Exercise 1.2.6

https://embed.grasple.com/exercises/c8b4eed4-179f-42ab-9ec9-07f66445c960?id=69482

To find a vector orthogonal to two given vectors in \(\R^2\).

Grasple Exercise 1.2.7

https://embed.grasple.com/exercises/b5a4e1c0-92ca-4307-9eb0-25a3a5807fc7?id=62415

To find a vector orthogonal to a given vector in \(\R^3\).

Grasple Exercise 1.2.8

https://embed.grasple.com/exercises/34bbb9e1-207e-4c06-8686-1c32b3f3d0aa?id=78751

To find a vector orthogonal to a given vector in \(\R^4\).

Grasple Exercise 1.2.9

https://embed.grasple.com/exercises/30a7abfe-9d40-4faa-a848-83bd67e024a0?id=62406

To compute the norm of vectors in \(\R^2\), \(\R^3\), \(\R^4\).

Grasple Exercise 1.2.10

https://embed.grasple.com/exercises/7dc339bb-fe79-4eb9-914c-ea1a7ca85a85?id=69737

To find the norm of the ‘all one’ vector in \(\mathbb{R}^n\).

Grasple Exercise 1.2.11

https://embed.grasple.com/exercises/8de90b0e-e89a-49a6-aa63-1b1e39f6e98e?id=79262

To find the distance between two vectors in \(\mathbb{R}^4\).

Grasple Exercise 1.2.12

https://embed.grasple.com/exercises/d4dd1154-a3ec-497e-bc73-1cd96529f0e7?id=69741

Find \(h\) such that the distance between two points has a given value \(d\).

Grasple Exercise 1.2.13

https://embed.grasple.com/exercises/c2242315-7e4f-463b-b3cf-09e9e15c8b2b?id=69739

To find a unit vector on a given line through \((0,0)\).

Grasple Exercise 1.2.14

https://embed.grasple.com/exercises/67334454-d109-45a2-b640-545041ff896d?id=62416

Find \(\text{proj}*{\mathbf{v}}(\mathbf{w})\) in \(\R^2\).

Grasple Exercise 1.2.15

https://embed.grasple.com/exercises/9705b078-6c91-42c6-9768-8a043115b881?id=62658

Find \(\text{proj}*{\mathbf{v}}(\mathbf{w})\) in \(\R^4\).

Grasple Exercise 1.2.16

https://embed.grasple.com/exercises/531d3be2-dd62-4c21-b023-70e0b63809be?id=78747

Regarding norm and orthogonality of \(\vect{u}\), \(\vect{v}\), \(\vect{u}-\vect{v}\) and \(\vect{u}+\vect{v}\).

61ecdf6-4cfb-41ba-bc16-685fe8532471?id=62414

Grasple Exercise 1.2.17

https://embed.grasple.com/exercises/161ecdf6-4cfb-41ba-bc16-685fe8532471?id=62414

To show that \((\vect{v}+\vect{w})\ip(\vect{v}-\vect{w}) = \norm{\vect{v}}^2 - \norm{\vect{w}}^2\).

Grasple Exercise 1.2.18

https://embed.grasple.com/exercises/c4d2743f-5f14-4812-9531-1a40c28c15cb?id=62413

To prove that \((\vect{v}+\vect{w})\ip\vect{x} = \vect{v}\ip\vect{x}+\vect{w}\ip\vect{x}\).

Grasple Exercise 1.2.19

https://embed.grasple.com/exercises/407cb45d-2baf-4b0d-a1eb-6e51186e19f3?id=69738

What to conclude from \(\norm{\vect{v}+\vect{w}} = \norm{\vect{v}}+\norm{\vect{w}}\)?

Grasple Exercise 1.2.20

https://embed.grasple.com/exercises/c4c1c609-b1dd-4588-865f-53d7e8221f88?id=62689

To prove that \(-1 \leq \dfrac{\vect{u}\ip\vect{u}}{\norm{\vect{u}}\,\norm{\vect{v}}} \leq 1\).

Grasple Exercise 1.2.21

https://embed.grasple.com/exercises/2a2423c3-0907-40b7-bd5f-7607baf7cc09?id=62668

What to conclude from \(\text{proj}*{\mathbf{v}}(\mathbf{w*1} ) = \text{proj}\_{\mathbf{v}}(\mathbf{w}_2)\)?