Linear transformations

3.1. Linear transformations#

3.1.1. Introduction#

Until now we have used matrices in the context of linear systems. The equation

\[ A\mathbf{x} = \mathbf{b}, \]

where \(A\) is an \(m \times n\)-matrix, is just a concise way to write down a system of \(m\) linear equations in \(n\) unknowns. A different way to look at this matrix equation is to consider it as an input-output system: the left-hand side \(A\mathbf{x}\) can be seen as a mapping that sends an “input” \(\mathbf{x}\) to an “output” \(\mathbf{y}= A\mathbf{x}\).

For instance, in computer graphics, typically points describing a 3D object have to be converted to points in 2D, to be able to visualise them on a screen. Or, in a dynamical system, a matrix \(A\) may describe how a system evolves from a “state” \(\mathbf{x}_{k}\) at time \(k\) to a state \(\mathbf{x}_{k+1}\) at time \(k+1\) via

\[ \mathbf{x}_{k+1} = A\mathbf{x}_{k}. \]

A “state” may be anything ranging from a set of particles at certain positions, a set of pixels describing a minion, concentrations of chemical substances in a reactor tank, to population sizes of different species. Thinking mathematically we would describe such an input-output interpretation as a transformation (or: function, map, mapping, operator, etc. ).

\[ T: \mathbb{R}^n \to \mathbb{R}^m. \]

We will see that these matrix transformations have two characteristic properties
which makes them the protagonists of the more general linear algebra concept of a linear transformation.

3.1.2. Matrix transformations#

Let \(A\) be an \(m\times n\)-matrix. We can in a natural way associate a transformation \(T_A:\mathbb{R}^n \to \mathbb{R}^m\) to the matrix \(A\).

matrix transformationrepresents

Definition 3.1.1

The transformation \(T_A\) corresponding to the \(m\times n\)-matrix \(A\) is the mapping defined by

\[ T_A(\mathbf{x}) = A\mathbf{x} \quad \text{or } \quad T_A:\mathbf{x} \mapsto A\mathbf{x}, \]

where \(\mathbf{x} \in \mathbb{R}^n\).

We call such a mapping a matrix transformation. Conversely we say that the matrix \(A\) represents the transformation \(T_A\).

As a first example consider the following.

Example 3.1.1

The transformation corresponding to the matrix \(A = \begin{pmatrix} 1 & 2 & 0\\ 1 & 2 & 1 \end{pmatrix}\) is defined by

\[\begin{split} T_A(\mathbf{x}) = \begin{pmatrix} 1 & 2 & 0\\ 1 & 2 & 1 \end{pmatrix}\mathbf{x}. \end{split}\]

We have, for instance

\[\begin{split} \begin{pmatrix} 1 & 2 & 0\\ 1 & 2 & 1 \end{pmatrix} \begin{pmatrix} 1\\1\\1 \end{pmatrix} = \begin{pmatrix} 3 \\ 4 \end{pmatrix} \quad \text{and} \quad \begin{pmatrix} 1 & 2 & 0\\ 1 & 2 & 1 \end{pmatrix} \begin{pmatrix} 2\\-1\\0 \end{pmatrix} = \begin{pmatrix} 0\\ 0 \end{pmatrix}. \end{split}\]

According to the definition of the matrix-vector product we can also write

(3.1.1)#\[\begin{split}A\mathbf{x} = \begin{pmatrix} 1 & 2 & 0\\ 1 & 2 & 1 \end{pmatrix} \begin{pmatrix} x_1\\x_2\\x_3 \end{pmatrix} = x_1 \begin{pmatrix} 1\\ 1 \end{pmatrix}+ x_2 \begin{pmatrix} 2 \\ 2 \end{pmatrix}+ x_3 \begin{pmatrix} 0\\ 1 \end{pmatrix}.\end{split}\]

We recall that for a transformation \(T\) from a domain \(D\) to a codomain \(E\) the range \(R= R_T\) is defined as the set of all images of elements of \(D\) in \(E\):

\[ R_T = \{\text{ all images } T(x), \, \text{ for } x \text{ in }D\}. \]

Remark 3.1.1

From Equation (3.1.1) it is clear that the range of the matrix transformation in Example 3.1.1 consists of all linear combinations of the three columns of \(A\):

\[\begin{split} \operatorname{Range}(T_A) = \Span{ \begin{pmatrix} 1\\ 1 \end{pmatrix}, \begin{pmatrix} 2 \\ 2 \end{pmatrix}, \begin{pmatrix} 0\\ 1 \end{pmatrix}}. \end{split}\]

In a later section (Section 4.1) we will call this the column space of the matrix \(A\).

The first example leads to a first property of matrix transformations:

Proposition 3.1.1

Suppose

\[ A = \begin{pmatrix} \mathbf{a}_1 & \mathbf{a}_2 & \cdots & \mathbf{a}_n \end{pmatrix} \]

is an \(m\times n\)-matrix.

Then the range of the matrix transformation corresponding to \(A\) is the span of the columns of \(A\):

\[ \operatorname{Range}(T_A) = \Span{\mathbf{a}_1, \mathbf{a}_2,\ldots,\mathbf{a}_n }. \]

Example 3.1.2

The matrix

\[\begin{split} A = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{pmatrix} \end{split}\]

leads to the transformation

\[\begin{split} T: \mathbb{R}^2 \to \mathbb{R}^3, \quad T \left(\begin{pmatrix} x \\ y \end{pmatrix}\right)= \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} x \\ y \\0 \end{pmatrix}. \end{split}\]

This transformation “embeds” the plane \(\mathbb{R}^2\) into the space \(\mathbb{R}^3\), as depicted in Figure 3.1.1.

../_images/Fig-LinTrafo-EmbedR2R3.svg — Fig. 3.1.1 \(T\) embeds \(\mathbb{R}^2\) into \(\mathbb{R}^3\).#

The range of this transformation is the span of the two vectors

\[\begin{split} \mathbf{e}_1 = \begin{pmatrix} 1\\ 0 \\ 0 \end{pmatrix} \quad \text{and} \quad \mathbf{e}_2 = \begin{pmatrix} 0\\ 1 \\ 0 \end{pmatrix}, \end{split}\]

which is the \(xy\)-plane in \(\mathbb{R}^3\).

For \(2\times2\) and \(3\times3\)-matrices the transformations often have a geometric interpretation, as the following example illustrates.

Example 3.1.3

The transformation corresponding to the matrix

\[\begin{split} A = \begin{pmatrix} 1 & 1 \\ 0 & 0 \end{pmatrix} \end{split}\]

is the mapping

\[\begin{split} T: \mathbb{R}^2 \to \mathbb{R}^2, \quad T\left(\begin{pmatrix} x \\ y \end{pmatrix}\right)= \begin{pmatrix} x +y \\ 0 \end{pmatrix}. \end{split}\]

First we observe that the range of this transformation consists of all multiples of the vector \( \begin{pmatrix} 1 \\ 0 \end{pmatrix} \), i.e. the \(x\)-axis in the plane.

Second, let us find the set of points/vectors that is mapped to an arbitrary point \(\begin{pmatrix} c \\ 0 \end{pmatrix}\) in the range. For this we solve

\[\begin{split} A\mathbf{x} = \begin{pmatrix} 1 & 1 \\ 0 & 0 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} c \\ 0 \end{pmatrix}, \quad \text{so} \quad \begin{pmatrix} x+y \\ 0 \end{pmatrix} = \begin{pmatrix} c \\ 0 \end{pmatrix}. \end{split}\]

The points whose coordinates satisfy this equation all lie on the line described by the equation

\[ x + y = c. \]

So what the mapping does is to send all points on a line \(\mathcal{L}:x + y = c\) to the point \((c,0)\), which is the intersecting of this line with the \(x\)-axis. An alternative way to describe it: it is the skew projection, in the direction \(\begin{pmatrix} 1 \\ -1 \end{pmatrix}\) onto the \(x\)-axis. See Figure 3.1.2.

../_images/Fig-LinTrafo-SkewProjection.svg — Fig. 3.1.2 The transformation of Example 3.1.3.#

Grasple exercise 3.1.1

https://embed.grasple.com/exercises/2db421fe-0649-4c7e-9624-9fdf0123565d?id=122271

Finding out which vectors are in the range of a linear transformation.

We close this subsection with an example of a matrix transformation representing a very elementary dynamical system.

Example 3.1.4

Consider a model with two cities between which over a fixed period of time migrations take place. Say in a period of ten years 90% of the inhabitants in city \(A\) stay in city \(A\) and 10% move to city \(B\). From city \(B\) 20% of the citizens move to \(A\), so 80% stay in city \(B\).

The following table contains the relevant statistics:

\[\begin{split} \begin{array}{c|cc|} & A & B \\ \hline A & 0.9 & 0.2 \\ B & 0.1 & 0.8 \\ \hline \end{array} \end{split}\]

For instance, if at time \(0\) the population in city \(A\) amounts to \(50\) (thousand) and in city \(B\) live \(100\) (thousand) people, then at the end of one period the population in city \(A\) amounts to

\[ 0.9 \times 50 + 0.2 \times 100 = 65. \]

Likewise for city \(B\).

If we denote the population sizes after \(k\) periods by a vector

\[\begin{split} \mathbf{x}_k = \begin{pmatrix} x_k \\ y_k \end{pmatrix} \end{split}\]

it follows that

\[\begin{split} \begin{pmatrix} x_{k+1} \\ y_{k+1} \end{pmatrix} = \begin{pmatrix} 0.9x_k + 0.2y_k \\0.1x_k + 0.8y_k \end{pmatrix}, \quad \text{i.e., } \mathbf{x}_{k+1} = \begin{pmatrix} 0.9 & 0.2 \\ 0.1 & 0.8 \end{pmatrix} \begin{pmatrix} x_k \\ y_k \end{pmatrix} = M \mathbf{x}_{k}. \end{split}\]

The \(M\) stands for migration matrix.

Obviously this model can be generalised to a “world” with any number of cities.

3.1.3. Linear transformations#

In the previous section we saw that the matrix transformation \(\mathbf{y}=A\mathbf{x}\) can also be seen as a mapping \(T(\mathbf{x}) = A\mathbf{x}\).
This mapping has two characteristic properties on which we will focus in this section.

linear transformation

Definition 3.1.2

A linear transformation is a function \(T\) from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) that has the following properties

For all vectors \(\mathbf{v}_1,\,\mathbf{v}_2\) in \(\mathbb{R}^n\):

\[ T(\mathbf{v}_1+\mathbf{v}_2) = T(\mathbf{v}_1) + T(\mathbf{v}_2). \]
For all vectors \(\mathbf{v}\) in \(\mathbb{R}^n\) and all scalars \(c\) in \(\mathbb{R}\):

\[ T(c\mathbf{v}) = c\,T(\mathbf{v}). \]

Exercise 3.1.1

Show that a linear transformation from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) always sends the zero vector \(\mathbf{0}_n\) in \(\R^n\) to the zero vector \(\mathbf{0}_m\) in \(\R^m\).

Thus, if \( T:\mathbb{R}^n \to\mathbb{R}^m\) is a linear transformation, then \(T(\mathbf{0}_n) = \mathbf{0}_m\).

Solution to Exercise 3.1.1

If \( T:\mathbb{R}^n \to\mathbb{R}^m\) is linear, and \(\vect{v}\) is any vector in \(\R^n\), then \(\mathbf{0}_n = 0\vect{v}\). From the second property in Definition 3.1.2 it follows that

\[ T(\mathbf{0}_n) = T(0\vect{v}) = 0\,T(\vect{v}) = \mathbf{0}_m. \]

Example 3.1.5

Consider the map \(T:\mathbb{R}^2\rightarrow\mathbb{R}^3\) that sends each vector \(\begin{pmatrix} x \\ y \end{pmatrix}\) in \(\mathbb{R}^2\) to the vector \(\begin{pmatrix} x \\ y \\ 0 \end{pmatrix}\) in \(\mathbb{R}^3\). Let us check that this a linear map.

For that, we need to check the two properties in the definition.

For property (i) we take two arbitrary vectors

\[\begin{split} \begin{pmatrix} x_1 \\ y_1 \end{pmatrix} \quad \text{ and }\quad \begin{pmatrix} x_2 \\ y_2 \end{pmatrix} \quad \text{in} \quad \mathbb{R}^2, \end{split}\]

and see:

\[\begin{split} T\left(\begin{pmatrix} x_1 \\ y_1 \end{pmatrix} + \begin{pmatrix} x_2 \\ y_2 \end{pmatrix} \right)= T \left(\begin{pmatrix} x_1+x_2 \\ y_1+y_2 \end{pmatrix}\right)= \begin{pmatrix} x_1 + x_2 \\ y_1 + y_2 \\ 0 \end{pmatrix} = \begin{pmatrix} x_1 \\ y_1 \\ 0 \end{pmatrix} + \begin{pmatrix} x_2 \\ y_2 \\ 0 \end{pmatrix}. \end{split}\]

This last vector indeed equals

\[\begin{split} T\left(\begin{pmatrix} x_1 \\ y_1 \end{pmatrix}\right)+ T\left(\begin{pmatrix} x_2 \\ y_2 \end{pmatrix}\right). \end{split}\]

Similarly, for the second property, given any scalar \(c\),

\[\begin{split} T\left(c \begin{pmatrix} x_1 \\ y_1 \end{pmatrix}\right)= T \left(\begin{pmatrix} c x_1 \\ cy_1 \end{pmatrix}\right)= \begin{pmatrix} c x_1 \\ c y_1 \\ 0 \end{pmatrix} = c \begin{pmatrix} x_1 \\ y_1 \\ 0 \end{pmatrix}= cT \left(\begin{pmatrix} x_1 \\ y_1 \end{pmatrix}\right). \end{split}\]

So indeed \(T\) has the two properties of a linear transformation.

Example 3.1.6

Consider the mapping \(T:\mathbb{R}^2\rightarrow\mathbb{R}^2\) that sends each vector \( \begin{pmatrix} x \\ y \end{pmatrix}\) in \(\mathbb{R}^2\) to the vector \(\begin{pmatrix} x+y \\ xy \end{pmatrix}\):

\[\begin{split} T: \begin{pmatrix} x \\ y \end{pmatrix} \mapsto \begin{pmatrix} x+y \\ xy \end{pmatrix} \end{split}\]

This mapping is not a linear transformation.

\[\begin{split} T \left(\begin{pmatrix} 1 \\ 1 \end{pmatrix} + \begin{pmatrix} 1 \\ 2 \end{pmatrix}\right)= T \left(\begin{pmatrix} 2 \\ 3 \end{pmatrix}\right) = \begin{pmatrix} 5 \\ 6 \end{pmatrix}, \end{split}\]

whereas

\[\begin{split} T \left(\begin{pmatrix} 1 \\ 1 \end{pmatrix}\right)+ T \left(\begin{pmatrix} 1 \\ 2 \end{pmatrix}\right)= \begin{pmatrix} 2 \\ 1 \end{pmatrix} + \begin{pmatrix} 3 \\ 2 \end{pmatrix} = \begin{pmatrix} 5 \\ 3 \end{pmatrix} \,\neq\, \begin{pmatrix} 5 \\ 6 \end{pmatrix} . \end{split}\]

The second requirement of a linear transformation is violated as well:

\[\begin{split} T\left(3 \begin{pmatrix} 1 \\ 1 \end{pmatrix}\right)= T \left(\begin{pmatrix} 3 \\ 3 \end{pmatrix}\right)= \begin{pmatrix} 6 \\ 9 \end{pmatrix} \,\,\neq\,\, 3\,T \left(\begin{pmatrix} 1 \\ 1 \end{pmatrix} \right)= 3 \begin{pmatrix} 2 \\ 1 \end{pmatrix} = \begin{pmatrix} 6 \\ 3 \end{pmatrix}. \end{split}\]

Exercise 3.1.2

Let \(\mathbf{p}\) be a non-zero vector in \(\mathbb{R}^2\). Is the translation

\[ T\!:\mathbb{R}^2 \to \mathbb{R}^2, \quad \mathbf{x} \mapsto \mathbf{x} + \mathbf{p} \]

a linear transformation?

Solution to Exercise 3.1.2

The transformation defined by \(T(\vect{x}) = \vect{x} + \vect{p}\), with \(\vect{p}\neq \vect{0}\) does not have any of the two properties of a linear transformation.

For instance, since \(\vect{p}+\vect{p} \neq \vect{p}\),

\[ T(\vect{x}+\vect{y}) = \vect{x}+\vect{y} + \vect{p} \neq T(\vect{x})+T(\vect{y}) = \vect{x}+ \vect{p} +\vect{y} + \vect{p}. \]

Note that Example 3.1.5 was in fact the first example of a matrix transformation in the Subsection Introduction:

\[\begin{split} \begin{pmatrix} x \\ y \end{pmatrix} \mapsto \begin{pmatrix} x \\ y \\ 0 \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0&1 \\ 0&0 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} \end{split}\]

As we will see: any linear transformation from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) is a matrix transformation. The converse is true as well. This is the content of the next proposition.

Proposition 3.1.2

Each matrix transformation is a linear transformation.

Proof of Proposition 3.1.2

This is a direct consequence of the two properties of the matrix-vector product (Proposition 2.4.2) that say

\[ A\,(\mathbf{x}+\mathbf{y} ) = A\mathbf{x} + A\mathbf{y} \quad \text{and} \quad A\,(c\mathbf{x}) = c\,A\mathbf{x}. \]

Proposition 3.1.3

Suppose \(T: \mathbb{R}^n\to\mathbb{R}^m\) and \(S:\mathbb{R}^m\to\mathbb{R}^p\) are linear transformations. Then the transformation \(S\circ T:\mathbb{R}^n\to\mathbb{R}^p \) defined by

\[ S\circ T(\mathbf{x}) = S(T(\mathbf{x})) \]

is a linear transformation from \(\mathbb{R}^n\) to \(\mathbb{R}^p\).

Remark 3.1.2

The transformation \(S\circ T\) is called the composition of the two transformations \(S\) and \(T\). It is best read as “\(S\) after \(T\)”.

Proof of Proposition 3.1.3

Suppose that

\[ T(\mathbf{x}+\mathbf{y}) = T(\mathbf{x})+T(\mathbf{y})\quad \text{and} \quad T(c\mathbf{x}) = cT(\mathbf{x}), \quad \text{for}\,\, \mathbf{x}, \mathbf{y} \quad \text{in } \mathbb{R}^n, \,\, c \text{ in } \mathbb{R} \]

and likewise for \(S\). Then

\[\begin{split} \begin{array}{rl} S\circ T(\mathbf{x}+\mathbf{y}) = S(T(\mathbf{x}+\mathbf{y})) = S( T(\mathbf{x})+T(\mathbf{y})) \!\!\!\!& = S(T(\mathbf{x})) + S(T(\mathbf{y})) \\ & = S\circ T(\mathbf{x}) + S\circ T(\mathbf{y}) \end{array} \end{split}\]

and

\[ S\circ T(c\mathbf{x}) = S(T(c\mathbf{x})) = S(cT(\mathbf{x})) = c\,S(T(\mathbf{x})) = c\,S\circ T(\mathbf{x}). \]

Hence \(S\circ T\) satisfies the two requirements of a linear transformation.

In words: the composition/concatenation of two linear transformations is itself a linear transformation.

Exercise 3.1.3

There are other ways to combine linear transformations.

The sum \(S = T_1 + T_2\) of two linear transformation \(T_1,T_2: \mathbb{R}^n \to \mathbb{R}^m\) is defined as follows

\[ S: \mathbb{R}^n \to \mathbb{R}^m, \quad S(\mathbf{x}) = T_1(\mathbf{x}) + T_2(\mathbf{x}). \]

And the (scalar) multiple \(T_3 = cT_1\) is the transformation

\[ T_3: \mathbb{R}^n \to \mathbb{R}^m, \quad T_3(\mathbf{x}) = cT_1(\mathbf{x}). \]

Show that \(S\) and \(T_3\) are again linear transformations.

Solution to Exercise 3.1.3

The properties of the linear transformation \(T_1\) and \(T_2\) carry over to \(S\) and \(T_3\) in the following way. We check the properties one by one.

For the sum \(S\) we have

For all vectors \(\mathbf{v}_1,\,\mathbf{v}_2\) in \(\R^n\)

\[\begin{split} \begin{array}{rcl} S(\mathbf{v}_1+\mathbf{v}_2) &=& T_1(\mathbf{v}_1+\mathbf{v}_2) + T_2(\mathbf{v}_1+\mathbf{v}_2)\\ &=& T_1(\mathbf{v}_1) + T_1(\mathbf{v}_2) + T_2(\mathbf{v}_1) + T_2(\mathbf{v}_2)\\ &=& T_1(\mathbf{v}_1) + T_2(\mathbf{v}_1) + T_1(\mathbf{v}_2) + T_2(\mathbf{v}_2)\\ &=& S(\mathbf{v}_1)+S(\mathbf{v}_2). \end{array} \end{split}\]
Likewise, for all vectors \(\mathbf{v}\) in \(\mathbb{R}^n\) and all scalars \(c\) in \(\mathbb{R}\):

\[\begin{split} \begin{array}{rcl} S(c\mathbf{v}) &=& T_1(c\mathbf{v})+T_2(c\mathbf{v}) \\ &=& cT_1(\mathbf{v})+cT_2(\mathbf{v}) \\ &=& c \big(T_1(\mathbf{v})+cT_2(\mathbf{v})\big)\\ &=& cS(\mathbf{v}). \end{array} \end{split}\]

The linearity of \(T_3\) is verified in a similar manner.

Exercise 3.1.4

This exercise sheds some light on the geometry behind linear transformations. We restrict ourselves to linear transformations in the plane, but the ideas can be generalised.

Suppose \(T\) is a linear transformation from \(\R^2\) to \(\R^2\). Show that the image of a line segment \(\mathcal{S}\) between two points \(P\) and \(Q\) is either a line segment or a single point.

Solution to Exercise 3.1.4

Suppose \(\vect{p} = \overrightarrow{OP}\) and \(\vect{q} = \overrightarrow{OQ}\) are the position vectors, and let \(\vect{r} = \overrightarrow{PQ} = \vect{q} - \vect{p}\). Then the line segment between \(\vect{p}\) and \(\vect{q}\) consist of all vectors \(\vect{v} = \vect{p} + t\vect{r}\), where \(t\) runs from \(0\) to \(1\).

By the linearity of \(T\) it follows that the image of \(\mathcal{S}\) consist of all vectors \(T(\vect{p} + t\vect{r}) = T(\vect{p}) + tT(\vect{r})\), with \(0 \leq t \leq 1\).

This describes the line segment between the points \(T(\vect{p})\) and \(T(\vect{p}) + T(\vect{r})\). Note that \(T(\vect{p}) + T(\vect{r}) = T(\vect{p} + \vect{r}) = T(\vect{q})\). If by any chance \(T(\vect{r}) = \mathbf{0}\), the segment ‘shrinks’ to a single point.

And now, let us return to matrix transformations.

3.1.4. Standard matrix for a linear transformation#

We have seen that every matrix transformation is a linear transformation. In this subsection we will show that conversely every linear transformation \(T:\mathbb{R}^n \to \mathbb{R}^m\) can be represented by a matrix transformation.

The key to construct a matrix that represents a given linear transformation lies in the following proposition.

Proposition 3.1.4

Suppose \(T:\mathbb{R}^n\rightarrow\mathbb{R}^m\) is a linear transformation. Then the following property holds: for each set of vectors \(\mathbf{x}_1, \ldots, \mathbf{x}_k\) in \(\mathbb{R}^n\) and each set of numbers \(c_1,\ldots,c_k\) in \(\mathbb{R}\):

(3.1.2)#\[T(c_1\mathbf{x}_1+c_2 \mathbf{x}_2+ \cdots +c_k \mathbf{x}_k) = c_1T(\mathbf{x}_1)+c_2T(\mathbf{x}_2)+ \cdots +c_kT( \mathbf{x}_k).\]

In words: for any linear transformation the image of a linear combination of vectors is equal to the linear combination of their images.

Proof of Proposition 3.1.4

Suppose \(T:\mathbb{R}^n\rightarrow\mathbb{R}^m\) is a linear transformation.

So we have

\[ \text{(i) } T(\mathbf{x}+\mathbf{y}) = T(\mathbf{x})+T(\mathbf{y}) \quad\text{and} \quad \text{(ii) } T(c\mathbf{x}) = c T(\mathbf{x}). \]

First apply rule (i) to split the term on the left in Equation (3.1.2) into \(k\) terms:

\[\begin{split} \begin{array}{ccl} T(c_1\mathbf{x}_1+c_2 \mathbf{x}_2+ \cdots +c_k \mathbf{x}_k) &=& T(c_1\mathbf{x}_1)+T(c_2 \mathbf{x}_2+ \cdots +c_k \mathbf{x}_k) \\ &\vdots& \\ &=& T(c_1\mathbf{x}_1)+T(c_2 \mathbf{x}_2)+ \cdots + T(c_k \mathbf{x}_k) \end{array} \end{split}\]

and then apply rule (ii) to each term.

Example 3.1.7

Suppose \(T: \mathbb{R}^3 \to \mathbb{R}^2\) is a linear transformation, and we know that for

\[\begin{split} \vect{a}_1 = \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix}, \quad \vect{a}_2 = \begin{pmatrix} 1 \\ 1 \\ 0 \end{pmatrix}, \quad \vect{a}_3 = \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} \end{split}\]

the images under \(T\) are given by

\[\begin{split} T(\vect{a}_1) = \vect{b}_1 = \begin{pmatrix} 1 \\ 2 \end{pmatrix}, \quad T(\vect{a}_2) = \vect{b}_2 = \begin{pmatrix} 3 \\ -1 \end{pmatrix}, \quad \text{and} \quad T(\vect{a}_3) = \vect{b}_3 = \begin{pmatrix} 2 \\ -2 \end{pmatrix}. \end{split}\]

Then for the vector

\[\begin{split} \vect{v} = \begin{pmatrix} 4 \\ 1 \\ -1 \end{pmatrix} = 3 \vect{a}_1 + 2 \vect{a}_2 - 1 \vect{a}_3 \end{split}\]

it follows that

\[\begin{split} T(\vect{v}) = 3 \vect{b}_1 + 2 \vect{b}_2 + (-1) \vect{b}_3 = 3 \begin{pmatrix} 1 \\ 2 \end{pmatrix} + 2 \begin{pmatrix} 3 \\ -1 \end{pmatrix} + (-1) \begin{pmatrix} 2 \\ -2 \end{pmatrix}= \begin{pmatrix} 7 \\ 6 \end{pmatrix}. \end{split}\]

The central idea illustrated in Example 3.1.7, which is in fact a direct consequence of Proposition 3.1.4, is the following:

a linear transformation \(T\) from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) is completely specified by the images \( T(\mathbf{a}_1), T(\mathbf{a}_2), \ldots , T(\mathbf{a}_n)\) of a set of vectors \(\{\mathbf{a}_1, \mathbf{a}_2, \ldots, \mathbf{a}_n\}\) that spans \(\mathbb{R}^n\).

The simplest set of vectors that spans the whole space \(\mathbb{R}^n\) is the standard basis for \(\mathbb{R}^n\) which was introduced in the Section Linear combinations.

Recall that this is the set of vectors

(3.1.3)#\[\begin{split}\left(\vect{e}_1,\mathbf{e}_2, \ldots, \mathbf{e}_n\right)= \left(\begin{pmatrix} 1 \\ 0 \\ 0 \\ \vdots \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ 1 \\ 0 \\ \vdots \\ 0 \end{pmatrix}, \quad \cdots \quad , \begin{pmatrix} 0 \\ 0 \\ 0 \\ \vdots \\ 1 \end{pmatrix}\right).\end{split}\]

The next example gives an illustration of the above, and it also leads the way to the construction of a matrix for an arbitrary linear transformation.

Example 3.1.8

Suppose \(T\) is a linear transformation from \(\mathbb{R}^2\) to \(\mathbb{R}^2\) for which

\[\begin{split} T(\mathbf{e}_1) = \mathbf{a}_1 = \begin{pmatrix} 1 \\2 \end{pmatrix}, \quad T(\mathbf{e}_2) = \mathbf{a}_2 = \begin{pmatrix} 4 \\3 \end{pmatrix}. \end{split}\]

Then for an arbitrary vector

\[\begin{split} \mathbf{x} = \begin{pmatrix} x_1\\x_2 \end{pmatrix} = x_1 \begin{pmatrix} 1\\0 \end{pmatrix} + x_2 \begin{pmatrix} 0\\1 \end{pmatrix} = x_1\mathbf{e}_1 + x_2\mathbf{e}_2, \end{split}\]

it follows that

\[\begin{split} \begin{array}{rcl} T(\mathbf{x}) &=& x_1T(\mathbf{e}_1) + x_2T(\mathbf{e}_2) \\ &=& x_1 \begin{pmatrix} 1 \\2 \end{pmatrix} + x_2 \begin{pmatrix} 4 \\3 \end{pmatrix} \,\,=\,\,\, \begin{pmatrix} 1 &4 \\2 &3 \end{pmatrix}\mathbf{x}. \end{array} \end{split}\]

So we see that

\[ T(\mathbf{x}) = A \mathbf{x}, \quad\text{where} \quad A = \begin{pmatrix} T(\mathbf{e}_1) & T(\mathbf{e}_2) \end{pmatrix}. \]

Exercise 3.1.5

Show that the procedure of Example 3.1.8 applied to the linear transformation of Example 3.1.5 indeed yields the matrix

\[\begin{split} A = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{pmatrix}. \end{split}\]

Solution to Exercise 3.1.5

Consider the linear transformation \(T:\mathbb{R}^2\rightarrow\mathbb{R}^3\) that sends each vector \( \begin{pmatrix} x \\ y \end{pmatrix}\) in \(\mathbb{R}^2\) to the vector \(\begin{pmatrix} x \\ y \\ 0 \end{pmatrix}\). It holds that

\[\begin{split} T(\vect{e}_1) = T\left(\begin{pmatrix} 1\\ 0 \end{pmatrix}\right) = \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix}, \quad T(\vect{e}_2) = T\left(\begin{pmatrix} 0\\ 1 \end{pmatrix}\right) = \begin{pmatrix} 0 \\ 1 \\ 0 \end{pmatrix}. \end{split}\]

We find that for an arbitrary vector \(\begin{pmatrix} x\\ y \end{pmatrix} = x\begin{pmatrix} 1\\ 0 \end{pmatrix}+y\begin{pmatrix} 0\\ 1 \end{pmatrix}\) it holds that

\[\begin{split} T\left(\begin{pmatrix} x\\ y \end{pmatrix}\right) = xT(\vect{e}_1) + yT(\vect{e}_2) = x\begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix}+ y\begin{pmatrix} 0 \\ 1 \\ 0 \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & 1\\ 0 & 0 \end{pmatrix}\begin{pmatrix} x\\ y \end{pmatrix}. \end{split}\]

The reasoning of Example 3.1.8 can be generalised. This is the content of the next theorem.

Theorem 3.1.1

Each linear transformation \(T\) from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) is a matrix transformation.

More specific, if \(T: \mathbb{R}^n \to \mathbb{R}^m\) is linear, then for each \(\mathbf{x}\) in \(\mathbb{R}^n\)

(3.1.4)#\[T(\mathbf{x}) = A\mathbf{x}, \quad \text{where} \quad A = \begin{pmatrix} T(\mathbf{e}_1) & T(\mathbf{e}_2) & \cdots & T(\mathbf{e}_n) \end{pmatrix}.\]

Proof of Theorem 3.1.1

We can more or less copy the derivation in Example 3.1.8. First of all, any vector \(\mathbf{x}\) is a linear combination of the standard basis:

\[\begin{split} \mathbf{x} = \begin{pmatrix} x_1\\x_2\\ \vdots \\ x_n \end{pmatrix} = x_1 \begin{pmatrix} 1 \\ 0 \\ \vdots \\ 0 \end{pmatrix} + x_2 \begin{pmatrix} 0 \\ 1 \\ \vdots \\ 0 \end{pmatrix} + \cdots + x_n \begin{pmatrix} 0 \\ 0 \\ \vdots \\ 1 \end{pmatrix}, \end{split}\]

i.e.,

\[ \mathbf{x} = x_1 \mathbf{e}_1 + x_2 \mathbf{e}_2 + \cdots + x_n \mathbf{e}_n. \]

From Proposition 3.1.4 it follows that

\[ T( \mathbf{x}) = x_1 T(\mathbf{e}_1) + x_2 T(\mathbf{e}_2) + \cdots + x_n T(\mathbf{e}_n). \]

The last expression is a linear combination of \(n\) vectors in \(\mathbb{R}^m\), and thus can be written as a matrix-vector product:

\[ x_1 T(\mathbf{e}_1) + x_2 T(\mathbf{e}_2) + \cdots + x_n T(\mathbf{e}_n) = \begin{pmatrix} T(\mathbf{e}_1) & T(\mathbf{e}_2) & \cdots & T(\mathbf{e}_n) \end{pmatrix} \mathbf{x}. \]

standard matrix

Definition 3.1.3

For a linear transformation \(T:\mathbb{R}^n \to \mathbb{R}^m\), the matrix

(3.1.5)#\[\begin{pmatrix} T(\mathbf{e}_1) & T(\mathbf{e}_2) & \cdots & T(\mathbf{e}_n) \end{pmatrix}\]

is called the standard matrix of \(T\).

In the Section Some important classes of linear transformations you will learn how to build standard matrices for rotations, reflections and other geometrical mappings. For now let us look at a more “algebraic” example.

Example 3.1.9

Consider the transformation

\[\begin{split} T: \begin{pmatrix} x \\ y \\ z \end{pmatrix} \mapsto \begin{pmatrix} x-y \\ 2y+3z \\ x+y-z \end{pmatrix}. \end{split}\]

It can be checked that the transformation has the two properties of a linear transformation according to the definition. Note that

\[\begin{split} T(\mathbf{e}_1) = \begin{pmatrix} 1 \\ 0 \\ 1 \end{pmatrix}, \quad T(\mathbf{e}_2) = \begin{pmatrix} -1 \\ 2 \\ 1 \end{pmatrix}, \quad \text{and} \quad T(\mathbf{e}_3) = \begin{pmatrix} 0 \\ 3 \\ -1 \end{pmatrix}. \end{split}\]

So we find that the standard matrix \([T]\) of \(T\) is given by

\[\begin{split} [T] = \begin{pmatrix} 1 & -1 & 0 \\ 0 &2&3 \\ 1 & 1 & -1 \end{pmatrix}. \end{split}\]

Exercise 3.1.6

In the previous example we could have found the matrix just by inspection.

For the slightly different transformation \(T:\R^3 \to \R^3\) given by

\[\begin{split} T: \begin{pmatrix} x \\ y \\ z \end{pmatrix} \mapsto \begin{pmatrix} 3x-z \\ y+4z \\ x-y+2z \end{pmatrix}, \end{split}\]

can you fill in the blanks in the following equation?

\[\begin{split} \begin{pmatrix} 3x-z \\ y+4z \\ x-y+2z \end{pmatrix} = \begin{pmatrix} \ldots & \ldots & \ldots \\ \ldots & \ldots & \ldots \\ \ldots & \ldots & \ldots \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix}. \end{split}\]

If you can, you will have shown that \(T\) is a matrix transformation, and as a direct consequence \(T\) is a linear transformation.

Solution to Exercise 3.1.6

\[\begin{split} \begin{pmatrix} 3x-z \\ y+4z \\ x-y+2z \end{pmatrix} = \begin{pmatrix} 3 & 0 & -1\\ 0 & 1 & 4 \\ 1 & -1 & 2 \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix}. \end{split}\]

To conclude we consider an example that refers back to Proposition 3.1.3, and which will to a large extent pave the road for the product of two matrices.

Example 3.1.10

Suppose \(T:\mathbb{R}^2 \to \mathbb{R}^3\) and \(S:\mathbb{R}^3 \to \mathbb{R}^3\) are the matrix transformations given by

\[\begin{split} T(\mathbf{x}) = A\mathbf{x} = \begin{pmatrix} 1&2 \\ 3&4 \\ 1&0 \end{pmatrix} \mathbf{x} \quad \text{and} \quad S(\mathbf{y}) = B\mathbf{y} = \begin{pmatrix} 1&0 &1 \\ 1 & -1 &2 \\ -1&1&-3 \end{pmatrix} \mathbf{y}. \end{split}\]

From Proposition 3.1.3 we know that the composition \(S\circ T: \mathbb{R}^2 \to \mathbb{R}^3\) is also a linear transformation. What is the (standard) matrix of \(S\circ T\)?

For this we need the images of the unit vectors \(\mathbf{e}_1\) and \(\mathbf{e}_2\) in \(\mathbb{R}^2\). For each vector we first apply \(T\) and then \(S\). For \(\mathbf{e}_1\) this gives

\[\begin{split} T(\mathbf{e}_1) = \begin{pmatrix} 1&2 \\ 3&4 \\ 1&0 \end{pmatrix} \begin{pmatrix} 1\\0 \end{pmatrix} = \begin{pmatrix} 1 \\ 3 \\ 1 \end{pmatrix}, \end{split}\]

and then

\[\begin{split} S (T(\mathbf{e}_1)) = \begin{pmatrix} 1&0 &1 \\ 1 & -1 &2 \\ -1&1&-3 \end{pmatrix} \begin{pmatrix} 1 \\ 3 \\ 1 \end{pmatrix}= \begin{pmatrix} 2 \\ 0 \\ -1 \end{pmatrix}. \end{split}\]

Likewise for \(\mathbf{e}_2\):

\[\begin{split} T(\mathbf{e}_2) = \begin{pmatrix} 1&2 \\ 3&4 \\ 1&0 \end{pmatrix} \begin{pmatrix} 0\\1 \end{pmatrix} = \begin{pmatrix} 2\\4\\0 \end{pmatrix} \,\,\Longrightarrow\,\, S (T(\mathbf{e}_2)) = \begin{pmatrix} 1&0 &1 \\ 1 & -1 &2 \\ -1&1&-3 \end{pmatrix} \begin{pmatrix} 2 \\ 4 \\ 0 \end{pmatrix}= \begin{pmatrix} 2 \\ -2 \\ 2 \end{pmatrix}. \end{split}\]

So the matrix of \(S\circ T\) becomes

\[\begin{split} [S\circ T] \,= \, \begin{pmatrix} S\circ T(\mathbf{e}_1)&S\circ T(\mathbf{e}_2) \end{pmatrix} \,\,=\,\, \begin{pmatrix} 2 &2 \\ 0&-2 \\ -1&2 \end{pmatrix}. \end{split}\]

In the Section Matrix operations we will define the product of two matrices precisely in such a way that

\[\begin{split} \begin{pmatrix} 1&0 &1 \\ 1 & -1 &2 \\ -1&1&-3 \end{pmatrix} \begin{pmatrix} 1&2 \\ 3&4 \\ 1&0 \end{pmatrix} = \begin{pmatrix} 2 &2 \\ 0&-2 \\ -1&2 \end{pmatrix}. \end{split}\]

3.1.5. Grasple exercises#

Grasple exercise 3.1.2

https://embed.grasple.com/exercises/3f14573a-1d4c-4a4b-ae48-ccb168005702?id=70373

To specify the domain and the codomain of a linear transformation.

Grasple exercise 3.1.3

https://embed.grasple.com/exercises/b80d9889-bd46-45c6-a9cb-d056aa315232?id=70374

To find the size of the matrix of a linear transformation.

Grasple exercise 3.1.4

https://embed.grasple.com/exercises/be6a768d-c60d-4ed6-81a7-5dea71b4a1a5?id=70375

To find the image of two vectors under \(T(\vect{x}) = A\vect{x}\).

Grasple exercise 3.1.5

https://embed.grasple.com/exercises/c8bb24f6-d357-4571-adb3-39ea0fa9e4ee?id=70395

For a linear map \(T\), find \(T(c\vect{u})\) and \(T(\vect{u}+\vect{v})\) if \(T(\vect{u})\) and \(T(\vect{v})\) are given.

Grasple exercise 3.1.6

https://embed.grasple.com/exercises/93048f7c-b755-4445-a532-949f34136096?id=70398

For a linear map \(T:\R^2 \to \R^2\), find \(T((x_1,x_2))\) if \(T(\vect{e}_1)\) and \(T(\vect{e}_2)\) are given.

Grasple exercise 3.1.7

https://embed.grasple.com/exercises/2af6559f-8871-494d-abce-d4263d530c69?id=70381

Find all vectors \(\vect{w}\) for which \(T(\vect{w}) = \vect{u}\).

Grasple exercise 3.1.8

https://embed.grasple.com/exercises/ce6e4a52-c985-43ee-92cb-2762a467ac5a?id=70383

Find vectors \(\vect{w}\) for which \(T(\vect{w}) = \vect{u}\).

Grasple exercise 3.1.9

https://embed.grasple.com/exercises/37b6bd46-8cfc-4c98-a5e8-53aa41c87dcf?id=70384

Find vectors \(\vect{w}\) for which \(T(\vect{w}) = \vect{u}\).

Grasple exercise 3.1.10

https://embed.grasple.com/exercises/c5b2a642-fd50-43f6-9346-c37a0ffe1a40?id=70386

Find vectors \(\vect{w}\) for which \(T(\vect{w}) = \vect{u}\).

Grasple exercise 3.1.11

https://embed.grasple.com/exercises/c3d009c0-62d6-4ae3-8ca1-04a5d2730455?id=70406

To show that a given transformation is non-linear.

Grasple exercise 3.1.12

https://embed.grasple.com/exercises/b9a4b128-f2c2-4612-a7f5-271c4e69aa70?id=70418

Finding an image and a pre-image of \(T:\R^2 \to \R^2\) using a picture.

Grasple exercise 3.1.13

https://embed.grasple.com/exercises/4058e54a-74f2-414e-9693-420abbc62677?id=70391

To give a geometric description of \(T: \vect{x} \mapsto A\vect{x}\).

Grasple exercise 3.1.14

https://embed.grasple.com/exercises/990bf561-629e-430f-b8d0-e757c63fe15c?id=70392

To give a geometric description of \(T: \vect{x} \mapsto A\vect{x}\).

Grasple exercise 3.1.15

https://embed.grasple.com/exercises/4e5d3f55-9257-4023-9739-5df0a1a9f277?id=70410

To find the matrix of the transformation that sends \((x,y)\) to \(x\vect{a}_1 + y\vect{a}_2\).

Grasple exercise 3.1.16

https://embed.grasple.com/exercises/9efa96e2-483d-4b2c-a58a-ba197bc09a81?id=70411

To find the matrix of the transformation that sends \((x,y)\) to \(x\vect{a}_1 + y\vect{a}_2\).

Grasple exercise 3.1.17

https://embed.grasple.com/exercises/729cba57-72d1-4d54-8cf9-c9946952bf9d?id=70412

To rewrite \(T:\R^3 \to \R^2\) to standard form.

Grasple exercise 3.1.18

https://embed.grasple.com/exercises/b4bb3730-f14c-4a60-a8b8-6b895cf93ac5?id=70413

To find the standard matrix for \(T:\R^4 \to \R\).

Grasple exercise 3.1.19

https://embed.grasple.com/exercises/34bb6386-7e7c-411b-83a1-09bbaf1106c5?id=70415

To find the standard matrix for \(T:\R^2 \to \R^2\) if \(T(\vect{v}_1)\) and \(T(\vect{v}_2)\) are given.

Grasple exercise 3.1.20

https://embed.grasple.com/exercises/ce8ba17c-0a17-4d5e-b4b7-5c277c7e8df8?id=70416

To find the standard matrix for \(T:\R^2 \to \R^3\) if \(T(\vect{v}_1)\) and \(T(\vect{v}_2)\) are given.

Grasple exercise 3.1.21

https://embed.grasple.com/exercises/2de4f8d1-ab3d-4d3a-94e4-5e414e2da3d9?id=70372

If \(T(\vect{0}) = \vect{0}\), is \(T\) (always) linear?

Grasple exercise 3.1.22

https://embed.grasple.com/exercises/94d618e0-de21-491c-ad44-8e29974e0303?id=71098

True or false? If \(\{\vect{v}_1,\vect{v}_2,\vect{v}_3\}\) is linearly dependent, then \(\{T(\vect{v}_1),T(\vect{v}_2),T(\vect{v}_3)\}\) is also linearly dependent.

Grasple exercise 3.1.23

https://embed.grasple.com/exercises/f983b627-10c2-4dd6-a273-2a33e99d0ded?id=71101

True or false? If \(\{\vect{v}_1,\vect{v}_2,\vect{v}_3\}\) is linearly independent, then \(\{T(\vect{v}_1),T(\vect{v}_2),T(\vect{v}_3)\}\) is also linearly independent.

Linear transformations

Contents

3.1. Linear transformations#

3.1.1. Introduction#

3.1.2. Matrix transformations#

3.1.3. Linear transformations#

3.1.4. Standard matrix for a linear transformation#

3.1.5. Grasple exercises#