3.1. Linear Transformations#

3.1.1. Introduction#

Until now we have used matrices in the context of linear systems. The equation

\[ A\mathbf{x} = \mathbf{b}, \]

where \(A\) is an \(m \times n\) matrix, is just a concise way to write down a system of \(m\) linear equations in \(n\) unknowns. A different way to look at this matrix equation is to consider it as an input-output system: the left-hand side \(A\mathbf{x}\) can be seen as a mapping that sends an “input” \(\mathbf{x}\) to an “output” \(\mathbf{y}= A\mathbf{x}\).

For instance, in computer graphics, typically points describing a 3D object have to be converted to points in 2D, to be able to visualize them on a screen. Or, in a dynamical system, a matrix \(A\) may describe how a system evolves from a “state” \(\mathbf{x}_{k}\) at time \(k\) to a state \(\mathbf{x}_{k+1}\) at time \(k+1\) via :

\[ \mathbf{x}_{k+1} = A\mathbf{x}_{k}. \]

A “state” may be anything ranging from a set of particles at certain positions, a set of pixels describing a minion, concentrations of chemical substances in a reactor tank, to population sizes of different species. Thinking mathematically we would describe such an input-output interpretation as a transformation (or: function, map, mapping, operator)

\[ T: \mathbb{R}^n \to \mathbb{R}^m. \]

We will see that these matrix transformations have two characteristic properties
which makes them the protagonists of the more general linear algebra concept of a linear transformation.

3.1.2. Matrix Transformations#

Let \(A\) be an \(m\times n\) matrix. We can in a natural way associate a transformation \(T_A:\mathbb{R}^n \to \mathbb{R}^m\) to the matrix \(A\).

Definition 3.1.1

The transformation \(T_A\) corresponding to the \(m\times n\) matrix \(A\) is the mapping defined by

\[ T_A(\mathbf{x}) = A\mathbf{x} \quad \text{or } \quad T_A:\mathbf{x} \mapsto A\mathbf{x}, \]

where \(\mathbf{x} \in \mathbb{R}^n\).

We call such a mapping a matrix transformation. Conversely we say that the matrix \(A\) represents the transformation \(T_A\).

As a first example consider the following.

Example 3.1.1

The transformation corresponding to the matrix \(A = \begin{bmatrix} 1 & 2 & 0\\ 1 & 2 & 1 \end{bmatrix}\) is defined by

\[\begin{split} T_A(\mathbf{x}) = \begin{bmatrix} 1 & 2 & 0\\ 1 & 2 & 1 \end{bmatrix}\mathbf{x}. \end{split}\]

We have, for instance

\[\begin{split} \begin{bmatrix} 1 & 2 & 0\\ 1 & 2 & 1 \end{bmatrix} \begin{bmatrix} 1\\1\\1 \end{bmatrix} = \begin{bmatrix} 3 \\ 4 \end{bmatrix} \quad \text{and} \quad \begin{bmatrix} 1 & 2 & 0\\ 1 & 2 & 1 \end{bmatrix} \begin{bmatrix} 2\\-1\\0 \end{bmatrix} = \begin{bmatrix} 0\\ 0 \end{bmatrix}. \end{split}\]

According to the definition of the matrix-vector product we can also write

(3.1.1)#\[\begin{split}A\mathbf{x} = \begin{bmatrix} 1 & 2 & 0\\ 1 & 2 & 1 \end{bmatrix} \begin{bmatrix} x_1\\x_2\\x_3 \end{bmatrix} = x_1 \begin{bmatrix} 1\\ 1 \end{bmatrix}+ x_2 \begin{bmatrix} 2 \\ 2 \end{bmatrix}+ x_3 \begin{bmatrix} 0\\ 1 \end{bmatrix}.\end{split}\]

We recall that for a transformation \(T\) from a domain \(D\) to a codomain \(E\) the range \(R= R_T\) is defined as the set of all images of elements of \(D\) in \(E\):

\[ R_T = \{\text{ all images } T(x), \, \text{ for } x \text{ in }D\}. \]

Remark 3.1.1

From Equation (3.1.1) it is clear that the range of the matrix transformation in Example 3.1.1 consists of all linear combinations of the three columns of \(A\):

\[\begin{split} \text{Range}(T_A) = \Span{ \begin{bmatrix} 1\\ 1 \end{bmatrix}, \begin{bmatrix} 2 \\ 2 \end{bmatrix}, \begin{bmatrix} 0\\ 1 \end{bmatrix}}. \end{split}\]

In a later chapter (Section 4.1, Subspaces of \(\R^n\)) we will call this the column space of the matrix \(A\).

The first example leads to a first property of matrix transformations:

Proposition 3.1.1

Suppose

\[ A = \begin{bmatrix} \mathbf{a}_1 & \mathbf{a}_2 & \ldots & \mathbf{a}_n \end{bmatrix} \]

is an \(m\times n\) matrix.

Then the range of the matrix transformation corresponding to \(A\) is the span of the columns of \(A\):

\[ \text{Range}(T_A) = \Span{\mathbf{a}_1, \mathbf{a}_2,\ldots,\mathbf{a}_n }. \]

Example 3.1.2

The matrix

\[\begin{split} A = \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{bmatrix} \end{split}\]

leads to the transformation

\[\begin{split} T: \mathbb{R}^2 \to \mathbb{R}^3, \quad T \left(\begin{bmatrix} x \\ y \end{bmatrix}\right)= \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} x \\ y \\0 \end{bmatrix}. \end{split}\]

This transformation “embeds” the plane \(\mathbb{R}^2\) into the space \(\mathbb{R}^3\), as depicted in Figure 3.1.1.

../_images/Fig-LinTrafo-EmbedR2R3.svg

Fig. 3.1.1 \(T\): embedding \(\mathbb{R}^2\) into \(\mathbb{R}^3\).#

The range of this transformation is the span of the two vectors

\[\begin{split} \mathbf{e}_1 = \begin{bmatrix} 1\\ 0 \\ 0 \end{bmatrix} \quad \text{and} \quad \mathbf{e}_2 = \begin{bmatrix} 0\\ 1 \\ 0 \end{bmatrix}, \end{split}\]

which is the \(xy\)-plane in \(\mathbb{R}^3\).

For \(2\times2\) and \(3\times3\) matrices the transformations often have a geometric interpretation, as the following example illustrates.

Example 3.1.3

The transformation corresponding to the matrix

\[\begin{split} A = \begin{bmatrix} 1 & 1 \\ 0 & 0 \end{bmatrix} \end{split}\]

is the mapping

\[\begin{split} T: \mathbb{R}^2 \to \mathbb{R}^2, \quad T\left(\begin{bmatrix} x \\ y \end{bmatrix}\right)= \begin{bmatrix} x +y \\ 0 \end{bmatrix}. \end{split}\]

First we observe that the range of this transformation consists of all multiples of the vector \( \begin{bmatrix} 1 \\ 0 \end{bmatrix} \), i.e. the \(x\)-axis in the plane.

Second, let us find the set of points/vectors that is mapped to an arbitrary point \(\begin{bmatrix} c \\ 0 \end{bmatrix}\) in the range. For this we solve

\[\begin{split} A\mathbf{x} = \begin{bmatrix} 1 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} c \\ 0 \end{bmatrix} \quad \iff \quad \begin{bmatrix} x+y \\ 0 \end{bmatrix} = \begin{bmatrix} c \\ 0 \end{bmatrix}. \end{split}\]

The points whose coordinates satisfy this equation all lie on the line described by the equation

\[ x + y = c. \]

So what the mapping does is to send all points on a line \(\mathcal{L}:x + y = c\) to the point \((c,0)\), which is the intersecting of this line with the \(x\)-axis.
An alternative way to describe it: it is the skew projection, in the direction \(\begin{bmatrix} 1 \\ -1 \end{bmatrix}\) onto the \(x\)-axis. See Figure 3.1.2.

../_images/Fig-LinTrafo-SkewProjection.svg

Fig. 3.1.2 The transformation of Example 3.1.3.#

Exercise 3.1.1

Find out whether the vectors

\[\begin{split} \mathbf{y}_1 = \begin{bmatrix} 2 \\ 1 \\ 0 \end{bmatrix} \quad \text{and} \quad \mathbf{y}_2 = \begin{bmatrix} 2 \\ 0 \\ 1 \end{bmatrix} \end{split}\]

are in the range of the matrix transformation

\[\begin{split} T(\mathbf{x}) = A\mathbf{x} = \begin{bmatrix} 1 &1&1 \\ 1 &-1&3 \\ -1&2&-4 \end{bmatrix}\mathbf{x}. \end{split}\]

We close this subsection with an example of a matrix transformation representing a very elementary dynamical system.

Example 3.1.4

Consider a model with two cities between which over a fixed period of time migrations take place. Say in a period of ten years 90% of the inhabitants in city \(A\) stay in city \(A\) and 10% move to city \(B\). From city \(B\) 20% of the citizens move to \(A\), so 80% stay in city \(B\).

The following table contains the relevant statistics:

\[\begin{split} \begin{array}{c|cc} & A & B \\ \hline A & 0.9 & 0.2 \\ B & 0.1 & 0.8 \\ \hline \end{array} \end{split}\]

For instance, if at time 0 the population in city \(A\) amounts to 50 (thousand) and in city \(B\) live 100 (thousand) people, then at the end of one period the population in city \(A\) amounts to

\[ 0.9 \times 50 + 0.2 \times 100 = 55. \]

Likewise for city \(B\).

If we denote the population sizes after \(k\) periods by a vector

\[\begin{split} \mathbf{x}_k = \begin{bmatrix} x_k \\ y_k \end{bmatrix} \end{split}\]

it follows that

\[\begin{split} \begin{bmatrix} x_{k+1} \\ y_{k+1} \end{bmatrix} = \begin{bmatrix} 0.9x_k + 0.2y_k \\0.1x_k + 0.8y_k \end{bmatrix}, \quad \text{i.e., } \mathbf{x}_{k+1} = \begin{bmatrix} 0.9 & 0.2 \\ 0.1 & 0.8 \end{bmatrix} \begin{bmatrix} x_k \\ y_k \end{bmatrix} = M \mathbf{x}_{k}. \end{split}\]

The \(M\) stands for migration matrix.

Obviously this model can be generalized to a “world” with any number of cities.

3.1.3. Linear Transformations#

In the previous section we saw that the matrix transformation \(\mathbf{y}=A\mathbf{x}\) can also be seen as a mapping \(T(\mathbf{x}) = A\mathbf{x}\).
This mapping has two characteristic properties on which we will focus in this section.

Definition 3.1.2

A linear transformation is a function \(T\) from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) that has the following properties

  1. For all vectors \(\mathbf{v}_1,\,\mathbf{v}_2\) in \(\mathbb{R}^n\):


    \[ T(\mathbf{v}_1+\mathbf{v}_2) = T(\mathbf{v}_1) + T(\mathbf{v}_2). \]
  2. For all vectors \(\mathbf{v}\) in \(\mathbb{R}^n\) and all scalars \(c\) in \(\mathbb{R}\):


    \[ T(c\mathbf{v}) = c\,T(\mathbf{v}). \]

Exercise 3.1.2

Show that a linear transformation from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) always sends the zero vector in \(\R^n\) to the zero vector in \(\R^m\).
Thus, if \( T:\mathbb{R}^n \to\mathbb{R}^m\) is a linear transformation, then \(T(\mathbf{0}_n) = \mathbf{0}_m\).

Example 3.1.5

Consider the map \(T:\mathbb{R}^2\rightarrow\mathbb{R}^3\) that sends each vector \(\begin{bmatrix} x \\ y \end{bmatrix}\) in \(\mathbb{R}^2\) to the vector \(\begin{bmatrix} x \\ y \\ 0 \end{bmatrix}\) in \(\mathbb{R}^3\). Let us check that this a linear map.

For that, we need to check the two properties in the definition.
For property (i) we take two arbitrary vectors

\[\begin{split} \begin{bmatrix} x_1 \\ y_1 \end{bmatrix} \quad \text{ and }\quad \begin{bmatrix} x_2 \\ y_2 \end{bmatrix} \quad \text{in} \quad \mathbb{R}^2, \end{split}\]

and see:

\[\begin{split} T\left(\begin{bmatrix} x_1 \\ y_1 \end{bmatrix} + \begin{bmatrix} x_2 \\ y_2 \end{bmatrix} \right)= T \left(\begin{bmatrix} x_1+x_2 \\ y_1+y_2 \end{bmatrix}\right)= \begin{bmatrix} x_1 + x_2 \\ y_1 + y_2 \\ 0 \end{bmatrix} = \begin{bmatrix} x_1 \\ y_1 \\ 0 \end{bmatrix} + \begin{bmatrix} x_2 \\ y_2 \\ 0 \end{bmatrix}. \end{split}\]

This last vector indeed equals

\[\begin{split} T\left(\begin{bmatrix} x_1 \\ y_1 \end{bmatrix}\right)+ T\left(\begin{bmatrix} x_2 \\ y_2 \end{bmatrix}\right). \end{split}\]

Similarly, for the second property, given any scalar \(c\),

\[\begin{split} T\left(c \begin{bmatrix} x_1 \\ y_1 \end{bmatrix}\right)= T \left(\begin{bmatrix} c x_1 \\ cy_1 \end{bmatrix}\right)= \begin{bmatrix} c x_1 \\ c y_1 \\ 0 \end{bmatrix} = c \begin{bmatrix} x_1 \\ y_1 \\ 0 \end{bmatrix}= cT \left(\begin{bmatrix} x_1 \\ y_1 \end{bmatrix}\right). \end{split}\]

So indeed \(T\) has the two properties of a linear transformation.

Example 3.1.6

Consider the mapping \(T:\mathbb{R}^2\rightarrow\mathbb{R}^2\) that sends each vector \( \begin{bmatrix} x \\ y \end{bmatrix}\) in \(\mathbb{R}^2\) to the vector \(\begin{bmatrix} x+y \\ xy \end{bmatrix}\):

\[\begin{split} T: \begin{bmatrix} x \\ y \end{bmatrix} \mapsto \begin{bmatrix} x+y \\ xy \end{bmatrix} \end{split}\]

This mapping is not a linear transformation.

\[\begin{split} T \left(\begin{bmatrix} 1 \\ 1 \end{bmatrix} + \begin{bmatrix} 1 \\ 2 \end{bmatrix}\right)= T \left(\begin{bmatrix} 2 \\ 3 \end{bmatrix}\right) = \begin{bmatrix} 5 \\ 6 \end{bmatrix}, \end{split}\]

whereas

\[\begin{split} T \left(\begin{bmatrix} 1 \\ 1 \end{bmatrix}\right)+ T \left(\begin{bmatrix} 1 \\ 2 \end{bmatrix}\right)= \begin{bmatrix} 2 \\ 1 \end{bmatrix} + \begin{bmatrix} 3 \\ 2 \end{bmatrix} = \begin{bmatrix} 5 \\ 3 \end{bmatrix} \,\neq\, \begin{bmatrix} 5 \\ 6 \end{bmatrix} . \end{split}\]

The second requirement of a linear transformation is violated as well:

\[\begin{split} T\left(3 \begin{bmatrix} 1 \\ 1 \end{bmatrix}\right)= T \left(\begin{bmatrix} 3 \\ 3 \end{bmatrix}\right)= \begin{bmatrix} 6 \\ 9 \end{bmatrix} \,\,\neq\,\, 3\,T \left(\begin{bmatrix} 1 \\ 1 \end{bmatrix} \right)= 3 \begin{bmatrix} 2 \\ 1 \end{bmatrix} = \begin{bmatrix} 6 \\ 3 \end{bmatrix}. \end{split}\]

Exercise 3.1.3

Let \(\mathbf{p}\) be a nonzero vector in \(\mathbb{R}^2\). Is the translation

\[ T\!:\mathbb{R}^2 \to \mathbb{R}^2, \quad \mathbf{x} \mapsto \mathbf{x} + \mathbf{p} \]

a linear transformation?

Note that Example 3.1.5 was in fact the first example of a matrix transformation in the Introduction:

\[\begin{split} \begin{bmatrix} x \\ y \end{bmatrix} \mapsto \begin{bmatrix} x \\ y \\ 0 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0&1 \\ 0&0 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} \end{split}\]

As we will see: any linear transformation from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) is a matrix transformation. The converse is true as well. This is the content of the next proposition.

Proposition 3.1.2

Each matrix transformation is a linear transformation.

Proof of Proposition 3.1.2

This is a direct consequence of the two properties of the matrix-vector product (Proposition 2.4.2) that say

\[ A\,(\mathbf{x}+\mathbf{y} ) = A\mathbf{x} + A\mathbf{y} \quad \text{and} \quad A\,(c\mathbf{x}) = c\,A\mathbf{x}. \]

Proposition 3.1.3

Suppose \(T: \mathbb{R}^n\to\mathbb{R}^m\) and \(S:\mathbb{R}^m\to\mathbb{R}^p\) are linear transformations. Then the transformation \(S\circ T:\mathbb{R}^n\to\mathbb{R}^p \) defined by

\[ S\circ T(\mathbf{x}) = S(T(\mathbf{x})) \]

is a linear transformation from \(\mathbb{R}^n\) to \(\mathbb{R}^p\).

Remark 3.1.2

The transformation \(S\circ T\) is called the composition of the two transformations \(S\) and \(T\). It is best read as \(S\) after \(T\).

Proof of Proposition 3.1.3

Suppose that

\[ T(\mathbf{x}+\mathbf{y}) = T(\mathbf{x})+T(\mathbf{y})\quad \text{and} \quad T(c\mathbf{x}) = cT(\mathbf{x}), \quad \text{for}\,\, \mathbf{x}, \mathbf{y} \quad \text{in } \mathbb{R}^n, \,\, c \text{ in } \mathbb{R} \]

and likewise for \(S\). Then

\[\begin{split} \begin{array}{rl} S\circ T(\mathbf{x}+\mathbf{y}) = S(T(\mathbf{x}+\mathbf{y})) = S( T(\mathbf{x})+T(\mathbf{y})) \!\!\!\!& = S(T(\mathbf{x})) + S(T(\mathbf{y})) \\ & = S\circ T(\mathbf{x}) + S\circ T(\mathbf{y}) \end{array} \end{split}\]

and

\[ S\circ T(c\mathbf{x}) = S(T(c\mathbf{x})) = S(cT(\mathbf{x})) = c\,S(T(\mathbf{x})) = c\,S\circ T(\mathbf{x}). \]

Hence \(S\circ T\) satisfies the two requirements of a linear transformation.

In words: the composition/concatenation of two linear transformations is itself a linear transformation.

Exercise 3.1.4

There are other ways to combine linear transformations.

The sum \(S = T_1 + T_2\) of two linear transformation \(T_1,T_2: \mathbb{R}^n \to \mathbb{R}^m\) is defined as follows

\[ S: \mathbb{R}^n \to \mathbb{R}^m, \quad S(\mathbf{x}) = T_1(\mathbf{x}) + T_2(\mathbf{x}). \]

And the (scalar) multiple \(T_3 = cT_1\) is the transformation

\[ T_3: \mathbb{R}^n \to \mathbb{R}^m, \quad T_3(\mathbf{x}) = cT_1(\mathbf{x}). \]

Show that \(S\) and \(T_3\) are again linear transformations.

And now, let us return to matrix transformations.

3.1.4. Standard Matrix for a Linear Transformation#

We have seen that every matrix transformation is a linear transformation. In this subsection we will show that conversely every linear transformation \(T:\mathbb{R}^n \to \mathbb{R}^m\) can be represented by a matrix transformation.

The key to construct a matrix that represents a given linear transformation lies in the following proposition.

Proposition 3.1.4

Suppose \(T:\mathbb{R}^n\rightarrow\mathbb{R}^m\) is a linear transformation. Then the following property holds: for each set of vectors \(\mathbf{x}_1, \ldots, \mathbf{x}_k\) in \(\mathbb{R}^n\) and each set of numbers \(c_1,\ldots,c_k\) in \(\mathbb{R}\):

(3.1.2)#\[T(c_1\mathbf{x}\_1+c_2 \mathbf{x}\_2+\ldots +c_k \mathbf{x}\_k) = c_1T(\mathbf{x}\_1)+c_2T(\mathbf{x}\_2)+\ldots +c_kT( \mathbf{x}\_k).\]

In words: for any linear transformation the image of a linear combination of vectors is equal to the linear combination of their images.

Proof of Proposition 3.1.4

Suppose \(T:\mathbb{R}^n\rightarrow\mathbb{R}^m\) is a linear transformation.

So we have

\[ \text{(i) } T(\mathbf{x}+\mathbf{y}) = T(\mathbf{x})+T(\mathbf{y}) \quad\text{and} \quad \text{(ii) } T(c\mathbf{x}) = c T(\mathbf{x}). \]

First apply rule (i) to split the term on the left in (3.1.2) into \(k\) terms:

\[\begin{split} \begin{array}{ccl} T(c_1\mathbf{x}_1+c_2 \mathbf{x}_2+\ldots +c_k \mathbf{x}_k) &=& T(c_1\mathbf{x}_1)+T(c_2 \mathbf{x}_2+\ldots +c_k \mathbf{x}_k) \\ &=& \quad \ldots \\ &=& T(c_1\mathbf{x}_1)+T(c_2 \mathbf{x}_2)+\ldots + T(c_k \mathbf{x}_k) \end{array} \end{split}\]

and then apply rule (ii) to each term.

Example 3.1.7

Suppose \(T: \mathbb{R}^3 \to \mathbb{R}^2\) is a linear transformation, and we know that for

\[\begin{split} \vect{a}_1 = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}, \quad \vect{a}_2 = \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix}, \quad \vect{a}_3 = \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix} \end{split}\]

the images under \(T\) are given by

\[\begin{split} T(\vect{a}_1) = \vect{b}_1 = \begin{bmatrix} 1 \\ 2 \end{bmatrix}, \quad T(\vect{a}_2) = \vect{b}_2 = \begin{bmatrix} 3 \\ -1 \end{bmatrix}, \quad \text{and} \quad T(\vect{a}_3) = \vect{b}_3 = \begin{bmatrix} 2 \\ -2 \end{bmatrix}. \end{split}\]

Then for the vector

\[\begin{split} \vect{v} = \begin{bmatrix} 4 \\ 1 \\ -1 \end{bmatrix} = 3 \vect{a}_1 + 2 \vect{a}_2 - 1 \vect{a}_3 \end{split}\]

it follows that

\[\begin{split} T(\vect{v}) = 3 \vect{b}_1 + 2 \vect{b}_2 + (-1) \vect{b}_3 = 3 \begin{bmatrix} 1 \\ 2 \end{bmatrix} + 2 \begin{bmatrix} 3 \\ -1 \end{bmatrix} + (-1) \begin{bmatrix} 2 \\ -2 \end{bmatrix}= \begin{bmatrix} 7 \\ 6 \end{bmatrix}. \end{split}\]

The central idea illustrated in Example 3.1.7, which is in fact a direct consequence of Proposition 3.1.4, is the following:

a linear transformation \(T\) from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) is completely specified by the images \( T(\mathbf{a}\_1), T(\mathbf{a}\_2), \ldots , T(\mathbf{a}\_n)\) of a set of vectors \(\{\mathbf{a}_1, \mathbf{a}_2, \ldots, \mathbf{a}_n\}\) that spans \(\mathbb{R}^n\).
This idea is also hovering over Example 3.1.7.

The simplest set of vectors that spans the whole space \(\mathbb{R}^n\) is the standard basis for \(\mathbb{R}^n\) which was introduced in the section Linear Combinations.

Recall that this is the set of vectors

(3.1.3)#\[\begin{split}\left(\vect{e}\_1,\mathbf{e}\_2, \ldots, \mathbf{e}\_n\right)= \left(\begin{bmatrix} 1 \\ 0 \\ 0 \\ \vdots \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \\ 0 \\ \vdots \\ 0 \end{bmatrix}, \quad \cdots \quad , \begin{bmatrix} 0 \\ 0 \\ 0 \\ \vdots \\ 1 \end{bmatrix}\right).\end{split}\]

The next example gives an illustration of the above, and it also leads the way to the construction of a matrix for an arbitrary linear transformation.

Example 3.1.8

Suppose \(T\) is a linear transformation from \(\mathbb{R}^2\) to \(\mathbb{R}^2\) for which

\[\begin{split} T(\mathbf{e}_1) = \mathbf{a}_1 = \begin{bmatrix} 1 \\2 \end{bmatrix}, \quad T(\mathbf{e}_2) = \mathbf{a}_2 = \begin{bmatrix} 4 \\3 \end{bmatrix}. \end{split}\]

Then for an arbitrary vector

\[\begin{split} \mathbf{x} = \begin{bmatrix} x_1\\x_2 \end{bmatrix} = x_1 \begin{bmatrix} 1\\0 \end{bmatrix} + x_2 \begin{bmatrix} 0\\1 \end{bmatrix} = x_1\mathbf{e}_1 + x_2\mathbf{e}_2, \end{split}\]

it follows that

\[\begin{split} \begin{array}{rcl} T(\mathbf{x}) &=& x_1T(\mathbf{e}_1) + x_2T(\mathbf{e}_2) \\ &=& x_1 \begin{bmatrix} 1 \\2 \end{bmatrix} + x_2 \begin{bmatrix} 4 \\3 \end{bmatrix} \,\,=\,\,\, \begin{bmatrix} 1 &4 \\2 &3 \end{bmatrix}\mathbf{x}. \end{array} \end{split}\]

So we see that

\[ T(\mathbf{x}) = A \mathbf{x}, \quad\text{where} \quad A = \begin{bmatrix} T(\mathbf{e}_1) & T(\mathbf{e}_2) \end{bmatrix}. \]

Exercise 3.1.5

Show that the procedure of Example 3.1.8 applied to the linear transformation of Example 3.1.5 indeed yields the matrix

\[\begin{split} A = \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{bmatrix}. \end{split}\]

The reasoning of Example 3.1.8 can be generalized. This is the content of the next theorem.

Theorem 3.1.1

Each linear transformation \(T\) from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) is a matrix transformation.

More specific, if \(T: \mathbb{R}^n \to \mathbb{R}^m\) is linear, then for each \(\mathbf{x}\) in \(\mathbb{R}^n\)

(3.1.4)#\[T(\mathbf{x}) = A\mathbf{x}, \quad \text{where} \quad A = \begin{bmatrix} T(\mathbf{e}\_1) & T(\mathbf{e}\_2) & \ldots & T(\mathbf{e}\_n) \end{bmatrix}.\]

Proof of Theorem 3.1.1

We can more or less copy the derivation in Example 3.1.8. First of all, any vector \(\mathbf{x}\) is a linear combination of the standard basis:

\[\begin{split} \mathbf{x} = \begin{bmatrix} x_1\\x_2\\ \vdots \\ x_n \end{bmatrix} = x_1 \begin{bmatrix} 1 \\ 0 \\ \vdots \\ 0 \end{bmatrix} + x_2 \begin{bmatrix} 0 \\ 1 \\ \vdots \\ 0 \end{bmatrix} + \ldots + x_n \begin{bmatrix} 0 \\ 0 \\ \vdots \\ 1 \end{bmatrix}, \end{split}\]

i.e.,

\[ \mathbf{x} = x_1 \mathbf{e}_1 + x_2 \mathbf{e}_2 + \ldots + x_n \mathbf{e}_n. \]

From Proposition 3.1.4 it follows that

\[ T( \mathbf{x}) = x_1 T(\mathbf{e}_1) + x_2 T(\mathbf{e}_2) + \ldots + x_n T(\mathbf{e}_n). \]

The last expression is a linear combination of \(n\) vectors in \(\mathbb{R}^m\), and this can be written as a matrix-vector product:

\[ x_1 T(\mathbf{e}_1) + x_2 T(\mathbf{e}_2) + \ldots + x_n T(\mathbf{e}_n) = \begin{bmatrix} T(\mathbf{e}_1) & T(\mathbf{e}_2) & \ldots & T(\mathbf{e}_n) \end{bmatrix} \mathbf{x}. \]

Definition 3.1.3

For a linear transformation \(T:\mathbb{R}^n \to \mathbb{R}^m\), the matrix

(3.1.5)#\[\begin{bmatrix} T(\mathbf{e}\_1) & T(\mathbf{e}\_2) & \ldots & T(\mathbf{e}\_n) \end{bmatrix}\]

is called the standard matrix of \(T\).

In the section Some Important Classes of Linear Transformations you will learn how to build standard matrices for rotations, reflections and other geometrical mappings. For now let us look at a more “algebraic” example.

Example 3.1.9

Consider the transformation

\[\begin{split} T: \begin{bmatrix} x \\ y \\ z \end{bmatrix} \mapsto \begin{bmatrix} x-y \\ 2y+3z \\ x+y-z \end{bmatrix}. \end{split}\]

It can be checked that the transformation has the two properties of a linear transformation according to the definition. Note that

\[\begin{split} T(\mathbf{e}_1) = \begin{bmatrix} 1 \\ 0 \\ 1 \end{bmatrix}, \quad T(\mathbf{e}_2) = \begin{bmatrix} -1 \\ 2 \\ 1 \end{bmatrix}, \quad \text{and} \quad T(\mathbf{e}_3) = \begin{bmatrix} 0 \\ 3 \\ -1 \end{bmatrix}. \end{split}\]

So we find that the matrix \([T]\) of \(T\) is given by

\[\begin{split} [T] = \begin{bmatrix} 1 & -1 & 0 \\ 0 &2&3 \\ 1 & 1 & -1 \end{bmatrix} \end{split}\]

is the standard matrix of \(T\).

Exercise 3.1.6

In the previous example we could have found the matrix just by inspection.

For the slightly different transformation \(T:\R \to \R\) given by

\[\begin{split} T: \begin{bmatrix} x \\ y \\ z \end{bmatrix} \mapsto \begin{bmatrix} 3x-z \\ y+4z \\ x-y+2z \end{bmatrix}, \end{split}\]

can you fill in the blanks in the following equation?

\[\begin{split} \begin{bmatrix} 3x-z \\ y+4z \\ x-y+2z \end{bmatrix} = \begin{bmatrix} .. & .. & .. \\ .. & .. & .. \\ .. & .. & .. \end{bmatrix} \begin{bmatrix} x \\ y \\ z \end{bmatrix}. \end{split}\]

If you can, you will have shown that \(T\) is a matrix transformation, and as a direct consequence \(T\) is a linear transformation.

To conclude we consider an example that refers back to Proposition 3.1.3, and which will to a large extent pave the road for the product of two matrices.

Example 3.1.10

Suppose \(T:\mathbb{R}^2 \to \mathbb{R}^3\) and \(S:\mathbb{R}^3 \to \mathbb{R}^3\) are the matrix transformations given by

\[\begin{split} T(\mathbf{x}) = A\mathbf{x} = \begin{bmatrix} 1&2 \\ 3&4 \\ 1&0 \end{bmatrix} \mathbf{x} \quad \text{and} \quad S(\mathbf{y}) = B\mathbf{y} = \begin{bmatrix} 1&0 &1 \\ 1 & -1 &2 \\ -1&1&-3 \end{bmatrix} \mathbf{x} \end{split}\]

From Proposition 3.1.3 we know that the composition \(S\circ T: \mathbb{R}^2 \to \mathbb{R}^3\) is also a linear transformation. What is the (standard) matrix of \(S\circ T\)?

For this we need the images of the unit vectors \(\mathbf{e}_1\) and \(\mathbf{e}_2\) in \(\mathbb{R}^2\). For each vector we first apply \(T\) and then \(S\). For \(\mathbf{e}_1\) this gives

\[\begin{split} T(\mathbf{e}_1) = \begin{bmatrix} 1&2 \\ 3&4 \\ 1&0 \end{bmatrix} \begin{bmatrix} 1\\0 \end{bmatrix} = \begin{bmatrix} 1 \\ 3 \\ 1 \end{bmatrix}, \end{split}\]

and then

\[\begin{split} S (T(\mathbf{e}_1)) = \begin{bmatrix} 1&0 &1 \\ 1 & -1 &2 \\ -1&1&-3 \end{bmatrix} \begin{bmatrix} 1 \\ 3 \\ 1 \end{bmatrix}= \begin{bmatrix} 2 \\ 0 \\ -1 \end{bmatrix}. \end{split}\]

Likewise for \(\mathbf{e}_2\):

\[\begin{split} T(\mathbf{e}_2) = \begin{bmatrix} 1&2 \\ 3&4 \\ 1&0 \end{bmatrix} \begin{bmatrix} 0\\1 \end{bmatrix} = \begin{bmatrix} 2\\4\\0 \end{bmatrix} \,\,\Longrightarrow\,\, S (T(\mathbf{e}_2)) = \begin{bmatrix} 1&0 &1 \\ 1 & -1 &2 \\ -1&1&-3 \end{bmatrix} \begin{bmatrix} 2 \\ 4 \\ 0 \end{bmatrix}= \begin{bmatrix} 2 \\ -2 \\ 2 \end{bmatrix}. \end{split}\]

So the matrix of \(\circ \) becomes

\[\begin{split} [S\circ T] \,= \, \begin{bmatrix} S\circ T(\mathbf{e_1})&S\circ T(\mathbf{e_2}) \end{bmatrix} \,\,=\,\, \begin{bmatrix} 2 &2 \\ 0&-2 \\ -1&2 \end{bmatrix}. \end{split}\]

In the section Matrix Operations we will define the product of two matrices precisely in such a way that

\[\begin{split} \begin{bmatrix} 1&0 &1 \\ 1 & -1 &2 \\ -1&1&-3 \end{bmatrix} \begin{bmatrix} 1&2 \\ 3&4 \\ 1&0 \end{bmatrix} = \begin{bmatrix} 2 &2 \\ 0&-2 \\ -1&2 \end{bmatrix}. \end{split}\]

3.1.5. Grasple Exercises#

Grasple Exercise 3.1.1

https://embed.grasple.com/exercises/3f14573a-1d4c-4a4b-ae48-ccb168005702?id=70373

To specify the domain and the codomain of a linear transformation

Grasple Exercise 3.1.2

https://embed.grasple.com/exercises/b80d9889-bd46-45c6-a9cb-d056aa315232?id=70374

To find the size of the matrix of a linear transformation

Grasple Exercise 3.1.3

https://embed.grasple.com/exercises/be6a768d-c60d-4ed6-81a7-5dea71b4a1a5?id=70375

To find image of two vectors under \(T(\vect{x}) = A\vect{x}\).

Grasple Exercise 3.1.4

https://embed.grasple.com/exercises/c8bb24f6-d357-4571-adb3-39ea0fa9e4ee?id=70395

For linear map \(T\), find \(T(c\vect{u})\) and \(T(\vect{u}+\vect{v})\) if \(T(\vect{u})\) and \(T(\vect{v})\) are given.

Grasple Exercise 3.1.5

https://embed.grasple.com/exercises/93048f7c-b755-4445-a532-949f34136096?id=70398

For linear map \(T:\R^2 \to \R^2\), find \(T((x1,x2))\) if \(T(\vect{e}_1)\) and \(T(\vect{e}_2)\) are given

Grasple Exercise 3.1.6

https://embed.grasple.com/exercises/2af6559f-8871-494d-abce-d4263d530c69?id=70381

Find all vectors \(\vect{w}\) for which \(T(\vect{w}) = \vect{u}\).

Grasple Exercise 3.1.7

https://embed.grasple.com/exercises/ce6e4a52-c985-43ee-92cb-2762a467ac5a?id=70383

Find vectors \(\vect{w}\) for which \(T(\vect{w}) = \vect{u}\).

Grasple Exercise 3.1.8

https://embed.grasple.com/exercises/37b6bd46-8cfc-4c98-a5e8-53aa41c87dcf?id=70384

Find vectors \(\vect{w}\) for which \(T(\vect{w}) = \vect{u}\).

Grasple Exercise 3.1.9

https://embed.grasple.com/exercises/c5b2a642-fd50-43f6-9346-c37a0ffe1a40?id=70386

Find vectors \(\vect{w}\) for which \(T(\vect{w}) = \vect{u}\).

Grasple Exercise 3.1.10

https://embed.grasple.com/exercises/c3d009c0-62d6-4ae3-8ca1-04a5d2730455?id=70406

To show that a given transformation is non-linear.

Grasple Exercise 3.1.11

https://embed.grasple.com/exercises/b9a4b128-f2c2-4612-a7f5-271c4e69aa70?id=70418

Finding an image and a pre-image of \(T:\R^2 \to \R^2\) using a picture.

Grasple Exercise 3.1.12

https://embed.grasple.com/exercises/4058e54a-74f2-414e-9693-420abbc62677?id=70391

To give a geometric description of \(T: \vect{x} \mapsto A\vect{x}\).

Grasple Exercise 3.1.13

https://embed.grasple.com/exercises/990bf561-629e-430f-b8d0-e757c63fe15c?id=70392

To give a geometric description of \(T: \vect{x} \mapsto A\vect{x}\).

Grasple Exercise 3.1.14

https://embed.grasple.com/exercises/4e5d3f55-9257-4023-9739-5df0a1a9f277?id=70410

To find the matrix of the transformation that sends \((x,y)\) to \(x\vect{a}_1 + y\vect{a}_2\).

Grasple Exercise 3.1.15

https://embed.grasple.com/exercises/9efa96e2-483d-4b2c-a58a-ba197bc09a81?id=70411

To find the matrix of the transformation that sends \((x,y)\) to \(x\vect{a}_1 + y\vect{a}_2\).

Grasple Exercise 3.1.16

https://embed.grasple.com/exercises/729cba57-72d1-4d54-8cf9-c9946952bf9d?id=70412

To rewrite \(T:\R^3 \to \R^2\) to standard form.

Grasple Exercise 3.1.17

https://embed.grasple.com/exercises/b4bb3730-f14c-4a60-a8b8-6b895cf93ac5?id=70413

To find the standard matrix for \(T:\R^4 \to \R\).

Grasple Exercise 3.1.18

https://embed.grasple.com/exercises/34bb6386-7e7c-411b-83a1-09bbaf1106c5?id=70415

To find the standard matrix for \(T:\R^2 \to \R^2\) if \(T(\vect{v}_1)\) and \(T(\vect{v}_2)\) are given.

Grasple Exercise 3.1.19

https://embed.grasple.com/exercises/ce8ba17c-0a17-4d5e-b4b7-5c277c7e8df8?id=70416

To find the standard matrix for \(T:\R^2 \to \R^3\) if \(T(\vect{v}_1)\) and \(T(\vect{v}_2)\) are given.

Grasple Exercise 3.1.20

https://embed.grasple.com/exercises/2de4f8d1-ab3d-4d3a-94e4-5e414e2da3d9?id=70372

If \(T(\vect{0}) = \vect{0}\), is \(T\) (always) linear?

Grasple Exercise 3.1.21

https://embed.grasple.com/exercises/3f992e7a-19e3-4b83-8d90-db86e323ea94?id=69296

To show that \(T(\vect{0}) = \vect{0}\) for a linear transformation.

Grasple Exercise 3.1.22

https://embed.grasple.com/exercises/94d618e0-de21-491c-ad44-8e29974e0303?id=71098

(T/F) If \(\{\vect{v}_1,\vect{v}_2,\vect{v}_3\}\) is linearly dependent, then \(\{T(\vect{v}_1),T(\vect{v}_2),T(\vect{v}_3)\}\) is also linearly dependent?

Grasple Exercise 3.1.23

https://embed.grasple.com/exercises/f983b627-10c2-4dd6-a273-2a33e99d0ded?id=71101

(T/F) If \(\{\vect{v}_1,\vect{v}_2,\vect{v}_3\}\) is linearly independent, then \(\{T(\vect{v}_1),T(\vect{v}_2),T(\vect{v}_3)\}\) is also linearly independent?