3.1. Linear transformations#
3.1.1. Introduction#
Until now we have used matrices in the context of linear systems. The equation
where \(A\) is an \(m \times n\)-matrix, is just a concise way to write down a system of \(m\) linear equations in \(n\) unknowns. A different way to look at this matrix equation is to consider it as an input-output system: the left-hand side \(A\mathbf{x}\) can be seen as a mapping that sends an “input” \(\mathbf{x}\) to an “output” \(\mathbf{y}= A\mathbf{x}\).
For instance, in computer graphics, typically points describing a 3D object have to be converted to points in 2D, to be able to visualise them on a screen. Or, in a dynamical system, a matrix \(A\) may describe how a system evolves from a “state” \(\mathbf{x}_{k}\) at time \(k\) to a state \(\mathbf{x}_{k+1}\) at time \(k+1\) via
A “state” may be anything ranging from a set of particles at certain positions, a set of pixels describing a minion, concentrations of chemical substances in a reactor tank, to population sizes of different species. Thinking mathematically we would describe such an input-output interpretation as a transformation (or: function, map, mapping, operator, etc. ).
We will see that these matrix transformations have two characteristic properties
which makes them the protagonists of the more general linear algebra concept of a linear transformation.
3.1.2. Matrix transformations#
Let \(A\) be an \(m\times n\)-matrix. We can in a natural way associate a transformation \(T_A:\mathbb{R}^n \to \mathbb{R}^m\) to the matrix \(A\).
Definition 3.1.1
The transformation \(T_A\) corresponding to the \(m\times n\)-matrix \(A\) is the mapping defined by
where \(\mathbf{x} \in \mathbb{R}^n\).
We call such a mapping a matrix transformation. Conversely we say that the matrix \(A\) represents the transformation \(T_A\).
As a first example consider the following.
Example 3.1.1
The transformation corresponding to the matrix \(A = \begin{pmatrix} 1 & 2 & 0\\ 1 & 2 & 1 \end{pmatrix}\) is defined by
We have, for instance
According to the definition of the matrix-vector product we can also write
We recall that for a transformation \(T\) from a domain \(D\) to a codomain \(E\) the range \(R= R_T\) is defined as the set of all images of elements of \(D\) in \(E\):
Remark 3.1.1
From Equation (3.1.1) it is clear that the range of the matrix transformation in Example 3.1.1 consists of all linear combinations of the three columns of \(A\):
In a later section (Section 4.1) we will call this the column space of the matrix \(A\).
The first example leads to a first property of matrix transformations:
Proposition 3.1.1
Suppose
is an \(m\times n\)-matrix.
Then the range of the matrix transformation corresponding to \(A\) is the span of the columns of \(A\):
Example 3.1.2
The matrix
leads to the transformation
This transformation “embeds” the plane \(\mathbb{R}^2\) into the space \(\mathbb{R}^3\), as depicted in Figure 3.1.1.
Fig. 3.1.1 \(T\) embeds \(\mathbb{R}^2\) into \(\mathbb{R}^3\).#
The range of this transformation is the span of the two vectors
which is the \(xy\)-plane in \(\mathbb{R}^3\).
For \(2\times2\) and \(3\times3\)-matrices the transformations often have a geometric interpretation, as the following example illustrates.
Example 3.1.3
The transformation corresponding to the matrix
is the mapping
First we observe that the range of this transformation consists of all multiples of the vector \( \begin{pmatrix} 1 \\ 0 \end{pmatrix} \), i.e. the \(x\)-axis in the plane.
Second, let us find the set of points/vectors that is mapped to an arbitrary point \(\begin{pmatrix} c \\ 0 \end{pmatrix}\) in the range. For this we solve
The points whose coordinates satisfy this equation all lie on the line described by the equation
So what the mapping does is to send all points on a line \(\mathcal{L}:x + y = c\) to the point \((c,0)\), which is the intersecting of this line with the \(x\)-axis. An alternative way to describe it: it is the skew projection, in the direction \(\begin{pmatrix} 1 \\ -1 \end{pmatrix}\) onto the \(x\)-axis. See Figure 3.1.2.
Fig. 3.1.2 The transformation of Example 3.1.3.#
Grasple exercise 3.1.1
Finding out which vectors are in the range of a linear transformation.
Click to show/hide
We close this subsection with an example of a matrix transformation representing a very elementary dynamical system.
Example 3.1.4
Consider a model with two cities between which over a fixed period of time migrations take place. Say in a period of ten years 90% of the inhabitants in city \(A\) stay in city \(A\) and 10% move to city \(B\). From city \(B\) 20% of the citizens move to \(A\), so 80% stay in city \(B\).
The following table contains the relevant statistics:
For instance, if at time \(0\) the population in city \(A\) amounts to \(50\) (thousand) and in city \(B\) live \(100\) (thousand) people, then at the end of one period the population in city \(A\) amounts to
Likewise for city \(B\).
If we denote the population sizes after \(k\) periods by a vector
it follows that
The \(M\) stands for migration matrix.
Obviously this model can be generalised to a “world” with any number of cities.
3.1.3. Linear transformations#
In the previous section we saw that the matrix transformation \(\mathbf{y}=A\mathbf{x}\) can also be seen as a mapping \(T(\mathbf{x}) = A\mathbf{x}\).
This mapping has two characteristic properties on which we will focus in this section.
Definition 3.1.2
A linear transformation is a function \(T\) from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) that has the following properties
- 
For all vectors \(\mathbf{v}_1,\,\mathbf{v}_2\) in \(\mathbb{R}^n\): \[ T(\mathbf{v}_1+\mathbf{v}_2) = T(\mathbf{v}_1) + T(\mathbf{v}_2). \]
- 
For all vectors \(\mathbf{v}\) in \(\mathbb{R}^n\) and all scalars \(c\) in \(\mathbb{R}\): \[ T(c\mathbf{v}) = c\,T(\mathbf{v}). \]
Exercise 3.1.1
Show that a linear transformation from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) always sends the zero vector \(\mathbf{0}_n\) in \(\R^n\) to the zero vector \(\mathbf{0}_m\) in \(\R^m\).
Thus, if \( T:\mathbb{R}^n \to\mathbb{R}^m\) is a linear transformation, then \(T(\mathbf{0}_n) = \mathbf{0}_m\).
Solution to Exercise 3.1.1
If \( T:\mathbb{R}^n \to\mathbb{R}^m\) is linear, and \(\vect{v}\) is any vector in \(\R^n\), then \(\mathbf{0}_n = 0\vect{v}\). From the second property in Definition 3.1.2 it follows that
Example 3.1.5
Consider the map \(T:\mathbb{R}^2\rightarrow\mathbb{R}^3\) that sends each vector \(\begin{pmatrix} x \\ y \end{pmatrix}\) in \(\mathbb{R}^2\) to the vector \(\begin{pmatrix} x \\ y \\ 0 \end{pmatrix}\) in \(\mathbb{R}^3\). Let us check that this a linear map.
For that, we need to check the two properties in the definition.
For property (i) we take two arbitrary vectors
and see:
This last vector indeed equals
Similarly, for the second property, given any scalar \(c\),
So indeed \(T\) has the two properties of a linear transformation.
Example 3.1.6
Consider the mapping \(T:\mathbb{R}^2\rightarrow\mathbb{R}^2\) that sends each vector \( \begin{pmatrix} x \\ y \end{pmatrix}\) in \(\mathbb{R}^2\) to the vector \(\begin{pmatrix} x+y \\ xy \end{pmatrix}\):
This mapping is not a linear transformation.
whereas
The second requirement of a linear transformation is violated as well:
Exercise 3.1.2
Let \(\mathbf{p}\) be a non-zero vector in \(\mathbb{R}^2\). Is the translation
a linear transformation?
Solution to Exercise 3.1.2
The transformation defined by \(T(\vect{x}) = \vect{x} + \vect{p}\), with \(\vect{p}\neq \vect{0}\) does not have any of the two properties of a linear transformation.
For instance, since \(\vect{p}+\vect{p} \neq \vect{p}\),
Note that Example 3.1.5 was in fact the first example of a matrix transformation in the Subsection Introduction:
As we will see: any linear transformation from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) is a matrix transformation. The converse is true as well. This is the content of the next proposition.
Proposition 3.1.2
Each matrix transformation is a linear transformation.
Proof of Proposition 3.1.2
This is a direct consequence of the two properties of the matrix-vector product (Proposition 2.4.2) that say
Proposition 3.1.3
Suppose \(T: \mathbb{R}^n\to\mathbb{R}^m\) and \(S:\mathbb{R}^m\to\mathbb{R}^p\) are linear transformations. Then the transformation \(S\circ T:\mathbb{R}^n\to\mathbb{R}^p \) defined by
is a linear transformation from \(\mathbb{R}^n\) to \(\mathbb{R}^p\).
Remark 3.1.2
The transformation \(S\circ T\) is called the composition of the two transformations \(S\) and \(T\). It is best read as “\(S\) after \(T\)”.
Proof of Proposition 3.1.3
Suppose that
and likewise for \(S\). Then
and
Hence \(S\circ T\) satisfies the two requirements of a linear transformation.
In words: the composition/concatenation of two linear transformations is itself a linear transformation.
Exercise 3.1.3
There are other ways to combine linear transformations.
The sum \(S = T_1 + T_2\) of two linear transformation \(T_1,T_2: \mathbb{R}^n \to \mathbb{R}^m\) is defined as follows
And the (scalar) multiple \(T_3 = cT_1\) is the transformation
Show that \(S\) and \(T_3\) are again linear transformations.
Solution to Exercise 3.1.3
The properties of the linear transformation \(T_1\) and \(T_2\) carry over to \(S\) and \(T_3\) in the following way. We check the properties one by one.
For the sum \(S\) we have
- 
For all vectors \(\mathbf{v}_1,\,\mathbf{v}_2\) in \(\R^n\) \[\begin{split} \begin{array}{rcl} S(\mathbf{v}_1+\mathbf{v}_2) &=& T_1(\mathbf{v}_1+\mathbf{v}_2) + T_2(\mathbf{v}_1+\mathbf{v}_2)\\ &=& T_1(\mathbf{v}_1) + T_1(\mathbf{v}_2) + T_2(\mathbf{v}_1) + T_2(\mathbf{v}_2)\\ &=& T_1(\mathbf{v}_1) + T_2(\mathbf{v}_1) + T_1(\mathbf{v}_2) + T_2(\mathbf{v}_2)\\ &=& S(\mathbf{v}_1)+S(\mathbf{v}_2). \end{array} \end{split}\]
- 
Likewise, for all vectors \(\mathbf{v}\) in \(\mathbb{R}^n\) and all scalars \(c\) in \(\mathbb{R}\): \[\begin{split} \begin{array}{rcl} S(c\mathbf{v}) &=& T_1(c\mathbf{v})+T_2(c\mathbf{v}) \\ &=& cT_1(\mathbf{v})+cT_2(\mathbf{v}) \\ &=& c \big(T_1(\mathbf{v})+cT_2(\mathbf{v})\big)\\ &=& cS(\mathbf{v}). \end{array} \end{split}\]
The linearity of \(T_3\) is verified in a similar manner.
Exercise 3.1.4
This exercise sheds some light on the geometry behind linear transformations. We restrict ourselves to linear transformations in the plane, but the ideas can be generalised.
Suppose \(T\) is a linear transformation from \(\R^2\) to \(\R^2\). Show that the image of a line segment \(\mathcal{S}\) between two points \(P\) and \(Q\) is either a line segment or a single point.
Solution to Exercise 3.1.4
Suppose \(\vect{p} = \overrightarrow{OP}\) and \(\vect{q} = \overrightarrow{OQ}\) are the position vectors, and let \(\vect{r} = \overrightarrow{PQ} = \vect{q} - \vect{p}\). Then the line segment between \(\vect{p}\) and \(\vect{q}\) consist of all vectors \(\vect{v} = \vect{p} + t\vect{r}\), where \(t\) runs from \(0\) to \(1\).
By the linearity of \(T\) it follows that the image of \(\mathcal{S}\) consist of all vectors \(T(\vect{p} + t\vect{r}) = T(\vect{p}) + tT(\vect{r})\), with \(0 \leq t \leq 1\).
This describes the line segment between the points \(T(\vect{p})\) and \(T(\vect{p}) + T(\vect{r})\). Note that \(T(\vect{p}) + T(\vect{r}) = T(\vect{p} + \vect{r}) = T(\vect{q})\). If by any chance \(T(\vect{r}) = \mathbf{0}\), the segment ‘shrinks’ to a single point.
And now, let us return to matrix transformations.
3.1.4. Standard matrix for a linear transformation#
We have seen that every matrix transformation is a linear transformation. In this subsection we will show that conversely every linear transformation \(T:\mathbb{R}^n \to \mathbb{R}^m\) can be represented by a matrix transformation.
The key to construct a matrix that represents a given linear transformation lies in the following proposition.
Proposition 3.1.4
Suppose \(T:\mathbb{R}^n\rightarrow\mathbb{R}^m\) is a linear transformation. Then the following property holds: for each set of vectors \(\mathbf{x}_1, \ldots, \mathbf{x}_k\) in \(\mathbb{R}^n\) and each set of numbers \(c_1,\ldots,c_k\) in \(\mathbb{R}\):
In words: for any linear transformation the image of a linear combination of vectors is equal to the linear combination of their images.
Proof of Proposition 3.1.4
Suppose \(T:\mathbb{R}^n\rightarrow\mathbb{R}^m\) is a linear transformation.
So we have
First apply rule (i) to split the term on the left in Equation (3.1.2) into \(k\) terms:
and then apply rule (ii) to each term.
Example 3.1.7
Suppose \(T: \mathbb{R}^3 \to \mathbb{R}^2\) is a linear transformation, and we know that for
the images under \(T\) are given by
Then for the vector
it follows that
The central idea illustrated in Example 3.1.7, which is in fact a direct consequence of Proposition 3.1.4, is the following:
a linear transformation \(T\) from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) is completely specified by the images \( T(\mathbf{a}_1), T(\mathbf{a}_2), \ldots , T(\mathbf{a}_n)\) of a set of vectors \(\{\mathbf{a}_1, \mathbf{a}_2, \ldots, \mathbf{a}_n\}\) that spans \(\mathbb{R}^n\).
The simplest set of vectors that spans the whole space \(\mathbb{R}^n\) is the standard basis for \(\mathbb{R}^n\) which was introduced in the Section Linear combinations.
Recall that this is the set of vectors
The next example gives an illustration of the above, and it also leads the way to the construction of a matrix for an arbitrary linear transformation.
Example 3.1.8
Suppose \(T\) is a linear transformation from \(\mathbb{R}^2\) to \(\mathbb{R}^2\) for which
Then for an arbitrary vector
it follows that
So we see that
Exercise 3.1.5
Show that the procedure of Example 3.1.8 applied to the linear transformation of Example 3.1.5 indeed yields the matrix
Solution to Exercise 3.1.5
Consider the linear transformation \(T:\mathbb{R}^2\rightarrow\mathbb{R}^3\) that sends each vector \( \begin{pmatrix} x \\ y \end{pmatrix}\) in \(\mathbb{R}^2\) to the vector \(\begin{pmatrix} x \\ y \\ 0 \end{pmatrix}\). It holds that
We find that for an arbitrary vector \(\begin{pmatrix} x\\ y \end{pmatrix} = x\begin{pmatrix} 1\\ 0 \end{pmatrix}+y\begin{pmatrix} 0\\ 1 \end{pmatrix}\) it holds that
The reasoning of Example 3.1.8 can be generalised. This is the content of the next theorem.
Theorem 3.1.1
Each linear transformation \(T\) from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) is a matrix transformation.
More specific, if \(T: \mathbb{R}^n \to \mathbb{R}^m\) is linear, then for each \(\mathbf{x}\) in \(\mathbb{R}^n\)
Proof of Theorem 3.1.1
We can more or less copy the derivation in Example 3.1.8. First of all, any vector \(\mathbf{x}\) is a linear combination of the standard basis:
i.e.,
From Proposition 3.1.4 it follows that
The last expression is a linear combination of \(n\) vectors in \(\mathbb{R}^m\), and thus can be written as a matrix-vector product:
Definition 3.1.3
For a linear transformation \(T:\mathbb{R}^n \to \mathbb{R}^m\), the matrix
is called the standard matrix of \(T\).
In the Section Some important classes of linear transformations you will learn how to build standard matrices for rotations, reflections and other geometrical mappings. For now let us look at a more “algebraic” example.
Example 3.1.9
Consider the transformation
It can be checked that the transformation has the two properties of a linear transformation according to the definition. Note that
So we find that the standard matrix \([T]\) of \(T\) is given by
Exercise 3.1.6
In the previous example we could have found the matrix just by inspection.
For the slightly different transformation \(T:\R^3 \to \R^3\) given by
can you fill in the blanks in the following equation?
If you can, you will have shown that \(T\) is a matrix transformation, and as a direct consequence \(T\) is a linear transformation.
Solution to Exercise 3.1.6
To conclude we consider an example that refers back to Proposition 3.1.3, and which will to a large extent pave the road for the product of two matrices.
Example 3.1.10
Suppose \(T:\mathbb{R}^2 \to \mathbb{R}^3\) and \(S:\mathbb{R}^3 \to \mathbb{R}^3\) are the matrix transformations given by
From Proposition 3.1.3 we know that the composition \(S\circ T: \mathbb{R}^2 \to \mathbb{R}^3\) is also a linear transformation. What is the (standard) matrix of \(S\circ T\)?
For this we need the images of the unit vectors \(\mathbf{e}_1\) and \(\mathbf{e}_2\) in \(\mathbb{R}^2\). For each vector we first apply \(T\) and then \(S\). For \(\mathbf{e}_1\) this gives
and then
Likewise for \(\mathbf{e}_2\):
So the matrix of \(S\circ T\) becomes
In the Section Matrix operations we will define the product of two matrices precisely in such a way that
3.1.5. Grasple exercises#
Grasple exercise 3.1.2
To specify the domain and the codomain of a linear transformation.
Click to show/hide
Grasple exercise 3.1.3
To find the size of the matrix of a linear transformation.
Click to show/hide
Grasple exercise 3.1.4
To find the image of two vectors under \(T(\vect{x}) = A\vect{x}\).
Click to show/hide
Grasple exercise 3.1.5
For a linear map \(T\), find \(T(c\vect{u})\) and \(T(\vect{u}+\vect{v})\) if \(T(\vect{u})\) and \(T(\vect{v})\) are given.
Click to show/hide
Grasple exercise 3.1.6
For a linear map \(T:\R^2 \to \R^2\), find \(T((x_1,x_2))\) if \(T(\vect{e}_1)\) and \(T(\vect{e}_2)\) are given.
Click to show/hide
Grasple exercise 3.1.7
Find all vectors \(\vect{w}\) for which \(T(\vect{w}) = \vect{u}\).
Click to show/hide
Grasple exercise 3.1.8
Find vectors \(\vect{w}\) for which \(T(\vect{w}) = \vect{u}\).
Click to show/hide
Grasple exercise 3.1.9
Find vectors \(\vect{w}\) for which \(T(\vect{w}) = \vect{u}\).
Click to show/hide
Grasple exercise 3.1.10
Find vectors \(\vect{w}\) for which \(T(\vect{w}) = \vect{u}\).
Click to show/hide
Grasple exercise 3.1.11
To show that a given transformation is non-linear.
Click to show/hide
Grasple exercise 3.1.12
Finding an image and a pre-image of \(T:\R^2 \to \R^2\) using a picture.
Click to show/hide
Grasple exercise 3.1.13
To give a geometric description of \(T: \vect{x} \mapsto A\vect{x}\).
Click to show/hide
Grasple exercise 3.1.14
To give a geometric description of \(T: \vect{x} \mapsto A\vect{x}\).
Click to show/hide
Grasple exercise 3.1.15
To find the matrix of the transformation that sends \((x,y)\) to \(x\vect{a}_1 + y\vect{a}_2\).
Click to show/hide
Grasple exercise 3.1.16
To find the matrix of the transformation that sends \((x,y)\) to \(x\vect{a}_1 + y\vect{a}_2\).
Click to show/hide
Grasple exercise 3.1.17
To rewrite \(T:\R^3 \to \R^2\) to standard form.
Click to show/hide
Grasple exercise 3.1.18
To find the standard matrix for \(T:\R^4 \to \R\).
Click to show/hide
Grasple exercise 3.1.19
To find the standard matrix for \(T:\R^2 \to \R^2\) if \(T(\vect{v}_1)\) and \(T(\vect{v}_2)\) are given.
Click to show/hide
Grasple exercise 3.1.20
To find the standard matrix for \(T:\R^2 \to \R^3\) if \(T(\vect{v}_1)\) and \(T(\vect{v}_2)\) are given.
Click to show/hide
Grasple exercise 3.1.21
If \(T(\vect{0}) = \vect{0}\), is \(T\) (always) linear?
Click to show/hide
Grasple exercise 3.1.22
True or false? If \(\{\vect{v}_1,\vect{v}_2,\vect{v}_3\}\) is linearly dependent, then \(\{T(\vect{v}_1),T(\vect{v}_2),T(\vect{v}_3)\}\) is also linearly dependent.
Click to show/hide
Grasple exercise 3.1.23
True or false? If \(\{\vect{v}_1,\vect{v}_2,\vect{v}_3\}\) is linearly independent, then \(\{T(\vect{v}_1),T(\vect{v}_2),T(\vect{v}_3)\}\) is also linearly independent.
