3.5. Injectivity, Surjectivity, and Bijectivity#
Since matrices correspond to linear transformations, we can translate questions about matrices to questions about linear transformations. For example: under what conditions on \(A\) is the system \(A\vect{x}=\vect{b}\) consistent for all \(\vect{b}\)? This question, which deals with a matrix \(A\), is equivalent to the following question about linear transformations: when is every element in the codomain of a linear tranformation the image of some element in the domain?
Or, similarly: under what conditions on \(A\) is every consistent system \(A\vect{x}=\vect{b}\) uniquely solvable? In the language of linear transformations, this becomes: is every element in the range of a linear transformation the image of a unique element in the domain?
These two questions give rise to the concepts of injective and surjective linear transformations. The two concepts are very closely related and for most statements about injective linear transformations you can find an analogous statement about surjective linear transformations and vice versa.
3.5.1. Injectivity#
We call a linear transformation \(T:\mathbb{R}^{n}\to\mathbb{R}^{m}\) injective or one-to-one if \(T(\mathbf{v}_{1})=T(\mathbf{v}_{2})\) implies \(\mathbf{v}_{1}=\mathbf{v}_{2}\).
Intuitively, a transformation \(T\) is injective if it can be undone without ambiguity. Let us see what this means for some concrete examples.
-
Consider the linear transformation, illustrated in Figure 3.5.1,
\[\begin{split} T:\mathbb{R}^{2}\to\mathbb{R}^{2},\quad\mathbf{v}\mapsto A\mathbf{v}\quad\quad\text{where}\quad\quad A=\begin{bmatrix} 1&-1\\ -1&1 \end{bmatrix}. \end{split}\]This is not an injective transformation. Indeed, if we put for example
\[\begin{split} \mathbf{v}_{1}=\begin{bmatrix}1\\2\end{bmatrix},\quad\mathbf{v}_{2}=\begin{bmatrix}-3\\-2\end{bmatrix},\quad\text{and}\quad\mathbf{u}=\begin{bmatrix}-1\\1\end{bmatrix} \end{split}\]we find \(T(\mathbf{v}_{1})=\mathbf{u}=T(\mathbf{v}_{2})\). Consequently, i \(T(\vect{v})=\vect{u}\), we have no way of knowing whether the \(\vect{v}=\mathbf{v}_{1}\) or \(\vect{v}=\mathbf{v}_{2}\). Or perhaps some other vector, because there are infinitely many possible inputs that yield \(\mathbf{u}\) as output. To find them, it suffices to solve the system \(A\mathbf{x}=\mathbf{u}\). We find
\[\begin{split} \left[\begin{array}{cc|c} 1&-1&-1\\ -1&1&1 \end{array}\right]\sim\left[\begin{array}{cc|c} 1&-1&-1\\ 0&0&0 \end{array}\right] \end{split}\]so any vector \(\mathbf{v}=\begin{bmatrix}a_{1}\\a_{2}\end{bmatrix}\) with \(a_{1}=a_{2}-1\) is a solution.
-
Consider now the transformation
\[\begin{split} T:\mathbb{R}^{2}\to\mathbb{R}^{2},\quad\mathbf{v}\mapsto A\mathbf{v}\quad\quad\text{where}\quad\quad A=\begin{bmatrix} 1&-1\\ 1&1 \end{bmatrix} \end{split}\]which is illustrated in Figure 3.5.2.
This is an injective transformation. Indeed, if we take an arbitrary vector \(\mathbf{v}=\begin{bmatrix}a_{1}\\a_{2}\end{bmatrix}\) in \(\mathbb{R}^{2}\) and we try to solve \(A\mathbf{x}=\mathbf{u}\), we find
\[\begin{split} \left[\begin{array}{cc|c} 1&-1&a_{1}\\ 1&1&a_{2} \end{array}\right]\sim\left[\begin{array}{cc|c} 1&-1&a_{1}\\ 0&2&a_{2}-a_{1} \end{array}\right]. \end{split}\]There are no free variables, so if there is a solution to the system \(A\vect{x}=\vect{v}\) (which, in this case, is true for all \(\vect{v}\)) it will be unique. This means that there is only one \(\mathbf{x}\) with \(T(\mathbf{x})=A\mathbf{x}=\mathbf{u}\), hence \(T\) is injective.
-
Let us consider now a transformation for which the domain and the codomain are different. Take, for example,
\[\begin{split} S:\R^{2}\to\R^{3},\vect{v}\mapsto A\vect{v}\quad\text{where}\quad A=\begin{bmatrix} 1&0\\ 0&1\\ 0&0 \end{bmatrix}. \end{split}\]Then \(S\) is an injective transformation. Indeed, take an arbitrary vector
\[\begin{split} \vect{v}=\begin{bmatrix} a_{1}\\ a_{2}\\ a_{3} \end{bmatrix}\text{ in }\R^{3}.\quad\text{Then}\quad\left[\begin{array}{cc|c} 1&0&a_{1}\\ 0&1&a_{2}\\ 0&0&a_{3} \end{array}\right] \end{split}\]
has, when it is consistent, a unique solution as there are no free variables.
If we perform two actions which can both be undone, then we should be able to undo the combination of those two actions. That this intuitive rule really does hold is essentially the content of Proposition 3.5.1.
If \(S:\mathbb{R}^{l}\to\mathbb{R}^{m}\) and \(T:\mathbb{R}^{m}\to\mathbb{R}^{n}\) are both injective, then so is \(T\circ S:\mathbb{R}^{l}\to\mathbb{R}^{n}\).
Proof of Proposition 3.5.1
Take \(\mathbf{v}_{1}\) and \(\mathbf{v}_{2}\) in \(\mathbb{R}^{l}\) such that \((T\circ S)(\mathbf{v}_{1})=(T\circ S)(\mathbf{v}_{2})\), i.e. \(T(S(\mathbf{v}_{1}))=T(S(\mathbf{v}_{2}))\). Since \(T\) is injective, we must have \(S(\mathbf{v}_{1})=S(\mathbf{v}_{2})\). As \(S\) is also injective, this in turn implies \(\mathbf{v}_{1}=\mathbf{v}_{2}\) which establishes injectivity of \(T\circ S\).
You may think that, similarly, if the combination of two actions can be undone, you must be able to undo both. This, however, is generally not true, as Example 3.5.2 shows.
Consider
You can verify yourself that \((T\circ S)\mathbf{v}=\mathbf{v}\) for any vector \(\mathbf{v}\) in \(\mathbb{R}^{2}\), so \(T\circ S\) is injective. But \(T\) itself definitely isn’t! Indeed
Note that \(S\) is injective by Example 3.5.1 iii. This is not a coincidence, as Proposition 3.5.2 shows.
If \(S:\mathbb{R}^{l}\to\mathbb{R}^{m}\) and \(T:\mathbb{R}^{m}\to\mathbb{R}^{n}\) are such that \(T\circ S\) is injective, then \(S\) is injective.
Proof of Proposition 3.5.2
Suppose \(T\circ S\) is injective but \(S\) is not. Then there are \(\mathbf{v}_{1}\neq\mathbf{v}_{2}\) in \(\mathbb{R}^{l}\) such that \(S(\mathbf{v}_{1})=S(\mathbf{v}_{2})\). But then \(T(S(\mathbf{v}_{1}))=T(S(\mathbf{v}_{2}))\) which contradicts the assumption that \(T\circ S\) is injective.
All the results we have obtained so far hold true for general transformations. But there are some extra characterizations we can prove for linear transformations.
Let \(T:\mathbb{R}^{m}\to\mathbb{R}^{n}\) be a linear transformation with standard matrix \(A\). Then the following are equivalent:
-
\(T\) is injective;
-
\(A\mathbf{x}=\mathbf{b}\) has, for every \(\mathbf{b}\) in \(\mathbb{R}^{n}\), at most one solution;
-
\(A\) has a pivot in every column.
Prove Proposition 3.5.3.
Solution to Exercise 3.5.1
Assume \(T\) is injective and \(A\vect{x}=\vect{b}\) has solutions \(\vect{x}_{1}\) and \(\vect{x}_{2}\). Then \(T(\vect{x}_{1})=T(\vect{x}_{2})\), so \(\vect{x}_{1}=\vect{x}_{2}\) by injectivity of \(T\). Similarly, if we assume that \(A\vect{x}=\vect{b}\) has at most one solution, then \(T(\vect{x}_{1})=T(\vect{x}_{2})\) would imply \(\vect{x}_{1}=\vect{x}_{2}\), hence \(T\) is injective.
If \(A\) has a column without pivot, then there is a free variable. Consequently, if \(A\vect{x}=\vect{b}\) has a solution, it has infinitely many, contradicting ii.. Similarly, if \(A\vect{x}=\vect{b}\) has at most one solution, then there cannot be a free variable, hence \(A\) has no column without pivot.
If \(T:\mathbb{R}^{m}\to\mathbb{R}^{n}\) is injective, then \(m\leq n\).
Proof of Corollary 3.5.1
If \(T\) is injective, then its standard matrix \(A\) has a pivot in every column. Consequently, the number of columns of \(A\), which is \(m\), must be smaller than or equal to the number of rows of \(A\), which is \(n\).
A linear transformation \(T:\mathbb{R}^{m}\to\mathbb{R}^{n}\) is injective if and only if \(T(\mathbf{v})=\mathbf{0}\) implies \(\mathbf{v}=\mathbf{0}\).
Proof of Proposition 3.5.4
Since \(T(\mathbf{0})=\mathbf{0}\) for any linear transformation \(T\), injectivity of \(T\) implies that \(T(\mathbf{v})=\mathbf{0}\) only occurs when \(\mathbf{v}=\mathbf{0}\).
Assume now that \(T(\mathbf{v})=\mathbf{0}\) implies \(\mathbf{v}=\mathbf{0}\). If \(T(\mathbf{v}_{1})=T(\mathbf{v}_{2})\) for some \(\mathbf{v}_{1},\mathbf{v}_{2}\) in \(\mathbb{R}^{m}\), then \(T(\mathbf{v}_{1}-\mathbf{v}_{2})=T(\mathbf{v}_{1})-T(\mathbf{v}_{2})=\mathbf{0}\) since \(T\) is linear. By our assumption, \(\mathbf{v}_{1}-\mathbf{v}_{2}=\mathbf{0}\) follows. But then \(\mathbf{v}_{1}=\mathbf{v}_{2}\). We have shown that two vectors can only have the same image under \(T\) if they are the same to start with, i.e. we have shown \(T\) to be injective.
3.5.2. Surjectivity#
In this section, we will talk about surjectivity. It is a natural complement to the concept of injectivity – the yin to its yang, if you will. This duality becomes apparent when comparing Proposition 3.5.3 to Proposition 3.5.7.
A linear transformation \(T:\mathbb{R}^{m}\to\mathbb{R}^{n}\) is called surjective if for any vector \(\mathbf{u}\) in \(\mathbb{R}^{n}\) there is some vector \(\mathbf{v}\) in \(\mathbb{R}^{m}\) such that \(T(\mathbf{v})=\mathbf{u}\). In other words, \(T\) is surjective if and only if the range of \(T\) is the whole codomain.
In other words, a transformation is surjective if for every potential output there is some input which really does give that output. We will now see what this concept means for the transformations from Example 3.5.1.
-
Let us start with the transformation, illustrated in Figure 3.5.5,
\[\begin{split} T:\mathbb{R}^{2}\to\mathbb{R}^{2},\quad\mathbf{v}\mapsto A\mathbf{v}\quad\quad\text{where}\quad\quad A=\begin{bmatrix} 1&-1\\ -1&1 \end{bmatrix}. \end{split}\]In order to see whether \(T\) is surjective, we have to check whether for every vector \(\mathbf{u}\) in the codomain – that is \(\mathbb{R}^{2}\) here – there is a vector \(\mathbf{v}\) in the domain with \(T(\mathbf{v})=\mathbf{u}\). So take \(\mathbf{u}=\begin{bmatrix}a_{1}\\a_{2}\end{bmatrix}\) in \(\mathbb{R}^{2}\). We find
\[\begin{split} \left[\begin{array}{cc|c} 1&-1&a_{1}\\ -1&1&a_{2} \end{array}\right]\sim\left[\begin{array}{cc|c} 1&-1&-1\\ 0&0&a_{1}+a_{2} \end{array}\right], \end{split}\]so the system \(A\mathbf{x}=\mathbf{u}\) can only be solved if \(a_{1}+a_{2}=0\). This transformation is therefore not surjective.
-
We have already seen in Example 3.5.1 ii. that, for
\[\begin{split} A=\begin{bmatrix} 1&-1\\ 1&1 \end{bmatrix}, \end{split}\]the system \(A\mathbf{x}=\mathbf{u}\) is consistent for any \(\mathbf{u}\) in \(\mathbb{R}^{2}\). Consequently, the linear transformation
\[ T:\mathbb{R}^{2}\to\mathbb{R}^{2},\quad\mathbf{v}\mapsto A\mathbf{v} \]is surjective.
If \(S:\mathbb{R}^{l}\to\mathbb{R}^{m}\) and \(T:\mathbb{R}^{m}\to\mathbb{R}^{n}\) are both surjective, then so is \(T\circ S:\mathbb{R}^{l}\to\mathbb{R}^{n}\).
Proof of Proposition 3.5.5
Take \(\mathbf{u}\) in \(\mathbb{R}^{n}\). As \(T\) is surjective, there is a \(\mathbf{v}\in\mathbb{R}^{m}\) with \(T(\mathbf{v})=\mathbf{u}\). Since \(S\) is surjective, there is a \(\mathbf{w}\) in \(\mathbb{R}^{l}\) with \(S(\mathbf{w})=\mathbf{v}\). We find \((T\circ S)(\mathbf{w})=T(\mathbf{v})=\mathbf{u}\), so \(T\circ S\) is surjective, as claimed.
Here, too, it turns out that the opposite statement is not true in general as Example 3.5.4 shows. Compare this with Example 3.5.2.
Put
As we have seen in Example 3.5.2, \((T\circ S)(\mathbf{v})=\mathbf{v}\) for any vector \(\mathbf{v}\) in \(\mathbb{R}^{2}\), so \(T\circ S\) is surjective. But \(S\) is not! Indeed,
so any vector \(\mathbf{u}\) in \(\mathbb{R}^{3}\) for which the third entry is non-zero cannot be \(S(\mathbf{v})\) for any \(\mathbf{v}\) in \(\mathbb{R}^{2}\).
The following two propositions are counterparts of Proposition 3.5.2 and Proposition 3.5.3, respectively.
If \(S:\mathbb{R}^{l}\to\mathbb{R}^{m}\) and \(T:\mathbb{R}^{m}\to\mathbb{R}^{n}\) are such that \(T\circ S\) is surjective, then \(T\) is surjective.
Proof of Proposition 3.5.6
Take \(\mathbf{u}\in\mathbb{R}^{n}\). Since \(T\circ S\) is surjective, there is a \(\mathbf{w}\) in \(\mathbb{R}^{l}\) with \(\mathbf{u}=(T\circ S)(\mathbf{w})=T(S(\mathbf{w}))\). Hence \(\mathbf{u}=T(\mathbf{v})\) for \(\mathbf{v}=S(\mathbf{w})\), which shows \(T\) to be surjective.
Let \(T:\mathbb{R}^{m}\to\mathbb{R}^{n}\) be a linear transformation with standard matrix \(A\). Then the following are equivalent:
-
\(T\) is surjective;
-
\(A\mathbf{x}=\mathbf{b}\) has, for every \(\mathbf{b}\) in \(\mathbb{R}^{n}\), at least one solution;
-
\(A\) has a pivot in every row.
Prove Proposition 3.5.7.
Solution to Exercise 3.5.2
Assume \(T\) is surjective and take an arbitrary \(\vect{b}\) in \(\R^{n}\). Then there is an \(\vect{x}\) in \(\mathbb{R}^{m}\) such that \(T(\vect{x})=\vect{b}\), i.e. \(\vect{x}\) is a solution of \(A\vect{x}=\vect{b}\). Similarly, if \(A\vect{x}=\vect{b}\) has a solution for any \(\vect{b}\) in \(\R^{n}\), then \(\vect{b}=T(\vect{x})\) which establishes surjectivity of \(T\).
If \(A\) has a row without pivot, then, for a well-chosen \(\vect{b}\), a pivot can appear in the last column of the augmented matrix \([A|\vect{b}]\). This means that \(A\vect{x}=\vect{b}\) has no solutions. Similarly, if \(A\vect{x}=\vect{b}\) always has a solution, then \([A|\vect{b}]\) can never have a pivot in the lost column. Consequently, \(A\) must have a pivot in every row.
If \(T:\mathbb{R}^{m}\to\mathbb{R}^{n}\) is surjective, then \(m\geq n\).
Proof of Corollary 3.5.2
If \(T\) is surjective, then its standard matrix \(A\) has a pivot in every row. Consequently, the number of columns of \(A\), which is \(m\), must be greater than or equal to the number of rows of \(A\), which is \(n\).
3.5.3. Bijectivity#
In this section, we investigate what happens when injectivity and surjectivity meet. Let us start with giving that idea a name.
A linear transformation \(T:\mathbb{R}^{m}\to\mathbb{R}^{n}\) is called bijective if it is both injective and surjective.
Although we allow \(m\) and \(n\) to be different in the definition, it turns out that this cannot happen as the following proposition shows.
If a linear transformation \(T:\mathbb{R}^{m}\to\mathbb{R}^{n}\) is bijective, then \(m=n\).
Proof of Proposition 3.5.8
Let \(A\) be the standard matrix of \(T\), which is an \(n\times m\) matrix. By Proposition 3.5.3, the number of pivots in \(A\) is \(m\). By Proposition 3.5.7, the number of pivots in \(A\) is \(n\). Hence the claim follows.
If the domain and codomain are both \(\mathbb{R}^{n}\) then something striking happens: injectivity, surjectivity, and bijectivity all become equivalent!
Let \(T:\mathbb{R}^{n}\to\mathbb{R}^{n}\) be a linear transformation. The following are equivalent:
-
\(T\) is injective;
-
\(T\) is surjective;
-
\(T\) is bijective.
Proof of Proposition 3.5.9
It suffices to show that \(T\) is injective if and only if it is surjective. Assume \(T\) is injective. Then the standard matrix \(A\) of \(T\) has a pivot in each column. But as the number of columns and the number of rows are the same, there must be a pivot in each row, showing surjectivity of \(T\).
Similarly, if \(T\) is surjective, then the standard matrix \(A\) of \(T\) has a pivot in each column. Again, the number of columns is the same as the number of rows, so there is also a pivot in every row and we conclude that \(T\) must be injective.
The content of Proposition 3.5.10 is, intuitively, that a transformation is bijective if and only if it can be undone by some linear transformation.
For a linear transformation \(T:\R^{n}\to\R^{n}\) with standard matrix \(A\) the following are equivalent:
-
\(T\) is bijective.
-
\(A\) is invertible.
-
There exists a transformation \(S:\R^{n}\to\R^{n}\) such that
\[ST(\vect{v})=\vect{v}=TS(\vect{v})\]for any vector \(\vect{v}\) in \(\R^{n}\).
Proof of Proposition 3.5.10
By Theorem 3.4.1, we know that \(A\) is invertible if and only if the system \(\vect{x}=\vect{b}\) has a unique solution for each \(\vect{b}\) in \(\R^{n}\). The existence of a solution is equivalent to surjectivity of \(T\) and its uniqueness is equivalent to injectivity of \(T\). This establishes the equivalence of i. and ii.
We will now show that ii. and iii. are equivalent, too. The invertibility of \(A\) is equivalent to the existence of an \(n\times n\)-matrix \(B\) such that \(AB=I\). Define
Then \(TS(\vect{v})=AB\vect{v}=\vect{v}\) for any \(\vect{v}\) in \(\R^{n}\). The only thing left to show is \(ST(\vect{v})=\vect{v}\) or, in other words, \(BA(\vect{v})=\vect{v}\) for any \(\vect{v}\) in \(\R^{n}\). For this, we argue as follows. Since \(AB=I\), \(ABA=A\) follows. But then \(A(BA-I)=0\), hence \(A((BA-I)\vect{v})=\vect{0}\) for all \(\vect{v}\) in \(\R^{n}.\) Since \(A\) is invertible, \(A\vect{x}=\vect{0}\) only has the trivial solution \(\vect{x}=\vect{0}\). Therefore \((BA-I)\vect{v}=\vect{0}\) for all \(\vect{v}\) which implies \(BA\vect{v}=\vect{v}\) for all \(\vect{v}\) in \(\R^{n}\).
3.5.4. Grasple Exercises#
Is \(T\) injective, surjective, bijective, or neither?
Show/Hide Content
Is \(T\) injective, surjective, bijective, or neither?
Show/Hide Content
Is \(T\) injective, surjective, bijective, or neither?
Show/Hide Content
What can we conclude if the composition of two transformations is bijective?