Linear Independence, Column-Space, Null-Space and Rank of a Matrix¶

Matrix as a linear transformation of an vector¶

Matrix-Vector Multiplication as a Linear Combination¶

The matrix-vector product $\mathbf{A} \mathbf{x}$ is fundamentally a linear combination of the columns of $\mathbf{A}$, where the components of $\mathbf{x}$ serve as the scalar weights. Consider a matrix $\mathbf{A} \in \mathbb{R}^{m \times n}$ with column vectors $\mathbf{a}_i$ and a vector $\mathbf{x} \in \mathbb{R}^n$:

$$\mathbf{A} \mathbf{x} = \begin{pmatrix} \mathbf{a}_1 & \mathbf{a}_2 & \cdots & \mathbf{a}_n \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix} = x_1 \mathbf{a}_1 + x_2 \mathbf{a}_2 + \cdots + x_n \mathbf{a}_n$$

Example:

$$\mathbf{A} \mathbf{x} = \begin{pmatrix} 2 & 0 \\ 1 & 1 \\ 1 & 3 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} = x_1 \begin{pmatrix} 2 \\ 1 \\ 1 \end{pmatrix} + x_2 \begin{pmatrix} 0 \\ 1 \\ 3 \end{pmatrix} = x_1 \mathbf{a}_1 + x_2 \mathbf{a}_2$$

Linear Independence¶

A set of vectors $\{\mathbf{a}_1, \mathbf{a}_2, \dots, \mathbf{a}_n\}$ is considered linearly independent if no vector in the set can be defined as a linear combination of the others. Equivalently, the vectors are linearly independent when the equation

$$x_1 \mathbf{a}_1 + x_2 \mathbf{a}_2 + \dots + x_n \mathbf{a}_n = \mathbf{0}$$

has only the trivial solution $x_1 = x_2 = \dots = x_n = 0$.

If instead there exists a set of weights with at least one $x_i \neq 0$ that satisfies the equation, the vectors are linearly dependent.

The Span¶

The Span of a set of vectors $\{\mathbf{a}_1, \mathbf{a}_2, \dots, \mathbf{a}_n\}$ is the collection of all possible vectors that can be reached by forming linear combinations of that set. Mathematically, any vector $\mathbf{b}$ in the span can be expressed as:

$$\mathbf{b} = x_1 \mathbf{a}_1 + x_2 \mathbf{a}_2 + \dots + x_n \mathbf{a}_n$$

where $x_i$ are any real scalars.

Geometric Intuition

The span describes the "geometric footprint" created by the vectors:

One non-zero vector: Its span is a 1D line passing through the origin.
Two linearly independent vectors: Their span is a 2D plane passing through the origin.
Three linearly independent vectors: Their span is the entire 3D space ($\mathbb{R}^3$).

The Role of Redundancy

Adding a redundant vector — one that is already a linear combination of the vectors already in the set — does not expand the span; it only makes the set linearly dependent. For example, if $\mathbf{a}_3$ is already a linear combination of $\mathbf{a}_1$ and $\mathbf{a}_2$, then the span of $\{\mathbf{a}_1, \mathbf{a}_2, \mathbf{a}_3\}$ is exactly the same as the span of $\{\mathbf{a}_1, \mathbf{a}_2\}$. In this case, $\mathbf{a}_3$ does not "contribute a new dimension."

The Null Space: $N(\mathbf{A})$¶

The null space of a matrix $\mathbf{A}$ is the set of all vectors $\mathbf{x}$ that satisfy the homogeneous equation:

$$\mathbf{A}\mathbf{x} = \mathbf{0}$$

Connection between Independence and the Null Space

This is where the concepts of independence and the null space converge:

Linearly Independent Columns: The only way to combine the columns to reach $\mathbf{0}$ is to multiply them all by zero. Therefore, $N(\mathbf{A})$ contains only the zero vector ($\mathbf{x} = \mathbf{0}$).
Linearly Dependent Columns: There exist non-zero weights that can combine the columns to reach $\mathbf{0}$. Therefore, $N(\mathbf{A})$ contains non-zero vectors, forming a line, plane, or higher-dimensional subspace.

The Column Space: $C(\mathbf{A})$¶

While the null space describes the "redundancy" of the input, the column space describes the "reach" of the output. It is the set of all possible linear combinations of the columns of $\mathbf{A}$

The column space (or range) of a matrix $\mathbf{A}$ is denoted as $C(\mathbf{A})$. Geometrically, the columns always span a subspace of $\mathbb{R}^m$ — this subspace is $C(\mathbf{A})$. If the columns are linearly independent, the dimension of this subspace equals the number of columns. In the example above, $C(\mathbf{A})$ is the plane in $\mathbb{R}^3$ spanned by $\mathbf{a}_1$ and $\mathbf{a}_2$.

For the linear system $\mathbf{A}\mathbf{x} = \mathbf{b}$ to be consistent (possess at least one solution), the vector $\mathbf{b}$ must reside within the column space:

Existence Condition: $\mathbf{A}\mathbf{x} = \mathbf{b}$ has a solution if and only if $\mathbf{b} \in C(\mathbf{A})$.

If $\mathbf{b}$ lies outside this subspace, no combination of the columns of $\mathbf{A}$ can reach it, rendering the system inconsistent.

If there are non-zero vectors in the null space, then any solution to $\mathbf{A}\mathbf{x} = \mathbf{b}$ is not unique, as any vector $\mathbf{x}_{null} \in N(\mathbf{A})$ can be added to a particular solution $\mathbf{x}_p$ without changing the result:

$$\mathbf{A}(\mathbf{x}_p + \mathbf{x}_{null}) = \mathbf{A}\mathbf{x}_p + \mathbf{A}\mathbf{x}_{null} = \mathbf{b} + \mathbf{0} = \mathbf{b}$$

Relation to Span

The Column Space $C(\mathbf{A})$ of a matrix is defined as the span of its column vectors. Therefore, when we ask if the equation $\mathbf{A}\mathbf{x} = \mathbf{b}$ has a solution, we are asking a geometric question: "Is the vector $\mathbf{b}$ located within the span of the columns of $\mathbf{A}$?"

Rank of a Matrix¶

The rank of a matrix, denoted as $\text{rank}(\mathbf{A})$, is a single number that summarizes the "dimensions" of the information contained within the matrix.

The rank of a matrix is the dimension of its column space. Equivalently, it is the number of linearly independent columns. The column space $C(\mathbf{A})$ is spanned by the columns of $\mathbf{A}$, and its dimension counts how many of those columns are genuinely independent: a redundant column — one that is a linear combination of the others — adds no new direction and therefore does not raise the dimension, exactly as we saw for the span.

The rank has a second, equivalent characterization through the rows. It can be defined in two ways:

The dimension of the column space $C(\mathbf{A})$ (the number of linearly independent columns, the column rank).
The dimension of the row space $C(\mathbf{A}^\top)$ (the number of linearly independent rows, the row rank).

The number of independent columns equals the number of independent rows. This is surprising — for a non-square $m \times n$ matrix the columns live in $\mathbb{R}^m$ and the rows in $\mathbb{R}^n$, two entirely different spaces. Here is an intuitive way to see why it must be true anyway.

Suppose the column rank is $r$: among the $n$ columns of $\mathbf{A}$, exactly $r$ of them are genuinely independent. Keep those $r$ columns as a small "kit" of building blocks, $\mathbf{k}_1, \dots, \mathbf{k}_r$. Since every other column is a combination of independent ones, every column of $\mathbf{A}$ can be rebuilt from this kit. For a given column $\mathbf{a}_j$, the recipe is just a list of $r$ mixing weights $B_{1j}, \dots, B_{rj}$:

$$\mathbf{a}_j = B_{1j}\,\mathbf{k}_1 + B_{2j}\,\mathbf{k}_2 + \dots + B_{rj}\,\mathbf{k}_r .$$

Line the $r$ building-block columns up side by side into a matrix $\mathbf{K}$ (size $m \times r$; the kit matrix, not to be confused with the column space $C(\mathbf{A})$), and collect all the recipes into a matrix $\mathbf{B}$ (size $r \times n$). Saying "every column of $\mathbf{A}$ is built from the kit according to its recipe" is exactly the statement

$$\mathbf{A} = \mathbf{K}\,\mathbf{B}.$$

Now look at this same equation one row at a time instead of one column at a time. In the product $\mathbf{K}\mathbf{B}$, row $i$ of $\mathbf{A}$ is produced by taking row $i$ of $\mathbf{K}$ — a short list of just $r$ numbers — and using it to mix the $r$ rows of $\mathbf{B}$:

$$\text{row}_i(\mathbf{A}) = K_{i1}\,\text{row}_1(\mathbf{B}) + \dots + K_{ir}\,\text{row}_r(\mathbf{B}).$$

So every row of $\mathbf{A}$, however many rows there are, is built from the same $r$ "row ingredients" (the rows of $\mathbf{B}$). And a collection of vectors that are all mixtures of $r$ fixed vectors can contain at most $r$ independent ones. Therefore

$$\text{row rank}(\mathbf{A}) \le r = \text{column rank}(\mathbf{A}).$$

The reasoning is completely symmetric: if we instead start from a kit of independent rows and rebuild every row from it, the same steps give the opposite inequality $\text{column rank}(\mathbf{A}) \le \text{row rank}(\mathbf{A})$. The only way both can hold at once is equality:

$$\text{row rank}(\mathbf{A}) = \text{column rank}(\mathbf{A}) = \text{rank}(\mathbf{A}).$$

The heart of it: $r$ independent columns are enough to rebuild every column, and the very recipe that does this uses only $r$ row ingredients to rebuild every row. Independence in one direction therefore caps independence in the other at the same number $r$.

In our earlier example, because $\mathbf{a}_1$ and $\mathbf{a}_2$ are not multiples of one another, they are linearly independent. Therefore, $\text{rank}(\mathbf{A}) = 2$.

Invertibility and the Solution $\mathbf{x} = \mathbf{A}^{-1}\mathbf{b}$¶

Having now seen both existence (the column space) and uniqueness (the null space), we can tie them together in the notion of invertibility.

A square matrix $\mathbf{A} \in \mathbb{R}^{n \times n}$ is invertible if there exists a matrix $\mathbf{A}^{-1}$ with

$$\mathbf{A}^{-1}\mathbf{A} = \mathbf{A}\,\mathbf{A}^{-1} = \mathbf{I}.$$

When such an $\mathbf{A}^{-1}$ exists, the system $\mathbf{A}\mathbf{x} = \mathbf{b}$ has a unique solution for every right-hand side $\mathbf{b}$, obtained by multiplying both sides from the left by $\mathbf{A}^{-1}$:

$$\mathbf{A}\mathbf{x} = \mathbf{b} \quad\Longrightarrow\quad \mathbf{A}^{-1}\mathbf{A}\mathbf{x} = \mathbf{A}^{-1}\mathbf{b} \quad\Longrightarrow\quad \mathbf{x} = \mathbf{A}^{-1}\mathbf{b}.$$

This single formula encodes both properties discussed above:

Existence — a solution $\mathbf{x} = \mathbf{A}^{-1}\mathbf{b}$ can be written down for any $\mathbf{b}$, which is exactly the statement $C(\mathbf{A}) = \mathbb{R}^n$.
Uniqueness — the solution is the value $\mathbf{A}^{-1}\mathbf{b}$ and no other. If $\mathbf{x}_1$ and $\mathbf{x}_2$ both solved the system, then $\mathbf{A}(\mathbf{x}_1 - \mathbf{x}_2) = \mathbf{0}$, so $\mathbf{x}_1 - \mathbf{x}_2 \in N(\mathbf{A})$; invertibility forces $N(\mathbf{A}) = \{\mathbf{0}\}$, hence $\mathbf{x}_1 = \mathbf{x}_2$.

Conversely, if $\mathbf{A}$ is not invertible (a square matrix with $\text{rank}(\mathbf{A}) < n$), then $\mathbf{A}^{-1}$ does not exist and the formula $\mathbf{x} = \mathbf{A}^{-1}\mathbf{b}$ is unavailable. Depending on $\mathbf{b}$, the system then has either no solution (if $\mathbf{b} \notin C(\mathbf{A})$) or infinitely many (if $\mathbf{b} \in C(\mathbf{A})$, since any $\mathbf{x}_{null} \in N(\mathbf{A})$ can be added to a solution).

Why a two-sided inverse needs a square, full-rank matrix. Reaching every $\mathbf{b}$ means $C(\mathbf{A}) = \mathbb{R}^m$, i.e. $\mathbf{A}$ is surjective (full row rank) — this is only the existence half. A true (two-sided) inverse also needs uniqueness, $N(\mathbf{A}) = \{\mathbf{0}\}$, and the two conditions together force $\mathbf{A}$ to be square. Looking at the non-square shapes shows why:

Wide ($n > m$): one can reach every $\mathbf{b}$, but the null space is non-trivial, so solutions are not unique. $\mathbf{A}$ then has only a right inverse, not a two-sided one.
Tall ($m > n$), such as the $3 \times 2$ example above: $\dim C(\mathbf{A}) \le n < m$, so most $\mathbf{b}$ are unreachable in the first place — existence already fails.

Note that $\mathbf{x} = \mathbf{A}^{-1}\mathbf{b}$ is a statement of existence and uniqueness, not a recipe for computation: in practice one solves $\mathbf{A}\mathbf{x} = \mathbf{b}$ by factorization (e.g. LU, QR) rather than by forming $\mathbf{A}^{-1}$ explicitly, which is more expensive and numerically less stable.

Literature¶

Gilbert Strang, Linear Algebra and Learning from Data, Wellesley-Cambridge Press, 2019
Jim Hefferon, Linear Algebra, Saint Michael's College, USA, 2017
Gilbert Strang, Introduction to Linear Algebra, Wellesley-Cambridge Press, 2016
Philip Klein, Coding the Matrix: Linear Algebra through Applications to Computer Science, Newtonian Press, 2013