Irreducible operators are very important in Mathematics. For the case of Directed Graphs, the adjacency matrix is irreducible if and only if the graph is strongly connected. Unfortunately, the canonical definition of irreducible matrices has bothered me for years. If you search for reducible on Wolfram, Wikipedia, or Planet Math you will get one of two definitions:
Definition 1: An
n x n
matrixA
is said to be reducible if and only if for some permutation matrixP
, the matrixP^T A P
is block upper triangular.
or
Definition 2: A square
n x n
matrixA = a_ij
is called reducible if the indices1, 2, ..., n
can be divided into two disjoint nonempty setsi_1, i_2, ... , i_u
andj_1, j_2, ... , j_v
(withu+v = n
) such thata_{i_α, j_β} = 0
forα = 1, 2, ... , u
andβ = 1, 2, ..., v
What does this even mean? Block upper triangular? Disjoint, non-empty sets of indices? What does this have to do with reducibility? These definitions are over-complicated, overly-technical, and miss the simplistic beauty of what it means for an operator to be reducible.
As a graduate student I was tasked with extending the Perron-Frobenius Theorem to a more generalized setting. I had to think about what reducibility meant, in general. Several times my thesis supervisor hinted at such a generalization, but it never came to me intuitively until a few years ago.
Here is what the definition of a reducible matrix should be:
Definition: An
N x N
matrixA
is reducible if there exists a non-trivial subspace invariant underA
, i.e. there existsY ⊊ ℝ_N
such thatA(Y) ⊆ Y
.
Or, more generally for linear operators:
Definition: An linear operator
T
is reducible if there exists a non-trivial subspace invariant underT
.
In the above more general definition, we are saying: if T : X -> X
is a linear operator, then T
is reducible if there exists some non-trivial supspace Y ⊊ X
such that T(Y) ⊆ Y
. This means we can “reduce” the behaviour of T
onto some well-defined subset Y ⊊ X
. Therefore, it may be possible to analyze T
in a restricted (or reduced) subspace Y
. Essentially, we’ve “reduced” the behaviour of T
to a subset of X
.
Conversely, the above definition means that irreducible operators have no special behaviour for any subspace of X
. That is, their behaviour cannot be further reduced; T
requires the entire space X
to express itself.
While I am using very loose language (what does it mean for an operator to express itself, or what is an operators “behaviour”), it is hopefully intuitive that this definition has more interpretive meaning.
For matrices, the invariant subspace definition and the block upper-triangular / permutation indeces definitions of reducibility are equivalent. It’s a fairly straightforward proof: if a non-trivial, invariant subspace exists, then decompose your space into the direct sum of this space and its compliment. Using a permutation matrix to transform your basis, it’s fairly easy to see that the matrix can be represented in block-triangular form (the proof comes down to showing that one of your blocks A_3 * Y = 0
which, since Y
is non-trivial, implies the block A_3
must be zero). If your matrix is in block upper-triangular form, it’s easy to see there is an invariant subspace.
The initial definitions are overly technical and miss the point of reducibility; that the behaviour of reducible operators can be simplified, and that irreducible operators require the entire space to express themselves. Further, the block upper-triangular and permutation indices definitions do not translate well to more generalized settings. For example, what does upper-triangular form mean in L_p
spaces? It turns out there is a way to define upper-triangularity in L_p
spaces (see this book), but reducibility in the context of invariant subspaces is much more intuitive.
Please petition your local math authorities to adopt this much simpler definition. Thank you.