Download as pdf or txt
Download as pdf or txt
You are on page 1of 53

Linear Algebra Done Right

preliminary version of fourth edition


6 November 2022
© 2022

Sheldon Axler
[email protected]

This document contains Chapter 5 of the future fourth edition of Linear Algebra
Done Right. Notice the nice results that follow from the increasing use of the
minimal polynomial. Suggestions for improvements are most welcome.

The fourth edition of Linear Algebra Done Right will be an Open Access book,
which means that the electronic version will be legally free to the world. The
print version will be published by Springer and will be reasonably priced.

If you will teach an appropriate class beginning in January 2023, please consider
using the fourth edition of Linear Algebra Done Right as your textbook. Instructors
can contact me for an electronic version of the book that can be freely distributed
to your students (the electronic version will not be publicly available until the
print version is published). Your class testing can help improve this book.

Because this is not necessarily the final version of Chapter 5, please do not post
this document elsewhere on the web, although it is fine to link to it.

𝑛 𝑛
1 + √5
1 1 − √5
𝐹𝑛 = [( ) −( )]
√5 2 2
Cover equation: Formula for 𝑛th Fibonacci number. Exercise 16 in Section 5D
derives this formula by diagonalizing an appropriate operator.
About the Author

Sheldon Axler was valedictorian of his high school in Miami, Florida. He received
his AB from Princeton University with highest honors, followed by a PhD in
Mathematics from the University of California at Berkeley.
As a postdoctoral Moore Instructor at MIT, Axler received a university-wide
teaching award. He was then an assistant professor, associate professor, and
professor at Michigan State University, where he received the first J. Sutherland
Frame Teaching Award and the Distinguished Faculty Award.
Axler received the Lester R. Ford Award for expository writing from the Math-
ematical Association of America in 1996, for a paper that eventually expanded into
this book. In addition to publishing numerous research papers, he is the author of
six mathematics textbooks, ranging from freshman to graduate level. Previous
editions of this book have been adopted as a textbook at over 350 universities and
colleges and have been translated into three languages.
Axler has served as Editor-in-Chief of the Mathematical Intelligencer and
Associate Editor of the American Mathematical Monthly. He has been a member
of the Council of the American Mathematical Society and a member of the Board
of Trustees of the Mathematical Sciences Research Institute. He has also served
on the editorial board of Springer’s series Undergraduate Texts in Mathemat-
ics, Graduate Texts in Mathematics, Universitext, and Springer Monographs in
Mathematics.
He is a Fellow of the American Mathematical Society and has been a recipient
of numerous grants from the National Science Foundation.
Axler joined San Francisco State University as Chair of the Mathematics
Department in 1997. He served as Dean of the College of Science & Engineering
from 2002 to 2015, when he returned to a regular faculty appointment as a
professor in the Mathematics Department.

v
Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Contents

About the Author v

Preface for Students xii

Preface for Instructors xiii

Acknowledgments xvii

Chapter 1
Vector Spaces 1
1A 𝐑𝑛 and 𝐂𝑛 2
Complex Numbers 2
Lists 4
𝐅𝑛 6
Digression on Fields 10
Exercises 1A 10
1B Definition of Vector Space 12
Exercises 1B 16
1C Subspaces 18
Sums of Subspaces 19
Direct Sums 21
Exercises 1C 24

Chapter 2
Finite-Dimensional Vector Spaces 27
2A Span and Linear Independence 28
Linear Combinations and Span 28
Linear Independence 31
Exercises 2A 37

vi
Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Contents vii

2B Bases 39
Exercises 2B 42
2C Dimension 44
Exercises 2C 48

Chapter 3
Linear Maps 51
3A Vector Space of Linear Maps 52
Definition and Examples of Linear Maps 52
Algebraic Operations on ℒ(𝑉, 𝑊) 55
Exercises 3A 57
3B Null Spaces and Ranges 59
Null Space and Injectivity 59
Range and Surjectivity 61
Fundamental Theorem of Linear Maps 62
Exercises 3B 66
3C Matrices 69
Representing a Linear Map by a Matrix 69
Addition and Scalar Multiplication of Matrices 71
Matrix Multiplication 72
Exercises 3C 76
3D Invertibility and Isomorphic Vector Spaces 78
Invertible Linear Maps 78
Isomorphic Vector Spaces 82
Linear Maps Thought of as Matrix Multiplication 84
Exercises 3D 86
3E Products and Quotients of Vector Spaces 88
Products of Vector Spaces 88
Products and Direct Sums 89
Quotients of Vector Spaces 90
Exercises 3E 95
3F Duality 97
Dual Space and Dual Map 97
Null Space and Range of Dual of Linear Map 100
Matrix of Dual of Linear Map 105

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
viii Contents

Rank of Matrix 107


Exercises 3F 109

Chapter 4
Polynomials 113
Uniqueness of Coefficients for Polynomials 116
Division Algorithm for Polynomials 116
Zeros of Polynomials 118
Factorization of Polynomials over 𝐂 119
Factorization of Polynomials over 𝐑 123
Exercises 4 125

Chapter 5
Eigenvalues and Eigenvectors 128
5A Invariant Subspaces 129
Eigenvalues 129
Polynomials Applied to Operators 133
Exercises 5A 135
5B Eigenvalues and the Minimal Polynomial 139
Existence of Eigenvalues 139
Minimal Polynomial 140
Exercises 5B 145
5C Upper-Triangular Matrices 148
Exercises 5C 155
5D Diagonalizable Operators 156
Exercises 5D 163
5E Commuting Operators 166
Exercises 5E 170

Chapter 6
Inner Product Spaces 172
6A Inner Products and Norms 173
Inner Products 173
Norms 177
Exercises 6A 182

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Contents ix

6B Orthonormal Bases 188


Orthonormal Lists and the Gram–Schmidt Procedure 188
Linear Functionals on Inner Product Spaces 195
Exercises 6B 198
6C Orthogonal Complements and Minimization Problems 202
Orthogonal Complements 202
Minimization Problems 208
Pseudoinverse 211
Exercises 6C 214

Chapter 7
Operators on Inner Product Spaces 217
7A Self-Adjoint and Normal Operators 218
Adjoints 218
Self-Adjoint Operators 223
Normal Operators 225
Exercises 7A 229
7B Spectral Theorem 232
Real Spectral Theorem 232
Complex Spectral Theorem 235
Exercises 7B 236
7C Positive Operators 239
Exercises 7C 243
7D Isometries, Unitary Operators, and 𝑄𝑅 Factorization 245
Isometries 245
Unitary Operators 247
𝑄𝑅 Factorization 250
Exercises 7D 253
7E Singular Value Decomposition 255
Singular Values 255
SVD for Linear Maps and for Matrices 258
Exercises 7E 263
7F Consequences of the Singular Value Decomposition 265
Norms of Linear Maps 265
Approximation by Linear Maps with Lower-Dimensional Range 268

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
x Contents

Polar Decomposition 270


Volume Via Singular Values 272
Exercises 7F 272

Chapter 8
Operators on Complex Vector Spaces 275
8A Generalized Eigenvectors and Nilpotent Operators 276
Null Spaces of Powers of an Operator 276
Generalized Eigenvectors 278
Nilpotent Operators 283
Exercises 8A 285
8B Decomposition of an Operator 288
Description of Operators on Complex Vector Spaces 288
Multiplicity of an Eigenvalue 289
Block Diagonal Matrices 291
Square Roots of Operators 293
Exercises 8B 295
8C The Characteristic Polynomial 297
Cayley–Hamilton Theorem 297
Exercises 8C 298
8D Jordan Form 301
Exercises 8D 304

Chapter 9
Operators on Real Vector Spaces 306
9A Complexification 307
Complexification of a Vector Space 307
Complexification of an Operator 308
Minimal Polynomial of Complexification 310
Eigenvalues of Complexification 310
Characteristic Polynomial of Complexification 313
Exercises 9A 315
9B Operators on Real Inner Product Spaces 318
Normal Operators on Real Inner Product Spaces 318
Unitary Operators on Real Inner Product Spaces 322
Exercises 9B 324

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Contents xi

Chapter 10
Trace and Determinant 326
10A Trace 327
Change of Basis 327
Trace: A Connection Between Operators and Matrices 330
Exercises 10A 335
10B Determinant 337
Determinant of an Operator 337
Determinant of a Matrix 339
Sign of Determinant 349
Volume Via Determinants 351
Exercises 10B 359

Chapter 11
Multilinear Algebra and Tensors 362
11A Multilinear Algebra 363
11B Tensors 364

Photo Credits 365

Symbol Index 366

Index 367

Colophon: Notes on Typesetting 371

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Major changes for the fourth edition:

• Increasing use of the minimal polynomial to provide cleaner proofs of multiple


results, including necessary and sufficient conditions for an operator to have an
upper-triangular matrix with respect to some basis (see Section 5C), necessary
and sufficient conditions for diagonalizabilty (see Section 5D), and the real
spectral theorem (see Section 7B).
• New section on commuting operators (see Section 5E).

• New subsection on pseudoinverse (see Section 6C).

• New subsection on 𝑄𝑅 factorization (see Section 7D).

• Singular value decomposition now done for linear maps from an inner product
space to another (possibly different) inner product space, rather than only deal-
ing with linear operators from an inner product space to itself (see Section 7E).
• Polar decomposition now proved from singular value decomposition, rather
than in the opposite order; this has led to cleaner proofs of both the singu-
lar value decomposition (see Section 7E) and the polar decomposition (see
Section 7F).
• New subsection on norms of linear maps on finite-dimensional inner product
spaces, using the singular value decomposition to avoid even mentioning
supremum in the definition of the norm of a linear map (see Section 7F).
• New subsection on approximation by linear maps with lower-dimensional
range (see Section 7F).
Chapter 5
Eigenvalues and Eigenvectors

Linear maps from one vector space to another vector space were the objects of
study in Chapter 3. Now we begin our investigation of operators, which are linear
maps from a vector space to itself. Their study constitutes the most important
part of linear algebra.
To learn about an operator, we might try restricting it to a smaller subspace.
Asking for that restriction to be an operator will lead us to the notion of invariant
subspaces. Each one-dimensional invariant subspace arises from a vector that
the operator maps into a scalar multiple of the vector. This path will lead us to
eigenvectors and eigenvalues.
We will then prove one of the most important results in linear algebra: every
operator on a finite-dimensional, nonzero, complex vector space has an eigenvalue.
This result will allow us to show that for each operator on a finite-dimensional
complex vector space, there is a basis of the vector space with respect to which
the matrix of the operator has at least almost half its entries equal to 0.
Note that in this chapter we assume that 𝑉 is finite-dimensional.

standing assumptions for this chapter

• 𝐅 denotes 𝐑 or 𝐂 .
• 𝑉 and 𝑊 denote vector spaces over 𝐅 , and 𝑉 is finite-dimensional.
Hans-Peter Postel CC BY

Statue of Leonardo of Pisa (1170–1250, approximate dates), also known as Fibonacci.


Exercise 16 in Section 5D shows how linear algebra can be used to find
an explicit formula for the Fibonacci sequence.
128
Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5A Invariant Subspaces 129

5A Invariant Subspaces
Eigenvalues

5.1 definition: operator

A linear map from a vector space to itself is called an operator.

Suppose 𝑇 ∈ ℒ(𝑉). If 𝑚 ≥ 2 and Recall that we defined the notation


𝑉 = 𝑉1 ⊕ ⋯ ⊕ 𝑉𝑚 , ℒ(𝑉) to mean ℒ(𝑉, 𝑉).
where each 𝑉𝑘 is a nonzero subspace of 𝑉, then to understand the behavior of 𝑇 we
need only understand the behavior of each 𝑇|𝑉𝑘 ; here 𝑇|𝑉𝑘 denotes the restriction
of 𝑇 to the smaller domain 𝑉𝑘 . Dealing with 𝑇|𝑉𝑘 should be easier than dealing
with 𝑇 because 𝑉𝑘 is a smaller vector space than 𝑉.
However, if we intend to apply tools useful in the study of operators (such
as taking powers), then we have a problem: 𝑇|𝑉𝑘 may not map 𝑉𝑘 into itself; in
other words, 𝑇|𝑉𝑘 may not be an operator on 𝑉𝑘 . Thus we are led to consider only
decompositions of 𝑉 of the form above where 𝑇 maps each 𝑉𝑘 into itself. Hence
we now give a name to subspaces of 𝑉 that get mapped into themselves by 𝑇.

5.2 definition: invariant subspace

Suppose 𝑇 ∈ ℒ(𝑊). A subspace 𝑈 of 𝑊 is called invariant under 𝑇 if 𝑇𝑢 ∈ 𝑈


for every 𝑢 ∈ 𝑈.

In other words, 𝑈 is invariant under 𝑇 if 𝑇|𝑈 is an operator on 𝑈.

5.3 example: invariant subspace of differentiation operator


Suppose that 𝑇 ∈ ℒ(𝒫(𝐑)) is defined by 𝑇𝑝 = 𝑝′. Then 𝒫4 (𝐑), which is a
subspace of 𝒫(𝐑), is invariant under 𝑇 because if 𝑝 ∈ 𝒫(𝐑) has degree at most 4,
then 𝑝′ also has degree at most 4.

5.4 example: four invariant subspaces, not necessarily all different


If 𝑇 ∈ ℒ(𝑊), then the following subspaces are all invariant under 𝑇:
{0} The subspace {0} is invariant under 𝑇 because if 𝑢 ∈ {0}, then 𝑢 = 0
and hence 𝑇𝑢 = 0 ∈ {0}.
𝑊 The subspace 𝑊 is invariant under 𝑇 because if 𝑢 ∈ 𝑊, then 𝑇𝑢 ∈ 𝑊.
null 𝑇 The subspace null 𝑇 is invariant under 𝑇 because if 𝑢 ∈ null 𝑇, then
𝑇𝑢 = 0, and hence 𝑇𝑢 ∈ null 𝑇.
range 𝑇 The subspace range 𝑇 is invariant under 𝑇 because if 𝑢 ∈ range 𝑇,
then 𝑇𝑢 ∈ range 𝑇.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
130 Chapter 5 Eigenvalues and Eigenvectors

Must an operator 𝑇 ∈ ℒ(𝑉) have The most famous unsolved problem


any invariant subspaces other than {0} in functional analysis is called
and 𝑉? Later we will see that this ques- the invariant subspace problem.
tion has an affirmative answer if 𝑉 is This problem deals with invari-
finite-dimensional and dim 𝑉 > 1 (for ant subspaces of operators on
𝐅 = 𝐂 ) or dim 𝑉 > 2 (for 𝐅 = 𝐑); see infinite-dimensional vector spaces.
5.21 and 9.7.
Although null 𝑇 and range 𝑇 are invariant under 𝑇, they do not necessarily
provide easy answers to the question above about the existence of invariant sub-
spaces other than {0} and 𝑉, because null 𝑇 may equal {0} and range 𝑇 may equal
𝑉 (this happens when 𝑇 is invertible).
We will return later to a deeper study of invariant subspaces. Now we turn to
an investigation of the simplest possible nontrivial invariant subspaces—invariant
subspaces with dimension 1.
Take any 𝑣 ∈ 𝑊 with 𝑣 ≠ 0 and let 𝑈 equal the set of all scalar multiples of 𝑣:

𝑈 = {𝜆𝑣 ∶ 𝜆 ∈ 𝐅} = span(𝑣).

Then 𝑈 is a one-dimensional subspace of 𝑊 (and every one-dimensional subspace


of 𝑊 is of this form for an appropriate choice of 𝑣). If 𝑈 is invariant under an
operator 𝑇 ∈ ℒ(𝑊), then 𝑇𝑣 ∈ 𝑈, and hence there is a scalar 𝜆 ∈ 𝐅 such that

𝑇𝑣 = 𝜆𝑣.

Conversely, if 𝑇𝑣 = 𝜆𝑣 for some 𝜆 ∈ 𝐅 , then span(𝑣) is a one-dimensional


subspace of 𝑊 invariant under 𝑇.
The equation 𝑇𝑣 = 𝜆𝑣, which we have just seen is intimately connected with
one-dimensional invariant subspaces, is important enough that the vectors 𝑣 and
scalars 𝜆 satisfying it are given special names.

5.5 definition: eigenvalue

Suppose 𝑇 ∈ ℒ(𝑊). A number 𝜆 ∈ 𝐅 is called an eigenvalue of 𝑇 if there


exists 𝑣 ∈ 𝑊 such that 𝑣 ≠ 0 and 𝑇𝑣 = 𝜆𝑣.

In the definition above, we require The word eigenvalue is half-German,


that 𝑣 ≠ 0 because every scalar 𝜆 ∈ 𝐅 half-English. The German adjective
satisfies 𝑇0 = 𝜆0. eigen means “own” in the sense of
The comments above show that 𝑊 characterizing an intrinsic property.
has a one-dimensional subspace invariant
under 𝑇 if and only if 𝑇 has an eigenvalue.

5.6 example: eigenvalue


Define 𝑇 ∈ ℒ(𝐑3 ) by 𝑇(𝑥, 𝑦, 𝑧) = (7𝑥 + 3𝑧, 3𝑥 + 6𝑦 + 9𝑧, −6𝑦). Then
𝑇(3, 1, −1) = (18, 6, −6) = 6(3, 1, −1). Thus 6 is an eigenvalue of 𝑇.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5A Invariant Subspaces 131

Recall our standing assumption for this chapter that 𝑉 is finite-dimensional,


which is needed for our next result.

5.7 equivalent conditions to be an eigenvalue

Suppose 𝑇 ∈ ℒ(𝑉) and 𝜆 ∈ 𝐅 . Then the following are equivalent.


(a) 𝜆 is an eigenvalue of 𝑇
Recall that 𝐼 ∈ ℒ(𝑉) is the identity
(b) 𝑇 − 𝜆𝐼 is not injective operator defined by 𝐼𝑣 = 𝑣 for all
(c) 𝑇 − 𝜆𝐼 is not surjective 𝑣 ∈ 𝑉.

(d) 𝑇 − 𝜆𝐼 is not invertible

Proof Conditions (a) and (b) are equivalent because the equation 𝑇𝑣 = 𝜆𝑣
is equivalent to the equation (𝑇 − 𝜆𝐼)𝑣 = 0. Conditions (b), (c), and (d) are
equivalent by 3.53.

5.8 definition: eigenvector

Suppose 𝑇 ∈ ℒ(𝑊) and 𝜆 ∈ 𝐅 is an eigenvalue of 𝑇. A vector 𝑣 ∈ 𝑊 is


called an eigenvector of 𝑇 corresponding to 𝜆 if 𝑣 ≠ 0 and 𝑇𝑣 = 𝜆𝑣.

Because 𝑇𝑣 = 𝜆𝑣 if and only if (𝑇 − 𝜆𝐼)𝑣 = 0, a vector 𝑣 ∈ 𝑊 with 𝑣 ≠ 0 is


an eigenvector of 𝑇 corresponding to 𝜆 if and only if 𝑣 ∈ null(𝑇 − 𝜆𝐼).

5.9 example: eigenvalues and eigenvectors


Suppose 𝑇 ∈ ℒ(𝐅2 ) is defined by 𝑇(𝑤, 𝑧) = (−𝑧, 𝑤).
(a) First consider the case where 𝐅 = 𝐑 . Then 𝑇 is a counterclockwise rotation
by 90∘ about the origin in 𝐑2. An operator has an eigenvalue if and only if
there exists a nonzero vector in its domain that gets sent by the operator to a
scalar multiple of itself. A 90∘ counterclockwise rotation of a nonzero vector
in 𝐑2 obviously never equals a scalar multiple of itself. Conclusion: if 𝐅 = 𝐑 ,
then 𝑇 has no eigenvalues (and thus has no eigenvectors).
(b) Now consider the case where 𝐅 = 𝐂 . To find eigenvalues of 𝑇, we must find
the scalars 𝜆 such that
𝑇(𝑤, 𝑧) = 𝜆(𝑤, 𝑧)
has some solution other than 𝑤 = 𝑧 = 0. The equation above is equivalent to
the simultaneous equations

5.10 − 𝑧 = 𝜆𝑤, 𝑤 = 𝜆𝑧.

Substituting the value for 𝑤 given by the second equation into the first equation
gives
−𝑧 = 𝜆2 𝑧.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
132 Chapter 5 Eigenvalues and Eigenvectors

Now 𝑧 cannot equal 0 [otherwise 5.10 implies that 𝑤 = 0; we are looking for
solutions to 5.10 where (𝑤, 𝑧) is not the 0 vector], so the equation above leads
to the equation −1 = 𝜆2. The solutions to this equation are 𝜆 = 𝑖 and 𝜆 = −𝑖.
You should be able to verify easily that 𝑖 and −𝑖 are eigenvalues of 𝑇. Indeed,
the eigenvectors corresponding to the eigenvalue 𝑖 are the vectors of the form
(𝑤, −𝑤𝑖), with 𝑤 ∈ 𝐂 and 𝑤 ≠ 0, and the eigenvectors corresponding to the
eigenvalue −𝑖 are the vectors of the form (𝑤, 𝑤𝑖), with 𝑤 ∈ 𝐂 and 𝑤 ≠ 0.

Now we show that each list of eigenvectors corresponding to distinct eigenval-


ues of an operator is linearly independent.

5.11 linearly independent eigenvectors

Let 𝑇 ∈ ℒ(𝑊). Suppose 𝜆1 , …, 𝜆𝑚 are distinct eigenvalues of 𝑇 and 𝑣1 , …, 𝑣𝑚


are corresponding eigenvectors. Then 𝑣1 , …, 𝑣𝑚 is linearly independent.

Proof Suppose 𝑣1 , …, 𝑣𝑚 is linearly dependent. Let 𝑘 be the smallest positive


integer such that
5.12 𝑣𝑘 ∈ span(𝑣1 , …, 𝑣𝑘 − 1 );
the existence of 𝑘 with this property follows from the linear dependence lemma
(2.19). Thus there exist 𝑎1 , …, 𝑎𝑘 − 1 ∈ 𝐅 such that
5.13 𝑣𝑘 = 𝑎1 𝑣1 + ⋯ + 𝑎𝑘 − 1 𝑣𝑘 − 1 .
Apply 𝑇 to both sides of this equation, getting
𝜆𝑘 𝑣𝑘 = 𝑎1 𝜆1 𝑣1 + ⋯ + 𝑎𝑘 − 1 𝜆𝑘 − 1 𝑣𝑘 − 1 .
Multiply both sides of 5.13 by 𝜆𝑘 and then subtract the equation above, getting
0 = 𝑎1 (𝜆𝑘 − 𝜆1 )𝑣1 + ⋯ + 𝑎𝑘 − 1 (𝜆𝑘 − 𝜆𝑘 − 1 )𝑣𝑘 − 1 .
Because we chose 𝑘 to be the smallest positive integer satisfying 5.12, 𝑣1 , …, 𝑣𝑘 − 1
is linearly independent. Thus the equation above implies that
𝑎1 (𝜆𝑘 − 𝜆1 ) = ⋯ = 𝑎𝑘 − 1 (𝜆𝑘 − 𝜆𝑘 − 1 ) = 0.
Because 𝜆1 , …, 𝜆𝑘 are distinct numbers, this implies that 𝑎1 = ⋯ = 𝑎𝑘 − 1 = 0.
Now 5.13 implies that 𝑣𝑘 = 0, contradicting our hypothesis that 𝑣𝑘 is an eigenvector.
Therefore our assumption that 𝑣1 , …, 𝑣𝑚 is linearly dependent was false.

5.14 operator cannot have more eigenvalues than dimension of vector space

Each operator on 𝑉 has at most dim 𝑉 distinct eigenvalues.

Proof Let 𝑇 ∈ ℒ(𝑉). Suppose 𝜆1 , …, 𝜆𝑚 are distinct eigenvalues of 𝑇. Let


𝑣1 , …, 𝑣𝑚 be corresponding eigenvectors. Then 5.11 implies that the list 𝑣1 , …, 𝑣𝑚
is linearly independent. Thus 𝑚 ≤ dim 𝑉 (see 2.22), as desired.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5A Invariant Subspaces 133

Polynomials Applied to Operators


The main reason that a richer theory exists for operators (which map a vector
space into itself) than for more general linear maps is that operators can be raised
to powers. In this subsection we define that notion and the key concept of applying
a polynomial to an operator. This concept will be the main tool that we use in the
next section when we prove that every operator on a nonzero finite-dimensional
complex vector space has an eigenvalue.
If 𝑇 is an operator, then 𝑇𝑇 makes sense and is also an operator on the same
vector space as 𝑇. We usually write 𝑇 2 instead of 𝑇𝑇. More generally, we have
the following definition of 𝑇 𝑚 .

5.15 notation: 𝑇 𝑚

Suppose 𝑇 ∈ ℒ(𝑊) and 𝑚 is a positive integer.


• 𝑇 𝑚 is defined by
⏟ .
𝑇 𝑚 = 𝑇⋯𝑇
𝑚 times

• 𝑇 0 is defined to be the identity operator 𝐼 on 𝑊.


• If 𝑇 is invertible with inverse 𝑇 −1 , then 𝑇 −𝑚 is defined by
𝑚
𝑇 −𝑚 = (𝑇 −1 ) .

You should verify that if 𝑇 is an operator, then


𝑇𝑚𝑇𝑛 = 𝑇𝑚 + 𝑛 and (𝑇 𝑚 )𝑛 = 𝑇 𝑚𝑛 ,
where 𝑚 and 𝑛 are arbitrary integers if 𝑇 is invertible and are nonnegative integers
if 𝑇 is not invertible.
Having defined powers of an operator, we can now define what it means to
apply a polynomial to an operator.

5.16 notation: 𝑝(𝑇)

Suppose 𝑇 ∈ ℒ(𝑊) and 𝑝 ∈ 𝒫(𝐅) is a polynomial given by

𝑝(𝑧) = 𝑎0 + 𝑎1 𝑧 + 𝑎2 𝑧2 + ⋯ + 𝑎𝑚 𝑧𝑚

for 𝑧 ∈ 𝐅 . Then 𝑝(𝑇) is the operator defined by

𝑝(𝑇) = 𝑎0 𝐼 + 𝑎1 𝑇 + 𝑎2 𝑇 2 + ⋯ + 𝑎𝑚 𝑇 𝑚.

This is a new use of the symbol 𝑝 because we are applying 𝑝 to operators, not
just elements of 𝐅 . The idea here is that to evaluate 𝑝(𝑇), we simply replace 𝑧 with
𝑇 in the expression defining 𝑝. Note that the constant term 𝑎0 in 𝑝(𝑧) becomes the
operator 𝑎0 𝐼 (which is a reasonable choice because 𝑎0 = 𝑎0 𝑧0 and thus we should
replace 𝑎0 with 𝑎0 𝑇 0 , which equals 𝑎0 𝐼).

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
134 Chapter 5 Eigenvalues and Eigenvectors

5.17 example: a polynomial applied to the differentiation operator


Suppose 𝐷 ∈ ℒ(𝒫(𝐑)) is the differentiation operator defined by 𝐷𝑞 = 𝑞′ and
𝑝 is the polynomial defined by 𝑝(𝑥) = 7 − 3𝑥 + 5𝑥2. Then 𝑝(𝐷) = 7𝐼 − 3𝐷 + 5𝐷2.
Thus
(𝑝(𝐷))𝑞 = 7𝑞 − 3𝑞′ + 5𝑞″
for every 𝑞 ∈ 𝒫(𝐑).

If we fix an operator 𝑇 ∈ ℒ(𝑊), then the function from 𝒫(𝐅) to ℒ(𝑉) given
by 𝑝 ↦ 𝑝(𝑇) is linear, as you should verify.

5.18 definition: product of polynomials

If 𝑝, 𝑞 ∈ 𝒫(𝐅), then 𝑝𝑞 ∈ 𝒫(𝐅) is the polynomial defined by

(𝑝𝑞)(𝑧) = 𝑝(𝑧)𝑞(𝑧)

for 𝑧 ∈ 𝐅 .

Part (b) of the next result states that the order does not matter when taking
products of polynomials of a single operator.

5.19 multiplicative properties

Suppose 𝑝, 𝑞 ∈ 𝒫(𝐅) and 𝑇 ∈ ℒ(𝑊).


Informal proof: When expanding a
Then
product of polynomials using the dis-
(a) (𝑝𝑞)(𝑇) = 𝑝(𝑇)𝑞(𝑇); tributive property, it does not matter
(b) 𝑝(𝑇)𝑞(𝑇) = 𝑞(𝑇)𝑝(𝑇). whether the symbol is 𝑧 or 𝑇.

Proof
(a) Suppose 𝑝(𝑧) = ∑𝑚 𝑎 𝑧𝑗 and 𝑞(𝑧) = ∑𝑛𝑘 = 0 𝑏𝑘 𝑧𝑘 for 𝑧 ∈ 𝐅 . Then
𝑗=0 𝑗

𝑚 𝑛
(𝑝𝑞)(𝑧) = ∑ ∑ 𝑎𝑗 𝑏𝑘 𝑧𝑗 + 𝑘 .
𝑗 = 0 𝑘=0

Thus
𝑚 𝑛
(𝑝𝑞)(𝑇) = ∑ ∑ 𝑎𝑗 𝑏𝑘 𝑇 𝑗 + 𝑘
𝑗 = 0 𝑘=0
𝑚 𝑛
= ( ∑ 𝑎𝑗 𝑇 𝑗 )( ∑ 𝑏𝑘 𝑇 𝑘 )
𝑗=0 𝑘=0

= 𝑝(𝑇)𝑞(𝑇).

(b) Part (a) implies 𝑝(𝑇)𝑞(𝑇) = (𝑝𝑞)(𝑇) = (𝑞𝑝)(𝑇) = 𝑞(𝑇)𝑝(𝑇).

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5A Invariant Subspaces 135

We observed earlier that if 𝑇 ∈ ℒ(𝑉), then null 𝑇 and range 𝑇 are invariant
under 𝑇 [see 5.4]. Now we show that the null space and the range of each
polynomial of 𝑇 are also invariant under 𝑇.

5.20 null space and range of 𝑝(𝑇) are invariant under 𝑇

Suppose 𝑇 ∈ ℒ(𝑉) and 𝑝 ∈ 𝒫(𝐅). Then null 𝑝(𝑇) and range 𝑝(𝑇) are
invariant under 𝑇.

Proof Suppose 𝑢 ∈ null 𝑝(𝑇). Then 𝑝(𝑇)𝑢 = 0. Thus

(𝑝(𝑇))(𝑇𝑢) = 𝑇(𝑝(𝑇)𝑢) = 𝑇(0) = 0.

Hence 𝑇𝑢 ∈ null 𝑝(𝑇). Thus null 𝑝(𝑇) is invariant under 𝑇, as desired.


Suppose 𝑢 ∈ range 𝑝(𝑇). Then there exists 𝑣 ∈ 𝑉 such that 𝑢 = 𝑝(𝑇)𝑣. Thus

𝑇𝑢 = 𝑇(𝑝(𝑇)𝑣) = 𝑝(𝑇)(𝑇𝑣).

Hence 𝑇𝑢 ∈ range 𝑝(𝑇). Thus range 𝑝(𝑇) is invariant under 𝑇, as desired.

Exercises 5A

1 Suppose 𝑇 ∈ ℒ(𝑊) and 𝑈 is a subspace of 𝑊.


(a) Prove that if 𝑈 ⊂ null 𝑇, then 𝑈 is invariant under 𝑇.
(b) Prove that if range 𝑇 ⊂ 𝑈, then 𝑈 is invariant under 𝑇.

2 Suppose 𝑆, 𝑇 ∈ ℒ(𝑊) are such that 𝑆𝑇 = 𝑇𝑆.


(a) Prove that null 𝑆 is invariant under 𝑇.
(b) Prove that range 𝑆 is invariant under 𝑇.

3 Suppose that 𝑇 ∈ ℒ(𝑊) and 𝑊1 , …, 𝑊𝑚 are subspaces of 𝑊 invariant


under 𝑇. Prove that 𝑊1 + ⋯ + 𝑊𝑚 is invariant under 𝑇.
4 Suppose 𝑇 ∈ ℒ(𝑊). Prove that the intersection of every collection of
subspaces of 𝑊 invariant under 𝑇 is invariant under 𝑇.
5 Prove or give counterexample: If 𝑈 is a subspace of 𝑉 that is invariant under
every operator on 𝑉, then 𝑈 = {0} or 𝑈 = 𝑉.
6 Suppose 𝑇 ∈ ℒ(𝐑2 ) is defined by 𝑇(𝑥, 𝑦) = (−3𝑦, 𝑥). Find the eigenvalues
of 𝑇.
7 Define 𝑇 ∈ ℒ(𝐅2 ) by
𝑇(𝑤, 𝑧) = (𝑧, 𝑤).
Find all eigenvalues and eigenvectors of 𝑇.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
136 Chapter 5 Eigenvalues and Eigenvectors

8 Define 𝑇 ∈ ℒ(𝐅3 ) by

𝑇(𝑧1 , 𝑧2 , 𝑧3 ) = (2𝑧2 , 0, 5𝑧3 ).

Find all eigenvalues and eigenvectors of 𝑇.


9 Define 𝑇 ∶ 𝒫(𝐑) → 𝒫(𝐑) by 𝑇𝑝 = 𝑝′. Find all eigenvalues and eigenvectors
of 𝑇.
10 Define 𝑇 ∈ ℒ(𝒫4 (𝐑)) by

(𝑇𝑝)(𝑥) = 𝑥𝑝′(𝑥)

for all 𝑥 ∈ 𝐑 . Find all eigenvalues and eigenvectors of 𝑇.


11 Suppose 𝑇 ∈ ℒ(𝑉) and 𝜆 ∈ 𝐅 . Prove that there exists 𝛼 ∈ 𝐅 such that
1
|𝛼 − 𝜆| < 1000 and 𝑇 − 𝛼𝐼 is invertible.
12 Suppose 𝑊 = 𝑋 ⊕ 𝑌, where 𝑋 and 𝑌 are nonzero subspaces of 𝑊. Define
𝑃 ∈ ℒ(𝑊) by 𝑃(𝑥 + 𝑦) = 𝑥 for 𝑥 ∈ 𝑋 and 𝑦 ∈ 𝑌. Find all eigenvalues and
eigenvectors of 𝑃.
13 Suppose 𝑇 ∈ ℒ(𝑊). Suppose 𝑆 ∈ ℒ(𝑊) is invertible.
(a) Prove that 𝑇 and 𝑆−1 𝑇𝑆 have the same eigenvalues.
(b) What is the relationship between the eigenvectors of 𝑇 and the eigen-
vectors of 𝑆−1 𝑇𝑆?

14 Suppose 𝑉 is a complex vector space, 𝑇 ∈ ℒ(𝑉), and the matrix of 𝑇 with


respect to some basis of 𝑉 contains only real entries. Show that if 𝜆 is an
eigenvalue of 𝑇, then so is 𝜆.
15 Give an example of an operator 𝐑4 that has no (real) eigenvalues.
16 Suppose 𝑣1 , …, 𝑣𝑛 is a basis of 𝑉 and 𝑇 ∈ ℒ(𝑉). Prove that if 𝜆 is an
eigenvalue of 𝑇, then

|𝜆| ≤ 𝑛 max{|ℳ(𝑇)𝑗, 𝑘 | ∶ 1 ≤ 𝑗, 𝑘 ≤ 𝑛},

where ℳ(𝑇)𝑗, 𝑘 denotes the entry in row 𝑗, column 𝑘 of the matrix of 𝑇 with
respect to the basis 𝑣1 , …, 𝑣𝑛 .
For a different bound on the absolute value of an eigenvalue of an operator 𝑇,
see Exercise 18 in Section 6A.

17 Show that the forward shift operator 𝑇 ∈ ℒ(𝐅∞ ) defined by

𝑇(𝑧1 , 𝑧2 , … ) = (0, 𝑧1 , 𝑧2 , … )

has no eigenvalues.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5A Invariant Subspaces 137

18 Define the backward shift operator 𝑆 ∈ ℒ(𝐅∞ ) by


𝑆(𝑧1 , 𝑧2 , 𝑧3 , … ) = (𝑧2 , 𝑧3 , … ).
(a) Show that every element of 𝐅 is an eigenvalue of 𝑆.
(b) Find all the eigenvectors of 𝑆.

19 Suppose 𝑇 ∈ ℒ(𝑊) is invertible.


(a) Suppose 𝜆 ∈ 𝐅 with 𝜆 ≠ 0. Prove that 𝜆 is an eigenvalue of 𝑇 if and
only if 1𝜆 is an eigenvalue of 𝑇 −1 .
(b) Prove that 𝑇 and 𝑇 −1 have the same eigenvectors.

20 Suppose 𝑇 ∈ ℒ(𝑊) and there exist nonzero vectors 𝑣 and 𝑤 in 𝑊 such that
𝑇𝑣 = 3𝑤 and 𝑇𝑤 = 3𝑣.
Prove that 3 or −3 is an eigenvalue of 𝑇.
21 Suppose 𝑆, 𝑇 ∈ ℒ(𝑉). Prove that 𝑆𝑇 and 𝑇𝑆 have the same eigenvalues.
22 Suppose 𝑅, 𝑆, 𝑇 ∈ ℒ(𝑉). Prove or give counterexample: 𝑅𝑆𝑇 and 𝑅𝑇𝑆
have the same eigenvalues.
23 Suppose 𝐴 is an 𝑛-by-𝑛 matrix with entries in 𝐅 . Define 𝑇 ∈ ℒ(𝐅𝑛 ) by
𝑇𝑥 = 𝐴𝑥, where elements of 𝐅𝑛 are thought of as 𝑛-by-1 column vectors.
(a) Suppose the sum of the entries in each row of 𝐴 equals 1. Prove that 1
is an eigenvalue of 𝑇.
(b) Suppose the sum of the entries in each column of 𝐴 equals 1. Prove that
1 is an eigenvalue of 𝑇.

24 Suppose 𝑇 ∈ ℒ(𝑊) and 𝑢, 𝑣 are eigenvectors of 𝑇 such that 𝑢 + 𝑣 is also an


eigenvector of 𝑇. Prove that 𝑢 and 𝑣 are eigenvectors of 𝑇 corresponding to
the same eigenvalue.
25 Suppose 𝑇 ∈ ℒ(𝑊) is such that every nonzero vector in 𝑊 is an eigenvector
of 𝑇. Prove that 𝑇 is a scalar multiple of the identity operator.
26 Suppose 𝑛 = dim 𝑉 and 𝑘 ∈ {1, …, 𝑛 − 1}. Suppose 𝑇 ∈ ℒ(𝑉) is such that
every subspace of 𝑉 with dimension 𝑘 is invariant under 𝑇. Prove that 𝑇 is a
scalar multiple of the identity operator.
27 Suppose 𝑇 ∈ ℒ(𝑊). Prove that 𝑇 has at most 1 + dim range 𝑇 distinct
eigenvalues.

28 Suppose 𝑇 ∈ ℒ(𝐑3 ) and −4, 5, and √7 are eigenvalues of 𝑇. Prove that


there exists 𝑥 ∈ 𝐑3 such that 𝑇𝑥 − 9𝑥 = (−4, 5, √7).
29 Suppose 𝑇 ∈ ℒ(𝑊) and (𝑇 − 2𝐼)(𝑇 − 3𝐼)(𝑇 − 4𝐼) = 0. Suppose 𝜆 is an
eigenvalue of 𝑇. Prove that 𝜆 = 2 or 𝜆 = 3 or 𝜆 = 4.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
138 Chapter 5 Eigenvalues and Eigenvectors

30 Suppose 𝑇 ∈ ℒ(𝑊) has no eigenvalues and 𝑇 4 = 𝐼. Prove that 𝑇 2 = −𝐼.


31 Suppose 𝑣1 , …, 𝑣𝑚 ∈ 𝑉. Prove that the list 𝑣1 , …, 𝑣𝑚 is linearly independent
if and only if there exists 𝑇 ∈ ℒ(𝑉) such that 𝑣1 , …, 𝑣𝑚 are eigenvectors of
𝑇 corresponding to distinct eigenvalues.

32 Suppose 𝜆1 , …, 𝜆𝑛 is a list of distinct real numbers. Prove that the list


𝑒 𝜆1𝑥 , …, 𝑒 𝜆𝑛𝑥 is linearly independent in the vector space of real-valued func-
tions on 𝐑 .
Hint: Let 𝑉 = span(𝑒 𝜆1𝑥 , …, 𝑒 𝜆𝑛𝑥 ), and define an operator 𝐷 ∈ ℒ(𝑉) by
𝐷 𝑓 = 𝑓 ′. Find eigenvalues and eigenvectors of 𝐷.

33 Suppose 𝑇 ∈ ℒ(𝑉). Define 𝒜 ∈ ℒ(ℒ(𝑉)) by


𝒜(𝑆) = 𝑇𝑆
for 𝑆 ∈ ℒ(𝑉). Prove that the set of eigenvalues of 𝑇 equals the set of
eigenvalues of 𝒜.
34 Suppose 𝑇 ∈ ℒ(𝑉) and 𝑈 is a subspace of 𝑉 invariant under 𝑇. The quotient
operator 𝑇/𝑈 ∈ ℒ(𝑉/𝑈) is defined by
(𝑇/𝑈)(𝑣 + 𝑈) = 𝑇𝑣 + 𝑈

for 𝑣 ∈ 𝑉.
(a) Show that the definition of 𝑇/𝑈 makes sense (which requires using the
condition that 𝑈 is invariant under 𝑇) and show that 𝑇/𝑈 is an operator
on 𝑉/𝑈.
(b) Show that 𝑇/(range 𝑇) = 0.
(c) Show that 𝑇/(null 𝑇) is injective if and only if null 𝑇 ∩ range 𝑇 = {0}.
(d) Show that each eigenvalue of 𝑇/𝑈 is an eigenvalue of 𝑇.

35 Suppose 𝑆, 𝑇 ∈ ℒ(𝑊) and 𝑆 is invertible. Suppose 𝑝 ∈ 𝒫(𝐅) is a polyno-


mial. Prove that
𝑝(𝑆𝑇𝑆−1 ) = 𝑆𝑝(𝑇)𝑆−1 .
36 Suppose 𝑇 ∈ ℒ(𝑊) and 𝑈 is a subspace of 𝑊 invariant under 𝑇. Prove that
𝑈 is invariant under 𝑝(𝑇) for every polynomial 𝑝 ∈ 𝒫(𝐅).

37 Define 𝑇 ∈ ℒ(𝐅𝑛 ) by
𝑇(𝑥1 , 𝑥2 , 𝑥3 , …, 𝑥𝑛 ) = (𝑥1 , 2𝑥2 , 3𝑥3 , …, 𝑛𝑥𝑛 ).
(a) Find all eigenvalues and eigenvectors of 𝑇.
(b) Find all invariant subspaces of 𝑇.

38 Give an example of 𝑇 ∈ ℒ(𝐑2 ) such that 𝑇 4 = −𝐼.


39 Suppose dim 𝑉 > 1 and 𝑇 ∈ ℒ(𝑉). Prove that {𝑝(𝑇) ∶ 𝑝 ∈ 𝒫(𝐅)} ≠ ℒ(𝑉).

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5B Eigenvalues and the Minimal Polynomial 139

5B Eigenvalues and the Minimal Polynomial


Existence of Eigenvalues
Now we come to one of the central results about operators on finite-dimensional
complex vector spaces.

5.21 operators on complex vector spaces have an eigenvalue

Every operator on a finite-dimensional, nonzero, complex vector space has an


eigenvalue.

Proof Suppose 𝑉 is a complex vector space with dimension 𝑛 > 0 and 𝑇 ∈ ℒ(𝑉).
Choose 𝑣 ∈ 𝑉 with 𝑣 ≠ 0. Then
𝑣, 𝑇𝑣, 𝑇 2 𝑣, …, 𝑇 𝑛 𝑣
is not linearly independent, because 𝑉 has dimension 𝑛 and this list has length
𝑛 + 1. Hence some linear combination (with not all the coefficients equal to 0) of
the vectors above equals 0. Thus there there exists a nonconstant polynomial 𝑝 of
smallest degree such that
𝑝(𝑇)𝑣 = 0.
By the first version of the fundamental theorem of algebra (see 4.12), there
exists 𝜆 ∈ 𝐂 such that 𝑝(𝜆) = 0. By 4.10, there exists a polynomial 𝑞 ∈ 𝒫(𝐂)
such that
𝑝(𝑧) = (𝑧 − 𝜆)𝑞(𝑧)
for every 𝑧 ∈ 𝐂 . Thus 5.19 implies that
0 = 𝑝(𝑇)𝑣 = (𝑇 − 𝜆𝐼)(𝑞(𝑇)𝑣).
Because 𝑞 has smaller degree than 𝑝, we know that 𝑞(𝑇)𝑣 ≠ 0. Thus the equation
above implies that 𝜆 is an eigenvalue of 𝑇 with eigenvector 𝑞(𝑇)𝑣.

The proof above makes crucial use of the fundamental theorem of algebra.
The comment following Exercise 10 helps explain why the fundamental theorem
of algebra is so tightly connected to the result above.
The hypothesis in the result above that 𝐅 = 𝐂 cannot be replaced with the
hypothesis that 𝐅 = 𝐑 , as shown by Example 5.9. The next example shows that
the finite-dimensional hypothesis in the result above also cannot be deleted.

5.22 example: an operator on a complex vector space with no eigenvalues


Define 𝑇 ∈ ℒ(𝒫(𝐂) by (𝑇𝑝)(𝑧) = 𝑧𝑝(𝑧). If 𝑝 ∈ 𝒫(𝐂) is a nonzero polyno-
mial, then the degree of 𝑇𝑝 is one more than the degree of 𝑝, and thus 𝑇𝑝 cannot
equal a scalar multiple of 𝑝. In other words, 𝑇 has no eigenvalues.
Because 𝒫(𝐂) is infinite-dimensional, this example does not contradict the
result above.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
140 Chapter 5 Eigenvalues and Eigenvectors

Minimal Polynomial
In this subsection we introduce an important polynomial associated with each
operator. We begin with the following definition.

5.23 definition: monic polynomial

A monic polynomial is a polynomial whose highest-degree coefficient equals 1.

For example, the polynomial 2 + 9𝑧2 + 𝑧7 is a monic polynomial of degree 7.

5.24 existence, uniqueness, and degree of minimal polynomial

Suppose 𝑇 ∈ ℒ(𝑉). Then there is a unique monic polynomial 𝑝 ∈ 𝒫(𝐅) of


smallest degree such that 𝑝(𝑇) = 0. Furthermore, deg 𝑝 ≤ dim 𝑉.

Proof If dim 𝑉 = 0, then take 𝑝 to be the constant polynomial 1.


Now use induction on dim 𝑉. Thus assume that dim 𝑉 > 0 and that the
desired result is true on all vector spaces of smaller dimension. Let 𝑣 ∈ 𝑉 be
such that 𝑣 ≠ 0. The list 𝑣, 𝑇𝑣, …, 𝑇 dim 𝑉 𝑣 has length 1 + dim 𝑉 and thus is linearly
dependent. By the linear dependence lemma (2.19), there is a smallest positive
integer 𝑚 ≤ dim 𝑉 such that 𝑇 𝑚 𝑣 is a linear combination of 𝑣, 𝑇𝑣, …, 𝑇 𝑚 − 1 𝑣.
Thus there exist scalars 𝑐0 , 𝑐1 , 𝑐2 , …, 𝑐𝑚 − 1 ∈ 𝐅 such that
5.25 𝑐0 𝑣 + 𝑐1 𝑇𝑣 + ⋯ + 𝑐𝑚 − 1 𝑇 𝑚 − 1 𝑣 + 𝑇 𝑚 𝑣 = 0.

Define a monic polynomial 𝑞 ∈ 𝒫𝑚 (𝐅) by


𝑞(𝑧) = 𝑐0 + 𝑐1 𝑧 + ⋯ + 𝑐𝑚 − 1 𝑧𝑚 − 1 + 𝑧𝑚.

Then 5.25 implies that 𝑞(𝑇)𝑣 = 0.


If 𝑘 is a nonnegative integer, then
𝑞(𝑇)(𝑇 𝑘 𝑣) = 𝑇 𝑘 (𝑞(𝑇)𝑣) = 𝑇 𝑘 (0) = 0.

Because 𝑣, 𝑇𝑣, …, 𝑇 𝑚 − 1 𝑣 is linearly independent, this implies dim null 𝑞(𝑇) ≥ 𝑚.


Thus
dim range 𝑞(𝑇) = dim 𝑉 − dim null 𝑞(𝑇) ≤ dim 𝑉 − 𝑚.
Because range 𝑞(𝑇) is invariant under 𝑇 (by 5.20), we can apply our induction
hypothesis to the operator 𝑇|range 𝑞(𝑇) on the vector space range 𝑞(𝑇). Thus there
is a monic polynomial 𝑠 ∈ 𝒫(𝐅) with
deg 𝑠 ≤ dim 𝑉 − 𝑚 and 𝑠(𝑇)|range 𝑞(𝑇) = 0.
Now
(𝑠𝑞)(𝑇)(𝑣) = 𝑠(𝑇)(𝑞(𝑇)𝑣)) = 0
for all 𝑣 ∈ 𝑉 because 𝑞(𝑇)𝑣 ∈ range 𝑞(𝑇). Hence we conclude that 𝑠𝑞 is a monic
polynomial with deg 𝑠𝑞 ≤ dim 𝑉 and (𝑠𝑞)(𝑇) = 0.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5B Eigenvalues and the Minimal Polynomial ≈ 100√2

The paragraph above shows that there is a monic polynomial with degree at
most dim 𝑉 that when applied to 𝑇 gives the 0 operator. Thus there is a monic
polynomial with smallest degree with this property, completing the existence part
of this result.
Let 𝑝 ∈ 𝒫𝑛 (𝐅) be a monic polynomial of smallest degree such that 𝑝(𝑇) = 0.
To prove the uniqueness part of the result, suppose 𝑟 ∈ 𝒫(𝐅) is a monic poly-
nomial with the same degree as 𝑝 and 𝑟(𝑇) = 0. Then (𝑝 − 𝑟)(𝑇) = 0 and also
deg(𝑝 − 𝑟) < deg 𝑝. If 𝑝 − 𝑟 were not equal to 0, then we could divide 𝑝 − 𝑟 by
the coefficient of the highest order term in 𝑝 − 𝑟 to get a monic polynomial (with
smaller degree than 𝑝) that when applied to 𝑇 gives the 0 operator. Thus we
conclude that 𝑝 − 𝑟 = 0, and hence 𝑟 = 𝑝, as desired.

The previous result justifies the following definition.

5.26 definition: minimal polynomial

Suppose 𝑇 ∈ ℒ(𝑉). Then the minimal polynomial of 𝑇 is the unique monic


polynomial 𝑝 ∈ 𝒫(𝐅) of smallest degree such that 𝑝(𝑇) = 0.

To compute the minimal polynomial of an operator 𝑇 ∈ ℒ(𝑉), we need to


find the smallest positive integer 𝑚 such that the equation
𝑐0 𝐼 + 𝑐1 𝑇 + … + 𝑐𝑚 − 1 𝑇 𝑚 − 1 = −𝑇 𝑚
has a solution 𝑐0 , 𝑐1 , 𝑐𝑚 − 1 ∈ 𝐅 . If we pick a basis of 𝑉 and replace 𝑇 in the
equation above with the matrix of 𝑇, then the equation above can be thought of
as a system of (dim 𝑉)2 linear equations in the 𝑚 unknowns 𝑐0 , 𝑐1 , 𝑐𝑚 − 1 ∈ 𝐅 .
Gaussian elimination or another fast method of solving systems of linear equations
can be used to see if solutions exist for successive values of 𝑚 = 1, …, dim 𝑉 − 1
until a unique solution exists. By 5.24, a unique solution exists for some smallest
𝑚 ≤ dim 𝑉. The minimal polynomial of 𝑇 is then 𝑐0 + 𝑐1 𝑧 + ⋯ + 𝑐𝑚 − 1 𝑧𝑚 − 1 + 𝑧𝑚 .
Even faster (usually), pick 𝑣 ∈ 𝑉 and consider the equation
5.27 𝑐0 𝑣 + 𝑐1 𝑇𝑣 + ⋯ + 𝑐dim 𝑉 − 1 𝑇 dim 𝑉 − 1 𝑣 = −𝑇 dim 𝑉 𝑣.
Use a basis of 𝑉 to convert the equation above to a system of dim 𝑉 linear equa-
tions in dim 𝑉 unknowns 𝑐0 , 𝑐1 , …, 𝑐dim 𝑉 − 1 . If this system of equations has a
unique solution 𝑐0 , 𝑐1 , …, 𝑐dim 𝑉 − 1 (as happens most of the time), then the scalars
𝑐0 , 𝑐1 , …, 𝑐dim 𝑉 − 1 , 1 are the coefficients of the minimal polynomial of 𝑇 (because
5.24 states that the degree of the minimal polynomial is at most dim 𝑉).
Consider operators on 𝐑4 (thought These estimates are based on testing
of as 4-by-4 matrices with respect to the millions of random matrices.
standard basis), and take 𝑣 = (1, 0, 0, 0)
in the paragraph above. The faster method described above works on over 99.8%
of the 4-by-4 matrices with integer entries in the interval [−10, 10] and on over
99.999% of the 4-by-4 matrices with integer entries in [−100, 100].
The next example illustrates the faster procedure discussed above.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
142 Chapter 5 Eigenvalues and Eigenvectors

5.28 example: minimal polynomial of an operator on 𝐅5


Suppose 𝑇 ∈ ℒ(𝐅5 ) and
0 0 0 0 −3

⎜ ⎞


⎜ 1 0 0 0 6 ⎟

⎜ ⎟
ℳ(𝑇) = ⎜

⎜ 0 1 0 0 0 ⎟

⎟.

⎜ 0 0 1 0 0 ⎟

⎜ ⎟
⎝ 0 0 0 1 0 ⎠
with respect to the standard basis 𝑒1 , 𝑒2 , 𝑒3 , 𝑒4 , 𝑒5 . Taking 𝑣 = 𝑒1 for 5.27, we have

𝑇𝑒1 = 𝑒2 , 𝑇 4 𝑒1 = 𝑇(𝑇 3 𝑒1 ) = 𝑇𝑒4 = 𝑒5 ,


𝑇 2 𝑒1 = 𝑇(𝑇𝑒1 ) = 𝑇𝑒2 = 𝑒3 , 𝑇 5 𝑒1 = 𝑇(𝑇 4 𝑒1 ) = 𝑇𝑒5 = −3𝑒1 + 6𝑒2 .
𝑇 3 𝑒1 = 𝑇(𝑇 2 𝑒1 ) = 𝑇𝑒3 = 𝑒4 ,

Thus 3𝑒1 − 6𝑇𝑒1 = −𝑇 5 𝑒1 . The list 𝑒1 , 𝑇𝑒1 , 𝑇 2 𝑒1 , 𝑇 3 𝑒1 , 𝑇 4 𝑒1 , which equals the list
𝑒1 , 𝑒2 , 𝑒3 , 𝑒4 , 𝑒5 , is linearly independent, so no other linear combination of this list
equals −𝑇 5 𝑒1 . Hence the minimal polynomial of 𝑇 is 3 − 6𝑧 + 𝑧5 .

Recall that by definition, eigenvalues of operators on 𝑉 and zeros of polyno-


mials in 𝒫(𝐅) must be elements of 𝐅 . In particular, if 𝐅 = 𝐑 , then eigenvalues
and zeros must be real numbers.

5.29 eigenvalues are the zeros of the minimal polynomial

Suppose 𝑇 ∈ ℒ(𝑉).
(a) The zeros of the minimal polynomial of 𝑇 are the eigenvalues of 𝑇.
(b) If 𝑉 is a complex vector space, then the minimal polynomial of 𝑇 has the
form
(𝑧 − 𝜆1 )⋯(𝑧 − 𝜆𝑚 ),
where 𝜆1 , …, 𝜆𝑚 is a list of all the eigenvalues of 𝑇, possibly with
repetitions.

Proof Let 𝑝 be the minimal polynomial of 𝑇.


(a) First suppose 𝜆 ∈ 𝐅 is a zero of 𝑝. Then 𝑝 can be written in the form

𝑝(𝑧) = (𝑧 − 𝜆)𝑞(𝑧),

where 𝑞 is a monic polynomial with coefficients in 𝐅 (see 4.10). Because


𝑝(𝑇) = 0, we have
0 = (𝑇 − 𝜆𝐼)(𝑞(𝑇)𝑣)
for all 𝑣 ∈ 𝑉. Because the degree of 𝑞 is less than the degree of the minimal
polynomial 𝑝, there exists at least one vector 𝑣 ∈ 𝑉 such that 𝑞(𝑇)𝑣 ≠ 0. The
equation above thus implies that 𝜆 is an eigenvalue of 𝑇, as desired.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5B Eigenvalues and the Minimal Polynomial 143

To prove the other direction, now suppose 𝜆 ∈ 𝐅 is an eigenvalue of 𝑇. Thus


there exists 𝑣 ∈ 𝑉 with 𝑣 ≠ 0 such that 𝑇𝑣 = 𝜆𝑣. Repeated applications of
𝑇 to both sides of this equation show that 𝑇 𝑘 𝑣 = 𝜆𝑘 𝑣 for every nonnegative
integer 𝑘. Thus
𝑝(𝑇)𝑣 = 𝑝(𝜆)𝑣.
Because 𝑝 is the minimal polynomial of 𝑇, we have 𝑝(𝑇)𝑣 = 0. Hence the
equation above implies that 𝑝(𝜆) = 0. Thus 𝜆 is a zero of 𝑝, as desired.
(b) To get the desired result, use part (a) and the second version of the fundamental
theorem of algebra (see 4.13).

A nonzero polynomial has at most as many distinct zeros as its degree (see
4.11). Thus part (a) of the previous result, along with the result that the minimal
polynomial of an operator on 𝑉 has degree at most dim 𝑉, gives an alternative
proof of 5.14, which states that an operator on 𝑉 has at most dim 𝑉 distinct
eigenvalues.
Every monic polynomial is the minimal polynomial of some operator, as
shown by Exercise 10, which generalizes Example 5.28. Thus 5.29(a) shows that
finding exact expressions for the eigenvalues of an operator is equivalent to the
problem of finding exact expressions for the zeros of a polynomial (and thus is
not possible for some operators).

5.30 example: An operator whose eigenvalues cannot be found exactly


Let 𝑇 ∈ ℒ(𝐂5 ) be the operator defined by

𝑇(𝑧1 , 𝑧2 , 𝑧3 , 𝑧4 , 𝑧5 ) = (−3𝑧5 , 𝑧1 + 6𝑧5 , 𝑧2 , 𝑧3 , 𝑧4 ).

The matrix of 𝑇 with respect to the standard basis of 𝐂5 is the 5-by-5 matrix in
Example 5.28. As we showed that example, the minimal polynomial of 𝑇 is the
polynomial
3 − 6𝑧 + 𝑧5 .
No zero of the polynomial above can be expressed using rational numbers,
roots of rational numbers, and the usual rules of arithmetic (a proof of this would
take us considerably beyond linear algebra). Because the zeros of the polynomial
above are the eigenvalues of 𝑇 [by 5.29(a)], we cannot find an exact expression
for any eigenvalue of 𝑇 in any familiar form.
Numeric techniques, which we will not discuss here, show that the zeros of the
polynomial above, and thus the eigenvalues of 𝑇, are approximately the following
five complex numbers:

−1.67, 0.51, 1.40, −0.12 + 1.59𝑖, −0.12 − 1.59𝑖.

Note that the two nonreal zeros of this polynomial are complex conjugates of
each other, as we expect for a polynomial with real coefficients (see 4.14).

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
144 Chapter 5 Eigenvalues and Eigenvectors

The next result completely characterizes the polynomials that when applied to
an operator give the 0 operator.

5.31 𝑞(𝑇) = 0 ⟺ 𝑞 is a polynomial multiple of the minimal polynomial

Suppose 𝑇 ∈ ℒ(𝑉) and 𝑞 ∈ 𝒫(𝐅). Then 𝑞(𝑇) = 0 if and only if 𝑞 is a


polynomial multiple of the minimal polynomial of 𝑇.

Proof Let 𝑝 denote the minimal polynomial of 𝑇.


First we prove the easy direction. Suppose 𝑞 is a polynomial multiple of 𝑝.
Thus there exists a polynomial 𝑠 ∈ 𝒫(𝐅) such that 𝑞 = 𝑝𝑠. We have
𝑞(𝑇) = 𝑝(𝑇)𝑠(𝑇) = 0 𝑠(𝑇) = 0,

as desired.
To prove the other direction, now suppose 𝑞(𝑇) = 0. By the division algorithm
for polynomials (4.7), there exist polynomials 𝑠, 𝑟 ∈ 𝒫(𝐅) such that
5.32 𝑞 = 𝑝𝑠 + 𝑟

and deg 𝑟 < deg 𝑝. We have


0 = 𝑞(𝑇) = 𝑝(𝑇)𝑠(𝑇) + 𝑟(𝑇) = 𝑟(𝑇).

The equation above implies that 𝑟 = 0 (otherwise, dividing 𝑟 by its highest-degree


coefficient would produce a monic polynomial that when applied to 𝑇 gives 0;
this polynomial would have a smaller degree than the minimal polynomial, which
would be a contradiction). Thus 5.32 becomes the equation 𝑞 = 𝑝𝑠. Hence 𝑞 is a
polynomial multiple of 𝑝, as desired.

The next result shows that the constant term of the minimal polynomial of an
operator determines whether the operator is invertible.

5.33 𝑇 not invertible ⟺ constant term of minimal polynomial of 𝑇 is 0

An operator 𝑇 ∈ ℒ(𝑉) is not invertible if and only if the constant term of the
minimal polynomial of 𝑇 is 0.

Proof Suppose 𝑇 ∈ ℒ(𝑉) and 𝑝 is the minimal polynomial of 𝑇. Then


𝑇 is not invertible ⟺ 0 is an eigenvalue of 𝑇
⟺ 0 is a zero of 𝑝
⟺ 𝑝(0) = 0
⟺ the constant term of 𝑝 is 0,

where the first equivalence holds by 5.7, the second equivalence holds by 5.29(a),
and the last equivalence holds because the constant term of 𝑝 equals 𝑝(0).

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5B Eigenvalues and the Minimal Polynomial 145

Exercises 5B

1 Suppose 𝑛 is a positive integer and 𝑇 ∈ ℒ(𝐅𝑛 ) is defined by


𝑇(𝑥1 , …, 𝑥𝑛 ) = (𝑥1 + ⋯ + 𝑥𝑛 , …, 𝑥1 + ⋯ + 𝑥𝑛 ).

Thus 𝑇 is the operator on 𝐅𝑛 whose matrix (with respect to the standard


basis) consists of all 1’s.
(a) Find all eigenvalues and eigenvectors of 𝑇.
(b) Find the minimal polynomial of 𝑇.

2 Suppose 𝑇 ∈ ℒ(𝑊). Prove that 9 is an eigenvalue of 𝑇 2 if and only if 3 or


−3 is an eigenvalue of 𝑇.

3 Suppose 𝐅 = 𝐂 , 𝑇 ∈ ℒ(𝑊), 𝑝 ∈ 𝒫(𝐂) is a polynomial, and 𝛼 ∈ 𝐂 . Prove


that 𝛼 is an eigenvalue of 𝑝(𝑇) if and only if 𝛼 = 𝑝(𝜆) for some eigenvalue
𝜆 of 𝑇.

4 Give an example of an operator on 𝐑2 that shows that the result in the


previous exercise does not hold if 𝐂 is replaced with 𝐑 .
5 Suppose 𝑇 ∈ ℒ(𝐅2 ) is defined by 𝑇(𝑤, 𝑧) = (−𝑧, 𝑤). Find the minimal
polynomial of 𝑇.
6 Suppose 𝑇 ∈ ℒ(𝐑2 ) is the operator of counterclockwise rotation by 1∘ . Thus
𝑇 180 is the operator of counterclockwise rotation by 180∘ . Hence 𝑇 180 = −𝐼.
Find the minimal polynomial of 𝑇.
Because dim 𝐑2 = 2, the degree of the minimal polynomial of 𝑇 is at most 2.
Thus the minimal polynomial of 𝑇 is not the tempting polynomial 𝑥180 + 1.

7 Suppose dim 𝑉 = 2, 𝑇 ∈ ℒ(𝑉), and the matrix of 𝑇 with respect to some


𝑎 𝑐
basis of 𝑉 is ( ).
𝑏 𝑑
(a) Show that 𝑇 2 − (𝑎 + 𝑑)𝑇 + (𝑎𝑑 − 𝑏𝑐)𝐼 = 0.
(b) Show that the minimal polynomial of 𝑇 equals

{𝑧 − 𝑎 if 𝑏 = 𝑐 = 0 and 𝑎 = 𝑑,
⎨ 2
otherwise.
⎩𝑧 − (𝑎 + 𝑑)𝑧 + (𝑎𝑑 − 𝑏𝑐)
{

8 Suppose 𝑊 is a complex vector space and 𝑇 ∈ ℒ(𝑊) has no eigenvalues.


Prove that every subspace of 𝑊 invariant under 𝑇 is either {0} or infinite-
dimensional.
9 Define 𝑇 ∈ ℒ(𝐅𝑛 ) by
𝑇(𝑥1 , 𝑥2 , 𝑥3 , …, 𝑥𝑛 ) = (𝑥1 , 2𝑥2 , 3𝑥3 , …, 𝑛𝑥𝑛 ).

Find the minimal polynomial of 𝑇.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
146 Chapter 5 Eigenvalues and Eigenvectors

10 Suppose 𝑎0 , …, 𝑎𝑛 − 1 ∈ 𝐅 . Let 𝑇 be the operator on 𝐅𝑛 whose matrix (with


respect to the standard basis) is
0 −𝑎0

⎜ ⎞


⎜ 1 0 −𝑎1 ⎟


⎜ ⎟

⎜ 1 ⋱ −𝑎2 ⎟

⎜ ⎟
⎟ .

⎜ ⋱ ⋮ ⎟


⎜ ⎟

⎜ 0 −𝑎𝑛 − 2 ⎟
⎝ 1 −𝑎𝑛 − 1 ⎠
Here all entries of the matrix are 0 except for all 1’s on the line under the
diagonal and the entries in the last column (some of which might also be 0).
Show that the minimal polynomial of 𝑇 is the polynomial

𝑎0 + 𝑎1 𝑧 + … + 𝑎𝑛 − 1 𝑧𝑛 − 1 + 𝑧𝑛 .

The matrix above is called the companion matrix of the polynomial above.
This exercise shows that every monic polynomial is the minimal polynomial
of some operator. Hence a formula or an algorithm that could produce
exact eigenvalues for each operator on each 𝐅𝑛 could then produce exact
zeros for each polynomial [by 5.29(a)]. Thus there is no such formula or
algorithm. However, the efficient numeric methods for obtaining very good
approximations for the eigenvalues of an operator are often used to find
very good approximations for the zeros of a polynomial by considering the
companion matrix of the polynomial.

11 Suppose 𝑉 is a complex vector space with dim 𝑉 > 0 and 𝑇 ∈ ℒ(𝑉). Define
a function 𝑓 ∶ 𝐂 → 𝐑 by

𝑓 (𝜆) = dim range(𝑇 − 𝜆𝐼).

Prove that 𝑓 is not a continuous function.


12 Prove that if 𝑇 ∈ ℒ(𝑉) and ℰ is the subspace of ℒ(𝑉) defined by

ℰ = {𝑞(𝑇) ∶ 𝑞 ∈ 𝒫(𝐅)},

then dim ℰ equals the degree of the minimal polynomial of 𝑇.


13 Suppose 𝑇 ∈ ℒ(𝑉). Prove that the minimal polynomial of 𝑇 has degree at
most 1 + dim range 𝑇.
If dim range 𝑇 < dim 𝑉 − 1, then this exercise gives a better upper bound
than 5.24 for the degree of the minimal polynomial of 𝑇.

14 Suppose 𝑇 ∈ ℒ(𝑉) has minimal polynomial 4 + 5𝑧 − 6𝑧2 − 7𝑧3 + 2𝑧4 + 𝑧5.


Find the minimal polynomial of 𝑇 −1 .
15 Suppose 𝑇 ∈ ℒ(𝑉) is invertible. Prove that there exists a polynomial
𝑝 ∈ 𝒫(𝐅) such that deg 𝑝 ≤ dim 𝑉 − 1 and 𝑇 −1 = 𝑝(𝑇).

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5B Eigenvalues and the Minimal Polynomial 147

16 Suppose 𝑇 ∈ ℒ(𝑉). Let 𝑛 = dim 𝑉. Prove that if 𝑣 ∈ 𝑉, then

span(𝑣, 𝑇𝑣, …, 𝑇 𝑛 − 1 𝑣)

is invariant under 𝑇.
17 Suppose 𝑇 ∈ ℒ(𝑉) and 𝑈 is a subspace of 𝑉 that is invariant under 𝑇. Prove
that the minimal polynomial of 𝑇 is a polynomial multiple of the minimal
polynomial of 𝑇|𝑈 .
18 Suppose 𝑇 ∈ ℒ(𝐅4 ) is such that the eigenvalues of 𝑇 are 3, 5, 8. Prove that
(𝑇 − 3𝐼)2 (𝑇 − 5𝐼)2 (𝑇 − 8𝐼)2 = 0.

19 Suppose 𝑉 is a complex vector space. Suppose 𝑇 ∈ ℒ(𝑉) is such that 5


and 6 are eigenvalues of 𝑇 and that 𝑇 has no other eigenvalues. Prove that
(𝑇 − 5𝐼)dim 𝑉 − 1 (𝑇 − 6𝐼)dim 𝑉 − 1 = 0.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
148 Chapter 5 Eigenvalues and Eigenvectors

5C Upper-Triangular Matrices
In Chapter 3 we discussed the matrix of a linear map from one vector space to
another vector space. That matrix depended on a choice of a basis of each of the
two vector spaces. Now that we are studying operators, which map a vector space
to itself, the emphasis is on using only one basis.

5.34 definition: matrix of an operator, ℳ(𝑇)

Suppose 𝑇 ∈ ℒ(𝑉) and 𝑣1 , …, 𝑣𝑛 is a basis of 𝑉. The matrix of 𝑇 with respect


to this basis is the 𝑛-by-𝑛 matrix
𝐴 … 𝐴1, 𝑛
⎛ 1, 1 ⎞
ℳ(𝑇) = ⎜

⎜ ⋮ ⋮ ⎟


⎝ 𝐴𝑛, 1 … 𝐴𝑛, 𝑛 ⎠
whose entries 𝐴𝑗, 𝑘 are defined by

𝑇𝑣𝑘 = 𝐴1, 𝑘 𝑣1 + ⋯ + 𝐴𝑛, 𝑘 𝑣𝑛 .

If the basis is not clear from the context, then the notation ℳ(𝑇, (𝑣1 , …, 𝑣𝑛 ))
is used.

Note that the matrices of operators are square arrays, rather than the more
general rectangular arrays that we considered earlier for linear maps.
If 𝑇 is an operator on 𝐅𝑛 and no ba-
The 𝑘 th column of the matrix ℳ(𝑇) is
sis is specified, assume that the basis in formed from the coefficients used to
question is the standard one (where the write 𝑇𝑣 as a linear combination of
𝑘 th basis vector is 1 in the 𝑘 th slot and 0
𝑘
the basis 𝑣1 , …, 𝑣𝑛 .
in all the other slots). You can then think
of the 𝑘 th column of ℳ(𝑇) as 𝑇 applied to the 𝑘 th basis vector, where we identify
𝑛-by-1 column vectors with elements of 𝐅𝑛 .

5.35 example: matrix of an operator with respect to standard basis


Define 𝑇 ∈ ℒ(𝐅3 ) by 𝑇(𝑥, 𝑦, 𝑧) = (2𝑥 + 𝑦, 5𝑦 + 3𝑧, 8𝑧). Then the matrix of 𝑇
with respect to the standard basis of 𝐅3 is
2 1 0
⎛ ⎞
ℳ(𝑇) = ⎜
⎜ ⎟,
⎜ 0 5 3 ⎟

⎝ 0 0 8 ⎠
as you should verify.

A central goal of linear algebra is to show that given an operator 𝑇 ∈ ℒ(𝑉),


there exists a basis of 𝑉 with respect to which 𝑇 has a reasonably simple matrix.
To make this vague formulation a bit more precise, we might try to choose a basis
of 𝑉 such that ℳ(𝑇) has many 0’s.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5C Upper-Triangular Matrices 149

If 𝑉 is a complex vector space, then we already know enough to show that


there is a basis of 𝑉 with respect to which the matrix of 𝑇 has 0’s everywhere in
the first column, except possibly the first entry. In other words, there is a basis of
𝑉 with respect to which the matrix of 𝑇 looks like

𝜆

⎜ ⎞


⎜ 0 ∗ ⎟


⎜ ⎟
⎟ ;
⎜ ⋮ ⎟
⎝ 0 ⎠
here the ∗ denotes the entries in all the columns other than the first column.
To prove this, let 𝜆 be an eigenvalue of 𝑇 (one exists by 5.21) and let 𝑣 be a
corresponding eigenvector. Extend 𝑣 to a basis of 𝑉. Then the matrix of 𝑇 with
respect to this basis has the form above.
Soon we will see that we can choose a basis of 𝑉 with respect to which the
matrix of 𝑇 has even more 0’s.

5.36 definition: diagonal of a matrix

The diagonal of a square matrix consists of the entries on the line from the
upper left corner to the bottom right corner.

For example, the diagonal of the matrix


2 1 0
⎛ ⎞
ℳ(𝑇) = ⎜
⎜ ⎟
⎜ 0 5 3 ⎟

⎝ 0 0 8 ⎠
from Example 5.35 consists of the entries 2, 5, 8, which are shown in red above.

5.37 definition: upper-triangular matrix

A square matrix is called upper triangular if all the entries below the diagonal
equal 0.

For example, the 3-by-3 matrix above is upper triangular.


Typically we represent an upper-triangular matrix in the form


𝜆1 ∗ ⎞
⎜ ⎟

⎜ ⋱ ⎟;

⎝ 0 𝜆𝑛 ⎠

the 0 in the matrix above indicates that We often use ∗ to denote matrix entries
all entries below the diagonal in this that we do not know or that are irrele-
𝑛-by-𝑛 matrix equal 0. Upper-triangular vant to the questions being discussed.
matrices can be considered reasonably
simple—for 𝑛 large, at least almost half the entries in an 𝑛-by-𝑛 upper-triangular
matrix are 0.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
150 Chapter 5 Eigenvalues and Eigenvectors

The next result provides a useful connection between upper-triangular matrices


and invariant subspaces.

5.38 conditions for upper-triangular matrix

Suppose 𝑇 ∈ ℒ(𝑉) and 𝑣1 , …, 𝑣𝑛 is a basis of 𝑉. Then the following are


equivalent.
(a) the matrix of 𝑇 with respect to 𝑣1 , …, 𝑣𝑛 is upper triangular
(b) span(𝑣1 , …, 𝑣𝑘 ) is invariant under 𝑇 for each 𝑘 = 1, …, 𝑛
(c) 𝑇𝑣𝑘 ∈ span(𝑣1 , …, 𝑣𝑘 ) for each 𝑘 = 1, …, 𝑛

Proof First suppose (a) holds. To prove that (b) holds, suppose 𝑘 ∈ {1, …, 𝑛}. If
𝑗 ∈ {1, …, 𝑛}, then
𝑇𝑣𝑗 ∈ span(𝑣1 , …, 𝑣𝑗 )
because the matrix of 𝑇 with respect to 𝑣1 , …, 𝑣𝑛 is upper triangular. Because
span(𝑣1 , …𝑣𝑗 ) ⊂ span(𝑣1 , …𝑣𝑘 ) if 𝑗 ≤ 𝑘, we see that
𝑇𝑣𝑗 ∈ span(𝑣1 , …𝑣𝑘 )
for each 𝑗 ∈ {1, …, 𝑘}. Thus span(𝑣1 , …, 𝑣𝑘 ) is invariant under 𝑇, completing the
proof that (a) implies (b).
Now suppose (b) holds, so span(𝑣1 , …, 𝑣𝑘 ) is invariant under 𝑇 for each
𝑘 = 1, …, 𝑛. In particular, 𝑇𝑣𝑘 ∈ span(𝑣1 , …, 𝑣𝑘 ) for each 𝑘 = 1, …, 𝑛. Thus
(b) implies (c).
Now suppose (c) holds, so 𝑇𝑣𝑘 ∈ span(𝑣1 , …, 𝑣𝑘 ) for each 𝑘 = 1, …, 𝑛. This
means that when writing each 𝑇𝑣𝑘 as a linear combination of the basis vectors
𝑣1 , …, 𝑣𝑛 , we need to use only the vectors 𝑣1 , …, 𝑣𝑘 . Hence all entries under the
diagonal of ℳ(𝑇) are 0. In other words, ℳ(𝑇) is an upper-triangular matrix,
completing the proof that (c) implies (a).
We have shown that (a) ⟹ (b) ⟹ (c) ⟹ (a), which shows that (a), (b),
and (c) are equivalent.

The next result tells us that if 𝑇 ∈ ℒ(𝑉) and with respect to some basis of 𝑉
we have
𝜆
⎛ 1
∗ ⎞
ℳ(𝑇) = ⎜ ⎜
⎜ ⋱ ⎟
⎟,

⎝ 0 𝜆𝑛 ⎠
then 𝑇 satisfies a simple equation depending on 𝜆1 , …, 𝜆𝑛 .

5.39 equation satisfied by operator with upper-triangular matrix

Suppose 𝑇 ∈ ℒ(𝑉) and 𝑉 has a basis with respect to which 𝑇 has an upper-
triangular matrix with diagonal entries 𝜆1 , …, 𝜆𝑛 . Then

(𝑇 − 𝜆1 𝐼)⋯(𝑇 − 𝜆𝑛 𝐼) = 0.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5C Upper-Triangular Matrices 151

Proof Let 𝑣1 , …, 𝑣𝑛 denote a basis of 𝑉 with respect to which 𝑇 has an upper-


triangular matrix with diagonal entries 𝜆1 , …, 𝜆𝑛 . Then 𝑇𝑣1 = 𝜆1 𝑣1 , which means
that (𝑇 − 𝜆1 𝐼)𝑣 = 0, which implies that (𝑇 − 𝜆1 𝐼)⋯(𝑇 − 𝜆𝑛 𝐼)𝑣1 = 0 (using the
commutativity of 𝑇 − 𝜆𝑗 𝐼 with 𝑇 − 𝜆𝑘 𝐼).
If 𝑘 ∈ {2, …, 𝑛}, then (𝑇 − 𝜆𝑘 𝐼)𝑣𝑘 ∈ span(𝑣1 , …, 𝑣𝑘 − 1 ), which implies (by
induction on 𝑘) that
((𝑇 − 𝜆1 𝐼)⋯(𝑇 − 𝜆𝑘 − 1 𝐼))((𝑇 − 𝜆𝑘 𝐼)𝑣𝑘 ) = 0.

This implies that (𝑇 − 𝜆1 𝐼)⋯(𝑇 − 𝜆𝑛 𝐼)𝑣𝑘 = 0, again by using commutativity of


the factors. Because (𝑇 − 𝜆1 𝐼)⋯(𝑇 − 𝜆𝑛 𝐼) is 0 on each vector in a basis of 𝑉, we
conclude that (𝑇 − 𝜆1 𝐼)⋯(𝑇 − 𝜆𝑛 𝐼) = 0.

Unfortunately no method exists for exactly computing the eigenvalues of an


operator from its matrix. However, if we are fortunate enough to find a basis with
respect to which the matrix of the operator is upper triangular, then the problem
of computing the eigenvalues becomes trivial, as the next result shows.

5.40 determination of eigenvalues from upper-triangular matrix

Suppose 𝑇 ∈ ℒ(𝑉) has an upper-triangular matrix with respect to some basis


of 𝑉. Then the eigenvalues of 𝑇 are precisely the entries on the diagonal of
that upper-triangular matrix.

Proof Suppose 𝑣1 , …, 𝑣𝑛 is a basis of 𝑉 with respect to which 𝑇 has an upper-


triangular matrix

𝜆1 ∗ ⎞
ℳ(𝑇) = ⎜ ⎜
⎜ ⋱ ⎟

⎟.
⎝ 0 𝜆𝑛 ⎠
Clearly 𝜆1 is an eigenvalue of 𝑇 because 𝑇𝑣1 = 𝜆1 𝑣1 .
Suppose 𝑘 ∈ {2, …, 𝑛}. Then (𝑇 − 𝜆𝑘 𝐼)𝑣𝑘 ∈ span(𝑣1 , …, 𝑣𝑘 − 1 ). Thus 𝑇 − 𝜆𝑘 𝐼
maps span(𝑣1 , …, 𝑣𝑘 ) into span(𝑣1 , …, 𝑣𝑘 − 1 ). Because

dim span(𝑣1 , …, 𝑣𝑘 ) = 𝑘 and dim span(𝑣1 , …, 𝑣𝑘 − 1 ) = 𝑘 − 1,

this implies that 𝑇 − 𝜆𝑘 𝐼 restricted to span(𝑣1 , …, 𝑣𝑘 ) is not injective (by 3.22).


Thus there exists 𝑣 ∈ span(𝑣1 , …, 𝑣𝑘 ) such that 𝑣 ≠ 0 and (𝑇 − 𝜆𝑘 𝐼)𝑣 = 0. Thus
𝜆𝑘 is an eigenvalue of 𝑇. Hence we have shown that every entry on the diagonal
of ℳ(𝑇) is an eigenvalue of 𝑇.
To prove 𝑇 has no other eigenvalues, let 𝑞 be the polynomial defined by
𝑞(𝑧) = (𝑧 − 𝜆1 )⋯(𝑧 − 𝜆𝑛 ). Then 𝑞(𝑇) = 0 (by 5.39). Hence 𝑞 is a polynomial
multiple of the minimal polynomial of 𝑇 (by 5.39). Thus every zero of the minimal
polynomial of 𝑇 is a zero of 𝑞. Because the zeros of the minimal polynomial of 𝑇
are the eigenvalues of 𝑇 (by 5.29), this implies that every eigenvalue of 𝑇 is a zero
of 𝑞. In other words, the eigenvalues of 𝑇 are all contained in the list 𝜆1 , …, 𝜆𝑛 .

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
152 Chapter 5 Eigenvalues and Eigenvectors

5.41 example: eigenvalues via an upper-triangular matrix


Define 𝑇 ∈ ℒ(𝐅3 ) by 𝑇(𝑥, 𝑦, 𝑧) = (2𝑥 + 𝑦, 5𝑦 + 3𝑧, 8𝑧). The matrix of 𝑇 with
respect to the standard basis is
2 1 0
⎛ ⎞
ℳ(𝑇) = ⎜ ⎜
⎜ 0 5 3 ⎟
⎟.

⎝ 0 0 8 ⎠
Now 5.40 implies that the eigenvalues of 𝑇 are 2, 5, and 8.

The next example illustrates the result following the example: an operator has
an upper-triangular matrix with respect to some basis if and only if the minimal
polynomial of the operator is the product of degree 1 polynomials.

5.42 example: whether 𝑇 has an upper-triangular matrix can depend upon 𝐅


Define 𝑇 ∈ ℒ(𝐅4 ) by
𝑇(𝑧1 , 𝑧2 , 𝑧3 , 𝑧4 ) = (−𝑧2 , 𝑧1 , 2𝑧1 + 3𝑧3 , 𝑧3 + 3𝑧4 ).
Thus with respect to the standard basis of 𝐅4 , the matrix of 𝑇 is
0 −1 0 0

⎜ ⎞


⎜ 1 0 0 0 ⎟


⎜ ⎟
⎟ .
⎜ 2 0 3 0 ⎟
⎝ 0 0 1 3 ⎠
You can ask a computer to verify that the minimal polynomial of 𝑇 is the polyno-
mial 𝑝 defined by
𝑝(𝑧) = 9 − 6𝑧 + 10𝑧2 − 6𝑧3 + 𝑧4 .
First consider the case where 𝐅 = 𝐑 . Then the polynomial 𝑝 factors as
𝑝(𝑧) = (𝑧2 + 1)(𝑧 − 3)(𝑧 − 3),
with no further factorization of 𝑧2 + 1 as the product of two polynomials with real
coefficients and degree 1. Thus the following result 5.43 states that there does not
exist a basis of 𝐑4 with respect to which 𝑇 has an upper-triangular matrix.
Now consider the case where 𝐅 = 𝐂 . Then the polynomial 𝑝 factors as
𝑝(𝑧) = (𝑧 − 𝑖)(𝑧 + 𝑖)(𝑧 − 3)(𝑧 − 3),
where all the factors above have the form 𝑧−𝜆𝑘 . Thus 5.43 states that there is a basis
of 𝐂4 with respect to which 𝑇 has an upper-triangular matrix. Indeed, you can ver-
ify that with respect to the basis (4−3𝑖, −3−4𝑖, −3 + 𝑖, 1), (4 + 3𝑖, −3 + 4𝑖, −3−𝑖, 1),
(0, 0, 0, 1), (0, 0, 1, 0) of 𝐂4 , the operator 𝑇 has the upper-triangular matrix
𝑖 0 0 0

⎜ ⎞


⎜ 0 −𝑖 0 0 ⎟


⎜ ⎟
⎟ .
⎜ 0 0 3 1 ⎟
⎝ 0 0 0 3 ⎠

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5C Upper-Triangular Matrices 153

5.43 necessary and sufficient condition to have an upper-triangular matrix

Suppose 𝑇 ∈ ℒ(𝑉). Then 𝑇 has an upper-triangular matrix with respect


to some basis of 𝑉 if and only if the minimal polynomial of 𝑇 equals
(𝑧 − 𝜆1 )⋯(𝑧 − 𝜆𝑚 ) for some 𝜆1 , …, 𝜆𝑚 ∈ 𝐅 .

Proof First suppose 𝑇 has an upper-triangular matrix with respect to some basis
of 𝑉. Let 𝛼1 , …, 𝛼𝑛 denote the diagonal entries of that matrix. Define a polynomial
𝑞 ∈ 𝒫(𝐅) by
𝑞(𝑧) = (𝑧 − 𝛼1 )⋯(𝑧 − 𝛼𝑛 ).
Then 𝑞(𝑇) = 0, by 5.39. Hence 𝑞 is a polynomial multiple of the minimal polyno-
mial of 𝑇, by 5.31. Thus the minimal polynomial of 𝑇 equals (𝑧 − 𝜆1 )⋯(𝑧 − 𝜆𝑚 )
for some 𝜆1 , …, 𝜆𝑚 ∈ 𝐅 with {𝜆1 , …, 𝜆𝑚 } ⊂ {𝛼1 , …, 𝛼𝑛 }.
To prove the implication in the other direction, now suppose the minimal
polynomial of 𝑇 equals (𝑧 − 𝜆1 )⋯(𝑧 − 𝜆𝑚 ) for some 𝜆1 , …, 𝜆𝑚 ∈ 𝐅 . We will use
induction on 𝑚. To get started, if 𝑚 = 1 then 𝑧 − 𝜆1 is the minimal polynomial of
𝑇, which implies that 𝑇 = 𝜆1 𝐼, which implies that the matrix of 𝑇 (with respect
to any basis of 𝑉) is upper-triangular.
Now suppose 𝑚 > 1 and the desired result holds for 𝑚 − 1. Let

𝑈 = range(𝑇 − 𝜆𝑚 𝐼).

Then 𝑈 is invariant under 𝑇 [this is a special case of 5.20 with 𝑝(𝑧) = 𝑧 − 𝜆𝑚 ].


Thus 𝑇|𝑈 is an operator on 𝑈.
If 𝑢 ∈ 𝑈, then 𝑢 = (𝑇 − 𝜆𝑚 𝐼)𝑣 for some 𝑣 ∈ 𝑉 and

(𝑇 − 𝜆1 𝐼)⋯(𝑇 − 𝜆𝑚 − 1 𝐼)𝑢 = (𝑇 − 𝜆1 𝐼)⋯(𝑇 − 𝜆𝑚 𝐼)𝑣 = 0.

Hence (𝑧 − 𝜆1 )⋯(𝑧 − 𝜆𝑚 − 1 ) is a polynomial multiple of the minimal polynomial


of 𝑇|𝑈 , by 5.31.
By our induction hypothesis, there is a basis 𝑢1 , …, 𝑢𝑀 of 𝑈 with respect to
which 𝑇|𝑈 has an upper-triangular matrix. Thus for each 𝑘 ∈ {1, …, 𝑀}, we have
(using 5.38)

5.44 𝑇𝑢𝑘 = (𝑇|𝑈 )(𝑢𝑘 ) ∈ span(𝑢1 , …, 𝑢𝑘 ).

Extend 𝑢1 , …, 𝑢𝑀 to a basis 𝑢1 , …, 𝑢𝑀 , 𝑣1 , …, 𝑣𝑁 of 𝑉. For each 𝑘 ∈ {1, …, 𝑁},


we have
𝑇𝑣𝑘 = (𝑇 − 𝜆𝑚 𝐼)𝑣𝑘 + 𝜆𝑚 𝑣𝑘 .
The definition of 𝑈 shows that (𝑇 − 𝜆𝑚 𝐼)𝑣𝑘 ∈ 𝑈 = span(𝑢1 , …, 𝑢𝑀 ). Thus the
equation above shows that

5.45 𝑇𝑣𝑘 ∈ span(𝑢1 , …, 𝑢𝑀 , 𝑣1 , …, 𝑣𝑘 ).

From 5.44 and 5.45, we conclude (using 5.38) that 𝑇 has an upper-triangular
matrix with respect to the basis 𝑢1 , …, 𝑢𝑀 , 𝑣1 , …, 𝑣𝑁 of 𝑉, as desired.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
154 Chapter 5 Eigenvalues and Eigenvectors

The set of numbers {𝜆1 , …, 𝜆𝑚 } from the previous result equals the set eigen-
values of 𝑇 (because the set of zeros of the minimal polynomial of 𝑇 equals the
set eigenvalues of 𝑇, by 5.29), although the list 𝜆1 , …, 𝜆𝑚 in the previous result
may contain repetitions.
In Chapter 8 we will improve even the wonderful result of part (a) below; see
8.36 and 8.54.

5.46 if 𝐅 = 𝐂 , then every operator on 𝑉 has an upper-triangular matrix

Suppose 𝑇 ∈ ℒ(𝑉).
(a) If 𝐅 = 𝐂 , then 𝑇 has an upper-triangular matrix with respect to some
basis of 𝑉.
(b) If 𝐅 = 𝐑 , then 𝑇 has an upper-triangular matrix with respect to some basis
of 𝑉 if and only if every zero of the minimal polynomial of 𝑇, thought of
as a polynomial with complex coefficients, is real.

Proof The desired result follows immediately from 5.43 and the second version
of the fundamental theorem of algebra (see 4.13).

For an extension of the result above See Exercise 18 in Section 9A for addi-
to two operators 𝑆 and 𝑇 such that tional necessary and sufficient condi-
tions for an operator on a real vector
𝑆𝑇 = 𝑇𝑆,
space to have an upper-triangular ma-
see 5.74. Also, for an extension to more trix with respect to some basis.
than two operators, see Exercise 8(b) in
Section 5E.
Caution: If an operator 𝑇 ∈ ℒ(𝑉) has a upper-triangular matrix with respect
to some basis 𝑣1 , …, 𝑣𝑛 of 𝑉, then the eigenvalues of 𝑇 are exactly the entries on
the diagonal of ℳ(𝑇), as shown by 5.40, and furthermore 𝑣1 is an eigenvector of
𝑇. However, 𝑣2 , …, 𝑣𝑛 need not be eigenvectors of 𝑇. Indeed, a basis vector 𝑣𝑘 is
an eigenvector of 𝑇 if and only if all the entries in the 𝑘 th column of the matrix of
𝑇 are 0, except possibly the 𝑘 th entry.
You may recall from a previous The row echelon form of the matrix
course that every matrix of numbers can of an operator provides no informa-
be changed to a matrix in what is called tion about the eigenvalues of the oper-
row echelon form. If starting with a ator. In contrast, an upper-triangular
square matrix, the matrix in row echelon matrix with respect to some basis pro-
form will be an upper-triangular matrix. vides complete information about the
Do not confuse this upper-triangular ma- eigenvalues of the operator. However,
trix with the upper-triangular matrix of there is no method for computing ex-
an operator with respect to some basis actly such an upper-triangular matrix,
whose existence is proclaimed by 5.46(a) even though 5.46(a) guarantees its ex-
if 𝐅 = 𝐂 —there is no connection be- istence if 𝐅 = 𝐂 .
tween these upper-triangular matrices.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5C Upper-Triangular Matrices 155

Exercises 5C
1 (a) Prove or give counterexample: If 𝑇 ∈ ℒ(𝑉) and 𝑇 has an upper-
triangular matrix with respect to some basis of 𝑉, then 𝑇 2 has an upper-
triangular matrix with respect to some basis of 𝑉.
(b) Prove or give counterexample: If 𝑇 ∈ ℒ(𝑉) and 𝑇 2 has an upper-
triangular matrix with respect to some basis of 𝑉, then 𝑇 has an upper-
triangular matrix with respect to some basis of 𝑉.
2 Suppose 𝐴 and 𝐵 are upper-triangular matrices of the same size. Show that
𝐴 + 𝐵 and 𝐴𝐵 are upper-triangular matrices.
The result in this exercise is used in the proof of 5.75.

3 Suppose 𝑇 ∈ ℒ(𝑉) and 𝑣1 , …, 𝑣𝑛 is a basis of 𝑉 with respect to which the


matrix of 𝑇 is upper triangular.
(a) Show that for every positive integer 𝑚, the matrix of 𝑇 𝑚 is also upper
triangular with respect to the basis 𝑣1 , …, 𝑣𝑛 .
(b) Suppose 𝑇 is invertible. Show that the matrix of 𝑇 −1 is also upper
triangular with respect to the basis 𝑣1 , …, 𝑣𝑛 .

4 Give an example of an operator whose matrix with respect to some basis


contains only 0’s on the diagonal, but the operator is invertible.
This exercise and the exercise below show that 5.40 fails without the hypoth-
esis that an upper-triangular matrix is under consideration.

5 Give an example of an operator whose matrix with respect to some basis


contains only nonzero numbers on the diagonal, but the operator is not
invertible.
6 Prove that if 𝑉 is a complex vector space and 𝑇 ∈ ℒ(𝑉), then 𝑇 has an
invariant subspace of dimension 𝑘 for each 𝑘 = 1, …, dim 𝑉.
7 Suppose 𝑇 ∈ ℒ(𝑉) and 𝑣 ∈ 𝑉.
(a) Prove that there exists a unique monic polynomial 𝑝𝑣 of smallest degree
such that 𝑝𝑣 (𝑇)𝑣 = 0.
(b) Prove that the minimal polynomial of 𝑇 is a polynomial multiple of 𝑝𝑣 .

8 Suppose that 𝑇 ∈ ℒ(𝑉) and there exists a nonzero vector 𝑣 ∈ 𝑉 such that
𝑇 2 𝑣 + 2𝑇𝑣 = −2𝑣.
(a) Prove that if 𝐅 = 𝐑 , then there does not exist a basis of 𝑉 with respect
to which 𝑇 has an upper-triangular matrix.
(b) Prove that if 𝐅 = 𝐂 and 𝐴 is an upper-triangular matrix that equals
the matrix of 𝑇 with respect to some basis of 𝑉, then −1 + 𝑖 or −1 − 𝑖
appears on the diagonal of 𝐴.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
156 Chapter 5 Eigenvalues and Eigenvectors

5D Diagonalizable Operators

5.47 definition: diagonal matrix

A diagonal matrix is a square matrix that is 0 everywhere except possibly on


the diagonal.

5.48 example: diagonal matrix


8 0 0

⎜ ⎞


⎜ 0 5 0 ⎟

⎝ 0 0 5 ⎠
is a diagonal matrix.

If an operator has a diagonal matrix Every diagonal matrix is upper tri-


with respect to some basis, then the en- angular. Diagonal matrices typically
tries on the diagonal are precisely the have many more 0’s than most upper-
eigenvalues of the operator; this follows triangular matrices of the same size.
from 5.40 (or find an easier direct proof
for diagonal matrices).

5.49 definition: diagonalizable

An operator on 𝑉 is called diagonalizable if the operator has a diagonal matrix


with respect to some basis of 𝑉.

5.50 example: diagonalization may require a different basis


Define 𝑇 ∈ ℒ(𝐑2 ) by

𝑇(𝑥, 𝑦) = (41𝑥 + 7𝑦, −20𝑥 + 74𝑦).

The matrix of 𝑇 with respect to the standard basis of 𝐑2 is


41 7
( ),
−20 74

which is not a diagonal matrix. However, 𝑇 is diagonalizable. Specifically, the


matrix of 𝑇 with respect to the basis (1, 4), (7, 5) is
69 0
( )
0 46

because 𝑇(1, 4) = (69, 276) = 69(1, 4) and 𝑇(7, 5) = (322, 230) = 46(7, 5).

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5D Diagonalizable Operators 157

For 𝜆 ∈ 𝐅 , we will find it convenient to have a name and a notation for the set
of vectors that an operator 𝑇 maps to 𝜆 times the vector.

5.51 definition: eigenspace, 𝐸(𝜆, 𝑇)

Suppose 𝑇 ∈ ℒ(𝑊) and 𝜆 ∈ 𝐅 . The eigenspace of 𝑇 corresponding to 𝜆 is


the subspace 𝐸(𝜆, 𝑇) of 𝑊 defined by

𝐸(𝜆, 𝑇) = null(𝑇 − 𝜆𝐼).

In other words, 𝐸(𝜆, 𝑇) is the set of all eigenvectors of 𝑇 corresponding to 𝜆,


along with the 0 vector.

For 𝑇 ∈ ℒ(𝑊) and 𝜆 ∈ 𝐅 , the set 𝐸(𝜆, 𝑇) is a subspace of 𝑊 because the


null space of each linear map on 𝑊 is a subspace of 𝑊. The definitions imply
that 𝜆 is an eigenvalue of 𝑇 if and only if 𝐸(𝜆, 𝑇) ≠ {0}.

5.52 example: eigenspaces of an operator


Suppose the matrix of an operator 𝑇 ∈ ℒ(𝑉) with respect to a basis 𝑣1 , 𝑣2 , 𝑣3
of 𝑉 is the matrix in Example 5.48 above. Then
𝐸(8, 𝑇) = span(𝑣1 ), 𝐸(5, 𝑇) = span(𝑣2 , 𝑣3 ).

If 𝜆 is an eigenvalue of an operator 𝑇 ∈ ℒ(𝑊), then 𝑇 restricted to 𝐸(𝜆, 𝑇) is


just the operator of multiplication by 𝜆.

5.53 sum of eigenspaces is a direct sum

Suppose 𝑇 ∈ ℒ(𝑉) and 𝜆1 , …, 𝜆𝑚 are distinct eigenvalues of 𝑇. Then

𝐸(𝜆1 , 𝑇) + ⋯ + 𝐸(𝜆𝑚 , 𝑇)

is a direct sum. Furthermore,

dim 𝐸(𝜆1 , 𝑇) + ⋯ + dim 𝐸(𝜆𝑚 , 𝑇) ≤ dim 𝑉.

Proof To show that 𝐸(𝜆1 , 𝑇) + ⋯ + 𝐸(𝜆𝑚 , 𝑇) is a direct sum, suppose


𝑢1 + ⋯ + 𝑢𝑚 = 0,
where each 𝑢𝑘 is in 𝐸(𝜆𝑘 , 𝑇). Because eigenvectors corresponding to distinct
eigenvalues are linearly independent (see 5.11), this implies that each 𝑢𝑘 equals 0.
This implies (using 1.46) that 𝐸(𝜆1 , 𝑇) + ⋯ + 𝐸(𝜆𝑚 , 𝑇) is a direct sum, as desired.
Now
dim 𝐸(𝜆1 , 𝑇) + ⋯ + dim 𝐸(𝜆𝑚 , 𝑇) = dim(𝐸(𝜆1 , 𝑇) ⊕ ⋯ ⊕ 𝐸(𝜆𝑚 , 𝑇))
≤ dim 𝑉,
where the first line above follows from 3.73.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
158 Chapter 5 Eigenvalues and Eigenvectors

The following characterizations of diagonalizable operators will be useful.

5.54 conditions equivalent to diagonalizability

Suppose 𝑇 ∈ ℒ(𝑉). Let 𝜆1 , …, 𝜆𝑚 denote the distinct eigenvalues of 𝑇. Then


the following are equivalent.
(a) 𝑇 is diagonalizable
(b) 𝑉 has a basis consisting of eigenvectors of 𝑇
(c) 𝑉 = 𝐸(𝜆1 , 𝑇) ⊕ ⋯ ⊕ 𝐸(𝜆𝑚 , 𝑇)
(d) dim 𝑉 = dim 𝐸(𝜆1 , 𝑇) + ⋯ + dim 𝐸(𝜆𝑚 , 𝑇)

Proof An operator 𝑇 ∈ ℒ(𝑉) has a diagonal matrix


𝜆1 0

⎜ ⎞


⎜ ⋱ ⎟

⎝ 0 𝜆𝑛 ⎠
with respect to a basis 𝑣1 , …, 𝑣𝑛 of 𝑉 if and only if 𝑇𝑣𝑘 = 𝜆𝑘 𝑣𝑘 for each 𝑘. Thus
(a) and (b) are equivalent.
Suppose (b) holds; thus 𝑉 has a basis consisting of eigenvectors of 𝑇. Hence
every vector in 𝑉 is a linear combination of eigenvectors of 𝑇, which implies that
𝑉 = 𝐸(𝜆1 , 𝑇) + ⋯ + 𝐸(𝜆𝑚 , 𝑇).
Now 5.53 shows that (c) holds.
That (c) implies (d) follows immediately from 3.73.
Finally, suppose (d) holds; thus
5.55 dim 𝑉 = dim 𝐸(𝜆1 , 𝑇) + ⋯ + dim 𝐸(𝜆𝑚 , 𝑇).
Choose a basis of each 𝐸(𝜆𝑘 , 𝑇); put all these bases together to form a list 𝑣1 , …, 𝑣𝑛
of eigenvectors of 𝑇, where 𝑛 = dim 𝑉 (by 5.55). To show that this list is linearly
independent, suppose
𝑎1 𝑣1 + ⋯ + 𝑎𝑛 𝑣𝑛 = 0,
where 𝑎1 , …, 𝑎𝑛 ∈ 𝐅 . For each 𝑘 = 1, …, 𝑚, let 𝑢𝑘 denote the sum of all the terms
𝑎𝑗 𝑣𝑗 such that 𝑣𝑗 ∈ 𝐸(𝜆𝑘 , 𝑇). Thus each 𝑢𝑘 is in 𝐸(𝜆𝑘 , 𝑇), and
𝑢1 + ⋯ + 𝑢𝑚 = 0.
Because eigenvectors corresponding to distinct eigenvalues are linearly indepen-
dent (see 5.11), this implies that each 𝑢𝑘 equals 0. Because each 𝑢𝑘 is a sum of
terms 𝑎𝑗 𝑣𝑗 , where the 𝑣𝑗 ’s were chosen to be a basis of 𝐸(𝜆𝑘 , 𝑇), this implies that
all the 𝑎𝑗 ’s equal 0. Thus 𝑣1 , …, 𝑣𝑛 is linearly independent and hence is a basis of
𝑉 (by 2.38). Thus (d) implies (b), completing the proof.

For additional conditions equivalent to diagonalizability, see 5.61, Exercises 4


and 14 in this section, Exercise 20 in Section 7B, and Exercise 3 in Section 8B.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5D Diagonalizable Operators 159

As we know, every operator on a finite-dimensional complex vector space


has an eigenvalue. However, not every operator on a finite-dimensional complex
vector space has enough eigenvectors to be diagonalizable, as shown by the next
example.

5.56 example: an operator that is not diagonalizable


Define an operator 𝑇 ∈ ℒ(𝐅3 ) by 𝑇(𝑎, 𝑏, 𝑐) = (𝑏, 𝑐, 0). The matrix of 𝑇 with
respect to the standard basis of 𝐅3 is
0 1 0

⎜ ⎞


⎜ 0 0 1 ⎟,

⎝ 0 0 0 ⎠
which is an upper-triangular matrix but is not a diagonal matrix.
As you should verify, 0 is the only eigenvalue of 𝑇 and furthermore

𝐸(0, 𝑇) = {(𝑎, 0, 0) ∈ 𝐅3 ∶ 𝑎 ∈ 𝐅}.

Thus conditions (b), (c), and (d) of 5.54 are easily seen to fail (of course,
because these conditions are equivalent, it is sufficient to check that only one of
them fails). Thus condition (a) of 5.54 also fails, and hence 𝑇 is not diagonalizable,
regardless of whether 𝐅 = 𝐑 or 𝐅 = 𝐂 .

The next result shows that if an operator has as many distinct eigenvalues as
the dimension of its domain, then the operator is diagonalizable.

5.57 enough eigenvalues implies diagonalizability

Suppose 𝑇 ∈ ℒ(𝑉) has dim 𝑉 distinct eigenvalues. Then 𝑇 is diagonalizable.

Proof Suppose 𝑇 has distinct eigenvalues 𝜆1 , …, 𝜆dim 𝑉 . For each 𝑘, let 𝑣𝑘 ∈ 𝑉


be an eigenvector corresponding to the eigenvalue 𝜆𝑘 . Because eigenvectors corre-
sponding to distinct eigenvalues are linearly independent (see 5.11), 𝑣1 , …, 𝑣dim 𝑉
is linearly independent.
A linearly independent list of dim 𝑉 vectors in 𝑉 is a basis of 𝑉 (see 2.38); thus
𝑣1 , …, 𝑣dim 𝑉 is a basis of 𝑉. With respect to this basis consisting of eigenvectors,
𝑇 has a diagonal matrix.

In later chapters we will find additional conditions that imply that certain
operators are diagonalizable. For example, see the real spectral theorem (7.29)
and the complex spectral theorem (7.31).
The result above gives a sufficient condition for an operator to be diagano-
lizable. However, this condition is not necessary. For example, the operator 𝑇
on 𝐅3 defined by 𝑇(𝑥, 𝑦, 𝑧) = (6𝑥, 6𝑦, 7𝑧) has only 2 eigenvalues (6 and 7) and
dim 𝐅3 = 3, but 𝑇 is diagonalizable (by the standard basis of 𝐅3 ).

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
160 Chapter 5 Eigenvalues and Eigenvectors

The next example illustrates the im- For a spectacular application of these
portance of diagonalization, which can techniques, see Exercise 16, which
be used to compute high powers of an shows how to use diagonalization to
operator, taking advantage of the equa- find an exact formula for the 𝑛th term
tion 𝑇 𝑘 𝑣 = 𝜆𝑘 𝑣 if 𝑣 is an eigenvector of of the Fibonacci sequence.
𝑇 with eigenvalue 𝜆.

5.58 example: using diagonalization to compute 𝑇 100


Define 𝑇 ∈ ℒ(𝐅3 ) by 𝑇(𝑥, 𝑦, 𝑧) = (2𝑥 + 𝑦, 5𝑦 + 3𝑧, 8𝑧). With respect to the
standard basis, the matrix of 𝑇 is
2 1 0

⎜ ⎞


⎜ 0 5 3 ⎟
⎟.
⎝ 0 0 8 ⎠
The matrix above is an upper-triangular matrix but it is not a diagonal matrix. By
5.40, the eigenvalues of 𝑇 are 2, 5, and 8. Because 𝑇 is an operator on a vector
space with dimension 3 and 𝑇 has three distinct eigenvalues, 5.57 assures us that
there exists a basis of 𝐅3 with respect to which 𝑇 has a diagonal matrix.
To find this basis, we only have to find an eigenvector for each eigenvalue. In
other words, we have to find a nonzero solution to the equation

𝑇(𝑥, 𝑦, 𝑧) = 𝜆(𝑥, 𝑦, 𝑧)

for 𝜆 = 2, then for 𝜆 = 5, and then for 𝜆 = 8. These simple equations are easy
to solve: for 𝜆 = 2 we have the eigenvector (1, 0, 0); for 𝜆 = 5 we have the
eigenvector (1, 3, 0); for 𝜆 = 8 we have the eigenvector (1, 6, 6).
Thus (1, 0, 0), (1, 3, 0), (1, 6, 6) is a basis of 𝐅3 consisting of eigenvectors of 𝑇,
and with respect to this basis the matrix of 𝑇 is the diagonal matrix
2 0 0

⎜ ⎞


⎜ 0 5 0 ⎟
⎟.
⎝ 0 0 8 ⎠

To compute 𝑇 100 (0, 0, 1), for example, write (0, 0, 1) as a linear combination
of our basis of eigenvectors:

(0, 0, 1) = 16 (1, 0, 0) − 13 (1, 3, 0) + 16 (1, 6, 6).

Now apply 𝑇 100 to both sides of the equation above, getting

𝑇 100 (0, 0, 1) = 61 (𝑇 100 (1, 0, 0)) − 13 (𝑇 100 (1, 3, 0)) + 16 (𝑇 100 (1, 6, 6))

= 16 (2100 (1, 0, 0) − 2 ⋅ 5100 (1, 3, 0) + 8100 (1, 6, 6))

= 16 (2100 − 2 ⋅ 5100 + 8100 , 6 ⋅ 8100 − 6 ⋅ 5100 , 6 ⋅ 8100 ).

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5D Diagonalizable Operators 161

We saw earlier that an operator 𝑇 ∈ ℒ(𝑉) has an upper-triangular matrix


with respect to some basis of 𝑉 if and only if the minimal polynomial of 𝑇 equals
(𝑧 − 𝜆1 )⋯(𝑧 − 𝜆𝑚 ) for some 𝜆1 , …, 𝜆𝑚 ∈ 𝐅 (see 5.43). As we previously noted,
this condition is always satisfied if 𝐅 = 𝐂 .
Our next result 5.61 states that an operator 𝑇 ∈ ℒ(𝑉) has a diagonal matrix
with respect to some basis of 𝑉 if and only if the minimal polynomial of 𝑇 equals
(𝑧 − 𝜆1 )⋯(𝑧 − 𝜆𝑚 ) for some distinct 𝜆1 , …, 𝜆𝑚 ∈ 𝐅 .

5.59 example: diagonalizable, but with no known exact eigenvalues


Define 𝑇 ∈ ℒ(𝐂5 ) by
𝑇(𝑧1 , 𝑧2 , 𝑧3 , 𝑧4 , 𝑧5 ) = (−3𝑧5 , 𝑧1 + 6𝑧5 , 𝑧2 , 𝑧3 , 𝑧4 ).
The matrix of 𝑇 is shown in Example 5.28, where we showed that the minimal
polynomial of 𝑇 is 3 − 6𝑧 + 𝑧5 .
As mentioned in Example 5.30, no exact expression is known for any of the
zeros of this polynomial, but numeric techniques show that the zeros of this
polynomial are approximately −1.67, 0.51, 1.40, −0.12 + 1.59𝑖, −0.12 − 1.59𝑖.
The software that produces these approximations is accurate to more than
three digits. Thus these approximations are good enough to show that the five
numbers above are distinct. The minimal polynomial of 𝑇 equals the degree 5
monic polynomial with these zeros.
Now 5.61 shows that 𝑇 is diagonalizable.

5.60 example: showing that an operator is not diagonalizable


Define 𝑇 ∈ ℒ(𝐅3 ) by
𝑇(𝑧1 , 𝑧2 , 𝑧3 ) = (6𝑧1 + 3𝑧2 + 4𝑧3 , 6𝑧2 + 2𝑧3 , 7𝑧3 ).
The matrix of 𝑇 with respect to the standard basis of 𝐅3 is
6 3 4

⎜ ⎞


⎜ 0 6 2 ⎟
⎟.
⎝ 0 0 7 ⎠
The matrix above is an upper-triangular matrix but is not a diagonal matrix. Might
𝑇 have a diagonal matrix with respect to some other basis of 𝐅3 ?
To answer this question, we will find the minimal polynomial of 𝑇. First note
that the eigenvalues of 𝑇 are the diagonal entries of the matrix above (by 5.40).
Thus the zeros of the minimal polynomial of 𝑇 are 6, 7 [by 5.29(a)]. The diagonal
of the matrix above tells us that (𝑇 − 6𝐼)2 (𝑇 − 7𝐼) = 0 (by 5.39). The minimal
polynomial of 𝑇 has degree at most 3 (by 5.24). Putting all this together, we see
that the minimal polynomial of 𝑇 is either (𝑧 − 6)(𝑧 − 7) or (𝑧 − 6)2 (𝑧 − 7).
A simple computation shows that (𝑇 − 6𝐼)(𝑇 − 7𝐼) ≠ 0. Thus the minimal
polynomial of 𝑇 is (𝑧 − 6)2 (𝑧 − 7).
Now 5.61 shows that 𝑇 is not diagonalizable (for 𝐅 = 𝐑 and for 𝐅 = 𝐂 ).

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
162 Chapter 5 Eigenvalues and Eigenvectors

5.61 necessary and sufficient condition for diagonalizability

Suppose 𝑇 ∈ ℒ(𝑉). Then 𝑇 is diagonalizable if and only if the minimal


polynomial of 𝑇 equals (𝑧 − 𝜆1 )⋯(𝑧 − 𝜆𝑚 ) for some list of distinct numbers
𝜆1 , …, 𝜆𝑚 ∈ 𝐅 .

Proof First suppose 𝑇 is diagonalizable. Thus there is a basis 𝑣1 , …, 𝑣𝑛 of 𝑉


consisting of eigenvectors of 𝑇. Let 𝜆1 , …, 𝜆𝑚 be the distinct eigenvalues of 𝑇.
Then for each 𝑣𝑗 , there exists 𝜆𝑘 with (𝑇 − 𝜆𝑘 𝐼)𝑣𝑗 = 0. Thus
(𝑇 − 𝜆1 𝐼)⋯(𝑇 − 𝜆𝑚 𝐼)𝑣𝑗 = 0
To prove the implication in the other direction, now suppose the minimal
polynomial of 𝑇 equals (𝑧 − 𝜆1 )⋯(𝑧 − 𝜆𝑚 ) for some list of distinct numbers
𝜆1 , …, 𝜆𝑚 ∈ 𝐅 . Thus
5.62 (𝑇 − 𝜆1 𝐼)⋯(𝑇 − 𝜆𝑚 𝐼) = 0.
We will prove that 𝑇 is diagonalizable by induction on 𝑚. To get started,
suppose 𝑚 = 1. Then 𝑇 − 𝜆1 𝐼 = 0, which means that 𝑇 is a scalar multiple of the
identity operator, which implies that 𝑇 is diagonalizable.
Now suppose that 𝑚 > 1 and the desired result holds for all smaller values of
𝑚. The subspace range(𝑇 − 𝜆𝑚 𝐼) is invariant under 𝑇 [this is a special case of
5.20 with 𝑝(𝑧) = 𝑧 − 𝜆𝑚 ]. In other words, 𝑇 restricted to range(𝑇 − 𝜆𝑚 𝐼) is an
operator on range(𝑇 − 𝜆𝑚 𝐼).
If 𝑢 ∈ range(𝑇 − 𝜆𝑚 𝐼), then 𝑢 = (𝑇 − 𝜆𝑚 𝐼)𝑣 for some 𝑣 ∈ 𝑉, and 5.62
implies
5.63 (𝑇 − 𝜆1 𝐼)⋯(𝑇 − 𝜆𝑚 − 1 𝐼)𝑢 = (𝑇 − 𝜆1 𝐼)⋯(𝑇 − 𝜆𝑚 𝐼)𝑣 = 0.
Hence (𝑧 − 𝜆1 )⋯(𝑧 − 𝜆𝑚 − 1 ) is a polynomial multiple of the minimal polynomial
of 𝑇 restricted to range(𝑇 − 𝜆𝑚 𝐼) [by 5.31]. Thus by our induction hypothesis,
there is a basis of range(𝑇 − 𝜆𝑚 𝐼) consisting of eigenvectors of 𝑇.
Suppose that 𝑢 ∈ range(𝑇 − 𝜆𝑚 𝐼) ∩ null(𝑇 − 𝜆𝑚 𝐼). Then 𝑇𝑢 = 𝜆𝑚 𝑢. Now
5.63 implies that
0 = (𝑇 − 𝜆1 𝐼)⋯(𝑇 − 𝜆𝑚 − 1 𝐼)𝑢
= (𝜆𝑚 − 𝜆1 )⋯(𝜆𝑚 − 𝜆𝑚 − 1 )𝑢.
Because 𝜆1 , …, 𝜆𝑚 are distinct, the equation above implies that 𝑢 = 0. Hence
range(𝑇 − 𝜆𝑚 𝐼) ∩ null(𝑇 − 𝜆𝑚 𝐼) = {0}.
Thus range(𝑇− 𝜆𝑚 𝐼)+null(𝑇− 𝜆𝑚 𝐼) is a direct sum (by 1.47) whose dimension
is dim 𝑉 (by 3.73 and 3.21). Hence range(𝑇 − 𝜆𝑚 𝐼) ⊕ null(𝑇 − 𝜆𝑚 𝐼) = 𝑉. Every
vector in null(𝑇 − 𝜆𝑚 𝐼) is an eigenvector of 𝑇 with eigenvalue 𝜆𝑚 . Earlier in this
proof we saw that there a basis of range(𝑇 − 𝜆𝑚 𝐼) consisting of eigenvectors of 𝑇.
Adjoining to that basis a basis of null(𝑇 − 𝜆𝑚 𝐼) gives a basis of 𝑉 consisting of
eigenvectors of 𝑇. The matrix of 𝑇 with respect to this basis is a diagonal matrix,
as desired.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5D Diagonalizable Operators 163

No formula exists for the zeros of polynomials of degree 5 or greater. However,


the previous result can be used to determine whether or not an operator on a
complex vector space is diagonalizable without even finding approximations of
the zeros of the minimal polynomial—see Exercise 14.
The next result will be a key tool when we prove a result about the simul-
taneous diagonalization of two operators; see 5.70. Note how the use of the
characterization of diagonalizable operators in terms of the minimal polynomial
(see 5.61) leads to an easy proof of the next result.

5.64 restriction of diagonalizable to invariant subspace is diagonalizable

Suppose 𝑇 ∈ ℒ(𝑉) is diagonalizable and 𝑈 is a subspace of 𝑉 that is invariant


under 𝑇. Then 𝑇|𝑈 is a diagonalizable operator on 𝑈.

Proof The minimal polynomial of 𝑇 is a polynomial 𝑝 of the form


𝑝(𝑧) = (𝑧 − 𝜆1 )⋯(𝑧 − 𝜆𝑚 )
for some list of distinct numbers 𝜆1 , …, 𝜆𝑚 ∈ 𝐅 (by 5.61). If 𝑢 ∈ 𝑈, then
(𝑝(𝑇|𝑈 ))(𝑢) = (𝑝(𝑇))(𝑢) = 0.
Hence 𝑝(𝑇|𝑈 ) = 0. Thus 𝑝 is a polynomial multiple of the minimal polynomial
of 𝑇|𝑈 (by 5.31). Hence the minimal polynomial of 𝑇|𝑈 has the form required by
5.61, which shows that 𝑇|𝑈 is diagonalizable.

Exercises 5D

1 Suppose 𝑇 ∈ ℒ(𝑉) has a diagonal matrix 𝐴 with respect to some basis


of 𝑉 and that 𝜆 ∈ 𝐅 . Prove that 𝜆 appears on the diagonal of 𝐴 precisely
dim 𝐸(𝜆, 𝑇) times.
2 Suppose 𝑇 ∈ ℒ(𝑉).
(a) Prove that if 𝑇 is diagonalizable, then 𝑉 = null 𝑇 ⊕ range 𝑇.
(b) Prove the converse of the statement in part (a) or give a counterexample
to the converse.

3 Suppose 𝑇 ∈ ℒ(𝑉). Prove that the following are equivalent.


(a) 𝑉 = null 𝑇 ⊕ range 𝑇
(b) 𝑉 = null 𝑇 + range 𝑇
(c) null 𝑇 ∩ range 𝑇 = {0}

4 Suppose 𝑉 is a complex vector space and 𝑇 ∈ ℒ(𝑉). Prove that 𝑇 is


diagonalizable if and only if
𝑉 = null(𝑇 − 𝜆𝐼) ⊕ range(𝑇 − 𝜆𝐼)
for every 𝜆 ∈ 𝐂 .

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
164 Chapter 5 Eigenvalues and Eigenvectors

5 Suppose 𝑇 ∈ ℒ(𝐅5 ) and dim 𝐸(8, 𝑇) = 4. Prove that 𝑇 − 2𝐼 or 𝑇 − 6𝐼 is


invertible.
6 Suppose 𝑇 ∈ ℒ(𝑊) is invertible. Prove that
𝐸(𝜆, 𝑇) = 𝐸( 1𝜆 , 𝑇 −1 )
for every 𝜆 ∈ 𝐅 with 𝜆 ≠ 0.
7 Suppose 𝑇 ∈ ℒ(𝑉). Let 𝜆1 , …, 𝜆𝑚 denote the distinct nonzero eigenvalues
of 𝑇. Prove that
dim 𝐸(𝜆1 , 𝑇) + ⋯ + dim 𝐸(𝜆𝑚 , 𝑇) ≤ dim range 𝑇.
8 Suppose 𝑅, 𝑇 ∈ ℒ(𝐅3 ) each have 2, 6, 7 as eigenvalues. Prove that there
exists an invertible operator 𝑆 ∈ ℒ(𝐅3 ) such that 𝑅 = 𝑆−1 𝑇𝑆.
9 Find 𝑅, 𝑇 ∈ ℒ(𝐅4 ) such that 𝑅 and 𝑇 each have 2, 6, 7 as eigenvalues, 𝑅 and
𝑇 have no other eigenvalues, and there does not exist an invertible operator
𝑆 ∈ ℒ(𝐅4 ) such that 𝑅 = 𝑆−1 𝑇𝑆.
10 Find 𝑇 ∈ ℒ(𝐂3 ) such that 6 and 7 are eigenvalues of 𝑇 and such that 𝑇 does
not have a diagonal matrix with respect to any basis of 𝐂3.
11 Suppose 𝑇 ∈ ℒ(𝐂3 ) is such that 6 and 7 are eigenvalues of 𝑇. Furthermore,
suppose 𝑇 does not have a diagonal matrix with respect to any basis of 𝐂3.
Prove that there exists (𝑧1 , 𝑧2 , 𝑧3 ) ∈ 𝐂3 such that
𝑇(𝑧1 , 𝑧2 , 𝑧3 ) = (6 + 8𝑧1 , 7 + 8𝑧2 , 13 + 8𝑧3 ).
12 Suppose 𝐴 a diagonal matrix with distinct entries on the diagonal and 𝐵 is a
matrix with the same size as 𝐴. Show that 𝐴 and 𝐵 commute if and only if
𝐵 is a diagonal matrix.
13 (a) Give an example of a finite-dimensional complex vector space and an
operator 𝑇 on that vector space such that 𝑇 2 is diagonalizable but 𝑇 is
not diagonalizable.
(b) Suppose 𝐅 = 𝐂 , 𝑘 is a positive integer, and 𝑇 ∈ ℒ(𝑉) is invertible.
Prove that 𝑇 is diagonalizable if and only if 𝑇 𝑘 is diagonalizable.
14 Suppose 𝐅 = 𝐂 , 𝑇 ∈ ℒ(𝑉), and 𝑝 is the minimal polynomial of 𝑇. Prove
that the following are equivalent.
(a) 𝑇 is diagonalizable
(b) there does not exist 𝜆 ∈ 𝐂 such that 𝑝 is a polynomial multiple of (𝑧−𝜆)2
(c) 𝑝 and its derivative 𝑝′ have no zeros in common
(d) the greatest common divisor of 𝑝 and 𝑝′ is the constant polynomial 1
The greatest common divisor of 𝑝 and 𝑝′ is the monic polynomial 𝑞 of
largest degree such that 𝑝 and 𝑝′ are both polynomial multiples of 𝑞. The
Euclidean algorithm for polynomials (look it up) can quickly determine
the greatest common divisor of two polynomials, without requiring any
information about the zeros of the polynomials. Thus the equivalence of
parts (a) and (d) above shows that we can determine whether or not 𝑇 is
diagonalizable without knowing anything about the zeros of 𝑝.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5D Diagonalizable Operators 165

15 Suppose that 𝑇 ∈ ℒ(𝑉) is diagonalizable. Let 𝜆1 , …, 𝜆𝑚 denote the distinct


eigenvalues of 𝑇. Prove that a subspace 𝑈 of 𝑉 is invariant under 𝑇 if and
only if there exist subspaces 𝑈1 , …, 𝑈𝑚 of 𝑉 such that 𝑈𝑘 ⊂ 𝐸(𝜆𝑘 , 𝑇) for
each 𝑘 and 𝑈 = 𝑈1 ⊕ ⋯ ⊕ 𝑈𝑚 .
16 The Fibonacci sequence 𝐹0 , 𝐹1 , 𝐹2 , … is defined by

𝐹0 = 0, 𝐹1 = 1, and 𝐹𝑛 = 𝐹𝑛 − 2 + 𝐹𝑛 − 1 for 𝑛 ≥ 2.

Define 𝑇 ∈ ℒ(𝐑2 ) by 𝑇(𝑥, 𝑦) = (𝑦, 𝑥 + 𝑦).


(a) Show that 𝑇 𝑛 (0, 1) = (𝐹𝑛 , 𝐹𝑛 + 1 ) for each nonnegative integer 𝑛.
(b) Find the eigenvalues of 𝑇.
(c) Find a basis of 𝐑2 consisting of eigenvectors of 𝑇.
(d) Use the solution to part (c) to compute 𝑇 𝑛 (0, 1). Conclude that
𝑛 𝑛
1 1 + √5 1 − √5
𝐹𝑛 = [( ) −( )]
√5 2 2

for each nonnegative integer 𝑛.


(e) Use part (d) to conclude that for each nonnegative integer 𝑛, the Fi-
bonacci number 𝐹𝑛 is the integer that is closest to
𝑛
1 + √5
1
( ).
√5 2

Each 𝐹𝑛 is a nonnegative integer, even though the right side of the formula
in part (d) does not look like an integer.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
166 Chapter 5 Eigenvalues and Eigenvectors

5E Commuting Operators

5.65 definition: commute

• Two operators 𝑆 and 𝑇 on the same vector space commute if 𝑆𝑇 = 𝑇𝑆.


• Two square matrices 𝐴 and 𝐵 of the same size commute if 𝐴𝐵 = 𝐵𝐴.

For example, if 𝑇 is an operator and 𝑝, 𝑞 ∈ 𝒫(𝐅), then 𝑝(𝑇) and 𝑞(𝑇) commute
[see 5.19(b)].
As another example, if 𝐼 is the identity operator on 𝑊, then 𝐼 commutes with
every operator on 𝑊.

5.66 example: partial differentiation operators commute


Suppose 𝑚 is a nonnegative integer. Let 𝒫𝑚2 denote the real vector space of
polynomials (with real coefficients) of two real variables with degree at most 𝑚,
with the usual operations of addition and scalar multiplication of real-valued
functions. Thus the elements of 𝒫𝑚2 are functions 𝑝 on 𝐑2 of the form

5.67 𝑝= ∑ 𝑎𝑗, 𝑘 𝑥𝑗 𝑦𝑘 ,
𝑗+𝑘 ≤ 𝑚

where the indices 𝑗 and 𝑘 are allowed to take on only nonnegative integer values
such that 𝑗 + 𝑘 ≤ 𝑚, each 𝑎𝑗, 𝑘 ∈ 𝐑 , and 𝑥𝑗 𝑦𝑘 denotes the function on 𝐑2 defined
by (𝑥, 𝑦) ↦ 𝑥𝑗 𝑦𝑘.
Define operators 𝐷𝑥 , 𝐷𝑦 ∈ 𝒫𝑚2 by

𝜕𝑝 𝜕𝑝
𝐷𝑥 𝑝 = = ∑ 𝑗𝑎 𝑥𝑗 − 1 𝑦𝑘 and 𝐷𝑦 𝑝 = = ∑ 𝑘𝑎 𝑥𝑗 𝑦𝑘 − 1 ,
𝜕𝑥 𝑗 + 𝑘 ≤ 𝑚 𝑗, 𝑘 𝜕𝑦 𝑗 + 𝑘 ≤ 𝑚 𝑗, 𝑘

where 𝑝 is as in 5.67. The operators 𝐷𝑥 and 𝐷𝑦 are called partial differentiation


operators because each of these operators differentiates with respect to one of the
variables while pretending that the other variable is constant.
Then 𝐷𝑥 and 𝐷𝑦 commute because if 𝑝 is as in 5.67, then

(𝐷𝑥 𝐷𝑦 )𝑝 = ∑ 𝑗𝑘𝑎𝑗, 𝑘 𝑥𝑗 − 1 𝑦𝑘 − 1 = (𝐷𝑦 𝐷𝑥 )𝑝.


𝑗+𝑘 ≤ 𝑚

The equation 𝐷𝑥 𝐷𝑦 = 𝐷𝑦 𝐷𝑥 on 𝒫𝑚2 illustrates a more general result that the


order of partial differentiation does not matter for nice functions.

Commuting matrices are unusual. All 214,358,881 (which equals 118 )


For example, there are 214,358,881 pairs pairs of the 2-by-2 matrices under con-
of 2-by-2 matrices all of whose entries sideration were checked by a computer
are integers in the interval [−5, 5]. About to discover that only 674,609 of these
0.3% of these pairs of matrices commute. pairs of matrices commute.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5E Commuting Operators 167

The next result shows that two operators commute if and only if their matrices
(with respect to the same basis) commute.

5.68 commuting operators correspond to commuting matrices

Suppose 𝑆, 𝑇 ∈ ℒ(𝑉) and 𝑣1 , …, 𝑣𝑛 is a basis of 𝑉. Then 𝑆 and 𝑇 commute if


and only if ℳ(𝑆, (𝑣1 , …, 𝑣𝑛 )) and ℳ(𝑇, (𝑣1 , …, 𝑣𝑛 )) commute.

Proof We have
𝑆 and 𝑇 commute ⟺ 𝑆𝑇 = 𝑇𝑆
⟺ ℳ(𝑆𝑇) = ℳ(𝑇𝑆)
⟺ ℳ(𝑆)ℳ(𝑇) = ℳ(𝑇)ℳ(𝑆)
⟺ ℳ(𝑆) and ℳ(𝑇) commute,
as desired.

The next result shows that if two operators commute, then every eigenspace
for one operator is invariant under the other operator. This result, which we will
use several times, is one of the main reasons why a pair of commuting operators
behaves better than a pair of operators that does not commute.

5.69 eigenspace is invariant under commuting operator

Suppose 𝑆, 𝑇 ∈ ℒ(𝑊) commute and 𝜆 ∈ 𝐅 . Then 𝐸(𝜆, 𝑆) is invariant


under 𝑇.

Proof Suppose 𝑣 ∈ 𝐸(𝜆, 𝑆). Then


𝑆(𝑇𝑣) = (𝑆𝑇)𝑣 = (𝑇𝑆)𝑣 = 𝑇(𝑆𝑣) = 𝑇(𝜆𝑣) = 𝜆𝑇𝑣.
The equation above shows that 𝑇𝑣 ∈ 𝐸(𝜆, 𝑆). Thus 𝐸(𝜆, 𝑆) is invariant under 𝑇.

Suppose we have two operators, each of which is diagonalizable. If we want


to do computations involving both operators (for example, involving their sum),
then we want the two operators to be diagonalizable by the same basis, which
according to the next result is possible when the two operators commute.

5.70 simultaneous diagonalizablity ⟺ commutativity

Two diagonalizable operators on the same vector space have diagonal matrices
with respect to the same basis if and only if the two operators commute.

Proof First suppose 𝑆, 𝑇 ∈ ℒ(𝑉) have diagonal matrices with respect to the
same basis. The product of two diagonal matrices of the same size is the diagonal
matrix obtained by multiplying the corresponding elements of the two diagonals.
Thus any two diagonal matrices of the same size commute. Thus 𝑆 and 𝑇 commute,
by 5.68.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
168 Chapter 5 Eigenvalues and Eigenvectors

To prove the implication in the other direction, now suppose that 𝑆, 𝑇 ∈ ℒ(𝑉)
are diagonalizale operators that commute. Let 𝜆1 , …, 𝜆𝑚 denote the distinct
eigenvalues of 𝑆. Because 𝑆 is diagonalizable, part (c) of 5.54 shows that
5.71 𝑉 = 𝐸(𝜆1 , 𝑆) ⊕ ⋯ ⊕ 𝐸(𝜆𝑚 , 𝑆).
For each 𝑘 = 1, …, 𝑚, the subspace 𝐸(𝜆𝑘 , 𝑆) is invariant under 𝑇 (by 5.69).
Because 𝑇 is diagonalizable, 5.64 implies that 𝑇|𝐸( 𝜆𝑘, 𝑆) is diagonalizable for
each 𝑘. Hence for each 𝑘 = 1, …, 𝑚, there is a basis of 𝐸(𝜆𝑘 , 𝑆) consisting of
eigenvectors of 𝑇. Putting these bases together gives a basis of 𝑉 (because of
5.71), with each vector in this basis being an eigenvector of both 𝑆 and 𝑇. Thus 𝑆
and 𝑇 both have diagonal matrices with respect to this basis, as desired.

See Exercise 2 for an extension of the result above to more than two operators.
Suppose 𝑉 is a finite-dimensional, nonzero, complex vector space. Then every
operator on 𝑉 has an eigenvector (see 5.21). The next result shows that if two
operators on 𝑉 commute, then there is a vector in 𝑉 that is an eigenvector for both
operators (but the two commuting operators might not have a common eigenvalue).
For an extension of the next result to more than two operators, see Exercise 8(a).

5.72 common eigenvector for commuting operators

Every pair of commuting operators on a finite-dimensional, nonzero, complex


vector space has a common eigenvector.

Proof Suppose 𝑉 is a finite-dimensional, nonzero, complex vector space and


𝑆, 𝑇 ∈ ℒ(𝑉) commute. Let 𝜆 be an eigenvalue of 𝑆 (5.21 tells us that 𝑆 does
indeed have an eigenvalue). Thus 𝐸(𝜆, 𝑆) ≠ {0}. Also, 𝐸(𝜆, 𝑆) is invariant
under 𝑇 (by 5.69).
Thus 𝑇|𝐸( 𝜆, 𝑇) has an eigenvector (again using 5.21), which is an eigenvector
for both 𝑆 and 𝑇, completing the proof.

5.73 example: common eigenvector for partial differentiation operators


Let 𝒫𝑚2 be as in Example 5.66 and let 𝐷𝑥 , 𝐷𝑦 ∈ ℒ(𝒫𝑚2 ) be the commuting
partial differentiation operators in that example. As you can verify, 0 is the only
eigenvalue of each of these operators. Also
𝑚
𝐸(0, 𝐷𝑥 ) = { ∑ 𝑎𝑘 𝑦𝑘 ∶ 𝑎0 , …, 𝑎𝑚 ∈ 𝐑},
𝑘=0
𝑚
𝐸(0, 𝐷𝑦 ) = { ∑ 𝑐𝑗 𝑥𝑗 ∶ 𝑐0 , …, 𝑐𝑚 ∈ 𝐑}.
𝑗=0

The intersection of these two eigenspaces is the set of common eigenvectors of


the two operators. Because 𝐸(0, 𝐷𝑥 ) ∩ 𝐸(0, 𝐷𝑥 ) is the set of constant functions,
we see that 𝐷𝑥 and 𝐷𝑦 indeed have a common eigenvector, as promised by 5.72.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5E Commuting Operators 169

The next result extends 5.46(a) [the existence of a basis that gives an upper-
triangular matrix] to two commuting operators.

5.74 commuting operators are simultaneously upper triangularizable

Suppose 𝑉 is a complex vector space and 𝑆, 𝑇 are commuting operators on 𝑉.


Then there is a basis of 𝑉 with respect to which both 𝑆 and 𝑇 have upper-
triangular matrices.

Proof Let 𝑛 = dim 𝑉. We will use induction on 𝑛. The desired result holds if
𝑛 = 1 because all 1-by-1 matrices are upper triangular. Now suppose 𝑛 > 1 and
the desired result holds for all complex vector spaces whose dimension is 𝑛 − 1.
Let 𝑣1 be any common eigenvector of 𝑆 and 𝑇 (using 5.72). Hence 𝑆𝑣1 ∈
span(𝑣1 ) and 𝑇𝑣1 ∈ span(𝑣1 ). Let 𝑊 be a subspace of 𝑉 such that
𝑉 = span(𝑣1 ) ⊕ 𝑊;
see 2.33 for the existence of 𝑊. Define a linear map 𝑃 ∶ 𝑉 → 𝑊 by
𝑃(𝑎𝑣1 + 𝑤) = 𝑤
for 𝑎 ∈ 𝐂 and 𝑤 ∈ 𝑊. Define 𝑆̂, 𝑇̂ ∈ ℒ(𝑊) by
̂ = 𝑃(𝑇𝑤) and 𝑇𝑤
𝑆𝑤 ̂ = 𝑃(𝑇𝑤)

for 𝑤 ∈ 𝑊. To apply our induction hypothesis to 𝑆̂ and 𝑇̂ , we must first show


that these two operators on 𝑊 commute. To do this, suppose 𝑤 ∈ 𝑊. Then there
exists 𝑎 ∈ 𝐂 such that
(𝑆̂𝑇)𝑤
̂ = 𝑆(𝑃(𝑇𝑤))
̂ ̂
= 𝑆(𝑇𝑤 − 𝑎𝑣 ) = 𝑃(𝑆(𝑇𝑤 − 𝑎𝑣 )) = 𝑃((𝑆𝑇)𝑤),
1 1

where the last equality holds because 𝑣1 is an eigenvector of 𝑆 and 𝑃𝑣1 = 0.


Similarly,
(𝑇̂ 𝑆)𝑤
̂ = 𝑃((𝑇𝑆)𝑤).
Because the operators 𝑆 and 𝑇 commute, the last two displayed equations show
that (𝑆̂𝑇)𝑤 ̂ . Hence 𝑆̂ and 𝑇̂ commute.
̂ = (𝑇̂ 𝑆)𝑤
Thus we can use our induction hypothesis to state that there exists a basis
𝑣2 , …, 𝑣𝑛 of 𝑊 such that 𝑆̂ and 𝑇̂ both have upper-triangular matrices with respect
to this basis. The list 𝑣1 , …, 𝑣𝑛 is a basis of 𝑉.
If 𝑘 ∈ {2, …, 𝑛}, then there exist 𝑎𝑘 , 𝑏𝑘 ∈ 𝐂 such that
𝑆𝑣 = 𝑎 𝑣 + 𝑆𝑣
𝑘 𝑘 1
̂
𝑘 and 𝑇𝑣 = 𝑏 𝑣 + 𝑇𝑣
𝑘 𝑘 1
̂ .
𝑘

Because 𝑆̂ and 𝑇̂ have upper-triangular matrices with respect to 𝑣2 , …, 𝑣𝑛 , we


know that 𝑆𝑣
̂ ∈ span(𝑣 , …, 𝑣 ) and 𝑇𝑣
𝑘 2 𝑘
̂ ∈ span(𝑣 , …, 𝑣 ). Hence the equations
𝑘 2 𝑘
above imply that
𝑆𝑣𝑘 ∈ span(𝑣1 , …, 𝑣𝑘 ) and 𝑇𝑣𝑘 ∈ span(𝑣1 , …, 𝑣𝑘 ).
Thus 𝑆 and 𝑇 have upper-triangular matrices with respect to 𝑣1 , …, 𝑣𝑛 , as desired.

Exercise 8(b) extends the result above to more than two operators.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
170 Chapter 5 Eigenvalues and Eigenvectors

In general, it is not possible to determine the eigenvalues of the sum or product


of two operators from the eigenvalues of the two operators. However, the next
result shows that something nice happens when the two operators commute.

5.75 eigenvalues of sum and product of commuting operators

Suppose 𝑉 is a complex vector space and 𝑆, 𝑇 are commuting operators on 𝑉.


Then
• every eigenvalue of 𝑆 + 𝑇 is an eigenvalue of 𝑆 plus an eigenvalue of 𝑇,
• every eigenvalue of 𝑆𝑇 is an eigenvalue of 𝑆 times an eigenvalue of 𝑇.

Proof There is a basis of 𝑉 with respect to which both 𝑆 and 𝑇 have upper-
triangular matrices (by 5.74). With respect to that basis,
ℳ(𝑆 + 𝑇) = ℳ(𝑆) + ℳ(𝑇) and ℳ(𝑆𝑇) = ℳ(𝑆)ℳ(𝑇),
as stated in 3.33 and 3.40.
The definition of matrix addition shows that each entry on the diagonal of
ℳ(𝑆 + 𝑇) equals the sum of the corresponding entries on the diagonals of ℳ(𝑆)
and ℳ(𝑇). Similarly, because ℳ(𝑆) and ℳ(𝑇) are upper-triangular matrices,
the definition of matrix multiplication shows that each entry on the diagonal of
ℳ(𝑆𝑇) equals the product of the corresponding entries on the diagonals of ℳ(𝑆)
and ℳ(𝑇). Furthermore, ℳ(𝑆 + 𝑇) and ℳ(𝑆𝑇) are upper-triangular matrices
(see Exercise 2 in Section 5B).
Every entry on the diagonal of ℳ(𝑆) is an eigenvalue of 𝑆, and every entry
on the diagonal of ℳ(𝑇) is an eigenvalue of 𝑇 (by 5.40). Every eigenvalue
of 𝑆 + 𝑇 is on the diagonal of ℳ(𝑆 + 𝑇), and every eigenvalue of 𝑆𝑇 is on
the diagonal of ℳ(𝑆𝑇) (these assertions follow from by 5.40). Putting all this
together, we conclude that every eigenvalue of 𝑆 + 𝑇 is an eigenvalue of 𝑆 plus
an eigenvalue of 𝑇, and every eigenvalue of 𝑆𝑇 is an eigenvalue of 𝑆 times an
eigenvalue of 𝑇.

Exercises 5E

1 Give an example of two commuting operators 𝑆, 𝑇 on 𝐅4 such that there


is a subspace of 𝐅4 that is invariant under 𝑆 but not under 𝑇 and there is a
subspace of 𝐅4 that is invariant under 𝑇 but not under 𝑆
2 Suppose ℰ is a nonempty subset of ℒ(𝑉) and every element of ℰ is diagonal-
izable. Prove there exists a basis of 𝑉 with respect to which every element of
ℰ has a diagonal matrix if and only if every pair of elements of ℰ commutes.
This exercise extends 5.70, which considers the case where ℰ contains only
two elements. For this exercise, ℰ may contain any number of elements, and
ℰ may even be an infinite set.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler
Section 5E Commuting Operators 171

3 Prove or give counterexample: If 𝐴 is a diagonal matrix and 𝐵 is an upper-


triangular matrix of the same size as 𝐴, then 𝐴 and 𝐵 commute.
4 Prove that a pair of operators on a finite-dimensional vector space commute
if and only if their dual operators commute.
See 3.96 for the definition of the dual of an operator.

5 Suppose 𝑉 is a complex vector space and 𝑆, 𝑇 ∈ ℒ(𝑉) commute. Prove


that there exist 𝛼, 𝜆 ∈ 𝐂 such that

range(𝑆 − 𝛼𝐼) + range(𝑇 − 𝜆𝐼) ≠ 𝑉.

6 Suppose 𝑉 is a complex vector space, 𝑆 ∈ ℒ(𝑉) is diagonalizable, and


𝑇 ∈ ℒ(𝑉) commutes with 𝑆. Prove that there is a basis of 𝑉 such that 𝑆 has
a diagonal matrix with respect to this basis and 𝑇 has an upper-triangular
matrix with respect to this basis.
7 Suppose 𝑚 = 3 in Example 5.66 and 𝐷𝑥 , 𝐷𝑦 are the commuting partial
differentiation operators on 𝒫23 from that example. Find a basis of 𝒫23 with
respect to which both 𝐷𝑥 and 𝐷𝑦 have an upper-triangular matrix.
The existence of a basis of 𝒫23 that simultaneously gives upper-triangular
matrices to the commuting operators 𝐷𝑥 and 𝐷𝑦 is guaranteed by 5.74.

8 Suppose 𝑉 is a nonzero complex vector space and ℰ ⊂ ℒ(𝑉) is such that 𝑆


and 𝑇 commute for all 𝑆, 𝑇 ∈ ℰ.
(a) Prove that there is a vector in 𝑉 that is an eigenvector for every element
of ℰ.
(b) Prove that there is a basis of 𝑉 with respect to which every element of
ℰ has an upper-triangular matrix.
This exercise extends 5.72 and 5.74, which consider the case where ℰ
contains only two elements. For this exercise, ℰ may contain any number of
elements, and ℰ may even be an infinite set.

9 Give an example of two commuting operators 𝑆, 𝑇 on a finite-dimensional


real vector space such that 𝑆 + 𝑇 has a eigenvalue that does not equal an
eigenvalue of 𝑆 plus an eigenvalue of 𝑇 and 𝑆𝑇 has a eigenvalue that does
not equal an eigenvalue of 𝑆 times an eigenvalue of 𝑇.
This exercise shows that 5.75 does not hold on real vector spaces.

Linear Algebra Done Right, preliminary version of fourth edition, by Sheldon Axler

You might also like