Physics 221B

Spring 2012
Notes 44
Introduction to the Dirac Equation
1. Introduction
In 1927 Pauli published a version of the Schrodinger equation for spin- 21 particles that incorporates the interaction of the spin with the magnetic field. This equation is

q 2
h 1 
p A B + q ,


where the magnetic moment is related to the spin S as described in Notes 14. See Prob. 17.1(b)
and Eq. (17.44). Pauli understood that spin was in a sense a relativistic effect, and that his equation
amounted to a grafting of some relativistic corrections onto the nonrelativistic framework of the
Schrodinger theory. As such, he felt that his equation was neither deep nor elegant, and he hesitated
to publish it, hoping to find a proper and complete relativistic theory of the electron. In that effort,
however, he was anticipated by Dirac, who in 1928 published his work on his wave equation, which
we take up in these notes.
The Dirac equation is rightly regarded as one great monuments of modern physics. There is
considerable formalism involved in mastering it, but it is an essential part of relativistic quantum
mechanics, and we will take it one step at a time.
2. The Dirac Equation
Notes 43 describes the state of affairs when Dirac undertook his investigations of relativistic
wave equations. Recognizing that the negative probabilities of the Klein-Gordon equation were
related to the fact that the Klein-Gordon equation is second order in time, Dirac decided to find a
relativistic wave equation that was first order in time. Following Dirac, we write this wave equation

= H,


that is, as a Schrodinger equation, but with some Hamiltonian H that is to be determined. We
search for the correct relativistic H, taking at first the case of the free particle.
Dirac reasoned that since his equation was first order in the time derivative, and since relativity
is supposed to treat space and time on an equal footing, it should also be first order in the spatial
derivatives. That is, it should be linear in the derivatives /xi , or, equivalently, in the momentum
operators i
h /xi . Therefore he guessed that H would have the form
H = c p + mc2 ,


Notes 44: The Dirac Equation

where is a vector containing the coefficients multiplying the spatial derivatives and where p =
ih is the usual momentum operator. A factor of c has been split off so that these coefficients
will be dimensionless. Dirac also added a constant term involving a coefficient , with mc2 split off
so that would also be dimensionless. The coefficients and are to be determined. Thus, the
Dirac equation for a free particle can be written,

+ mc2 = 0.
= ihc



The coefficients k or = (1 , 2 , 3 ) cannot be ordinary numbers, because if they were

would specify some privileged direction in space, which we dont expect for a free particle. Therefore
Dirac assumed that the wave function was some kind of multicomponent object or spinor,

= ... ,


where the number of components N is to be determined. Then k , k = 1, 2, 3, and are N N

matrices. The matrices k form a vector of matrices , much as the Pauli matrices do. You
must henceforth understand, when looking at the Dirac equation, that is a spinor and and
are matrices, although usually we do not indicate this explicitly. Since the free-particle Hamiltonian
should be invariant under space and time displacements, the matrices k and must be constant,
that is, independent of x and t. Thus, they act on the spin degrees of freedom of the particle, much
as do the Pauli matrices in the nonrelativistic theory of a spin- 21 particle, and commute with purely
spatial operators such as x and p. Also, since the Dirac Hamiltonian should be Hermitian, so must
also the matrices k and be Hermitian.
To determine the matrices k and , Dirac required that every solution of the Dirac equation
should also satisfy the Klein-Gordon equation. This is so that the classical energy-momentum
relation, E 2 = m2 c4 + c2 p2 , would be satisfied. Applying ih /t to Eq. (4) we obtain,


2 2

+ m2 c4 2 .

(k + k )


xk x



In order that this agree with the Klein-Gordon equation (43.7), we must have
(k + k ) = k ,

k + k = 0,

2 = 1,


where we symmetrize the first equation in k and since the derivatives 2 /xk x are symmetric.
The right-hand sides of Eqs. (7) are understood to be multiplied by the N N identity matrix,
which we write simply as 1. Notice that if we set k = in the first of these equations we get 2k = 1,
k = 1, 2, 3, so all four Dirac matrices k , square to unity.

The conditions (7) are conveniently expressed in terms of anticommutators. That is, we rewrite
them as
{k , } = 2k ,

{k , } = 0,

{, } = 2.


Notes 44: The Dirac Equation

Equations (7) or (8) constitute the Dirac algebra, that is, the set of algebraic relations which the Dirac
matrices k and must satisfy. They amount to an encoding of the relativistic energy-momentum
relations in terms of matrices. The Dirac algebra is a special case of a kind of anticommuting algebra,
encoding the components of a metric tensor, that in mathematics is known as a Grassman algebra.
3. Finding the Dirac Matrices
To find Hermitian matrices k , that satisfy the Dirac algebra we note first of all that 2k = 1
and 2 = 1, so the eigenvalues of all four matrices can only be 1. Next, multiplying the middle
equation in Eq. (7) from the left by and taking traces we find
tr(k ) + tr(k ) = 0.


But by cycling the order of matrices, the first term can be written
tr(k ) = tr( 2 k ) = tr(k ),


which, when used in Eq. (9), gives tr k = 0. In a similar manner we show that tr = 0. The Dirac
matrices are Hermitian and traceless. Since the trace is the sum of the eigenvalues, which must be
1, the number of positive and negative eigenvalues must be equal. This means that the dimension
N of the Dirac matrices must be even.
The simplest possibility is N = 2. All 2 2 Hermitian matrices can be expanded as a linear

combination of the Pauli matrices and the identity with real coefficients, which can be used to
search for the expansion coefficients which will cause the Dirac relations (8) to be satisfied. It turns
out after an analysis that there is no solution to the Dirac algebra in terms of 2 2 matrices.

The next simplest case is N = 4. Here Dirac was able to find a explicit solution for matrices
k and that satisfies his algebra. The solution can be written,

1 0
0 1
where the 4 4 matrices are partitioned into four 2 2 matrices, which are represented in terms

of Pauli matrices and the 2 2 identity matrix (indicated simply by 1). As indicated, Eq. (11)
constitutes the Dirac-Pauli representation of the Dirac algebra.
4. Equivalent and Inequivalent Representations
Whenever we have an abstract set of algebraic relations that a set of matrices or linear operators
must satisfy, such as the Dirac algebra (8), and we find an explicit set of matrices or operators that
satify those relations, then we say that we have found a representation of those relations. Thus,
Dirac found a representation of his algebra by means of 4 4 matrices. However, a representation,
once found, is not unique, because we can conjugate all the matrices by a fixed unitary matrix U ,
without changing the algebraic relations. That is, if we make the replacements,
U U ,

U U ,


Notes 44: The Dirac Equation

then the new matrices will still satisfy the Dirac algebra. We require U to be unitary so that the
Hermiticity of the matrices will be preserved. The transformation (12) can be regarded as a change
of basis in spin space.
Any two representations (speaking of matrix representations) of the same dimensionality that
differ merely by a change of basis are called equivalent. The implication is that equivalent representations are trivially related. But if two matrix representations of the same dimensionality are not
related by a change of basis, then they are said to be inequivalent.
In the case of the Dirac algebra, it can be shown that all 4 4 representations are equivalent to
the representation (11) found by Dirac. Thus, we can say that to within a change of basis in spin
space, Diracs representation (11), the Dirac-Pauli representation, is unique. It is the representation
of the Dirac algebra of smallest dimensionality, and thus provides the simplest solution to the problem
of finding matrices that satisfy the Dirac algebra.
We remark that Diracs representation of his algebra is also irreducible. This means that there
is no change of basis that causes all the matrices to take on a block-diagonal structure (the same
block-diagonal structure for all matrices). If there were such a change of basis, then the blocks would
consititute a representation of the Dirac algebra by means of matrices smaller than 4 4, and we
have seen that there is no such representation.

In doing physical calculations in the Dirac theory it is never actually necessary to use explicit
representations of the Dirac matrices, such as the Dirac-Pauli representation (11). The same is
actually true of the Pauli matrices ; it is never actually necessary to use the explicit forms of
the three Pauli matrices that all students of quantum mechanics memorize. Nevertheless, some
calculations are simpler in one representation or another. The Dirac-Pauli representation (11) of
the Dirac algebra is most useful in studying the nonrelativistic limit of the Dirac equation, and it is
the one that we will use the most. But we mention another representation, the Weyl representation,
that is useful for studying ultrarelativistic particles such as neutrinos. The Weyl representation is

0 1
1 0
The explicit change of basis taking us from the Dirac-Pauli to the Weyl representation is
(k )Weyl = U (k )DP U ,


Weyl = U DP U ,
1 1
1 1



There is another representation of the Dirac algebra in common use, the Maiorana representation, that makes the effects of time-reversal simple. We will not use the Maioriana representation
and so will not quote it.
We remark that the fact that the Dirac matrices are 4-dimensional has nothing to do with
the fact that space-time is 4-dimensional. The Dirac matrices act on spin space, not on space-

Notes 44: The Dirac Equation

time. Conversely, while there are 4-dimensional matrices with space-time indices, such as a Lorentz
transformation or the electromagnetic field tensor F , these do not act on spin space.
5. Minimal Coupling
Equation (3) gives the Dirac Hamiltonian for a free particle, where now we take to be a
4-component spinor and we have explicit representations for the matrices and . To incorporate
the interaction of the particle with an electromagnetic field we may use the minimal coupling prescription, which is described at a classical level in Sec. B.17. In the present context it means that
we make the replacements,

p p A,
in Eq. (3), where q is the charge of the particle and and A are the scalar and vector potentials.
H H q,

This gives the new Hamiltonian,

H = c p A(x, t) + q(x, t) + mc2 .
It is a guess that this is the correct Hamiltonian for a Dirac particle interacting with the electromagnetic field, but it is the simplest guess consistent with Lorentz covariance.
Notice that in coming this far we have made the simplest choice possible at each stage where
there was more than one possibility, including the representation of the Dirac algebra and the
minimal coupling prescription.
6. The Probabilty Density and Current
Let us now search for a conserved probability density and current. Following the steps we
used in the nonrelativistic theory, we begin by writing down the Dirac equation and its Hermitian
conjugate. We use the Hamiltonian (17), which includes the interaction with an electromagnetic
field via the minimal coupling prescription. The Dirac equation itself is

= c i
h A + q + mc2 .
When we take the Hermitian conjugate, the column spinor is mapped into a row spinor,

= = [ 1 2 3 4 ] .



In the following you must remember that is a row spinor, so that when it is multiplied onto a
Dirac matrix from the left it produces another row spinor. The Dirac matrices and are mapped
into themselves under Hermitian conjugation, since they are Hermitian, and the order of matrix and
spinor multiplication is reversed. Altogether, the Hermitian conjugate of the Dirac equation is

= c i
h( ) ( A) + q + mc2 .

Notes 44: The Dirac Equation

Now we multiply Eq. (18) on the left by and (20) on the right by , where complete contractions
in spinor indices is understood, so the results are scalars insofar as the spin indices are concerned.
The terms involving A, and are the same, so when we subtract these cancel. What is left is
= ihc[ + ( ) ],

+ J = 0,
= ,

J = c .


In this derivation multiplication of spin matrices times spinors is implied, as is the multiplication of
row spinors times column spinors. If this is not clear we can always write out the spin operations
explicitly, for example, Eqs. (23) can be written,


|r |2 ,

Ji = c


r (i )rs s .



We have not only found a conserved probability density and current, but the density is nonnegative definite, 0. Recall that the conserved density in the theory of the Klein-Gordon equation
could take on any sign, and so could not be interpreted as a probability density. In deriving Eqs. (23)
Dirac felt that he had overcome one of the main difficulties of the Klein-Gordon equation.

Notice that in the nonrelativistic Pauli theory of a spin- 21 particle the wave function is a 2component spinor,


and the probability density is

= = |+ |2 + | |2 .


It is the probability density of finding the particle, if we dont care about the spin state. That is, it
is the probability density of finding a spin up particle, plus the probability density of finding a spin
down particle. We do not yet have a physical interpretation of the four components of the Dirac
spinor, but the similarity between Diracs and the in the Pauli theory is notable.
If we set

J =



then the continuity equation (22) looks covariant,

J =
= 0.
However, as remarked earlier in connection with the Schrodinger theory (see Sec. 43.6), to show that
Eq. (28) is covariant we must show that J transforms as a 4-vector under Lorentz transformations.
That is a fairly large project that we defer until we have a better physical understanding of the
Dirac equation.

Notes 44: The Dirac Equation

7. Free Particle Solutions at Rest

To gain physical insight into the Dirac equation, we examine the simplest possible free particle
solutions, namely those for a particle at rest. We use the Dirac-Pauli representation for the Dirac
matrices. The free particle Dirac equation can be written,

= ihc + mc2 .


But if the particle is at rest then the momentum vanishes, and = 0. Then the Dirac equation


= mc2




where the matrix shown is [see Eq. (11)]. The four components of the Dirac spinor decouple in
this case, and the solutions are easy to write down. They are

0 imc2 t/h

1 imc2 t/h

0 +imc2 t/h

0 +imc2 t/h


The first pair of these solutions have energy mc2 , just what we would expect for a particle at
rest, with the usual custom in relativity theory of including the rest mass-energy in the reckoning
the energy of the particle. The second pair of solutions have energy mc2 . We see that the Dirac
equation, like the Klein-Gordon equation, possesses solutions of negative energy. There is no ready
interpretation of these solutions, and we will have to do some work to find their proper physical
meaning. As for the positive energy solutions, we note that if we ask for the free particle solutions
at rest of the Pauli equation, they are



with no dependence on space or time. These look very similar to the first two solutions (31) of
the Dirac equation, in fact they are the same if we ignore the third and fourth components of the
spinor and shift the origin of energy by mc2 to conform with the usual custom of reckoning energy
in nonrelativistic problems.
Do not let this example lead you to think that the first two components of the Dirac spinor are
identical to the two components of the Pauli spinor. This is true for the free particle at rest, but
not for other problems, including the free particle in some state of motion. Nevertheless, it is true
that the solutions of the Pauli equation that are the nonrelativistic limit of the Dirac equation do
coincide approximately with the upper two components of the Dirac spinor. We will examine this
point in a little more detail in a moment, and in considerably more detail in a later set of notes.

Notes 44: The Dirac Equation

8. Heisenberg Equations of Motion

To gain further insight into the physical interpretation of the Dirac equation, we examine the
Heisenberg equations of motion. See Sec. 5.5 for the basic theory of the Heisenberg picture. We will
write the Heisenberg equations of motion as
[X, H],
X =


where H is the Hamiltonian and X is any operator. This is the same as Eq. (5.21), but without
the H subscripts. We will use the Hamiltonian (17), that is, the Hamiltonian of a Dirac particle
interacting with an electromagnetic field. We will write the kinetic momentum as
= p A(x, t),


to distinguish it from the canonical momentum p. Then the Hamiltonian is

H = c + q + mc2 .


Basic commutators are

[xi , xj ] = 0,

[xi , j ] = ih ij ,

[i , j ] =

ijk Bk ,


where Bi is the component of the magnetic field. See also Eqs. (10.5) and (10.6).
First we let X = xi (a component of the position operator). The operator xi has no explicit
time dependence, and everything in H except commutes with xi . Thus,
[xi , H] = c[xi , j j ] = cj [xi , j ] = ihc i .


We use the Heisenberg equations of motion to define the velocity operator v, and we find
v = x = c.


The velocity operator in the Dirac theory turns out to be a purely spin operator, which is a
surprise. In the Schrodinger or Pauli theory the velocity operator is v = /m, or, for a free particle,
simply v = p/m. It is a purely spatial operator in the nonrelativistic theory. In the relativistic
theory of the free particle we might have expected something like

m c 4 + c 2 p2


which is the expression in classical relativity giving the velocity as a function of the momentum. In
any case, we might have expected the velocity to be a function of the momentum, and, hence, a
purely spatial operator.
It will take us some time to understand the physical meaning of this result, which is discussed
further later in connection with the Gordon decomposition of the current. For now we simply remark
that the same interpretation of c (as a velocity operator) also appears in the probability current,

Notes 44: The Dirac Equation

J = c [see Eq. (23)]. Recall that the probability current in the nonrelativistic theory is v,
and, in any case, the probability current has to have dimensions of probability density ( ) times
a velocity.
Further strange features of the velocity operator in the Dirac theory emerge when we consider
making a measurement of the velocity. If we measure one component of the velocity, say, vx = v1 , the
answers can only be the eigenvalues of the operator c1 . But we have seen that the eigenvalues of all
the Dirac matrices are 1. Therefore, a measurement of vx yields only two results, c. Furthermore,
we cannot measure more than one component of the velocity simultaneously, since the commutator
[i , j ] is nonzero. In the Dirac theory the components of the velocity do not commute, so it is

impossible to measure the direction of motion of the particle.

A proper appreciation of these results requires developments that have not been explained yet.
For now suffice it to say that it turns out that the position operator does not have the same meaning
in relativistic quantum mechanics as in the nonrelativistic theory, because if we try to localize the
particle in a region smaller than the Compton wavelength there will arise significant amplitudes for
the production of particle-antiparticle pairs. Thus, we are no longer dealing with a single particle.
Recall that the photon does not have a position operator.
Now we compute the time derivative of the kinetic momentum , for which
q A

c t


Working out the commutators and expresing the derivatives of A and in terms of the electric and
magnetic fields, we find
= q E(x, t) + vB(x, t) ,


This is precisely the equation of motion for the kinetic momentum of a charged particle in classical
relativity theory, which of course here is reinterpreted as an operator equation. In the classical
theory = mv, but as we have seen, in the quantum theory the velocity and momentum are not
functions of one another.
The Heisenberg equations of motion are useful for finding the time-dependence of expectation
values of wave packets, a subject we will return to when we reconsider the strange features of the
velocity operator.

9. The Nonrelativistic Limit of the Dirac Equation

An obvious requirement for any successful relativistic theory such as the Dirac equation is that it
must reduce to the known, nonrelativistic theory when the velocity is small compared to the speed of
light. We will now examine solutions of the Dirac equation in the low velolcity limit, using a simple
procedure that gives us the answers to order (v/c)2 . Later we will carry out a more systematic
expansion of the Dirac equation in powers of v/c.


Notes 44: The Dirac Equation

Instead of the condition v c, we will write the energy as mc2 plus a correction that is small

compared to mc2 . This gives us the nonrelativistic limit of the positive energy solutions of the Dirac
equation. Since the time dependence of the wave function is governed by the energy (energy

eigenstates have the time dependence eiEt/h ), let us split off a factor of eimc t/h from the Dirac
wave function . Let us also break the 4-component spinor into two 2-component spinors and ,
using the Dirac-Pauli representation. That is, let us substitute

imc2 t/


into the Dirac equation, which we write in the Schrodinger form (2) with the Hamiltonian (35),
including the interaction with the electromagnetic field.
The left-hand side becomes

= eimc t/h mc2 + ih




while the right-hand side is


(c + q + mc ) = e

imc2 t/


+ q + mc

1 0
0 1



where we have written out the matrices and explicitly. Cancelling the common phase and rearranging the terms slightly, the Dirac equation becomes a system of two coupled equations involving
the 2-component spinors and :

= c( ) + q,

= c( ) + q 2mc2 .


This coupled system is exactly equivalent to the original Dirac equation.

In Eq. (45b) there are three terms involving . The term q is of the order of the nonrelativistic
kinetic energy (times ), assuming that the kinetic and potential energies trade back and forth as is
typical, for example, in the orbital motion of an electron in an atom. That is, q mc2 , according
to our assumptions. The term i
h/t must also be of the order of the nonrelativistic kinetic energy

(times ), because we have already stripped off mc2 from the time dependence that represents the
total (relativistic) energy. Therefore both these terms are small compared to 2mc2 , and can be

neglected. Doing this, we can solve Eq. (45b) for in terms of ,

( ).


Now substituting this into Eq. (45a), we obtain


( )2 + q.


Notes 44: The Dirac Equation


This equation is actually the Pauli equation. To see this, note that commutes with , so
( )2 = i i j j = i j i j = (ij + iijk k )i j , = 2 + iijk k i j ,


where we use the properties of the Pauli matrices [see Eq. (1.145)]. Since the Levi-Civita symbol
ijk is antisymmetric in i and j, the second term in this equation can be written,
ijk k (i j j i ) = ijk k [i , j ] = ijk k ij B = B,


where we use the commutator in Eq. (36) and Eq. (E.52). Altogether, we have
( )2 = 2



so Eq. (47) becomes


B + q ,



where we have set S = (
h/2) for a spin- 12 particle, and where g = 2. See Eq. (14.10).


Equation (51) is the Pauli equation for a particle of spin 21 (because has two components),
with the interaction of the spin with the magnetic field included and with a g-factor of 2. As we
have seen in Notes 14, the g-factor of the electron is very close to 2. In Diracs day the small
corrections to the value 2 had not been seen experimentally and were not suspected theoretically, so
as far as anyone knew the Dirac equation gave the exact g-factor of the electron. It is evident that
Diracs program for developing his wave equation, as described in these Notes, is giving us a theory
of the electron. Other particles of spin 21 , such as the proton and neutron, with their anomalous
magnetic moments, will require a modified development.
This is the first of a series of miraculous successes of the Dirac equation that we will see as
we proceed. In the first place notice that in the nonrelativistic theory, if we want to include the
interaction of the magnetic moment of the electron with a magnetic field, we must put the term
B into the Hamiltonian by hand, based on our knowledge that this is the energy of interaction
of a classical magnetic moment with a magnetic field. (And this classical interaction involves some
subtleties that are completely skipped in most books. In general, the subject of magnetic energy is
treated poorly in many books, with some common mistakes.) In the Dirac equation, however, we
did not have to put this interaction in by hand, rather it followed automatically from the minimal
assumptions we have made (the simplest representation of the Dirac algebra, and the minimal
coupling prescription for the interaction with the electromagnetic field). Not only that, but we
get the correct g-factor for the electron to a high degree of accuracy, while in the nonrelativistic
theory the g-factor has to be taken as a measured parameter without theoretical support. For this
reason the value g = 2 is sometimes called the Dirac value, while other values are considered


Notes 44: The Dirac Equation

Although there remain some interpretational difficulties with the Dirac equation, including the
negative energy solutions and some others we will discuss later, these successes are so striking that
we continue with the theory to see where it will lead. The next step is to see how the Dirac equation
transforms under Lorentz transformations.

1. The Dirac equation in two space dimensions. Suppose we lived in a world with two space dimensions (x and y) and one time dimension. Let x = (ct, x, y), and otherwise use the obvious restrictions
of ordinary relativity theory to two spatial dimensions (for example, g = diag(+1, 1, 1)). In
two spatial dimensions, the vector potential A has two components (Ax , Ay ), but the magnetic field
is a scalar,

More generally, true vectors in three dimensions become 2-vectors when restricted to two dimensions,
but pseudo-vectors in three dimensions become (pseudo) scalars in two dimensions. You will find

that in some respects the Dirac equation in two spatial dimensions is very similar to what was done
in lecture in three spatial dimensions, and in other respects it is different.
(a) Carry out Diracs program of finding a relativistic wave equation that is first order in both
space and time. Show that the Dirac algebra of = (1 , 2 ) and matrices can be satisfied by
2 2 Hermitian matrices. This is the simplest solution for the Dirac algebra in 2 + 1 dimensional
space-time. Thus, the Dirac spinor has two components. The solution is not unique, because you can
change the basis (conjugate any solution with some unitary matrix to create another representation
of the algebra), so let your primary solution be one in which is diagonal. Call this the Dirac-Pauli
representation. Also compute the matrices this the Dirac-Pauli representation. Also compute the
matrices in the Dirac-Pauli representation. The Dirac-Pauli representation is most convenient
for studying the nonrelativistic limit.
Now find a representation in which the matrices are purely imaginary. Call this the Maiorana representation. The Maiorana representation makes Lorentz transformations on the spinors
look most simple.
(b) Write out the Dirac equation including minimal coupling to the electromagnetic field, and show
that it possesses a positive definite probability density with an associated probability current that
taken together satisfy the continuity equation.
(c) Work out the Heisenberg equations of motion for x and = p (q/c)A.
(d) Using the Dirac-Pauli representation, write the 2-component Dirac spinor as

= eimc t/h


Notes 44: The Dirac Equation


and assume that the energy is E = mc2 + small. Find an approximation giving in terms of ,
and use it to obtain an effective Schrodinger equation for the upper component . Is the resulting
Schrodinger equation realistic for any problems in our real (3-dimensional) world?

