An Unexpected Union - Physics and Fisher Information

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

from SIAM News, Volume 33, Number 6

An Unexpected Union Physics and Fisher Information


Physics from Fisher Information: A Unification. By B. Roy Frieden, Cambridge University Press, New York, 1998, 328 pages, $74.95. First printed in 1998, and twice reprinted in 1999, the book under review is certainly being talked about. One suspects that the ideas it containsif not the book itselfwill continue to be discussed for generations to come. The exposition begins with a quote from John Archibald Wheeler to the effect that observer participancy gives rise to information, and information gives rise to physics. The idea that observers are participants in the events they observe has been around at least since Heisenberg, but Frieden has exploited it more systematically than any predecessor. To do so, he formulates a single unifying principlethat of extreme physical information (EPI) from which he derives Maxwells equations, the Einstein field By James Case equations, the Dirac and KleinGordon equations of quantum mechanics, various laws of statistical physics, and even a few previously undiscovered laws governing nearly incompressible turbulent fluid flows. Although EPI is obviously a unification, Frieden doubts that it will ever become the long sought theory of everything. Fisher information is named for its inventor, R.A. Fisher (18901962), a British biostatistician who was among the first to develop and employwhen and as the need arose during his work in genetics and eugenicssuch methods as maximum likelihood estimation, the analysis of variance, and the design of experiments. He also pointed out that Gregor Mendel Ronald A. Fisher, 1929, as sketched had probably falsified the data in his famous pea-plant experiments, which seem too clean by B. Roy Frieden, author of the to be the result of any natural process. book under review. The sketch, Fishers idea was that attempts to measure physical quantitiessuch as the time required which appears in Physics from Fisher Information: A Unification, for the winner of a 100-yard dash to reach the finish lineare invariably frustrated by noise. was done from a photograph taken Thats why multiple stopwatches are ordinarily employed. Moreover, the quantity of infor- at the time of Fisher's election as a mation concerning the actual elapsed time contained in such a sample varies directly with the Fellow of the Royal Society. degree to which the sample measurements cluster about a common value. Thus, if yn denotes the figure (time) recorded on device (stopwatch) n {1, ..., N} and if yn = q + xn, where q is the true elapsed time and x1, . . . , xn represent random errors, then the quantity I of information concerning q contained in the vector y = (y1, . . . , yN) of observations should be inversely related to a (scalar) measure of the dispersion of the elements of y about q. Accordingly, if p(y|q) is the N-dimensional probability density function describing the deviation of y from q1 and q0(y) denotes an unbiased estimator of q, the expected error

BOOK REVIEW

05 I 0y5 p1y 6 dy
E 0 y =
0

(1)

vanishes almost surely. Meanwhile, the mean square error

e2 =

0 y p y dy

05

1 6

(2)

remains positive a.s. Fisher information I = I(q) is defined to be

I=

I satisfies the consequence of the CauchySchwartz inequality known as the CramerRao inequality: Ie2 1. If N = 1 and p(y|q) = p(y q), (3) reduces to

ln p y / p y dy.

1 6

1 6

(3)

IJ IJ

p y / y
2

0 5 0 5 / p0 y 5L dy = p0 x 5 / p0 x 5L dx = I .
2

(4)

Moreover, the substitution p(x) = [q(x)]2 yields the even simpler form
I = 4 q x

05

dx.

(5)

The introduction of real probability amplitudes q(x) in place of probability density functions p(x) simplifies many calculations in modern physics. Moreover, since the integrability of p(x) is equivalent to the square integrability of q(x), the substitution allows the analysis to take place in Hilbert space, a change of venue that von Neumann welcomed. In higher dimensions, the quantity [q(x)]2 becomes q.q or, after the introduction of complex probability amplitudes, yy*, where * denotes complex conjugation. Frieden presents an entire table of Lagrangian functions, gleaned from branches of physics as diverse as classical and fluid mechanics, electro- and thermodynamics, quantum theory, and general relativity, in which such forms appear. This suggests that an approach via Fisher information could furnish at least a partial answer to a long unanswered question: Where do Lagrangians come from? Frieden quotes a standard text to the effect that Lagrangians are often plucked out of thin air, simply because the associated EulerLagrange equations are the expected equations of motion. There is, as yet, no general explanation for the remaining terms present in the Lagrangians of Friedens table. The success of the EPI technique requires an ability to account for the additional terms that occur in particular branches of physics. If an experiment is designed to estimate several different parameter vectors qn from measurements yn = qn + xn, the appropriate generalizations of (5) are

I=4 =4

I I

n n

qn qn dx n * dx, n
(6)

with either real or complex probability amplitudes. As long as N (the largest n) is even, complex amplitudes can be formed from real ones by writing i = (1)1/2 and yn = q2nl + iq2n , n = 1, . . . , N/2. Alternatively, the qns can be recovered from the yns by separating the real and imaginary parts. The choice between real and complex amplitudes is strictly a matter of convenience. It is also con-venient to define q(x) as (ql(x), . . . , qn(x)), and y(x) as (y1,(x), . . . , yN/2(x)). It follows from (4) that if the random deviations x follow a normal or Bernoulli distribution about q, then I is inversely proportional to their variance. If x is logistically distributed, so that p(x) = 1/(l + exp(ax)), then I = a/2. Or, if the deviations are 2-vectors drawn from a bivariate normal distribution in which the components share a common mean q, a common variance s2, and a covariance r, then I becomes 2/s2 (1 + r). The information content of two variables is thus a decreasing function of their degree of correlationyet another reason why experimenters should and typically do make the extra effort to observe independent random variables whenever possible. The form of the foregoing result leads Frieden to speculate that every experiment may constitute a (zero-sum) game against nature, in which the experimenter strives to maximize the Fisher information obtained, while a hostile nature (information demon) strives to minimize it by introducing unexpected correlations and the like. It seems an interestingif as yet undevelopedidea. Frieden compares Fisher information with the Shannon, Boltzmann, and KullbackLiebler definitions of entropy, which likewise represent attempts to identify useful scalar measures of information. The comparison is most easily made with Shannon entropy H = [ln p(x)]p(x)dx, and the discrete approximation
H x 1 n p x n ln p xn ,

1 6 1 6
2

(7)

which converges to H as Dx 0. dH differs from the discrete form


I x 1 n p x n+1 p x n

1 6 1 6

/ p xn

1 6

(8)

of (4) in that the probabilities p(.) can be reassigned among the carriers x1, . . . , xn without altering the value of (7). Not so (8). It matters for Fisher information Ibut not for Shannon information Hwhether the likeliest carriers are clustered in space. It is this feature that uniquely qualifies I to illuminate Lagrangian physics. I, like H, can be viewed as a measure of physical disorder. When H coincides with traditional Boltzmann entropy HB, as it often does, the second law of thermodynamics stipulates that dH(t)/dt is never negative. And in a variety of other circumstancesamong them the situation in which p(x,t) obeys a FokkerPlanck equation of the form pt = [D1(x,t)p]x + [D2(x,t)p]xxit turns out that dI(t)/ dt is never positive. It is also possible to define temperature in terms of Fisher information, and to deduce a version of the perfect

P.M. Morse and H. Feshbach, Methods of Theoretical Physics, Part I, McGraw-Hill, New York, 1953.

gas law incorporating that precise definition of temperature, which suggests that it may eventually prove possible to recover all of thermodynamics from the notion of Fisher information. That, of course, remains to be done. According to (4) and (5), Fisher information I is a real-valued functional defined on the space of PDFs p(.), or probability amplitudes q(.). In order to derive the laws of physics from Fisher information, one must find a second such functional J that, when subtracted from I, furnishes the missing terms in the Lagrangians of Friedens table. In what follows, I will always have the form (6), while J will have a form appropriate to the particular branch of physics under consideration. Moreover, I will always represent the information in the data obtained via experiment, while J will represent the information content of the physical situation being explored. It makes sense, therefore, to suppose that I can never exceed J, to define an efficiency k = I/J, and to observe that k has always turned out to be either 1 or 1/2 in every situation (branch of physics) that Frieden and his co-workers have explored to date. An experiment, however sensitive, must always perturb the probability amplitudes q(x) of the system under observation by an amount dq. The consequent perturbations dI and dJ represent quantities of information transferred from nature (the system) to the experimenter. Frieden proposes axiom 1: dI = dJ. This is a conservation law concerning the transfers of information possible in certain more or less idealized experiments. Its validity, according to Frieden, is demonstrated many times over in his book and is the key to his EPI approach to the understanding of physical reality. The axiom can actually be confirmed whenever the measurement space has a conjugate space connected to it by a unitary transformation. The author augments the foregoing conceptual axiom with two additional axioms of a more technical nature. To consider the effects on the functional K of a small perturbation dq(x) of its argument q(x), we write K = I J. Then, by axiom 1, we can write
K = I J = I J = 0,

0 5

(9)

and ask for which probability amplitudes q(x) equation (9) holds. The answer is naturally expressed in terms of EulerLagrange equations from the calculus of variations, a process best understood by seeing it in action. Consider a (small) particle of mass m moving on a line about its rest position q. If the particle is subject to a conservative scalar potential field V(x), its total energy W is conserved. In terms of complex wave functions yn(x), the expression (6) for I becomes
I = 4 N n d n x /dx dx,
2

05

(10)

the summation extending from n = 1 to n = N/2. The particle in question will have not only a position x relative to its rest position q, but a momentum m relative to its rest momentum 0. Moreover, in the same way the observations xn = yn q are drawn from probability distributions pn(x) = [qn(x)]2 corresponding to complex amplitudes yn (x), observed values m n would be drawn from distributions pn(m) corresponding to complex amplitudes fn(m). Again, we write f = (f1, . . . , fN/2) and y = y1, . . . , yN/2). Frieden concludes on physical groundsby analysis of an experimental apparatus that could in theory be used to perform the required measurementsthat J[f] = I[y]. Moreover, the fn and yn are Fourier transforms of one another, so that

05 3 8 I 16 exp1ix/h6d.
n x = 1 / 2 h
n

(11)

When the resulting expressions for yn(x) are substituted in (10),

I = 4 N /h 2

7I 1 6
2 n n

= J.

(12)

He then identifies the summation in (12) with the probability distribution p(m) mentioned earlier, so that
J = 4 N /h 2 E 2 .

(13)

Finally, he makes the specifically nonrelativistic approximation that the kinetic energy E of the particle is m2 /2m, so that
J = 8 Nm/h 2 E E
2 2

05 2 7 = 28 Nm/h 7I W V 0 x 5 p0 x 5 dx = 28 Nm/h 7I W V 0 x 5 0 x 5
= 8 Nm/h E W V x
2 n n

(14)
2

dx.

Since I and J have now been expressed in terms of the quantities yn(x), K = I J can be combined into a single integral of the form K = L (yn, yn, x) dx, and the associated EulerLagrange equations written:
x + 2 m/h 2 W V x n x = 0, n n = 1,..., N /2,

05 2

05 05

(15)

the time-independent Schrdinger equation. This is the (one-dimensional, nonrelativistic) approximation of the (covariant, relativistic) KleinGordon equation, which can be derivedwithout approximationin similar fashion. The unitary nature of the Fourier transformation played an essential role in the foregoing derivation, particularly that of equation (12), where it justified the conclusion that k = I/J = 1 in the present theory. That means that quantum mechanics is efficient in the sense that the underlying experiments are capable of extracting all available information. Many other physical theories, including electromagnetic theory and gravitational theory, yield only k = 1/2. Frieden derives a version of Heis-enbergs uncertainty principle from the CramerRao inequality, and contrasts the result with the standard version obtained via Fourier duality. He remarks that the EPI version is stronger than the standard one in the sense that it implies the other, whereas the converse is untrue. The standard result applies only to uncertainties that exist before any measurements are taken, while the EPI version also applies to uncertainties that remain afterward. The material described here is contained in the books first four chapters and in Appendix D. The next six chapters present further applications of the EPI method. Statistical mechanics, like quantum mechanics, is found to be efficient in the sense that k = I/ J = 1, while most other theories are only halfway efficient. Frieden finds this unsurprisingquantized versions of most physical theories, including gravitation and electromagnetism, have not been developed, although he expects that they will be found in due course. Indeed, he and others are currently applying the EPI method to a study of quantum gravityin steps analogous to those that led to the Schrdinger equation presented hereand, in higher dimensions, to the KleinGordon equation. Feynman path integrals seem to emerge naturally in this ongoing investigation, which seems destined to culminate in a vector wave equation of gravity. Also under attack are various problems in turbulence. Probability laws of fluid density and velocity have been found for lowvelocity flow in a nearly incompressible fluid. The new laws agree well with published data arising from detailed simulations of such turbulence. Frieden also remarks that the EPI method applies most naturally to the laws of physics that are expressed in the form of (differential) field equations. He sees no reason, however, why the method could not be extended to cover laws concerning the sources of the fields in question. The exposition ends as it began, quoting John Archibald Wheelers words on observer participation. It seems safe to conclude, all in all, that the unexpected union between physics and Fisher information will prove both lasting and fruitful.
James Case writes from Baltimore, Maryland.

You might also like