Flyvbjerg, Petersen - 1989 - Error Estimates On Averages of Correlated Data PDF

Error estimates on averages of correlated data
H. Flyvbjerg, and H. G. Petersen
Citation: The Journal of Chemical Physics 91, 461 (1989);

View online: https://doi.org/10.1063/1.457480
View Table of Contents: http://aip.scitation.org/toc/jcp/91/1
Published by the American Institute of Physics
Articles you may be interested in

Comparison of simple potential functions for simulating liquid water
The Journal of Chemical Physics 79, 926 (1998); 10.1063/1.445869
Statistically optimal analysis of samples from multiple equilibrium states

Canonical sampling through velocity rescaling

Polymorphic transitions in single crystals: A new molecular dynamics method

Journal of Applied Physics 52, 7182 (1998); 10.1063/1.328693
Molecular dynamics with coupling to an external bath

Determining the shear viscosity of model liquids from molecular dynamics simulations
Error estimates on averages of correlated data
H. Flyvbjerg
The Niels Bohr Institute, Blegdamsvej 17, DK-2100 Copenhagen (J, Denmark
H. G. Petersen
Institute 0/ Mathematics and Computer Science, Odense University, DK-5230 Odense M, Denmark
(Received 8 February 1989; accepted 14 March 1989)
We describe how the true statistical error on an average of correlated data can be obtained
with ease and efficiency by a renormalization group method. The method is illustrated with
numerical and analytical examples, having finite as well as infinite range correlations.
I. INTRODUCTION be the result ofa Monte Carlo simulation in "thermal" equi-

Computer simulations of physical systems by Monte librium, i being related to the Monte Carlo time. Or the Xi'S
Carlo methods or molecular dynamics typically produce may be the result of a molecular dynamics simulation of a
raw data in the form of finite time series of correlated data. In system in "equilibrium," i being related to the physical time
the wide class of cases, where stationary states are investigat- of the system. Let ( ... ) denote the expectation value with
ed, the first step in the data analysis consists in computing respect to the exact, but unknown, probability distribution
time averages. Since such averages are overfinite times, they p(x) according to which Xi is distributed. p does not depend
are fluctuating quantities: another simulation of the same on i, since we are considering a system in equilibrium. Let
system will typically give a different value for the same quan- --:-:-:- denote averages over the set {XI' ... , xn}.
tity. So the next step in the data analysis consists in estimat-
ing the variance of finite time averages. A mixed practice has (/)= J dxp(x)/(x), (1)
developed around this problem. _ 1 n
A popular estimator for the error on a time average of 1=-
n
L I(x
i= I
i ), (2)
correlated data is based on the correlation function for these
data. There is actually a whole family of such estimators, all ( ... ) is good for theoretical considerations; ... is some-
being approximations to one of two original estimators. thing we can compute in practice. We assume ergodicity;
They are reviewed in Sec. III of this paper with some atten- hence the ensemble average (I) is equal to the time average
tion paid to the approximations, SUbjective choices, and lim n _ 00 In practice we compute finite time averages like
].
computational effort involved. That should make the reader Eq. (2), and use them as estimates for ensemble averages
appreciate the alternative, the "blocking," or "bunching," like Eq. (1). A finite time average is a fluctuating quantity,
method, described in See. IV. In our opinion this method so we need also an estimate for the variance of this quantity
combines maximum rigor with minimum computation and to have a complete result. To be specific, we estimate the
reflection. It involves no approximations for SUbjective =
expectation value f.l (x) by the average value
choices, automatically gives the correct answer, when it is 1 n
available from the time series being analyzed, and warns the m=x=-
n
L i=1
Xi (3)
user, when this is not the case. We also give some-hopefully
illustrative-examples, analytical ones (Sees. V and VI) as and need an estimator for the variance of m
well as numerical ones (Sees. VII and VIII). In Sec. IX we u 2(m) = (m 2) _ (m)2. (4)
describe situations in which the blocking method cannot be Inserting Eq. (3) into Eq. (4) we find
used. The reader, who wants just a recipe for the blocking
1
method, needs only read Secs. II, IV, and IX, and a few 2
u (m) = -
2
n
L n
y ..
'.J
equations in Sec. III. i.j= I
The "blocking" method was not invented by us. It is

part of the verbal tradition in a part of the simulation com- = ~[YO +
n
2nil (1 - ~)Yt]
t=1 n
, (5)
munity. It may have been invented by K. Wilson. I This
seems plausible, since it is essentially a real space renormal- where we have introduced the correlated function
ization group technique applied in the one-dimensional, dis- Y;.j =(x;xj ) - (x;) (xj ) (6)
crete space of simulation time. The method is briefly de- and used its invariance under time translations to define
scribed by Whitmer2 and Gottlieb et aP Recently, we were
made aware that the method is unknown in parts of the sim- (7)
ulation community, and it is upon request that we describe it
in some detail here. III. ESTIMATORS BASED ON THE CORRELATION
FUNCTIONS
II. NOTATION u 2 (m) is often estimated using Eq. (5) with an estimate
Let XI' X 2 , ••• , Xn be the result of n consecutive measure- Yt Yt'S
for in place. Doing so requires some care and consi-
ments of some fluctuating quantity. Typically the Xi'S may deration, since the most obvious estimator for Y"
J. Chern. Phys. 91 (1),1 July 1989 0021-9606/89/130461-06$02.10 @ 1989 American Institute of Physics 461
462 H. Flyvbjerg and H. G. Petersen: Averages of correlated data
1 n~t _ _ (8) n - 1. If instead one can choose T such that 7' <T <n, then
Ct =-- £.. (x k -X)(Xk+t -x),
n - t k= 1 the approximation in Eq. (11) is good, and so is that in Eq.
is a biased estimator; its expectation value is not rt, but (14). Thus T should at the same time be chosen much
smaller than n and several times larger than the maximal
(c t ) = rt - 0'2(m) + a" (9)
correlation time in rt, assuming this quantity exists, and can
where be determined at least approximately from Ct. See Sec. VII
at = 2( -
In
L - -n-ti=l
1
L -nj=l
L
n- t) 1 n
ri,} • ( 10)
for an example for which this is the case.
ni=l
Alternatively one may monitor the estimate for 0' 2 (m)
based on Eq. (14) as a function of T, and hope to demon-
However, if the largest correlation time in rt is finite, call it
strate that it is independent of T for T larger than a certain
then Eq. (5) reads
7',
value. This makes evaluation of Ct necessary for all values of
t between 0 and Tmax' the maximal value for T considered.
0'2(m)=.l[ro+ 2
n
f
t= 1
(1-~)rt]
n
That requires tJ (n Tmax) numerical operations, and easily
becomes a time consuming affair. For this reason one would
+ &'[: exp( - T h)], (11 ) like to choose T max small. On the other hand, T max can in
principle only be chosen appropriately after 0' 2 (m) has been
inspected. So the choice of T max cannot be automatized. This
where T is a cutoff parameter in the sum. For exp( - T I lack of automatization, and the computational effort in-
7') «
1 the explicitly written terms in Eq. (8) clearly give a volved are serious disadvantages of the method just de-
very good approximation to 0' 2 (m) - 7'1n ) . Further- &' ( scribed. In the next section we explain how the blocking
method yields the desired result with only (2n) numerical &'
more, assuming n ~ 7',
operations, and in a way that can be fully automatized.
at = &'(::) for t«7', (12a) The estimator given by Binder [Ref. 4, Eq. (2.46) and
Refs. 5 and 6] is essentially Eq. (14), except the denomina-
growing to tor is approximated by n.
Daniell et al. [Ref. 7, Eq. (23)] use the approximation
at = &(~) for t~7'. (12b)
0' 2 (m) z (co + 21:;= 1 Ct )
. (15)
So we may neglect at inEq. (9), since it is at least a factor 7'1 n - 2T-l
n smaller than the term 0' 2(m) = (7'ln). Doing that, and &' One also sees the approximation
using Eq. (9) to eliminate rt from Eq. (5), we find
0' 2 (m)z (co + 21:;= .
1 Ct )
(16)
n
All these variants ofEq. (14) are equally good when T I
n is sufficiently small. There is no reason not to use Eq. (14)
+0'2(m)(1+2T _ T(T+l)). (13)
itself, though, when any of the formulas are appropriate. It is
n n2
as easy to compute as any of its approximations.
Solving for 0' 2 (m) we find A variant of Eq. (8) in use is
1 n-t( 1 n-t )
+ 21:;= 1 (1 - ~ )c
Co t
Ct =-- L Xk - - - Xk L
) n-t k=l n-t k=l
0'2(m)z , (14)
( n _ 2T _ 1 + T( T + 1) X ( Xk+t - - -
1 n-t
L Xk+t .
)
(17)
n n - t k= 1
where the "z" is due to the truncation of the sum over t and Like Eq. (8), Eq. (17) is a biased estimator for rt since
the neglect of at. The expression inside the angular brackets (c t ) = rt - 0' (m)
2
+ a"
-
(18)
in Eq. (14) is an estimator for 0'2(m). This estimator is
where
unbiased within the approximations done, as Eq. (14)
shows.
Notice that with Eq. (14) we have not assumed any-
_ (1
at = - 2 ~
n
£.. -
thing about T's relation with n, except T<n. The truncated n i,)= 1
suminEq. (11) clearly approximates 0' 2(m) better the larg-

er Tis, and is exact for T = n - 1. The same is not true in Eq.
r . . = &(7't
',J n3
2) . (19)
( 14). For T = n - 1 the denominator vanishes, and the nu-
merator is tJ(na t ) = tJ(rln). For Tclose to n -1, the Neglecting Et relatively to 0' 2(m) in Eq. (19) leads again to
right-hand side (RHS) of Eq. (14) is the ratio of small Eqs. (13) and (14). Using Eq. (17) instead ofEq. (8) as
numbers, one of which fluctuates, when (c t ) is estimated estimator for (c t ) in Eq. (14) is a better approximation,
with Ct. This makes the expression inside the angular brack- when 11:;= 1 (n - t)Et 1< Id 1:;= 1 (n - t)at I, i.e., roughly
ets in Eq. (14) a very bad estimator for 0' 2 (m) for T close to when T2 < 7'n.
J. Chern. Phys., Vol. 91, No.1, 1 July 1989

H. Flyvbjerg and H. G. Petersen: Averages of correlated data 463
IV. THE "BLOCKING" METHOD til the fixed point is reached, where upon it remains constant
We now describe a way to estimate 0"2(m) which is within fluctuations. The constant value is our estimate for
computationally more economical than that of the previous 0" 2(m).
section. It elegantly avoids the computation of Ct and the At the fixed point the "blocked" variables (X;>i= I •... n·
choice of T. In addition it gives information about the quali- are independent Gaussian variables-Gaussian by the cen-
ty of the estimate of 0" 2 (m). The method involves repeated tral limit theorem, and independent by virtue of the fixed
"blocking" of data, and computation of increasing lower point value of r;. Consequently, we can easily estimate the
bounds for 0" 2(m) in the following way. standard deviation on our estimate Co I (n' - 1) for 0" 2(m).
We transform the data set XI' ••. , Xn into half as large a It is (~2/(n - 1) co/(n' - 1):
data set xi, ... , x~., where
X; = ~(X2i_1 + X 2i ),
2
(20)
2
0" ( m ) - - - - +
-n ,-1-
Co fh - - - - Co
n, - 1 n' - -
- l'
(27)
~ o 1) .
n' =~n. (21)
We define m' as x', the average of the n' "new" data, and
O"(m)~ - - 1+ ( (28)
n'-1 - ~2(n'-1)
have
m'=m. (22) Knowing this error is a great help in determining whether
We also define r;'j and r; as in Eqs. (6) and (7) but from the fixed point has been reached or not in actual calculations,
primed variables x;. One easily shows that as we shall see in Sec. VII.
If the fixed point is not reached before n' = 2, this is
,
rt=
{~ro +!rl for t = 0
(23)
signalled by col (n - 1) not becoming constant. The largest
!r2t- I + ~r2t + !r2t+ I for t>O value obtained for col(n - 1) is then a lower bound on
0"2(m).
and that Notice that at no stage in these calculations were
0"2(m') = --k
n
±r;.j
i.j= I
= 0"2(m). (24)
( ct ) t = 0.1.2.... evaluated, and the estimate for 0" 2 ( m) was
obtained by & (2n) operations, while computation of
Equations (22) and (24) show that the two quantities we (c t ) t = 0.1 •...• T requires & ( Tn) operations.
wish to know, m and 0" 2(m), are invariant under the "block-
ing" transformation given in Eqs. (20), (21), and (22). V. ANALYTICAL EXAMPLE WITH FINITE
Thus, no information we desire is lost in this transformation CORRELATION TIME
ofthe data set to half as large a set. Not only is nothing lost, Assume XI' ... , Xn are correlated with one finite correla-
but something is gained: the value of 0" 2 (m) unravels gradu- tion time r:
ally from rr by repeated application of the "blocking" trans-
for 1= 0
formation. From Eq. (5) we know (29)
for 1>0 .
(25) The "blocking" transformation maps the four parameters
[n, r, 0" 2(X), A] into new values [n', T, 0" 2(X'), A '], where
and from Eqs. (21) and (23) it is clear that roln increases
n' =!n,
every time the "blocking" transformation is applied, unless
rl = O. In that case roln is invariant. r' =!r,
It is not difficult to show that (r,!n)t=0.1.2 .... (30)
ex: (I5 tO )t=0.1.2.... is a fixed point of the linear transformation 0"2(X') = H0"2(X) =Ae- IIT
],
[Eqs. (21)-(23)], and any vector (r,!n)t = 0.1.2.... for which
rt decreases faster than 1/t is in the basin of attraction of this A' = ~(1 + cosh r-I)A.
fixed point. At this fixed point 0" 2(m) = roln, since rt = 0 From Eq. (5) we get to leading order (we do not have to
for t>O. 0"2(m) is estimated by using Eqs. (9) or (18) to do this approximation, but do it to keep formulas as simple
eliminate ro from Eq. (25), using ~o = Ao = 0, and solving as possible in this example) in 1/n that
for 0" 2(m):
(26) (31)
where Co is defined in Eq. (8), and the identity is satisfied at which is invariant under the "blocking" transformation
the fixed point. Knowing this, one proceeds as follows. (30). Using (lIn)0"2(x) for lI(n - 1)(co) inEq. (26) and
Starting with a data set X I' ... , xn , col (n - 1) is comput- comparing with Eq. (31), we see that Eq. (26) underesti-
ed, and used as estimate for (coI(n - 1). Then the "block- mates 0" 2 (m) by an amount 15, which can be inspected in the
ing" transformation Eqs. (20)-(21) is applied to the data present example:
set, and co/(n' - 1) is computed as estimate for
(co/(n' - 1». This process is repeated until n' = 2. The 2A
1 5 =l/T
---- (32)
sequence of values obtained for col (n - 1) will increase un- n(e - 1)
J. Chern. Phys.• Vol. 91, No.1. 1 July 1989

From (30) follows that "blocking" gives by an amount ~, for which we have the explicit, ap-
U 2 (m)
~'=~(I+e-1/1')~ (33) proximate result
from which we see that "blocking" makes lI(n - 1) (co) (39)

grow to u 2 (m) essentially with geometric progression. This
is what one might have expected, knowing r' = r/2. How- From Eq. (38) then follows that the "blocking" transforma-
ever, this rate of convergence does not depend on r being tion gives
finite, as the example in the next section shows. It does not (40)
express a property of the time series being analyzed, but is
due to the efficiency of the "blocking" algorithm. i.e., ~ vanishes geometrically, when we block transform it,
even in this case of correlations with infinite range.
VII. NUMERICAL EXAMPLE FROM MONTE CARLO

SIMULATION
VI. ANALYTICAL EXAMPLE WITH INFINITE
CORRELATION TIME We have simulated the two-dimensional Ising model on
a 20 X 20 lattice at inverse temperature (3 = 0.30 using the
Assume XI' ••• , X n are correlated with infinite correlation
heat bath algorithm and checker board update. After
time, i.e., the correlation function decreases as a power of the
a hot start and 1000 thermalization sweeps we made
difference in time:
131 072( = 217) measurements of the instantaneous mag-
2
_{u (x) fort=O, netization with consecutive measurements being separated
(34)
Yt - A Ita for t>O. by one sweep. With X t denoting the instantaneous magneti-
Then (5) gives to leading order in lin zation and n == 131 072 Eq. (3) gave m = - 0.0011 for the
magnetization.
u 2(m) =~[U2(X)
+2At(a)], (35) We chose to measure the magnetization of this system to
n have a transparent example well known to most readers. The
where tea) is Riemann's zeta function: magnetization is easy to measure and even easier to discuss,
1 since we know its exact value is zero. Consequently our nu-
L -,
00
tea) = a
Rea> 1. (36)
merical estimate for the magnetization should differ from
t
t= 1
zero only by an amount that is justified by its error. Further-
We see that if a < 1 the data are too correlated to give a finite
more, with (3 = 0.30 the correlation time is short, and any
value for u 2(m). The "blocking" transformation Eqs.
method discussed above will work, so different methods can
(20)-(23) does not leave the form of the correlation func-
be compared easily.
tion (34) invariant, since
We computed C t as defined in Eq. (8). Figure 1 is a
HU 2(X) +A] fort=O semilog plot of c(t) vs t. For t.;;;;30 we see a straight line
Y; = { A 1(2t)a(1 + a(1 + a) + tJ(t -4») f,
or >
to' signaling a decrease of ct with a single correlation time
16t 2 r that we read off the plot to be r = 5.1. We also read off
(37) Co = 0.022. It will be self-consistently correct to neglect
So this example is not entirely analytically tractable. u 2(m) in Eq. (9), and therefore to use Ct as estimator for Yt.
However, in the most interesting situation, where a is not WhenEq. (28) is used with A = u 2(X) = 0.022 we find that
much larger than 1, u 2(m) in Eq. (5) receives a dominating u(m) = 0.0013.
contribution from Yt with t large, as Eqs. (35) and (36)
shows. For large t, Yt is well approximated by the first term 0.1.-----------------,
on the RHS of Eq. (37). This term is form invariant under
the "blocking" transformation, giving
n' =!n, 0.01
a'=a,
u 2(X') = Ha2(x) + A] (not reliable), (38)
~ 0.001
A'=2- A. a
As one would expect, Eq. (38) shows that "blocking" leaves

the infinite correlation time infinite, and the power law un- 0.0001
changed, while the amplitUde A is scaled in accordance with

the power law.
Equation (38) is based upon an approximation which
4 8 12 18 20 24 28 32 38 40 .. 48 52 58 80
improves with distance t, so its relation between u 2(X') and t
u 2 (x)--quantities defined at zero distance-is not reliable.
The relations for a and A are more reliable, and they are all FIG. 1. c, vs t. c, was computed from n = 217 measurementsofthemagneti-
we need. Comparing Eq. (35) with (25), remembering zation (x,),~ ...... n using the defining Eq. (8). For 0<t<16 c,
Yo=u 2 (x), we see that (lIn-l)co underestimates =0.022 exp( - t/5.1).

H. Flyvbjerg and H. G. Petersen: Averages of correlated data 465
Figure 2 shows estimates for q( m) obtained from the

same time series using the expression inside the angular .
'~ 40.0
36.0
bracket in Eq. (14) with increasing cutoff T as abscissa.
32.0
From 15<1<300 we see that q(m) = 0.0012 independent
26.0
of T. This constancy of Tis a convincing signal that the value
24.0
found for q( m) really is its true value. For T> 300 the values
] 20.0
for q(m) become increasingly noisy. This is because C I for b
.....• ++ f t t
t> 1" is an essentially random number, and when T is in- 16.0
12.0
creased more such numbers are included in Eq. (14), giving
8.0
rise to increasing noise.
4.0
Figure 3 shows estimates for q( m) obtained from the
0.0
same time series and its block transformed series defined in o I 2 3 4 5 8 7 8 g 10 11 12 13 14 15 18
Eqs. (20) and (21). Equation (28) has been used. After number of block transformations applied
approximately 6 block transformations q(m) reaches a val-
ue of 0.0012, where it remains (within error bars) during FIG. 3. Estimate for u(m) obtained with the blocking method. After -6
block transformations the estimates remain constant within error bars at
further block transformations. The distinct plateau that is 0.0012.
seen in this plot of q( m) vs the number of block transforma-
tions applied, is a fully convincing signal, that the fixed point
for the block transformation has been reached, and
q(m) = 0.0012 is the true standard deviation on m. This difference in the computational efficiency with which this is
value agrees with the previous estimate for it, and differs done, however. The data plotted in Fig. 2 was obtained with
little from the estimate based on 1" read off Fig. 1. It also tJ (nTmax) = tJ (l010) arithmetical operations, whereas
makes our estimate m = - 0.0011 for the magnetization those in Fig. 3 required only den) = &'(lOS) operations.
differ less than one standard deviation from zero. Tmax' which is the efficiency ratio between the two methods
Now let us make a more detailed comparison of the re- is lOS here only because we wanted to show the reader the
sult obtained with the two methods: Fig. 4 shows Figs. 2 and noise in Fig. 2. Just2the '
same, we could not have chosen Tmax
3 plotted on top of each other. The abscissa of the points much Iess than 10 , If we want to see the plateau in the esti-
from Fig. 3 have been chosen such that T = ! (2 # - 1) mate for q(m). Only if we know that 1""",5, can we legiti-
where # is the number of block transformations applied. At mately choose T max as low as "'" 20 in Eq. (14). On the other
hand, to obtain this knowledge we have to compute CI at
this Tvalue the number of pair correlations XI , X I, taken into
least for 0<t<max(1",3), which requires &'(nmax(1",3»)
account by the estimate for q( m) based on CI is equal to the
operations. In conclusion, there is no way to obtain q(m)
number of such pair correlations taken into account by the
which is more efficient than the blocking method; not even
estimate for q(m) obtained with the blocking method. In
the crude method consisting in reading Co::::: q 2 (x) and 1" off
Fig. 4 it is very clear that the two methods give the same
Fig. 1 and using them in Eq. (28) with A = q 2(X).
estimate for q(m). The length of the plateau giving this esti-
Another difference between Figs. 2 and 3 is the lack of
mate is also the same for both methods. And the noise on the
error bars in Fig. 2. Only if we are willing to assume that
sequence of estimates obtained with one method stays within
(XI) I = I •...• n are Gaussian variables with correlation matrix
the error bars on the estimates obtained with the other meth-
YI,.I, [compare Eq. (6)] can these error bars be estimated.
od. So as far as the quality of results in concerned, both
(Doing this is more than an exercise, we warn the reader.)
methods are equal, as they should be, since they both extract
all relevant information from the time series. There is a great
'~
..
ro
40.0
36.0
40.0
38.0
32.0
32.0
28.0
28.0
24.0
24.0 ] 20.0
i 20.0
16.0
b
16.0
12.0
12.0
8.0
8.0
4.0
4.0
0.0
I 10 100 1000 10000 100000
0.0 T+l
1 10 100 1000 10000 100000
T+l
FIG. 4. Figures 2 and 3 plotted on top of each other. For reasons given in the
FIG. 2. Estimates for u(m) using Eq. (12) vs cutoff T. For 15<t<400 text, the abscissa of the points from Fig. 3 have been chosen such that
u(m) ",,0.0012 independent of T. T = ~(2# - 1) where # is the number of block transformations applied.

The error bars in Fig. 3, on the other hand, are rigorous and TABLE I. Results of repeated application of the block transformation to a
time series from a molecular dynamics simulation of the Stockmayer fluid.
independent of assumptions.
Number of block
VIII. NUMERICAL EXAMPLE FROM MOLECULAR transformations n co/(n - 1) u[co/(n -1)]
DYNAMICS SIMULATION
o 50000 1.4Xl0- 4
The blocking method was tested on the results of a mo- 1 25000 2.9 X 10- 4
lecular dynamics run of 50 000 time steps of the Stockmayer 2 12500 5.8 X 10- 4
3 6 250 1.1 X 10- 3
fluid, using,u* = 1.5, T* = 1.35, p* = 0.8, N = 108, where 4 3 125 2.2 X 10- 3
the symbols have their usual meaning. 8 This is an example 5 1562 4.4x 10- 3
with very long time correlations. The Ewald summation was 6 781 8.1 X 10- 3
used to evaluate the dipolar interactions in periodic bound- 7 390 1.3 X 10- 2
8 195 1.8XlO- 2 0.2X 10- 2
ary conditions using a value of £' = 00 for the dielectric con- 9 97 2.1 X 10- 2 0.3X 10- 2
stant of the surroundings. 8,9 The same state point has been 10 48 2.4 X 10- 2 0.5X 10- 2
studied by Pollock and Alder,8 and our result for the Kirk- 11 24 2.0X 10- 2 0.6X 10- 2
12 12 2.0X 10- 2 0.9XlO- 2
wood factor g agree with theirs. Their result for the g was
13 6 1.6 X 10- 2 LOx 10- 2
g=-2
1 (NL . . ;)2 = 3.5 ±0.2. (41)
14 3 1.2 X 10- 2 1.2x 10- 2
N,u ;=1
although they give no details of how their error estimate was

arrived at.
The block transformation Eqs. (20)-(22) was used to one wants to determine a function ofa fluctuating quantity,
estimate the error in g. Where n was odd one of the n mea- and the function is not defined for the entire range of values
sures of g was discarded before performing the next block that the fluctuations may cover. A rather straightforward
transformation. The result of doing this is shown in Table I. procedure based on the jackknife method 11 makes blocking
u[co(n - 1)] is calculated from Eq. (27). A plateau in col possible also in such situations. 1,3
(n - 1) is observed between 9 and 12 block transformations,
and from this we find u(g) 9!.~0.024 9!.0.15. In the last few
rows in the table there are so few blocks, that the error on col
(n - 1) is close to the value of Co (n - 1) itself. In Ref. 11 the ACKNOWLEDGMENTS
blocking method was used to obtain error estimates for the We thank Claus Jeppesen for a critical reading of the
Kirkwood factor in other simulations. manuscript. H. G. P. thanks SARA at the University of Am-
sterdam for generous allocation of computer time on their
IX. CONCLUSION Cyber 205, and Dr. S. W. de Leeuw and the Department of
We hope the reader to whom the blocking method was Physical Chemistry for their hospitality. The financial assis-
new, has learned enough from our presentation to apply it tance of the Danish Research Academy is also gratefully
with confidence. The brevity ofthe literature on the method acknowledged.
inspires the vain hope that even workers using the method
might have learned a little. Our main message was that no
other method is comparable with the blocking method in
computational and intellectual economy. IK. Wilson, in Recent Developments in Gauge Theories, Cargese, 1979, edit·
ed by G. 't Hooft et al. (Plenum, New York, 1980)
We should add that there are situations in which the 2C. Whitmer, Phys. Rev. D 29,306 (1984).
method does not apply in the form presented here. Suppose, 3S. Gottlieb, P. B. Mackenzie, H. B. Thacker, and D. Weingarten, Nucl.
for example, we compute a correlation length from a correla- Phys. B 263,704 (1986).
tion function. This involves taking the logarithm of the cor- 4K. Binder, in Phase Transitions and Critical Phenomena edited by C.
Domb and M. S. Green (Academic, New York, 1976), Vol. 5.
relation function. To do that our estimate for the correlation sK. Binder, in Monte Carlo Methods in Statistical Physics, edited by K.
function must be positive. To ensure the positivity, we first Binder (Topics in Current Physics, Vol. 7) (Springer, New York, 1979).
average the correlation function over simulation time, and 6K. Binder and D. Stauffer, in Applications o/the Monte Carlo Method in
then take the logarithm. In contrast, application of the Statistical Physics, edited by K. Binder (Topics in Current Physics, Vol.
36) (Springer, New York, 1984).
blocking method to the correlation length requires first tak- 7G. J. Daniell, A. J. G. Hey, and J. E. Mandula, Phys. Rev. D 30, 2230
ing the logarithm to obtain a time series for the correlation (1984).
length, and then do the averaging and blocking. This proce- "E. L. Pollock and B. J. Alder, Physica A 102, 1 (1980).
9S. W. de Leeuw, J. W. Perram, and E. R. Smith, Proc. R. Soc. London Ser.
dure is clearly not possible, when fluctuations are large
A 373, 27 (1980).
enough to give the instantaneous correlation function nega- lOB. Efron, SIAM Review 21, 460 ( 1979).
tive values. In general, this problem may occur whenever "H. G. Petersen, S. W. de Leeuw, andJ. W. Perram, Mol. Phys. (in press).

Flyvbjerg, Petersen - 1989 - Error Estimates On Averages of Correlated Data PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Flyvbjerg, Petersen - 1989 - Error Estimates On Averages of Correlated Data PDF

Uploaded by

Copyright:

Available Formats

Error estimates on averages of correlated data

H. Flyvbjerg, and H. G. Petersen

Citation: The Journal of Chemical Physics 91, 461 (1989);

Articles you may be interested in

Statistically optimal analysis of samples from multiple equilibrium states

Canonical sampling through velocity rescaling

Polymorphic transitions in single crystals: A new molecular dynamics method

Molecular dynamics with coupling to an external bath

I. INTRODUCTION be the result ofa Monte Carlo simulation in "thermal" equi-

The "blocking" method was not invented by us. It is

suminEq. (11) clearly approximates 0' 2(m) better the larg-

J. Chern. Phys., Vol. 91, No.1, 1 July 1989

J. Chern. Phys.• Vol. 91, No.1. 1 July 1989

~'=~(I+e-1/1')~ (33) proximate result

from which we see that "blocking" makes lI(n - 1) (co) (39)

VII. NUMERICAL EXAMPLE FROM MONTE CARLO

As one would expect, Eq. (38) shows that "blocking" leaves

changed, while the amplitUde A is scaled in accordance with

J. Chern. Phys., Vol. 91, No.1, 1 July 1989

Figure 2 shows estimates for q( m) obtained from the

J. Chern. Phys., Vol. 91, No.1, 1 July 1989

although they give no details of how their error estimate was

J. Chern. Phys., Vol. 91, No.1, 1 July 1989

You might also like