Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

3

Graphical and Statistical


Methods

I n this chapter a variety of graphical and statistical methods will be


described that can be used to aid in the visual analysis of single case
data. These include graphical methods for representing phase means,
trends, and variability. Combined graphical and statistical methodolo-
gies for making inferences about change are also described.

GRAPHICAL AND STATISTICAL METHODS OF


REPRESENTATION
There are a variety of graphical and statistical methods that can be used
to represent various characteristics of single case design phase data. The
ones considered here are phase mean and median lines, and methods for
representing phase trend, variability, and overlap

Phase Mean and Median Lines


A simple way of graphically representing the mean level of the dependent
variable in a single case design phase is to draw a line across a phase that
marks the mean of the data in the phase. The phase mean will be
unaffected by autocorrelation (Ostrom, 1990). Phase means were shown
in various figures in earlier chapters. For example, in Figure 3.1, the means

77
78 Analyzing Single System Design Data

A B

Percentage of stores willing to sell


100 Prineville
Phase mean
80

tobacco to minors 60

40
Phase mean
20

0
Time

Figure 3.1 Prineville single case design data, from Biglan, Ary, and Wagenaar
(2000), with phase mean lines drawn in. Reproduced with kind permission from
Springer Science+Business Media. Biglan, A., Ary, D., & Wagenaar, A. (2000).
The value of interrupted time-series experiments for community intervention
research. Prevention Science, 1(1), 38, Figure 3.

of the data in the phases of the single case design study of Prineville, from
the Biglan, Ary, and Wagenaar (2000) study, are drawn with dashed lines.
Lines can also be drawn representing the median of the data in the single
case phases, as shown in Figure 3.2. The median line may be preferable to
the mean in cases in which there are extreme values of the dependent
variable relative to other values in the phase since the median is less
sensitive to extreme values (Pagano, 2006). In the case of the Prineville
data, there is little difference between the mean and median lines.

A B
Percentage of stores willing to sell

100 Prineville
Phase median
80
tobacco to minors

60

40
Phase median
20

0
Time

Figure 3.2 Prineville single case design data, from Biglan, Ary, and Wagenaar
(2000), with phase median lines drawn in.
Graphical and Statistical Methods 79

Phase Linear Trend


There are at least three approaches to representing linear trend across a
phase: the use of an ordinary least squares (OLS) regression line, a method
described by Nugent (2000) for hand-drawing a linear trend line, and a
technique called the celeration line technique, or sometimes the split-
middle technique (Bloom, Fischer, & Orme, 2008).

OLS Trend Lines. Least squares regression methods may be used to


compute a phase linear trend equation and then to represent this trend
graphically by drawing an OLS trend line, or arrow, through the phase.
This estimate will be unbiased even in the presence of autocorrelation
(Ostrom, 1990). An OLS trend line is shown in Figure 3.3. This line can
be placed simply by computing an OLS regression linear trend equation
for the phase data and then using this equation to plot the linear
trend line.

Mean Trend Line. Nugent (2000) described a procedure for drawing in


the mean trend line for a single case design phase. Nugent (2000) showed
that the trend line created by this method is the weighted—by length of
time interval—mean phase linear trend. In this procedure, the trend
across a longer interval of time is weighted more than the trends across

A B
Percentage of stores willing to sell

100 Prineville
80 OLS trend line
tobacco to minors

60

40

20

0
Time

Figure 3.3 Graphic representation of phase trends using ordinary least squares
(OLS) regression lines for Prineville data from Biglan, Ary, and Wagenaar
(2000).
80 Analyzing Single System Design Data

shorter intervals. This method is simple, and involves nothing more than
starting the trend line at the first data point in the phase and then
drawing a line or arrow though the final phase data point. In a variation
on this method, the mean trend line so created is ‘‘slid’’ until it bisects the
phase data points such that the same numbers of points are above and
below the trend line. This method is analogous to the hand-drawn phase
median line using the method of Ma (2006), and may create a trend line
facilitating a better representation of phase variability than the Nugent
(2000) method, as discussed later. These methods are illustrated in
Figure 3.4 for data from Biglan et al. (2000).

Celeration Line. This procedure will be illustrated by drawing in a


celeration line for the baseline data from the Prineville community
data. The first step is to divide the baseline data into two halves, as
shown in Figure 3.5. This midpoint line divides the baseline phase into
two equal parts. If there had been nine data points instead of eight, the
midpoint line would have gone through the fifth data point. In cases in
which the number of baseline observations is odd, the midpoint line will
go through the data point that has an equal number of data points on
either side of it. In cases in which the number of data points is even, such
as in Figure 3.5, the midpoint line will divide the phase into two sections
with equal numbers of data points.

A B
Percentage of stores willing to sell

100 Weighted mean Prineville


trend lines
80
tobacco to minors

60

40

20

0
Time

Figure 3.4 Weighted mean trend lines drawn in Prineville data from Biglan, Ary,
and Wagenaar (2000). The dashed arrow is the translated trend line bisecting
data points in phase.
Graphical and Statistical Methods 81

A B

Percentage of stores willing to sell


100 Prineville

80

tobacco to minors 60

40

20

0
Phase Time
midpoint

Figure 3.5 Baseline phase midpoint line drawn in to Prineville data from Biglan,
Ary, and Wagenaar (2000).

The next step is to draw in lines that further subdivide the phase
into quarters, as in Figure 3.6. There will now be lines at the first,
second, and third quarter points in the baseline phase, as can be seen in
Figure 3.6.
The next step is to compute the mean scores for the first half of the
phase and mark that point on the first quarter line, and compute the
mean score in the second half of the phase and mark that point on the

A B
Percentage of stores willing to sell

100 Prineville
tobacco to minors

80

60

40

20

0
Time

First quarter Phase


Third quarter
line midpoint line

Figure 3.6 First, second, and third quarter lines for baseline phase from the
Prineville community data from Biglan, Ary, and Wagenaar (2000).
82 Analyzing Single System Design Data

Mean of first Mean of second


half of baseline phase half of baseline phase

A B

Percentage of stores willing to sell


100 Prineville
Celeration line
tobacco to minors 80 for baseline

60

40

20

0
Time
First quarter
line Third quarter
Phase line
midpoint

Figure 3.7 Celeration line for baseline phase data from Prineville community from
Biglan, Ary, and Wagenaar (2000) study.

third quarter line. Finally, connect these points on the first and third
quarter lines with a line. This line is the celeration line and represents the
linear trend in the data across the phase. The celeration line for the
baseline phase in the Prineville data is shown in Figure 3.7. The trend
representations based on all four methods are shown in Figure 3.8 for the
Prineville data from Biglan et al. (2000).

Representation of Variability
There are a number of ways to represent the variability of the dependent
variable across a single case design phase. One is to represent the range of
values of the dependent variable using lines marking the upper and lower
limits of the range of values across the phase, as shown in Figure 3.9. The
dashed horizontal lines mark the upper and lower limits of the range of
values of the dependent variable across the phase, while the difference
between these lines represents the overall range of values of the depen-
dent variable across the phase.
A B

Percentage of stores willing to sell


100
Prineville
80

tobacco to minors 60

40

20

0
Time

Figure 3.8 Four linear trend representations for Prineville baseline data from
Biglan, Ary, and Wagenaar (2000). Upper dashed arrow is based on Nugent
(2000) method. Solid line is ordinary least squares (OLS) regression trend line.
Solid arrow is celeration line, while lower dashed arrow is Nugent method mean
trend line ‘‘slid’’ so as to bisect phase data points.

Upper limit of range of


values of dependent variable
across baseline
A B
Percentage of stores willing to sell

120 Prineville
100
tobacco to minors

80 Range of values
of dependent variable
60
across baseline phase
40
20
0
Time Lower limit of range of
values of dependent
variable across
baseline

Figure 3.9 Illustration of use of lines marking upper and lower limits of range of
values of dependent variable during a baseline phase using Prineville data from
Biglan, Ary, and Wagenaar (2000) study.

83
84 Analyzing Single System Design Data

Lines can also be used to represent such measures of dispersion


across a phase as the standard deviation and the interquartile range.
Methods for computing these statistics can be found in such sources as
Pagano (2006). An example of standard deviation and interquartile
range lines are illustrated in the graphs in Figures 3.10 and 3.11 for the
Prineville data from Biglan et al. (2000). Lines marking one standard
deviation above and below the phase mean are shown in Figure 3.10.
Lines marking two standard deviations above and below the phase mean
line have been advocated as a part of a procedure for making inferences
about change (Nourbakhsh & Ottenbacher, 1994).
Lines marking the first and third quartile scores for the phase are
shown in Figure 3.11, along with a line marking the median (second
quartile) score. The distance between the first and third quartile scores
marks the interquartile range for phase observations. One advantage of
using the median and interquartile range to represent characteristics of
single case design phase data is that these statistics are less sensitive to
extreme scores than are the mean and the standard deviation (Pagano,
2006). Hence, the median and interquartile range may be more robust
for representing the central tendency and variability, respectively, of the
data in a single case design phase.

One standard deviation


line above phase mean

A B
Percentage of stores willing to sell

120 Prineville
Phase mean line
100
tobacco to minors

80
60
40
20
0
Time
One standard deviation
line below phase mean

Figure 3.10 Illustration of lines marking one standard deviation above and below
the phase mean lines in Prineville data from Biglan, Ary, and Wagenaar (2000).
Graphical and Statistical Methods 85

Third quartile line

A B
Percentage of stores willing to sell 120
Prineville
Median line
100
tobacco to minors

80 Interquartile
range
60

40

20

0
Time

First quartile line

Figure 3.11 Illustration of graphic representation of interquartile range using


Prineville data from Biglan, Ary, and Wagenaar (2000).

Another method of representing variability is the average, or mean,


moving range, a method used in statistical process control (Orme & Cox,
2001; Pfadt & Wheeler, 1995; Wheeler, 1995). The moving range is
defined as the absolute value of the difference between two successive
data points in a single case design phase, jYt Yt þ 1 j, where Yt is the
phase observation at time t and Yt+1 is the observation at time t+1. The
mean moving range, mR, is the mean of these moving range values. For
example, suppose that the values of the observations in a phase are 4, 6, 2,
7, and 1. The moving range for the first two data points is |4 – 6| = 2; for
the next two data points, 4; for the next two data points, 5; and for the
last two data points, 6. Hence, the mean moving range is,

2þ4þ5þ6
mR ¼ ¼ 4:25:
4

Note that if there are n data points in a phase, there will be n–1 moving
range values for the phase.
The mean moving range can be used to compute the sigma unit, an
index that can also be used to represent variability (Orme & Cox, 2000;
Pfadt & Wheeler, 1995; Wheeler, 1995). Following Wheeler and
86 Analyzing Single System Design Data

Chambers (1992), three sigma units will be given by 2:66 # mR; two
sigma units by ð2=3Þ # 2:66mR; and one sigma unit by ð2:66=3Þ # mR.
One, two, and three sigma lines can be placed above and below the phase
mean line to represent variability. A method by which one, two, and
three sigma bands can be placed around a phase linear trend line is
illustrated later.

Representing Background Variability


Background Variability Relative to Mean. Single case design methodol-
ogists who have advocated the visual analysis of single case design data have
argued that treatment phase data should be contrasted against the ‘‘back-
ground variability’’ of the baseline phase data (Barlow, Hayes, & Nelson,
1984; Kazdin, 1982; Parsonson & Baer, 1986). Nourbakhsh and
Ottenbacher (1994) operationalized this approach by placing lines marking
two standard deviations above and below the phase mean line, as illustrated
in their Figure 2 (p. 774) and outlined in Table 1 (p. 771) in their 1994
article. A similar method for representing baseline phase background varia-
bility is adapted from statistical process control and involves placing one,
two, and three sigma bands above and below a baseline phase mean line
extended into treatment phase (Orme & Cox, 2001; Pfadt & Wheeler,
1995). This methodology is described and illustrated below.
Methods such as the two-standard deviation band procedure are
problematic in that they neglect linear trend. This issue was illu-
strated by Nourbakhsh and Ottenbacher (1994) in the contrast
between parts (a) and (b) in their Figure 2 (p. 774). A similar
contrast is shown in Figure 3.12. The baseline mean line has been
extended into and across treatment phase, and the two standard
deviation bands for baseline data have been drawn in this figure.
Note also in this figure the OLS baseline and treatment phase trend
lines (dashed lines) and the OLS trend line for the entire time series
(solid line). While the treatment phase data pattern relative to the
baseline mean line and the two standard deviation bands is sugges-
tive of change (Nourbakhsh & Ottenbacher, 1994), the congruence
of the three trend lines implies that it is very plausible that the
treatment phase trend is nothing more than a continuation of the
baseline phase trend. Thus, the hints of change based on
Graphical and Statistical Methods 87

Number of times Johnny yells at his mother


Two standard deviation bands
10
9
8 OLS phase trend lines
7 Baseline phase mean
6
5
4
3
2
1
0
M

Th

t
Sa
Su
Day of the week

OLS trend for entire time series

Figure 3.12 Hypothetical AB single case design from Figure 1.2 with baseline mean
line, and two standard deviation bands for baseline variability, drawn and extended
into treatment phase. Also shown are baseline and treatment phase ordinary least
squares (OLS) trend lines and the OLS trend line for entire time series.

comparisons of the treatment phase data with the baseline mean


and two standard deviation bands are contradicted by the apparent
continuation across treatment phase of a decreasing baseline trend.
Ignoring the trend information, and relying solely on the two
standard deviation bands representing baseline variability relative
to the baseline phase mean line, may therefore lead to erroneous
inferences about change between phases. This is an issue similar to
that of misspecified statistical models considered later.

Background Variability Relative to Trend. Nugent (2000) described a


method of graphically representing the background variability in a phase
relative to a trend line as opposed to the mean. This approach was built
upon suggestions from Bailey (1984) and Bloom and Fischer (1982,
pp. 468–471). A slightly modified version of this procedure, justified in
a later section, is carried out as follows and illustrated in Figure 3.13:

(1) Compute the OLS regression model for the baseline linear trend, and
then plot the OLS trend line across baseline phase and extend this
line into and across the treatment phase, as illustrated in Figure 3.13.
88 Analyzing Single System Design Data

(2) Compute the mean moving range (Wheeler & Chambers, 1992)
for the residuals about this OLS linear trend line. This is done by (a)
computing the OLS residuals for the baseline linear trend; (b)
computing the moving range for each adjacent pair of residuals
(this will be the absolute value of the difference between the residual
for the data point for time t and that for time t+1); and (c)
computing the mean of these moving range values, mR.
(3) Draw in sigma bands for the baseline phase data relative to the OLS
linear trend line based on the mean moving range. The one sigma
band lines are drawn by placing lines above and below, and parallel
to, the linear trend line a distance 0:887 # mR from the trend line.
The two sigma band lines are drawn, just as the one sigma bands
are, but a distance 1:77 # mR from the trend line. The three sigma
band lines are drawn a distance 2:66 # mR from the OLS trend line
(Wheeler & Chambers, 1992). These lines are extended from
baseline phase into and across treatment phase.

One, two, and three sigma bands so created can be seen in Figure 3.13.
These bands mark the regions of baseline phase background variability
relative to the baseline OLS trend. The computation of these bands is
shown in the next paragraph.
Number of times Johnny yells at his mother

A B Three sigma band


14
12 Mean line

10 Two sigma
8 band
6
4
2
0 One sigma
band
n

t
Th
W
M

Sa
T
F
Su

OLS trend line for baseline


Day of the week

Figure 3.13 Illustration of creation of region of background variability for baseline


data from Figure 1.2 using variation of method described by Nugent (2000).
Graphical and Statistical Methods 89

The OLS regression model for baseline data is yt = 5.143 – 0.286t, and
the residuals from this OLS trend line are –0.85714; 0.42857; 2.71429; –2.0;
–0.71429; –0.42857; and 0.85714. The moving range values are 1.28571;
2.28572; 4.71429; 1.28571; 0.28572; and 1.28571. The mean of these values
is 1.857. Hence, the one sigma value is (2.66/3) # 1.857, or 1.6; the two
sigma value is 3.3; and the three sigma value is 4.9 (Wheeler & Chambers,
1992). Therefore, the one sigma bands will be above and below, parallel to,
and 1.6 units away from the extended OLS baseline trend line. The two
sigma bands will be similarly situated 3.3 units away from the baseline
linear trend line, while the three sigma bands will be 4.9 units from the
extended trend line. These are all shown in Figure 3.13.

COMBINED GRAPHICAL AND STATISTICAL METHODS

No-Trend Models

A Median-Based Method. Two analytic procedures have been pro-


posed that combine graphical and statistical methods that are based on
the notion of overlap, discussed earlier in this chapter and that are based
on representing variability relative to the mean or median. One
described by Scruggs and Mastropieri (1998) is termed the percentage
of nonoverlapping data (PND) method. Ma (2006) critiqued this method
and suggested an alternative, the percentage of data points exceeding the
median (PEM), which addressed limitations identified by Ma in the PND
method. The PEM method is implemented as follows:

(1) A median line is drawn across baseline phase and extended into and
across the adjacent treatment phase. This could also be done by
drawing a median line across a treatment phase and extending it
into and across either an adjacent baseline or treatment phase.
(2) The percentage of the data points in the adjacent treatment
(or baseline) phase that are above the extended median line if this
direction indicates improvement (or below the extended median
line if this direction is indicative of improvement) is computed.
(3) The null hypothesis in the PEM approach can be stated loosely as, if
the adjacent treatment (baseline) phase data pattern is a continuation

You might also like