Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 257

Statistical Methods for Scientific 2015

Research

Software Based Statistical


Methods for Scientific Research
Training Module

BY Yosef Tekle-Giorgis (PhD)

School of Animal and Range Sciences


College of Agriculture
Hawassa University

Yosef Tekle-Giorgis, HUCA, 2015 Page 1


Statistical Methods for Scientific 2015

Research
Table of contents
Topic Page

1 Introduction 6

1.1 Definition and application of statistics in research 6


I.2 Definition of Population, Samples and Variables 9
1.3. Types of variables 10
2 Measures of central tendencies 12
II.1The Arithmetic mean 12
2.2. The Geometric mean 15
2.3 The Harmonic mean 18
2.4 The Median 18
2.5 The Mode 19
2.6 Summary on the usage of the three measures of central tendencies 19
3 Measures of dispersion 21
III.1 Range 22
III.2 Mean absolute deviation 23
III.3 Variance and standard deviation 24
III.4 The Coefficient of variation (CV) 26
3.5 Indices of diversity 26
4.The Overall mean and variance of sample means 29
4.1 The overall mean of repeatedly taken sample means 29
4.2 The variance of sample means 31
5 Distribution of continuous variable 36
5.1 The Normal distribution 36
5.1.1 Properties of a normal distribution 37
5.1.2 Standard normal deviate values (Z score) 40
5.1.3 Areas under a normal curve 41
5.1.4 The distribution of sample means 44
5.1.5 Computing probabilities based on sample mean distribution 45
5.2 Student t distribution 47
5.2.1 Characteristics of t distribution compared to Normal (Z distribution) 49
5.2.2 Areas under t curves (t table) 50
6 Estimation of population parameter from sample statistics 53
6.1 Formula for 95 % Confidence Interval estimation of population mean (μ) 54
6.2 The formula for other confidence levels 55
6.3 Confidence interval estimation when σ is unknown 56
7 Introduction to statistical hypothesis testing 58
7.1 One sample hypothesis concerning mean 60

Yosef Tekle-Giorgis, HUCA, 2015 Page 2


Statistical Methods for Scientific 2015

Research
7.1.1 Steps in hypothesis testing 62
7.1.2 One tailed Vs two tailed tests 67
8 Two sample hypothesis concerning mean 72
8.1 Independent sample hypothesis test concerning mean 72
8.2 Paired sample test 76
9 Multi-sample hypothesis concerning mean 82
9.1 Illustration of Single factor analysis of variance test 82
9.2 Simpler computational procedure for Single factor ANOVA test 90
9.3 Multiple comparison tests 93
10 Two Factor Analysis of Variance 99
10.1 Introduction to Multifactor Analysis of variance (Factorial ANOVA) 99
10.1.1 Aims (advantages) of a factorial design experiment 101
10.1.2 Illustration of main and interaction effects of factors in a factorial 104
experiment
10.2 Computational procedure for two factor ANOVA design 108
11 Three factor Analysis of Variance (CRD) 115
11.1 Sources of variation 118
11.2 Hypothesis 118
12 Randomized Complete Block Design (RCBD) 125
12.1 The effect of using heterogeneous experimental subjects when dealing with 125
Completely Randomized Designs
12.2 Randomized Complete Block Design (RCBD) 132
13 Latin Square ANOVA design 141
14 Lattice Design 145
14.1 Incomplete block designs 145
14.2 Balanced Lattice 146
14.2.1 Illustration of a balanced lattice design 146
14.2.2 Analysis of Balanced Lattice design 151
15 Split plot design 153
15.1 Blocking in a factorial experiment 153
15.2 Split plot design 155
16 Nested Analysis of Variance Design 165
16.1 Sampling procedure 166
16.2 Hypothesis 167
16.3 Sources of variation in a nested ANOVA design 167
16.4 Nested ANOVA design computational procedure 169
17 Repeated measure ANOVA design 172
17.1 Sources of variations in the repeated measure ANOVA design 175
17.2 SPSS procedure for repeated measure ANOVA design 177
18 Regression and correlation analysis 182
18.1 Types of regression and correlation relationships 185
18.2 Simple linear regression analysis 186

Yosef Tekle-Giorgis, HUCA, 2015 Page 3


Statistical Methods for Scientific 2015

Research
18.2.1 Determining simple linear regression equation 186
18.2.2 Expressing the strength of regression analysis 188
18.2.3 Testing the significance of a regression relationship 190
18.3 Simple linear correlation analysis 193
19 Multiple regression and correlation analysis 199
19.1 Multiple regression relationship 199
19.1.1 Multicollinearity test 199

19.1.2 Selecting important regressor variable to be included in the 200


regression model

19.1.3 Illustration of establishing a multiple regression relationship with 201


SPSS

19.2Multiple correlation 206


20 Logistics regression analysis 210

20.1 Simple (Binary) logistic regression 211


20.1.1 Linearizing the logistic equation 213
20.1.2 Odds ratio 214
20.1.3 Fitting the logistic regression relationship 214
20.1.4 Logistic regression with SPSS 215
20.2 Multiple binary logistic regression relationship 218
20.2.1 SPSS procedure for multiple logistic regression 219
20.2.2 SPSS output interpretation 220
21. Analysis of nominal scale variable 225
21.1 One way classification X2 test 225
21.2 Subdividing X2 test 228
21.3 Two way classification frequency data analysis 230
21.3.1 A 2X4 contingency data test procedure 231
21.3.2 Sub dividing a contingency analysis 232
22 Appendices 236

Yosef Tekle-Giorgis, HUCA, 2015 Page 4


Statistical Methods for Scientific 2015

Research
Chapter 1 Introduction

1.1 Definition and application of statistics in research

Statistics is a branch of mathematics employed to analyze research data. Biostatistics is


application of statistical methods to analyze biological data. Statistics is a main tool for research

 Application of statistics in research

o To design data collection methodology

o To summarize and describe data

o To infer population parameter from sample statistics

 To make generalization about population characteristics

o To make comparison among populations

 Hypothesis testing

These four areas of statistics involvement in research are elaborated in the following section

II. Application of statistics to design data collection methodology


The implication of this depends on the type of research

i) Survey type (exploratory) research


This type of research involves studying the population of interest without manipulating
the subjects. In this type of research there is extensive sampling work. Research is
conducted on population and the outcome of the research is to make generalization about
the population at large, BUT data are collected based on samples of subjects drawn from
the population. In order to make valid inferences about the population, the samples must
be representative of the population

 A sample is said to be representative of the population if :

Yosef Tekle-Giorgis, HUCA, 2015 Page 5


Statistical Methods for Scientific 2015

Research
• The sample size is sufficient and

• Randomly taken

A sample which is sufficiently large enough and drawn randomly from the population is
considered to represent the population better because the characteristics that are reflected
in the population by large are also exhibited by the sampled individuals.

 Hence designing data collection methodology in survey type research imply estimating
the sample size required to collect sufficient information and setting out of the sampling
regime that enable drawing random samples. knowledge of statistics is required to
decide on optimum sample size and to choose appropriate method of random sampling
for a particular research

ii) Designing data collection methodology in experimental research


Experimental research is employed when there is a need to study the response of subjects by
exposing them to a certain treatments. i.e., this type of research involves extensively
manipulating the subjects of the research. Accordingly the layout of the experiment should be
set out and this involves:

• Estimating optimum number of experimental subjects required under the respective


treatment

• The manner of assigning subjects to the treatments i.e., choosing experimental design that
suits a particular experiment

Both of these require knowledge of statistics

III. Application of statistics to summarize and describe a data set

Yosef Tekle-Giorgis, HUCA, 2015 Page 6


Statistical Methods for Scientific 2015

Research
Data summarization means condensing the raw data and presenting it with few (possibly one
or two) values. Data need to be summarized in order to easily grasp the message contained
in the data as well as to make comparisons

Data summarization procedures

• Tables

• Figures (bar graphs, line graphs, scatter plots, pie charts etc.,)

• Summary statistical quantities

o Averages (measures of central tendencies)

Mean(Arithmetic, Geometric, Harmonic)

Median

Mode

o Measures of dispersions

Range

Variance (standard deviation, standard error)

Care should be taken when condensing a raw data in such a way that the summarized data should
transmit exactly the same message as the raw data as much as possible. At times the message
transferred to the reader through the summarized data can be different from the message
contained in the original raw data because of inappropriate data summarization procedure. On
the other hand, choosing and using a particular data summarization procedure requires
knowledge of statistics

Yosef Tekle-Giorgis, HUCA, 2015 Page 7


Statistical Methods for Scientific 2015

Research
IV. Application of statistics to inference population parameter from sample statistics

• Inferencing involves estimating population parameter from sample statistical quantity.


i.e., Estimating parametric mean (μ), variance(σ2), Slope (β), correlation coefficient (ρ),

proportion (π) etc., from the corresponding sample mean ( X ), sample variance (S2),
sample slope (b), sample correlation coefficient (r) sample proportion (P), etc.,
respectively

There are two ways of estimating population parametric quantities from sample values

• Point estimation

• Interval estimation procedure

The latter is more accepted than the former since it allows expressing the level of
confidence in the estimation. Interval estimation procedure of different parametric
quantities require knowledge of statistics.

IV Hypothesis testing

Hypothesis testing involves comparing populations. Populations are compared in order to make
clear cut decisions. There are different types of hypothesis testing procedures depending on the
type of the response variable, the number of populations compared, the number of factors or
treatments involved, the nature of the experimental subjects, etc. Choosing and using the correct
testing procedure requires knowledge of statistics

• Designing data collection methodology and summarization and describing data fall in the
realm of descriptive statistics

• Making inferences about population parameters and hypothesis testing are categorized as
inferential statistics

Yosef Tekle-Giorgis, HUCA, 2015 Page 8


Statistical Methods for Scientific 2015

Research

1.2. Definition of Population, Samples and Variables


 A Population is the entire subjects of the research interest

 Samples are subsets from the population selected for the purpose of data collection

 Variables are characteristics under investigation

Types of populations

There are different ways of classifying populations based on:

 Size of populations as finite and infinite

 Finite populations are those that can be counted

 Infinite populations are those that are too large to count

 Nature of members as

 living or non living

 Real or imaginary

Real populations exist in time and space imaginary populations are those experimentally created

Defining a population

• A population should be properly defined or described in the title in terms of

 Time

 Space (location) it occupies

 Specific nature it possess such as genotype (breed or variety

 The environment it is exposed to such as management conditions, feeding etc.

Yosef Tekle-Giorgis, HUCA, 2015 Page 9


Statistical Methods for Scientific 2015

Research
The importance of properly defining populations is to avoid ambiguity during sampling as well
as to compare results of different studies conducted on the same population at different times

1.3 Types of variables

1.3.1)Quantitative variables

• Have numerical quantity

1.3.1.1) Measurement (continuous) variables

• Values determined using some kind of measuring scale

e.g. Length, weight, volume, area, density, time etc.

1.3.1.2) Countable (discrete) variables

• Values determined by counting

Family size, litter size, microbial count, egg production, tiller count etc.,

1.3.2) Qualitative variables

• Described by the qualities they possess

1.3.2.1) Ordinal (ranked) variables.

• Expressed by relative differences. e.g. High, medium, low


1.3.2.2) Nominal variables

• Phenotypic or genotypic categories



Exercise: Identify the population types and variables as well as comment on the manner in
which the populations are described in the following studies

• Yield of local corn variety (quintals/hectare) under two levels of DAP application in
Hawassa zuria woreda Ethiopia

Yosef Tekle-Giorgis, HUCA, 2015 Page 10


Statistical Methods for Scientific 2015

Research
• Comparative studies on the nitrogen contents of soils (mg/gm of soil) in some selected
areas of the rift valley region (Hawassa and Ziway) and adjoining highlands
(Wondogenet and Ada woreda)

• Egg production of Wolayita local VS Rhode Island Red (RIR) cross breed chicken under
three levels of bone meal supplementation

• Comparative study on herbaceous species composition, productivity and range condition


assessment of communally grazed and protected rangelands in Salamago woreda of south
Omo zone, SNNPRS, Ethiopia

• Microbial load of processed fish from Lake Hawassa under the existing traditional and
using hygienic processing method.

Yosef Tekle-Giorgis, HUCA, 2015 Page 11


Statistical Methods for Scientific 2015

Research
Chapter 2 Measures of central tendencies

Measures of central tendencies are single values calculated from the raw data used to describe or
represent a given data set. They are commonly referred as averages and they fall into three
categories as

• Mean (arithmetic, Geometric and harmonic mean)


• Median
• Mode
These three measures of central tendencies tend to lie at the center of the data set, which is
a property that gave them the description measures of central tendencies. This is also the
reason why they are used to represent a data set because a value to be considered as
representative of a given data, it has to lie somewhere at the center and reflect the nature of
the whole data set. Otherwise a value that tilts to one side of the data cannot be
legitimately considered as representative of the data in question. As a matter of fact, there
are instances in which a measure of central tendency (mainly mean value) looses this
nature and tends to lie towards one side of the distribution. In such instances, using the
value of the particular measure of central tendency to represent a given data set would
transmit biased message to the reader about the data and a proper measure of central
tendency should be chosen.

2.1 Arithmetic mean


Most widely used measure of central tendency and usually referred as mean or average. It is the
sum of all values of the data divided by the number of data points.

For raw data it is calculated as:

X́ =
∑ xi
n

where ,

X́ =samle mean

∑xi = the sum of the values starting from i = 1 up to i = n

Yosef Tekle-Giorgis, HUCA, 2015 Page 12


Statistical Methods for Scientific 2015

Research
n = sample size

For data collected from the population, the mean value (μ) is computed as

μ=
∑ xi where N = the population size
N

For grouped data, each values of xi should be weighed by its respective frequency and the above
formula is modified as follows.

Sample mean X́ =
∑ fi xi
n

Population mean μ=
∑ fi xi
N

Note that for group frequency classified data xi in the above formula refers to the midpoint of the
respective class interval.

Using Excel command, mean value can be computed by clicking on a cell and writing the
following command

=Average(data range) press enter or click √in the command line

In the above command, data range is the range of the data for which mean value is
computed

e.g. = Average(A1:A10) gives the mean value for data range A1 to A10

When there are many variables for which mean and other summary statistics are
computed, Pivot table command of excel is more efficient and can be executed as follows.

 Select the data range together with the variable names


 Click insert from the command bar
 Choose pivot table (this gives pivot table command prompt)
 Indicate where the output should be placed, if it is in the existing work sheet, click a
cell to indicate the location
 Drag the variables for which mean values are to be computed in the value box
 Click the black triangle to access the value field setting
 Scroll down and choose mean. NB other summary statistical quantities are also
available in the display

Yosef Tekle-Giorgis, HUCA, 2015 Page 13


Statistical Methods for Scientific 2015

Research

Exercise

Compute the arithmetic mean for data shown by Tables 2.1, 2.2, and 2.3

Table 2.1 Percent Butterfat content of milk of 11 randomly selected cows from ARSc dairy
farm

Id of cows % butterfat of milk


1 3.4
2 3.5
3 3.6
4 3.6
5 3.7
6 3.7
7 3.7
8 3.8
9 3.8
10 3.9
11 4
Total 40.7

Table 2.2. Frequency distribution of foxes that had different litter size

Litter size Frequency

3 10

4 27

5 22

6 4

7 1

Yosef Tekle-Giorgis, HUCA, 2015 Page 14


Statistical Methods for Scientific 2015

Research

Table 2.3 Income of daily laborers in Hawassa city (Birr/day)

Income (Birr/day)
Absolute frequency
Class limit Mid point (f) Relative frequency (%)

14.0-14.99 14.5 3 5.0

15.0-15.99 15.5 8 13.3

16.0-16.99 16.5 11 18.3

17.0-17.99 17.5 16 26.7

18.0-18.99 18.5 11 18.3

19.0-19.99 19.5 7 11.7

20.0-20.99 20.5 4 6.7

Total 60 100

Yosef Tekle-Giorgis, HUCA, 2015 Page 15


Statistical Methods for Scientific 2015

Research
2.2. Geometric mean
This is the nth root of the product of the n data and it is computed as:
n
X G=√ X 1 X 2 X 3 . .. .. . . Xn
The geometric mean can be also easily computed as the antilogarithm of the arithmetic
mean of the logarithm of the data, where the logarithm can be in any base

X G=anti log ( Logx 1+ Logx 2+ Logx 3+. .. . .. .. .+ Logxn


n ) =
anti log ( )
∑ log xi
i=1
n
The geometric mean is appropriate to use only for ratio-scale data and only when all of the data
are positive (that is, greater than zero). If the data are all equal, then the geometric mean is equal
to the arithmetic mean (and also equal to the harmonic mean described below); if the data are not
all equal, then Harmonic mean < Geometric mean< Arithmetic mean)
Geometric mean is sometimes used as a measure of location when the data are highly skewed to
the right (i.e., when there are many more data larger than the arithmetic mean than there are data
smaller than the arithmetic mean). For example consider the following microbial count made for
three replicates of the same sample. ie., X1 = 100, X2 = 1000 and X3 = 10000.
Arithmetic mean = (100 +1000+10000)/3 = 3700

Geometric mean = antilog (Log 100+Log1000+Log10000)/3 = 103 = 1000

Here the geometric mean is a better representation of the average count than the arithmetic mean

Geometric mean is also useful when dealing with data that represent ratios of change. As an
illustration of this, consider the change in the price of Teff (Birr/Quintal) between 2008 and 2013
in Hawassa market. The data are given in table 2.4 below. Compute both arithmetic and
geometric mean of the ratio of change. Which one is more appropriate as an average rate of
change in the price of Teff during the indicated years?

Yosef Tekle-Giorgis, HUCA, 2015 Page 16


Statistical Methods for Scientific 2015

Research

Price
year (Birr/Q) Price change/year Log ratio of change
2008 600 Table 2.4 The price
2009 750 1.2500 0.0969 of Teff (Birr/quintal)
2010 950 1.2667 0.1027
in Hawassa market
2011 1100 1.1579 0.0637
2012 1600 1.4545 0.1627 between 2008 and
2013 1800 1.1250 0.0512 2013
Arithmetic mean 1.2508 0.0954
Geometric mean 1.2457

The third column shows the price change per year in successive years. For example between
2008 and 2009, the price has risen by a factor of 750/600 = 1.25 times

Calculating the arithmetic mean change of price gives

(1.25+1.2667+1.1579+1.4545+1.125)/4 = 1.2508

This indicates that between the years 2008-2013, the price of Teff on average increased by a
factor of 1.2508 times per year. i.e., On average the price increased by 25.80% yearly. However
this is an over estimation of the true change of the price during the indicated years. For instance
considering a price of 600 Birr/Q in 2008, if this price is increased by a factor of 1.2508, in
2013 the price becomes 600 *1.2508 * 1.2508*1.2508*1.2508*1.2508 = 1837.079 Birr/Q.
This is a wrong price of Teff in 2013.

Yosef Tekle-Giorgis, HUCA, 2015 Page 17


Statistical Methods for Scientific 2015

Research
Instead using the log values of the price changes in successive years (column 4), and computing
the geometric mean of the price change gives
Antilog[ (0.0969+0.1027+0.0637+0.1627+0.0512)5] = 100.0954 = 1.2457
This indicates that the true rate of price change between the indicated years is on average
1.2457 per year or 24.57% rather than 25.08 % as obtained from the arithmetic
procedure. Now considering the price in 2008 = 600 Birr/Q and annual mean rate of increase
1.2457, the price in 2008 can be computed as
600 * 1.2457*1.2457*1.2457*1.2457*1.2457 = 1800 Birr/Q, and this is in agreement with the
observed price in 2013 (Table 4).
2.3 The Harmonic mean
This is the reciprocal of the arithmetic mean of the reciprocals of the data
1 n
X H= =
1/n ∑ 1/ xi ∑ 1/ xi

Harmonic mean is occasionally used measure of central tendency but it is most appropriate when
averaging rates. For example, consider that a flock of aquatic birds flies early in the morning
from a roosting area at Wondogenet to a feeding area Lake Hawassa, which is 20 km away,
flying at a speed of 40 km/hr (which takes 0.5 hr). The flock returns to the roosting area at
Wondogenet along the same route (20 km), flying at 20 km/hr (requiring 1 hr of flying time).
What would be the flight speed of the flock for the round trip?

Using the arithmetic mean and harmonic mean the average speed is computed as
Arithmetic mean = (40 km/hr + 20 km/hr )/2 = 30 km/hr.
Harmonic mean = 2/(1/40 +1/20) = 2/(0.025 + 0.05) = 2/0.075 = 26.67 km/hr.

In the above computation, the arithmetic mean is an overestimation of the average speed the
flock travelled during the round trip. This is because a total of 40 km was traveled in 1.5 hr,
indicating a speed of (40 km)/ (1.5 hr) = 26.7 km/hr. Thus the Harmonic mean is the correct
averaging procedure of rates.

Yosef Tekle-Giorgis, HUCA, 2015 Page 18


Statistical Methods for Scientific 2015

Research

2.4 The median


Median is defined as the middle value in an ordered data set. i.e., it is a value that divides the
data into two halves. For raw data the median value can be determined first by arranging the
data into ascending (or descending) order and the median value is computed as

M = X(n+1), if the data set is odd and

M =[ X(n/2)+ X(n/2)+1]/2 if the data set is even

For example, if the sample data contains 10 observations, X(n/2) = X5( fifth observation) and
X(n/2)+1 = X5+1 = X6 (sixth observation). Therefore the median value would be half way
between the fifth and sixth observation. i.e., (X5+X6)/2

For grouped data median value is computed as follows:

Median = LLM + [(n/2 – CFM)/Fm] * (Class interval width)

Where

LLM = Lower limit of the median class. i.e., the class that contains the median value

n = no of data points

CFM = cumulative frequency up to the median class

Fm = the frequency in the median class

Class interval width = the size of the interval used in the grouping

Exercise: using data of tables 2.1,2.2, and 2.3, determine the median values?

2.5 The Mode

Mode is the most frequently observed value in the data set. When mixtures of populations are
sampled, the data can have more than one modal value. i.e., multimodal distribution. For
example, such a data can be consider length records of a random sample of fish containing
different age group.

Yosef Tekle-Giorgis, HUCA, 2015 Page 19


Statistical Methods for Scientific 2015

Research

2.6 Summary on the usage of the three measures of central tendencies

Among the three averages, Mode can be used for both quantitative and qualitative variable.
Median can be used for quantitative (both continuous and discrete) as well as for ordinal scale
variable. On the other hand Mean for describing data, the variable has to be quantitative.
Although there is a misusage of mean to average ranked data, appropriately it should be done
with median or mode.

It appears that the usage of mean is only limited to quantitative variable. Nonetheless, mean is
the most preferred measure of central tendency whenever it can be used amongst the three. This
is because it uses all the values in its computational procedure. Whereas, the other two are rather
location parameters. There are also a wide variety of statistical inferencing procedures
developed based on mean value (referred as parametric tests) that are more powerful than tests
concerning the other measures of central tendencies such as those concerning median values and
ranks (non parametric procedures). Despite all this, care should be exercised when using mean
value for data representation as it is the most affected of the three when the data are considerably
skewed.

Yosef Tekle-Giorgis, HUCA, 2015 Page 20


Statistical Methods for Scientific 2015

Research
Chapter 3 Measures of dispersion

A measure of dispersion is an expression of variability in a data set. Or how spread are values
around the center, which is the opposite of how clustered are the values around the center. A
measure of dispersion has to reported with average values because the latter alone do not
adequately describe a data set.

For example consider the final score on biostatistics (%) of two classes of students, 1996 and
1997 class displayed by Table 3.1. The mean score for the 1996 and 1997 class is 75.0% and
75.4 %, respectively. Thus reporting the mean value alone misleads that the two class students
are comparable in performance. On the other hand, if some measure of dispersion like Range is
reported together with the mean value, the reader would understand that the two class students
are widely different in performance. i.e., the 1996 class students are highly varied in
performance but the 1997 class students are relatively close to each other. However, the reader
cannot get this idea by inspecting the mean value alone.

Table 3.1 Biostatistics final score of 1996 and 1997 class students (x/100)

Score (X/100)

No of student 1996 class 1997 class


1 95 85
2 90 82
3 85 80
4 80 78
5 70 75
6 65 70
7 60 68
8 55 65
Total (∑ X) 600 603
Mean 75 75.4
Range 55-95 65-85
Variance (s2) 214.29 50.84
Standard deviation
(S) 14.64 7.13

Yosef Tekle-Giorgis, HUCA, 2015 Page 21


Statistical Methods for Scientific 2015

Research

Dispersion measures

III.1 Range
This is the difference between the highest and lowest value in a data set. It can be expressed as
difference, such as the range of the 1996 class score is within 40 % or it can also be expressed
as interval such as the said range is between 55 and 95 %. However, when expressed as a
difference, mean value should be reported together with it, otherwise the reader will not
understand where the actual values lie. For example, reporting the range as within 40% does
also mean that the score of the students can be between 40 and 80% or any other range of
values. On the other hand, if the mean is reported together with the range, the reader would
understand where the values lie.

Range is simple to understand and easily graspable measure of dispersion. However, it is a


crude measure of dispersion and it can easily overestimate the true dispersion if the data
contains outlier values. For instance consider the scores of two classes of students (class A and
B) displayed by Table 4.2 below. The mean score of class A is 75.4% and that of class B is
75.3 %. The range of the scores is between 65-85 % and 55-95 %, respectively. Looking at this
report, a reader would understand that class B students are highly varied in their scores.
However, range in this instance does not indicate the true dispersion of class B students. This is
because 8 out of 10 of the students scored within 20 % range and only the lowest and highest
scoring students differed by 40 %. Therefore, range in this situation overemphasized the true
dispersion of the values.

Although range is crude and inefficient measure of dispersion, it is still useful to describe the
dispersion of some variables like for instance, market price or say weather information with
range. Normally it is the range of temperature or rainfall or any other weather information that
should be reported since it is the range rather than the mean that is actually felt or experienced.

Yosef Tekle-Giorgis, HUCA, 2015 Page 22


Statistical Methods for Scientific 2015

Research
Also taxonomists are often concerned with having an estimate of what the highest and lowest
values in a population are expected to be.

Range over emphasizes the true dispersion in the presence of outlier values because it only takes
into consideration two values in the data set. Therefore it happened to be necessary to look for a
measure of dispersion that takes into consideration all values in the data set in its computation.
This idea bought into attention other measures of dispersions discussed below.

III.2 Mean absolute deviation


This is an expression of dispersion as a deviation of each value from the mean. Since the sum of
all deviation from the mean is equal to zero, to avoid the sign effect absolute value of the
deviations are taken and the formula becomes

=
∑ ¿ xi−X /¿ ¿ Total absolute deviation
Mean Absolute Deviation n sample ¿ ¿ ¿

For example, for the 1996 class mean absolute deviation = 100/8 = 12.5%, as shown by Table
3.2 below. Mean absolute deviation is adequate to describe the dispersion of the data set. For
the 1996 class, it indicates that on average the score of a student deviated from the class average
by 12.5%. This measure of dispersion, although adequate to describe dispersion in a data set, its
theoretical distribution is poorly known and hence it has no use in higher statistics.

Table 3.2 The 1996 class score in Biostatistics data used for illustration of Mean absolute
deviation calculation

1996 class Score (Xi) xi−X


95 95 – 75 = 20
90 90 – 75 = 15
85 85 – 75 = 10
80 80 – 75 = 5
70 70 – 75 = -5

Yosef Tekle-Giorgis, HUCA, 2015 Page 23


Statistical Methods for Scientific 2015

Research
65 65 – 75 = -10
60 60 – 75 = -15
55 55 – 75 = -20
Sum of absolute deviation ∑/Xi – X / =100
Mean absolute deviation 100/8 = 12.5 %
III.3 Variance and standard deviation
When expressing dispersion as a deviation from the mean, another method of eliminating the
sign effect is to square the deviations. Then dividing the sum of the squared deviations by the
total number of data points gives a measure of dispersion known as Variance.

Population variance σ2 = ∑ (xi –μ)2/N

Sample variance S2 = ∑ (xi – X )2/ n-1

Variance is also termed as Mean square since it is the mean of the total squared deviation of each
value from the mean. i.e.,

Sample variance = Mean square = total sum of squared deviation/degrees of freedom

When abbreviated, the expression can be written as S2 = MS = TSS/DF

Sample variance S2 is an estimator of the population variance σ 2. The denominator for the
sample variance is n-1, referred as the degrees of freedom rather than n (sample size) because if
the denominator is sample size, S2 underestimates the true dispersion σ2.

Variance expresses the dispersion in the data values in squared unit. For example using data of
the 1996 class score shown by table 3.3 below, the sample variance is computed as

S2 = ∑ ( xi−X ) 2
/n-1 1500/7 = 214.29 %2

Yosef Tekle-Giorgis, HUCA, 2015 Page 24


Statistical Methods for Scientific 2015

Research
It indicates that the score of a student on average deviated from the average score of the class by
214.29 %2, i.e., the deviation from the center is expressed in percent square. Since variance
expresses deviation in squared unit, it is not easily graspable by common users. Therefore the
positive square root of variance is used to express deviation in linear units and this gives a
quantity called standard deviation (S)

Sample standard deviation = S √S2 = √∑ ( xi−X ) 2


/n-1

For the 1996 class students score, S = √214.29 = 14.64%. i.e., in the 1996 class, the score of a
given student deviated from the class average by 14.64%.. This is much easier to understand
since it is expressed in similar units as the values in the data set.

The above calculation formula for variance (and hence standard deviation) is tedious to use. It
can be simplified by expanding the expression for the sum of squares as follows

SS = ∑ (xi – X )2 ∑xi2 – (∑xi)2/n

Hence the simplified calculator formula for S2 = [∑xi2 – (∑xi)2/n]/n-1

For data recorded in frequency table, S2 = [∑fixi2 – (∑fixi)2/n]/n-1

Table 3.3 The 1996 class score in Biostatistics , data used for illustration of variance calculation

1996 class xi−X ( xi−X ) 2


Score (Xi)
95 95 – 75 = 20 20 2 = 400
90 90 – 75 = 15 152 = 225
85 85 – 75 = 10 102 = 100
80 80 – 75 = 5 52 = 25
70 70 – 75 = -5 -5 2 = 25
65 65 – 75 = -10 -102 = 100
60 60 – 75 = -15 -152 = 225
55 55 – 75 = -20 -202 = 400
Sum ∑(Xi – X ) =0 ∑ ( xi−X ) 2
= 1500

Yosef Tekle-Giorgis, HUCA, 2015 Page 25


Statistical Methods for Scientific 2015

Research
S2
∑ ( xi−X ) 2
/n-1
1500/7 = 214.29 %2

Variance is the most preferred measure of dispersion for it satisfies two important qualities.
Firstly it takes all values in the data set in its computation and therefore it is not biased as range.
Secondly its probability distribution is well understood and therefore this makes it indispensible
in higher statistics. In fact it is the bias for F test i.e., Analysis of Variance hypothesis testing
procedure which will be discussed in detail in the latter chapters.

In practice the range of a given data set is μ ± 3σ. i.e., The data values are found within 3σ units
around the population mean. From this, it can be seen that Range = 6σ
III.4 The Coefficient of variation (CV)
S
∗100
The Coefficient of variation is defined as CV = X . It describes variability in the data set
as a proportion to the mean. CV has no unit and it is a relative measure of variability. It can be
used to compare the relative dispersion (variability) of different data sets measured in different
units as well as magnitudes. For instance how variable (homogeneous) is the distribution of
weights of same age group individuals in a given community compared to the variation in weight
of same age monkey that live in a given community can be compared better using CV values
than using variances of the two data sets.

3.5 Indices of diversity

For a nominal scale variable, there is no mean value to express how diverse are the different
categories. Diversity here implies even distribution or richness. A community having species
represented by nearly equal proportion of observations is said to be diverse or has even
distribution. By contrast, if some species are more represented than others, it indicates a less
diverse community or dominance of some species over the others. For example, consider the
composition of five species of woody plants in two areas, one with degraded landscape and
another one with undisturbed landscape (Table 3.4). The data are no. of plants/hectare.

Yosef Tekle-Giorgis, HUCA, 2015 Page 26


Statistical Methods for Scientific 2015

Research
Table 3.4 The frequency distribution of five species of woody plants randomly sampled from
two communities (Undisturbed and degraded community)

Community
Spp. of woody plants Undisturbed Degraded
Spp1 20 80
Spp2 20 5
Spp3 20 5
Spp4 20 5
Spp5 20 5
Sample size (n) 100 100
Here the undisturbed community has more diverse species composition or rich community.
Whereas, the degraded community composes less diverse species which is dominated by only
one species (Spp 1). Statistically Shannon-Weiner diversity index is used to express the
diversity or richness of a community. It is computed as

H’ = -∑[Pi * Log(Pi)]

Where

Pi = the proportion of the different categories

Here the logarithm can be expressed in any base. For the above data the calculation is done as
follows

Table 3.5 illustration of the computation of Shannon diversity index using data of table 3.4

Undisturbed Disturbed
Spp. of Log Log
woody plants n pi (Pi) n pi (Pi)
Spp1 20 0.2 -0.699 80 0.8 -0.0969
Spp2 20 0.2 -0.699 5 0.05 -1.301
Spp3 20 0.2 -0.699 5 0.05 -1.301
Spp4 20 0.2 -0.699 5 0.05 -1.301
Spp5 20 0.2 -0.699 5 0.05 -1.301

Yosef Tekle-Giorgis, HUCA, 2015 Page 27


Statistical Methods for Scientific 2015

Research
Undisturbed H’ =-[(0.2 * -0.69897) + (0.2 * -0.69897) +(0.2 * -0.69897) +(0.2 * -0.69897)
+(0.2 * -0.69897)] = 0.699

Disturbed H’ = -[(0.8 * -0.09691) + (0.05* -1.30103) + (0.05* -1.30103) + (0.05* -1.30103) +


(0.05* -1.30103) = 0.338

This indicates that the undisturbed community composes more diverse species (more than
two fold) than the disturbed community

If a community has K species that are equally dominant, it is said that the community has
attained its maximum diversity (H’ max). For a variable that contains K categories, the
maximum possible diversity H’ max is computed as

H’max = Log (K)

For example if a community has two species that are equally represented then

H’max = Log (2) = 0.301

i.e., H’ max = H’ = [(0.5 * Log (0.5)) + (0.5 * Log (0.5))] = 0.301

In the above example, the H’ max for the disturbed and undisturbed communities is equal to

H’max = Log (5) = 0.699

Usually the relative diversity index (J’) is easier to explain and it is defined as the ratio of the
observed diversity index to the maximum possible diversity

i.e., J’ = H’/H’max

observed diversity/ max diversity expected

Yosef Tekle-Giorgis, HUCA, 2015 Page 28


Statistical Methods for Scientific 2015

Research
J’ is more easy to interpret because it shows how much diverse is a community compared to its
maximum possible diversity. For example,

J’ for the undisturbed community = 0.699/0.699 = 1

J’ for the disturbed community = 0.338/0.699 = 0.48

This indicates that the undisturbed community has already reached its maximum possible
diversity expected BUT the disturbed community only attained 48 % (nearly half) of its
maximum richness expected. This way the two communities can be easily compared.

Chapter 4. The Overall mean and variance of sample means

4.1 The overall mean of repeatedly taken sample means

Consider randomly taken samples of equal size repeatedly from a given population, the sample
means derived from each sampling have their overall mean which is equal to the population
mean from which the samples have been drawn. To illustrate this fact consider the milk yield
per lactation of 100 Borana cows shown by Table 4.1 below. The mean (μ), Variance (σ 2) and
Standard deviation (σ) of the measurements is calculated as 450cm, 1602 cm 2 and 40.025
liter/lactation, respectively.

Suppose1160 random samples of 5 cows were taken repeatedly from the 100 cows the resulting
sample mean milk yield grouped into a size class of 8.0 lit/lactation is shown by Table 4.2. The
overall mean (Grand mean) of the sample means is computed as:

X=
∑ fi∗Xi
a where,

Yosef Tekle-Giorgis, HUCA, 2015 Page 29


Statistical Methods for Scientific 2015

Research
X = Grand mean

Xi = The mean of respective samples

a = the number of repetition = ∑fi

for this data X = 522809/1160 = 450.7 liter/lactation

As seen here, the value of the grand mean is a close approximation of the value of μ (450
liter/lactation) from which the samples were taken. The value of the grand mean becomes
exactly equal to the population mean (μ), if the sampling is infinity. This is in fact to be
expected because each value of the sample mean is an estimator of the population mean μ.

Table 4.1 Milk yield per lactation (liter/lactation) of 100 Borana cows

Cow ID NO Milk yield Cow ID NO Milk yield Cow ID NO Milk yield


1 360 37 440 73 480
2 370 38 440 74 480
3 370 39 440 75 480
4 380 40 440 76 480
5 380 41 440 77 480
6 390 42 440 78 480
7 390 43 440 79 480
8 390 44 440 80 490
9 390 45 440 81 490
10 400 46 450 82 490
11 400 47 450 83 490
12 400 48 450 84 490
13 400 49 450 85 490
14 400 50 450 86 500
15 400 51 450 87 500
16 410 52 450 88 500
17 410 53 450 89 500
18 410 54 450 90 500
19 410 55 450 91 500
20 410 56 460 92 510
21 410 57 460 93 510
22 420 58 460 94 510

Yosef Tekle-Giorgis, HUCA, 2015 Page 30


Statistical Methods for Scientific 2015

Research
23 420 59 460 95 510
24 420 60 460 96 520
25 420 61 460 97 520
26 420 62 460 98 530
27 420 63 460 99 530
28 420 64 460 100 540
29 430 65 470
30 430 66 470
31 430 67 470
32 430 68 470
33 430 69 470
34 430 70 470
35 430 71 470
36 430 72 470

Table 4.2 Distribution of 1160 sample milk yield each calculated based on a sample size of n = 5
randomly taken data

Mean milk Observed Relative Sample


Mean milk yield Frequency Frequency standard
yield interval Class mid (F) (f) % deviation (s)
394 - 402 398 1 0.1 27
403 - 411 407 11 0.9 30
412 - 420 416 19 1.6 35
421- 428 424 64 5.5 45
429 - 437 433 127 10.9 48
438 - 446 442 230 19.8 42
448 - 455 451 260 22.4 49
456 - 463 459 226 19.5 50
464 - 472 468 124 10.7 46

Yosef Tekle-Giorgis, HUCA, 2015 Page 31


Statistical Methods for Scientific 2015

Research
473 - 481 477 60 5.2 45
482 - 490 486 23 2.0 32
491 - 498 494 14 1.2 30
499 - 507 503 1 0.1 28

Total=1160 100%

4.2 The variance of sample means

Consider randomly drawing a sample size of n from a given population, repeatedly. Obviously
the sample means obtained from such exercise are different from each other and hence this
difference or variability is expressed in terms of variance which is referred as variance of means
2
( σ X ). It is computed as
2
2 ( ∑ fi Xi )
∑ fi( Xi−μ )2 ∑ fi X −
a
σ 2X = a = a

The square root of this quantity is called standard deviation of mean and it is computed as

2
∑(fi(XXi−μ
Grand Mean
σX = √ a
) = )450.7 liter/lactation
2
Variance of mean( σ X ) = 259.7 liter2/lactation
2
σ
For the 1160 sample mean milk yield data, Variance of mean ( X ) is equal to 259.7
Standard deviation of mean ( σ X ) = 16.12 liter/lactation
σ
liter2/lactation and the Standard deviation of mean ( X ) = 16.12 liter/lactation.

Yosef Tekle-Giorgis, HUCA, 2015 Page 32


Statistical Methods for Scientific 2015

Research

Variance of means measures how deviant is the sample mean from the population mean from

which the sample is drawn. For instance in the present example,


σ X = 16.12 liter/lactation
indicates that if a sample size of 5 cows are randomly taken from the 100 cows, the mean
milk yield of the sampled cows deviate from the population mean milk yield by 16.12
liter/lactation.

The practical significance of the variance of means is that the magnitude of the variance of
means indicates how good is the sample mean as an estimator of the population mean. In other
words, a large variance of means value indicates that the sample mean is highly deviant from the
population mean and the sampling is biased. Conversely a small value indicates that the sample
mean is a close approximation of the population mean.

In practice the variance and standard deviation of mean are computed as


2
σ
σ 2X = n

σ2 σ
σX = √ n = √n
Where
2
σ = The variance of the population from which the sample was taken and

n = The sample size

2
In the present example, since σ of the 100 milk yield data is 1602 liter 2/lactation ,if a sample
size of 5 cows are randomly taken, their mean would deviate from the population mean by

Yosef Tekle-Giorgis, HUCA, 2015 Page 33


Statistical Methods for Scientific 2015

Research
1602
σX = √ 5 = 17.9 liter/lactation
This is a value close to the previously calculated standard deviation of mean from the raw data as
16.12 liter/lactation. Of course the two would exactly match if the sampling was conducted
infinitely large number of times.

From the above discussion, it can be seen that the magnitude by which a sample mean deviate
2
from the population mean depends on the variability of individuals in the population ( σ , i.e.,
how heterogeneous the population is) as well as the sample size (n) considered to compute the
mean value. The greater the sample size, the smaller will be the standard deviation of means. In
fact, as sample size increases to a very large number, the standard deviation of means becomes
vanishingly small. This makes good sense. Very large sample sizes, averaging many
observations, should yield estimates of means closer to the population mean and less variable
than those based on a few items.

As a matter of fact the formula for variance of means allows estimation of sample size required
to achieve estimates of population mean at a desired level of precision, given the population
variance is known. For instance, if one wishes to estimate the population mean milk yield of
2
Borana cows with a precision of 10 liter/lactation, knowing that σ = 1602 liter2/lactation, a
sample size required can be estimated as:
2
σ
σ 2X = n

102 = 1602/n
n = 1602/100
n ≈ 16
i.e., about 16 cows should be randomly taken to estimate the population mean milk yield of
Borana cows with in 10 liter/lactation precision.

Yosef Tekle-Giorgis, HUCA, 2015 Page 34


Statistical Methods for Scientific 2015

Research

2
As seen from the above formula of standard deviation of mean, one must know σ , which is a
rarely known parametric value. As a result when working with samples, the sample variance (S 2)
2
value is always calculated together with the mean value and this is used to replace σ in the

above formula. This gives an expression for standard error of mean (SE) symbolized as
SX

S2 S
SX=
√ =
n √n

It can be noted that standard error of mean value is helpful to judge how much deviant is the
sampled mean from the unknown population. As a matter of fact other statistical quantities
calculated based on sample data have standard error value that enable to judge how deviant is the
sample statistic value from the corresponding unknown population parametric value. There is a
hypothesis testing procedure as to how much standard error level can be tolerated before
deciding that the sample estimate is biased.

As a rule of thumb the following can be used as a guide for such kind of decision. For this
calculate CV (coefficient of variation %) as shown previously and if CV≤ 10%, The sample
estimate is considered as a very good approximation of the parametric value. This is usually
achieved in experimental research.

If CV is between 10 and 20 % this is also considered as a good approximation of the parametric


value. This is in fact a very good precision level for samples obtained from exploratory (survey
type) research. But for data generated from experimental research, this should be taken as the
upper tolerable limit of deviation for a sample statistic.

If CV is between 20 and 30%, this may be tolerated for exploratory research, but for designed
experiments, it is indicative of too much noise or an un-standardized methodology followed and

Yosef Tekle-Giorgis, HUCA, 2015 Page 35


Statistical Methods for Scientific 2015

Research
one has to consider revising the methodology of data collection. Apart from revising the
methodology of data collection, obviously increasing the sample size is also a good approach to
improve precision of estimate, whenever it can be practiced.

Yosef Tekle-Giorgis, HUCA, 2015 Page 36


Statistical Methods for Scientific 2015

Research
Chapter 5 Distribution of continuous variable

5.1 The Normal distribution

Consider data on continuous variable collected from a biological population. The frequency
distribution of such a continuous variable gives a bell shaped curve called normal frequency
distribution. To illustrate a normal distribution, consider the milk yield data of the 100 Borana
cows shown by Table 4.1 earlier. The data expressed in frequency table is shown by Table 5.1
below

Table 5.1 Frequency distribution of the milk yield of 100 Borana cows.

Milk Yield No of cows Relative Standardized


Per lactation Observed Frequency (%) normal deviates
(x) Frequency (F) (f) Z= (Xi - µ)/σ
360 1 1 -2.25
370 2 2 -2.00
380 2 2 -1.75
390 4 4 -1.50
400 6 6 -1.25
410 6 6 -1.00
420 7 7 -0.75
430 8 8 -0.50
440 9 9 -0.25
450 10 10 0.00
460 9 9 0.25
470 8 8 0.50
480 7 7 0.75
490 6 6 1.00
500 6 6 1.25
510 4 4 1.50
520 2 2 1.75
530 2 2 2.00
540 1 1 2.25
N = 100 ∑ f = 100 %

Plot of the milk yield data against their respective frequency gives bell shaped curve which is a
normal curve as shown by Figure 5.1 below.

Yosef Tekle-Giorgis, HUCA, 2015 Page 37


Statistical Methods for Scientific 2015

Research

10

8
Frequency

0
300 350 400 450 500 550 600

Milk yield (liters/lactation)

Figure 5.1 Frequency distribution of the milk yield of 100 Borana cows (data of table 5.1) shown
together with the fitted normal curve.

5.1.1 Properties of a normal distribution

Not all bell shaped curves are normally distributed. Most appropriately a normal distribution is
defined as one which the height of the curve at Xi is expressed by the relation:

1 −( Xi−μ )2 /2σ 2
Yi= e
σ √2 Π
The height of the curve (Yi) is referred as the normal density.

Yosef Tekle-Giorgis, HUCA, 2015 Page 38


Statistical Methods for Scientific 2015

Research
From biological point of view a normal distribution is characterized by the following properties.

i) Symmetry
A normally distributed data set is symmetrical in which the frequency of observations below the
mean and above the mean are equal. In practice a data set can be slightly deviant from perfectly
symmetrical distribution, but such moderate departure from symmetry does not violet the
characteristics of a normal distribution.

As opposed to a symmetrical distribution, there are asymmetrical or skewed distributions that


contain outlier values at either tail. Accordingly a negatively skewed distribution contains
outlier values at the low end (left tail) while a positively skewed distribution contain outlier
values at the high end (right tail). These two types of distributions are depicted by figure 5.2a
and 5.2b below.

Yosef Tekle-Giorgis, HUCA, 2015 Page 39


Statistical Methods for Scientific 2015

Research
10
8 a
Frequency

6
4
2
0
Xi

10
8
Frequency

6
4
2
0
Xi

Figure 5.2 negatively skewed (a) and positively skewed (b) distributions

ii) Kurtosis
Kurtosis expresses how peaked the distribution is or peakedness of the distribution. A normally
distributed data set is mesokurtic in which majority of the observations lie at the center and the
frequency of observations decrease proportionately towards the tails of the distribution. As
opposed to this there are leptokurtic distributions that contain more observations at tails and the
center while less number of observations at the middle layer than normal. Contrary to these,
there are also platykurtic distributions that contain less observations at the tails and the center
while more observations at the middle layer. These distributions are displayed by Figure 5.3a and

Yosef Tekle-Giorgis, HUCA, 2015 Page 40


Statistical Methods for Scientific 2015

Research
3b, respectively. Note that both leptokurtic and platykurtic distributions are symmetrical like a
normal distribution.

20
a
15 Normal
Frequency

10

0
1 2 3 4 5 6 7 8 9
X values

16
14
12
Frequency

10
8
6
4
2
0
1 2 3 4 5 6 7 8 9
X values

Figure 5.3 Leptokurtic (a) and platykurtic (b) distributions compared to normal distribution.

5.1.2 Standard normal deviate values (Z score)

Yosef Tekle-Giorgis, HUCA, 2015 Page 41


Statistical Methods for Scientific 2015

Research
Z score is a way of describing a data value as a deviation from the mean in terms of standard
deviation unit. i.e.,

Xi−μ
Zi =
σ

For example, the smallest milk yield in Table 5.1 is 360 liters/lactation. This value is 360- 450/
40.025 = -2.25 standard deviation units less than the mean milk yield of Borana cows. Also the
milk yield of a cow that is 2 standard deviation above the mean measures 530 liter/lactation etc.
Therefore, in a given data set, each Xi value can be represented with a Z value as shown by
column 4 of Table 5.1. Data points below the mean are represented by negative Z values and
those above the mean are represented with positive Z values. As a result the mean of Z is Zero
and the variance (and hence the standard deviation) of Z is one. Z is also a unit less statistical
quantity. The distribution of Z is normal around their mean Zero as shown by Figure 5.4 below
which is the frequency plot of Z values representing milk yield data of the 100 Borana cows
(data of Table 5.1). Z scores are called standard normal deviate values because they represent
deviation which is standardized (unit less) and normally distributed.

10
9
8
7
Frequency

6
5
4
3
2
1
0
-3 -2 -1 0 1 2 3

Standardized normal deviate (Z)

Figure 5.4 Frequency distribution of Zi values representing milk yield data of the 100 Borana
cows (data from table 5.1.)

Yosef Tekle-Giorgis, HUCA, 2015 Page 42


Statistical Methods for Scientific 2015

Research
5.1.3 Areas under a normal curve

One major area of application of a normal distribution is to estimate probabilities. i.e., the
probability of finding a certain value within a given range of values. This concerns estimating
areas under a normal curve that lie between certain interval of values. For instance, consider that
a cow has been randomly selected among the 100 cows depicted by table 5.1. One may wish to
ask what could be the probability that the milk yield of the randomly selected cow to be between
410 and 490 liter/lactation? Since we know that the distribution of the said data is normal with
μ= 450 liter/lactation and σ = 40.025 liter/lactation, it is possible to estimate the probability in
question as the area contained between 410 and 490 liter/lactation.

In probability notation this is stated as P(410≤Xi≤490) =? Graphically this represents the area
under the milk yield distribution that lie between the indicated intervals shown by the figure
below.

Yosef Tekle-Giorgis, HUCA, 2015 Page 43


Statistical Methods for Scientific 2015

Research
10
9
8
7
Frequency

6
5
4
3
2
1
0
300 350 400 450 500 550 600
Milk yield (liters/lactation)

The same question may be asked using Z distribution, by converting the Xi values into the
corresponding Z values as follows

Zi = (Xi –μ)/σ and hence Z1 = (410-450)/ 40.025 = -1 and Z2 = 490 – 450/40.025 = 1

Hence the area between 410 and 490 liters/lactation can be represented as the area between Z
values of -1 and 1. i.e. P(410≤Xi≤490) = P(-1≤Zi≤1) this is shown by the figure below.

Yosef Tekle-Giorgis, HUCA, 2015 Page 44


Statistical Methods for Scientific 2015

Research
10
9
8
7
Frequency

6
5
4
3
2
1
0
-3 -2 -1 0 1 2 3

Standardized normal deviate (Z)

Knowing that the distribution of a given data is normal, the area between any intervals can be
easily estimated by integrating the area contained in the indicated intervals. For instance in the
present example the area under the milk yield distribution contained between 410 and 490
liters/lactation can be estimated by integrating the normal density function equation displayed
earlier about the interval points 410 and 490 liters/lactation. i.e.,

490

∫ σ √12 Π e−( xi−μ )2 /2 σ 2 dx


P(410≤Xi≤490) = 410

Similarly the same procedure can be employed to integrate the area under the Z curve that lies
between -1 and 1.

∫ √21Π e−( xi)2 /2 dx


P(-1≤Xi≤1) = −1

Performing the above kind of integration requires knowledge of calculus and at the same time it
is tedious. As a result areas under a normal curve have been calculated and presented into a table

Yosef Tekle-Giorgis, HUCA, 2015 Page 45


Statistical Methods for Scientific 2015

Research
called Z table (or normal table) and such kind of questions can be tackled by consulting the Z
table displayed in Appendix table 1. There are different kinds of Z tables formulated in different
ways but conveying similar information. Some Z tables give areas outside a give Z value. There
is also a different kind of Z table that gives areas from the negative extremity up to a given Z
value (cumulative normal distribution table). The Z table displayed here in Appendix 1 gives
areas under the Z curve between 0 and a given positive Z value indicated by the first column.
Thus the table simply gives areas only to the positive half of the curve and if there is a need to
consult the negative side, the area up to the positive equivalent is taken. In the present example,
the area between 0 and Z=1 is 0.3413. Conversely, the area between 0 and -1 is also 0.3413 and
hence the area between Z values of -1 and 1 is 0.3413 + 0.3413 = 0.6826. i.e.,

P(-1≤Zi≤1) = 0.6826 and hence

P(410≤Xi≤490) = 0.6826.

This implies that the probability of randomly sampling acow that gives milk yield between 410
and 490 liters/lactation is 0.6826. In other words, 68.26 % of Borana cows give milk yield
between 410 cm and 490 liters/lactation.

Z values are customarily referred by the area that they exclude. For example the Z value that
excludes a 0.025 area under a normal curve is referred as Z 0.025. To find the actual value of Z 0.025
from the table, the area up to Z0.025 is used as entry and this can be found by deducting 0.025 from
0.5. i.e., P(0≤Zi≤ Z0.025) = 0.5 – 0.025 = 0.475. Hence using this as an entry value in the body of
the table and looking for the corresponding Z value gives Z 0.025 = 1.96. Note that the first row of
the table gives the second decimal place of Z value. In a similar fashion it can be shown that the
Z value that excludes a 0.025 area at the left tail (-Z0.025) is -1.96. Thus Z values of ±1.96
together bound 95 % of the curve area. In general it can be shown that:

68.27% of the observations in a normally distributed data lie within the range of μ±σ values

95% lie within μ±1.96σ. i.e., P(μ-1.96σ ≤ Xi ≤ μ+1.96σ) = 0.95

Yosef Tekle-Giorgis, HUCA, 2015 Page 46


Statistical Methods for Scientific 2015

Research
Also 99% lie within μ±2.58σ. i.e., P(μ-2.58σ ≤ Xi ≤ μ+2.58σ) = 0.99

5.1.4 The distribution of sample means

If random samples of size n are drawn from a normal population, the means of these samples will
conform to a normal distribution. The distribution of means from a non normal population will
not be normal but will tend to be normal as the sample size increases, which is known as central
limit theorem.

For example, consider the 1160 sample mean milk yield data displayed by Table 4.2. Plot of the
mean values against the frequencies indicate that the distribution is normal around the population

σ
mean μ = 450 liters/lactation with standard deviation of mean
σ X = √ n 40.025/√5 = 17.9

liters/lactation (Figure 5.5).

As seen from Figure 5.5 below, the width of the sample mean distribution is a function of
σX
and hence the distribution of the sample mean is narrower around the population mean compared
to the raw data. Note that the distribution of the milk yield of the 100 cows (the raw data) is also
shown together with the sample mean distribution for comparison. In fact the distribution of
sample mean becomes close around the population mean as the sample size increases. For

example, if a sample size of 25 cows were randomly taken from the 100 cows,
σX = σ
√n
40.025/√25 = 8.005liter/lactation. Obviously the sample means so obtained would be much
closely distributed around the population mean μ = 450 liter/lactation. Note that with the sample
size of 5, the sample means ranged between 398 and 503 liters/lactation. With a sample size of
25, the range of the sample mean distribution would be between 426 cm and 474 liter/lactation.
Recall that the range of a distribution is μ ± 3 standard deviation units, which in this case is

Yosef Tekle-Giorgis, HUCA, 2015 Page 47


Statistical Methods for Scientific 2015

Research
μ±3
σX 450±(3*8.005) = 426 – 474

As for the case of the raw data, sample means are also represented by Z score values and these

X i−μ
are computed from sample mean as Z =
σX

25
Relative frequency (%)

20

15

10

0
300 350 400 450 500 550 600
Milk yield per lactation

Figure 5.5 Frequency distributions of the 1160 mean milk yield data (solid curve) plotted
together with the raw data (dashed curve).

5.1.5 Computing probabilities based on sample mean distribution

As in the case of the raw data distribution, probabilities can be computed based on sample mean
distribution following the same procedure employed earlier. For instance consider that a sample

Yosef Tekle-Giorgis, HUCA, 2015 Page 48


Statistical Methods for Scientific 2015

Research
of 5 cows were randomly taken from the 100 cows mentioned earlier, and their mean milk yield
was calculated. One may ask the probability of finding the mean milk yield of the sampled cows
to lie between any given interval say between 414.9 and 485.1 cm. The same procedure is
employed to compute the desired probability as follows.

i) First it is always good to make a plot of the distribution and indicate the desired area

25
Relative frequency (%)

20

15

10

5
P(414.9 ≤Xi ≤ 485.1)
0
350 400 450 500 550
Milk yield per lactation

ii) Convert the interval values into Zi values


Z1= (414.9 – 45)/17.9 = -1.96
Z2 = (485.1 – 45)/17.9 = 1.96
iii) The area between X values 414.9 and 485.1 is the area between -1.96 and 1.96 Z
values. P(414.9 ≤ Zi ≤ 485.1) = P(-1.96 ≤ Zi ≤ 1.96)
iv) From the Z table, the area between 0 and 1.96 is P(0 ≤ Zi ≤ 1.96) = 0.475 also
P(-1.96 ≤ Zi ≤ 0) = 0.475. Thus P(-1.96 ≤ Zi ≤ 1.96) = 0.475 + 0.475 = 0.95

Yosef Tekle-Giorgis, HUCA, 2015 Page 49


Statistical Methods for Scientific 2015

Research
The interpretation is that the mean milk yield of the sampled cows would be between 414.9 and
485.1 liters/lactation with 95 % probability.

From the Z table, it can be shown that if samples of size n are randomly taken from a population,

±1.96
σX
95 % of the sample means lie between μ values. Similarly, 99% of the sample means

μ±2.58
σX
lie between values. i.e.,

σ σ
(
P μ−1. 96
√n
≤X i≤μ+1. 96
√n )
=0. 95

σ σ
(
P μ−2. 58
√n
≤ X i≤μ+2. 58
√n )
=0 . 99

5.2 Student t distribution

Earlier we have seen that the deviation of sample means from the population mean divided by
the standard deviation of mean gives Z value, which is normally distributed. i.e.,

X i−μ
Zi=
σX

On the other hand, the deviation of sample mean from the population mean divided by the

standard error of mean (


S X ) gives a t value, which is leptokurtic in its distribution rather than
being mesokurtic like Z distribution, in particular when the sample sizes are small (< 30). i.e.,

X i−μ
ti =
SX

Yosef Tekle-Giorgis, HUCA, 2015 Page 50


Statistical Methods for Scientific 2015

Research
To illustrate this fact, consider the 1160 sample milk yield data illustrated earlier. Together with
the sample means, there are corresponding sample standard deviation of mean values and using
these, the standard error value for the respective mean can be computed as shown by Table 5.2
below. Hence, t values can be computed corresponding to the deviation of sample mean using
the above formula. Note that the table also shows the Z values pertaining to each sample mean
for comparison. The Plot of t and Z values against their relative frequencies is shown by Figure
5.6

Table 5.2 Z and t values corresponding to the sampled 1160 mean milk yield data of Borana
cows.

Sample Standard Standard


Mean Relative standard error of deviation of
milk yield Absolute Frequency deviation
mean S mean σ
Class mid Frequency (%) (S) x x t z
398 1 0.1 27 12.07 17.90 -4.31 -2.91
407 11 0.9 30 13.42 17.90 -3.21 -2.40
416 19 1.6 35 15.65 17.90 -2.17 -1.90
424 64 5.5 45 20.12 17.90 -1.29 -1.45
433 127 10.9 48 21.47 17.90 -0.79 -0.95
442 230 19.8 42 18.78 17.90 -0.43 -0.45
451 260 22.4 49 21.91 17.90 0.05 0.06
459 226 19.5 50 22.36 17.90 0.40 0.50
468 124 10.7 46 20.57 17.90 0.87 1.01
477 60 5.2 45 20.12 17.90 1.34 1.51
486 23 2.0 32 14.31 17.90 2.52 2.01
494 14 1.2 30 13.42 17.90 3.28 2.46
503 1 0.1 28 12.52 17.90 4.23 2.96
1160 100%

Yosef Tekle-Giorgis, HUCA, 2015 Page 51


Statistical Methods for Scientific 2015

Research
30

25
t
Frequency (%)

20
Z
15

10

0
-5 -4 -3 -2 -1 0 1 2 3 4 5
Z or t

Figure 5.6 Z and t distributions derived from the 1160 sample milk yield data

5.2.1 Characteristics of t distribution compared to Normal (Z distribution)

t distribution like normal or Z distribution is symmetrical around its mean zero. However t
distribution contains more values at the tails and the center but fewer observations at the middle
area compared to the Z distribution. i.e., t is leptokurtic but Z is mesokurtic.

The leptokurtic nature of t distribution reduces as the sample size considered increases. For
small sample sizes, t is more leptokurtic and as the sample size increases, it tends to become less
and less leptokurtic. i.e., t distribution tends to approach normal (Z) distribution as the sample
size increases and for sample sizes ≥ 30, t distribution resembles Z with only minor differences.
Thus there are different t curves depending on the sample size used unlike a Z distribution,
which is only one. Since t distribution is identical with Z distribution at ∞ degrees of freedom, Z
curve is referred as one of the family of the t curves. Since the t curves for sample sizes ≥ 30 are
more or less similar with a Z curve, practically there are 29 different t curves with
distinguishable differences among themselves.

Yosef Tekle-Giorgis, HUCA, 2015 Page 52


Statistical Methods for Scientific 2015

Research
5.2.2 Areas under t curves (t table)

For use in probability calculations, areas under t curves have been integrated and summarized
into a table form known as t table. Since there are different t curves, there are also different t
tables. i.e., separate tables giving areas under the different t curves. Accordingly, one can find
29 different t tables pertaining to the 29 distinguishably different t curves. However not all the
areas in the t curves are useful for probability estimation and inferences. Hence a conventional t
table has been produced, which is a mixture of the 29 different t tables and this table is shown by
Appendix table 2 in the appendix section.

The body of the conventional t table gives t values that exclude areas at the tail region of the
different t curves. The tail region areas are shown by the first row of the table ranging from 0.1
up to 0.005. The first column gives the degrees of freedom values, which should be used to refer
to the particular t curve. For example, consider a t value that excludes a 0.025 area at the right

tail of a t curve with degrees of freedom 4. i.e., t0.025 (4) = 2.776. Also consider a t value that
excludes the same 0.025 area at the right tail of a t curve with degrees of freedom 24. i.e.,

t0.025 (24) = 2.064. Obviously the latter is smaller although both exclude a similar area under the
respective curve and this is because the t curve with 4 degrees of freedom is more leptokurtic and
with extended tails than the t curve with the 24 degrees of freedom. Also note that the t value

that excludes a 0.025 area with ∞ degrees of freedom, t0.025 (∞) = 1.960. This is a value
identical to the Z value that exclude a 0.025 area under the Z curve Z0.025 = 1.96. In fact the t
values given at the last row for ∞ degrees of freedoms are all identical to the corresponding Z
values that exclude the same area under the Z curve. This is obvious as t distribution for large
sample sizes resemble the Z distribution. Generally the t value that exclude a given α area at a

given n-1 degrees of freedom is designated as tα(n-1).

Yosef Tekle-Giorgis, HUCA, 2015 Page 53


Statistical Methods for Scientific 2015

Research
Exercises Questions related to normal and t distributions
1) Using the Z table find the probabilities pertaining to the following?

a) P(0 ≤ Z ≤ 1.6) b) P(0 ≤ Z ≤ 0.9) c) P(Z > 1.83) d) P(Z < -1.42)

e) P( -1.55 ≤ Z ≤ 0.44) f) P(0.58 ≤ Z ≤ 1.74)

Answer: a) 0.4452 b) 0.3159 c) 0.5 – 0.4664 = 0.0336 d) 0.5 – 0.4222 = 0.0778

e) P(1.55≤ Z ≤0) + P(0≤ Z ≤ 0.44) = 0.4394 + 0.1700 = 0.6094

f) P(0 ≤ Z ≤ 1.74) – P(0 ≤ Z ≤ 0.58) = 0.4591 – 0.2190 = 0.2401

2) A variable X is normally distributed with a mean µ = 10 and a standard deviation σ = 2. Find


the probabilities associated with the following

a) P(X > 13.5) b) P(X < 8.2) c) P(9.4 ≤ X ≤ 10.6) d) P(8 ≤ X ≤ 9)

Answer: a) P(Z > 1.75) = 0.5 – 0.4599 = 0.0401 b) P(Z< - 0.9) = 0.5 – 0.3159 = 0.1841

C)P(-0.3 ≤ Z ≤ 0.3) = 0.1179 + 0.1179 = 0.2358

d) P(-1 ≤ Z ≤ -0.5 ) = 0.3413 – 0.1915 = 0.1498

3) A variable X is normally distributed with unknown mean µ and a standard deviation

σ = 2. The probability that a randomly selected value of X exceeds 7.5 is 0.8023. Find the
value of µ ? (Answer : µ = 9.2)

4) Assume that the body weight of a given goat population is normally distributed with unknown
mean µ and the value of σ is 3 kg. The weight of a randomly selected goat which is one σ unit
below the mean µ is 15 kg.

a) Find the value of the µ? (Answer: 18 kg)

b) What percentage of the goats weigh below 12 kg? (Answer: 2.28%)

Yosef Tekle-Giorgis, HUCA, 2015 Page 54


Statistical Methods for Scientific 2015

Research
5) a) What is the difference between Standard deviation of means and standard error of means?

b) What is the purpose of calculating standard error value? What does a small value of standard
error imply compared to large value of standard error?

c) Give the mathematical definition of Z and t and interpret their meanings? What are the
differences between Z and t?

6) Long time yield record data indicates that the local variety of corn around Hawassa gives a
mean yield of µ = 35 quintals/ hectare with a standard deviation of σ = 5 quintals.

a) If a random sample of 25 farmers are selected from the region, what is the probability that the
average yield of these farmers exceeds 38 quintals/ hectare? (Answer: 0.0013)

b) Find the probability that the average yield of the sampled farmers (the sample mean) is within
±2 quintals of the mean yield of corn in the area, µ?

i.e., P(µ -2 quintal ≤ The mean yield of the 25 farmers ≤ µ +2 quintal)? (Answer: = 0.8944)

Yosef Tekle-Giorgis, HUCA, 2015 Page 55


Statistical Methods for Scientific 2015

Research
Chapter 6 Estimation of population parameter from sample statistics

Various population parametric values are estimated from sample statistics. There are two ways
of estimating parametric values namely point estimation and interval estimation procedure. In
the first case the value of the sample statistic is directly taken as the value of the corresponding
population parametric value. In the latter case a range of value is computed based on the value
of the sample statistic that contains the parametric value. The latter is recommended because it
allows expressing the level of confidence in the estimation. The following section elaborates
interval estimation procedure for the population mean based on sample mean value.

Suppose consider that a random sample of 25 farmers were selected from Hawassa Zuria woreda
and their corn yield was recorded. Accordingly, the mean yield of the sampled farmers was 36
Quintal /hectare. Estimate the mean corn yield for the population of farmers in the woreda.
Consider that from a long time record, the standard deviation for corn yield of the woreda
farmers is 10 Quintal/hectare.

i) Point estimation procedure.


The best available estimate at hand for the population mean is the sample mean and thus the
mean corn yield of the farmers in the woreda is taken as 36 Quintal/ha.

ii) Confidence interval estimation procedure


In statistics, there are two confidence interval estimation procedures namely the 95 % and the
99% confidence interval procedures. In the former case, the estimator is 95 % confident that the
parametric value falls within the computed interval. In the latter case the confidence level is
99% for the interval to contain the parametric value. The procedure for deriving the 95 %
confidence interval for the population mean is described below just for the sake of showing the
root. A similar argumentation is also employed for the estimation of other parametric values.

Yosef Tekle-Giorgis, HUCA, 2015 Page 56


Statistical Methods for Scientific 2015

Research
6.1 Formula for 95 % Confidence Interval estimation of population mean (μ)

The formula for the 95 % confidence interval estimation of μ is developed based on the notion

σ
that in a repeated sampling, 95 % of the sample means lie between μ ± 1.96 values. i.e.,
√n

σ σ
(
P μ−1. 96
√n
≤X i≤μ+1. 96
√n
=0. 95
) (deduct μ from all sides of the
inequality)

σ σ
(
P −1 . 96
√n
≤ X i−μ≤1. 96
√n )
=0. 95
(deduct Xi from all sides)

σ σ
(
P −X i−1 . 96
√n
≤−μ≤−X i+1. 96
√n
=0 .95
)
(multiply all sides by -1 to remove – from μ)

σ σ
(
P X i+1. 96
√n
≥μ≥ X i−1. 96
√n
=0 . 95
) (rearrange the left and the right
side)

σ σ
(
P X i−1 . 96
√n
≤μ≤X i+1. 96
√n
=0 . 95
) (final equation)

σ
X i±1. 96
The above formula means that the interval √n contains the population mean μ from
which the sample was drawn with 95 % probability. This is the formula used to estimate μ when
σ is known. In the present example on the corn yield of farmers,

n= 25, Xi = 36 Q/ha, σ = 10 Q/ha and the 95 % CI for μ is computed as

σ
X i±1. 96
√n

Yosef Tekle-Giorgis, HUCA, 2015 Page 57


Statistical Methods for Scientific 2015

Research
36 Q/ha ± 1.96 * 10/√25

36 ± 3.92 Q/ha

i.e., the mean corn yield of farmers in Hawassa zuria woreda ranges between 32.08 Q/ha and
39.92 Q/ha with 95 % probability.

6.2 The formula for other confidence levels

In 95 % CI estimation, the estimator has a 5% uncertainty that the constructed interval misses to
contain the true parametric value. The 5% is termed as probability of uncertainty and it is
denoted by α. Hence the 95%, which is the probability of confidence is denoted by 1-α.

The value 1.96 in the 95% CI formula represent the Z value that exclude a 2.5% area at one tail
i.e. Z0.025. Hence the 95 % CI for μ can be written as:

σ
X i±Z0 . 025
√n
In general for any level α, the 1-α% confidence interval for μ knowing σ value is given as:

σ
X i±Z α /2
√n
σ σ
X i±Z0 . 005 X i±2 .58
Accordingly the 99% CI for μ is given as √n = √n

For the corn yield data the 99 % CI for μ =

σ
X i±2 .58
√n
36 Q/ha ± 2.58 * 10/√25

Yosef Tekle-Giorgis, HUCA, 2015 Page 58


Statistical Methods for Scientific 2015

Research
36 ± 5.16 Q/ha

i.e., the mean corn yield of farmers in Hawassa zuria woreda ranges between 30.84 Q/ha and
41.16 Q/ha with 95 % probability.

Note that the 99% CI is wider than the 95% CI. This is natural because as the confidence level
increases, the width should also increase to lessen the mishap.

In general 99% and 95% CI are commonly employed in estimating parametric values.
Confidence intervals above 99 % are not recommended because the width of the interval
becomes too wide and obscures the true value. Similarly, Confidence intervals below 95 %
although narrower and desirable, they are not recommended because these again increase the
probability of uncertainty or error in the estimation.

6.3 Confidence interval estimation when σ is unknown

The 1-α% CI formula based on Z value requires that the population standard deviation σ to be
known. However this is in most instances unknown. In such cases σis replaced by S(sample
standard deviation) and the formula is modified as follows.

S
X i±Z α /2
√n when σ is unknown and n ≥30 (large sample based studies)

Still with the standard error of mean Z distribution is employed because the deviation of sample
mean from population mean divided by SE value can be considered as Z when the sample size is
large (i.e., n ≥30).

X−μ
≈Z
S /√n , if n ≥30

On the other hand, if n≤ 30 (i.e., small sampled based study), the formula is modified as:

Yosef Tekle-Giorgis, HUCA, 2015 Page 59


Statistical Methods for Scientific 2015

Research
S
X i±t α /2(n−1)
√n

Exercise

1) 50 soil samples were randomly collected from around Hawassa and the phosphorus content
was analyzed. The mean for the 50 samples was 9.2 mg/gm of soil with a variance S 2 = 6.25
mg2/gm. Give the 95 % confidence interval for the mean (µ) phosphorus content of soil
around Hawassa? (Answer: 9.2 ± 1.73 mg/ gm of soil).

2) Suppose 25 soil samples were collected and these gave a sample mean phosphorus content
of 8.75 mg/gm with S2 value of 9.61 mg2/gm soil. What would be the 95 % confidence
interval for the mean phosphorus content of the soil around Hawassa? (Answer: 8.75 ± 1.28
mg/gm)

3) Based on data collected from a random sample of 9 weaver birds, the 95 % confidence
interval for the mean life expectancy ( μ) of weaver birds was calculated to be between 30
and 40 months. i.e., P(30 ≤µ ≤ 40) = 0.95. Calculate the 99 % confidence interval for the
mean life expectancy (µ) of weaver birds ? There is enough information to work with, think
about it. Sample size and 95 % CI values are given, using these first find the values of
sample mean and S . Also find the t value from the table based on sample size and the α
value. Then you can convert this into 99% (Answer: 27.7 – 42.3 months)

Yosef Tekle-Giorgis, HUCA, 2015 Page 60


Statistical Methods for Scientific 2015

Research
Chapter 7 Introduction to statistical hypothesis testing

Hypothesis testing involves comparing populations and the purpose is to make objective decision
regarding the observed differences. There are different hypothesis testing procedures and
choosing a particular procedure depends on the following.

i) The type of the response variable studied


Statistical tests employed to compare populations with respect to quantitative variables are
different from those employed for qualitative variables. Quantitative variables are best described
by mean values and hence mean values of the variables are used to compare the populations
studied. Such tests that use mean values for comparison are called parametric tests. By contrast,
tests employed when comparing populations with respect to qualitative variables are called non-
parametric tests. These test use frequencies, ranks, median values etc. for the comparison and
these tests are also referred as distribution free tests as they do not use mean values as a basis for
the comparison.

Non parametric tests are not constrained by factors that affect the distribution of the data set such
as normality and heterogeneity. Therefore such tests are at times recommended for tests
concerning quantitative variables when the data significantly deviate from normality and/or
become heterogeneous. Detailed discussions will be done on both types of tests in the latter
sections.

ii) The number of populations compared


Depending on the number of populations compare, hypothesis testing procedures are categorized
as follows

a) One sample hypothesis testing


This involves studying one population and comparing it with a known or previously
studied population

b) Two sample hypothesis testing


This involves comparing two simultaneously studied populations

Yosef Tekle-Giorgis, HUCA, 2015 Page 61


Statistical Methods for Scientific 2015

Research
c) Multi-sample hypothesis testing
Multi-sample tests concern comparing three or more populations
iii) The number of factors (treatments) considered
In this regard hypothesis testing procedures fall into two categories as

a) Single factor effect analysis


This is employed when the sources of variation in the response variable is only one
b) Multifactor (factorial) effect analysis
This concerns analyzing the simultaneous effects of two or more factors or treatments on
the response variable.

iv) The nature of the experimental subjects


In this regard, hypothesis testing procedures employed differ depending on whether the
experimental subjects are homogeneous or heterogeneous. Thus based on this, the testing
procedures are categorized as follows:

a) Completely randomized designs.


These types of tests are employed when subjects being studied are known to be
homogeneous in factors that affect the study variable. When subjects are considered to
be reasonably homogeneous, they are randomly assigned under the respective treatments
(categories), the reason why they are called randomized designs.

b) Partially randomized (Non randomized) designs


These types of experimental designs are employed when subjects are found to vary or
become heterogeneous with respect to one or more factors that affect their response
towards the variable under investigation. In this types of designs, additional
categorization (factor) is employed to account for the effects of a nuisance variable(s).

In the forgoing chapters, Hypothesis testing procedures that involve quantitative and qualitative
variables will be entertained separately. The discussion will commence with those tests that

Yosef Tekle-Giorgis, HUCA, 2015 Page 62


Statistical Methods for Scientific 2015

Research
concern quantitative variables. Then tests that concern qualitative variables are entertained in the
latter chapters

7.1 One sample hypothesis concerning mean

This procedure is employed when comparing the mean of the studied population with previously
studied or known population. The studied response variable is obviously quantitative. This test
is illustrated with the following example.

Consider that from a long time study the total grass biomass production of communally grazed
rangelands in pastoral areas of Ethiopia has been estimated to be µ = 800 kg of dry matter per
hectare with a standard deviation of σ = 100kg DM/hectare. A researcher assessed the biomass
production of grass in communally grazed rangelands of Mursi-Bodi district (Jinka zone,
SNNPRS). Accordingly he randomly selected 50 different sampling stations in communally
grazed rangelands of Mursi-Bodi and estimated the mean grass biomass as 850 kg DM/hectare.

Test if the DM grass productivity of the studied rangeland is significantly different from the
national average grass productivity in communally grazed rangelands of pastoral areas? Use 5%
level of significance?

In the above illustrative example, only one population, which is the communally grazed
rangelands of Mursi-Bodi district, has been studied and it is to be compared with previously
studied population, which is the communally grazed rangelands in pastoral areas of Ethiopia.
This is perfectly a one sample hypothesis and it concerned a continuous response variable, which
is grass productivity that is measured on a continuous scale. Thus the comparison concerned
mean productivity of the studied rangeland with the previously known rangeland.

Yosef Tekle-Giorgis, HUCA, 2015 Page 63


Statistical Methods for Scientific 2015

Research
The mean of the known (hypothesized population) is designated as μ O and that of the sampled
population is designated as μ1. Here the value μO = 800 kg/ha, but the value of μ 1 is not known

and it is estimated from the sample mean X=850kg/ha .

The question is to test if μ1 - μO, which is approximated as X - μO is significant or not.


There are five steps in any hypothesis testing procedure to be accomplished and these steps will
be illustrated with the present example

7.1.1 Steps in hypothesis testing

Step i) Stating the hypothesis

Statistical hypothesis testing begins with stating a hypothesis statement. This is why the
procedure is referred as hypothesis testing. A hypothesis is a concise statement about the
outcome of the comparison. Every comparison has two possible outcomes. i.e.,

 There is no difference among the populations compared


 There is a difference among the populations compared.
Therefore two hypotheses are stated for every comparison, one supporting the idea of no
difference and another one indicating the presence of a difference.

The hypothesis that supports the presence of a difference is called null hypothesis (Ho) and the
opposite one that supports the presence of a difference is called alternative hypothesis (HA). In
the present example the null and alternate hypotheses are stated as follows:

Ho: The grass productivity in the communally grazed rangelands of Mursi-Bodi district is not
different from those of similar rangelands in pastoral areas of Ethiopia

HA: The grass productivity in the communally grazed rangelands of Mursi-Bodi district is
different from those of similar rangelands in pastoral areas of Ethiopia

Yosef Tekle-Giorgis, HUCA, 2015 Page 64


Statistical Methods for Scientific 2015

Research

Note that in a hypothesis statement, the variable of interest and the populations being compared
should be clearly indicated. Otherwise, if there is a mistake in the hypothesis statement, the
conclusion statement becomes wrong too.

In hypothesis testing it is the null hypothesis that is tested based on the available evidence not the
alternate hypothesis. i.e., if the observed difference between the population means being
compared is large enough convincingly, the null hypothesis is rejected and the alternate is
considered. On the other hand, if the difference is not significantly large enough, there is not
sufficient evidence to reject the null hypothesis and therefore it will be accepted. Therefore the
null hypothesis is called testable hypothesis.

It is important to realize that a true null hypothesis occasionally will be rejected, which of course
means an error has been committed in doing so. This type of error is called type I or α type
error. On the other hand, if Ho is in fact false but not rejected; another type of error is
committed known as type II or β type error.

The power of a statistical test is defined as 1-β. i.e., Power is the probability of rejecting the null
hypothesis when it is in fact false and should be rejected. The two types of errors are inversely
related. i.e., when one tries to minimize the probability of rejecting a true null hypothesis (α type
error), he maximizes the probability of accepting a false null hypothesis (β type error). The only
way to reduce both types of errors simultaneously is to increase the sample size (n). i.e., for a
given α, larger samples will result in statistical testing with greater power. The two types of
errors are summarized in the following table.

Yosef Tekle-Giorgis, HUCA, 2015 Page 65


Statistical Methods for Scientific 2015

Research
Table 7.1 The two types of errors in hypothesis testing

Condition
Decision Ho is true Ho is false
If Ho is rejected Type I error (α) Correct decision
If Ho is not rejected Correct decision Type II error (β)

Step ii) Determining the test statistic value for the comparison
In every comparison the observed differences are compared to make decision. However, the
mere difference is not directly used for the comparison. This is because the mere difference has
unit and it does not have a known probability distribution. Thus it is not possible to determine
the critical value for the comparison.

As a result, the mere difference is converted into a test statistic value, which is unit less and has a

known probability distribution. In case of a one sample hypothesis, the difference


X −μ O is
converted into a unit less statistical quantity that has a known probability distribution as follows:

X−μO
Z=
σX if σ is known OR

X−μO
Z=
SX if σ is unknown and n ≥ 30 (large sample based test) OR

X−μO
t=
SX if σ is unknown and n ≤ 30 (small sample based test)

Yosef Tekle-Giorgis, HUCA, 2015 Page 66


Statistical Methods for Scientific 2015

Research
As seen here both Z and t are unit less and they have a known probability distribution. Therefore
the test statistics for a one sample hypothesis concerning mean can be either a Z or a t value.
Accordingly, the test is referred as Z test or t test depending on the conditions met.

In the present example, since σ is known (100 kg/ha), the test statistic appropriate for the test is Z
and it is calculated as follows.

Given: n = 50 samples, X = 850 kg/ha, μO = 800 kg/ha, σ = 100 kg/ha

X−μ O
Z cal =
σX , (850 – 800)/(100/√50) 50/14.142 3.54

Here Zcal is calculated value of Z

In the above, the difference of 50 kg /ha in grass productivity between the two rangelands is
designated by a Z value of 3.54. Now asking whether a difference of 50 kg/ha is large or not
means asking if a Zcal (calculated Z) value of 3.54 is large or not to suggest the presence of
significant differences between the populations compared. To decide whether a Z value of 3.54
is large or not it is necessary to determine the critical Z value and this leads to the third step in
hypothesis testing.

Step iii) Determining the critical value of the test statistics

The critical value of a test statistics is a value considered to represent a small enough different
before deciding that the calculated test statistic value is large. Critical value is a criterion value
used to compare with the calculated value. Critical value of the test statistics is determined by
consulting the distribution of the test statistic, which in this case is the Z distribution.

Supposing that there is absolutely no difference between the sample mean and the hypothesized

population mean, i.e., if


X −μ O = 0, the value of Zcal becomes Zero. i.e., A Z cal value of

Yosef Tekle-Giorgis, HUCA, 2015 Page 67


Statistical Methods for Scientific 2015

Research
zero indicates perfect agreement and the null hypothesis is accepted. In other words, a Z cal
value that leads to rejection of the null hypothesis should be a value that significantly deviates
from zero or perfect agreement. This again means a Z cal value that lie at the extreme of both
tails does significantly deviate from the center or zero value and leads to rejection of the null
hypothesis.

Statistically, a test statistic value that lie at the tail 5% or 1% area (depending on the choice), is
considered as highly deviant from perfect agreement and leads to rejection of the null hypothesis.
Thus the tail 5% or 1% area of the test statistics distribution is considered as the rejection region
since a calculated value that fall within this region leads to rejection of the null hypothesis. The
rejection region is also called rejection probability or critical probability or level of significance.
It is denoted by α and it is the probability of committing type I error that the researcher willingly
accept. Level of significance is chosen before data collection to avoid bias. In the present
example since the test involves whether the two rangelands are different or not in productivity,
the rejection region is and α area at both tails. Since α is given as 5% (or 0.05), the rejection
region would be α/2 or 0.05/2 = 0.025 area at each tail as shown by Figure 8.1 below.
Accordingly, If the value of Z cal is either ≤ -Z0.025 or if it is ≥ Z0.025, it is considered as highly
deviant from the center (perfect agreement) and the null hypothesis is rejected.

Here the critical value of the test statistics is the value that demarcates the acceptance and the
rejection region. In the present example the Critical value of Z (Z critical) is either -Z0.025 or it is
Z0.025. From the Z table (Appendix Table 1), these values can be read as -1.96 and 1.96,
respectively. The next step is then to compare the calculated with the critical test statistics value
to arrive at decision.

Yosef Tekle-Giorgis, HUCA, 2015 Page 68


Statistical Methods for Scientific 2015

Research
10
9
8
7
Frequency

6
5Rejection region
4 α/2 = 0.025 Acceptance region Rejection region
3 1-α = 0.95 α/2 = 0.025
2
1
0
-3 -2 -1 0 1 2 3

-Zα/2=-1.96 Standardized normal deviate (Z) Zα/2 = 1.96

Figure 7.1 Plot of Z distribution showing the rejection and acceptance regions for the test in the
above example.

Step iv) Decision making

At this step the calculated value of the test statistics is compared with the critical value to make
decision. For a one sample hypothesis concerning mean, reject the null hypothesis if:

Z test Zcal ≥ Zα/2 if Z cal is positive (i.e., rejection region at the right tail)

-Zcal ≤ -Zα/2 if Z cal is negative (i.e., rejection region at the right tail)

t test tcal ≥ tα/2(n-1) if t cal is positive (i.e., rejection region at the right tail)

-tcal ≤ -tα/2 (n-1) if Z cal is negative (i.e., rejection region at the right tail)

In the present example, Z cal = 3.54 is > Z critical = 1.96

Yosef Tekle-Giorgis, HUCA, 2015 Page 69


Statistical Methods for Scientific 2015

Research
Therefore the calculated Z value is in the rejection region at the right tail and hence the decision
is to reject Ho.

Step v) Conclusion

Finally a concluding statement is stated regarding the comparison outcome as per the decision
arrived at the 4th step. Hence the conclusion is:

The grass productivity in the communally grazed rangelands of Mursi-Bodi district is


different from those of similar rangelands in pastoral areas of Ethiopia

In the above test if the test was done at 1 % level of significance, the critical Z value would be
Z0.005 = 2.58 from the Z table. Hence the decision and the conclusion remain the same. Note
that the critical test statistic value at 1% α is always larger than the critical value at 5% level.

7.1.2 One tailed Vs two tailed tests

The exampled considered earlier is called a two tailed test in which the rejection region concerns
α area both tails. In this kind of test, the main aim of the comparison is to find out if the sampled
population mean is different (or not) from the hypothesized population mean. For instance, in
the earlier example, there is no clue or ground to ask whether the rangeland in Mursi-Bodi is
greater or less in productivity than the rangelands in the pastoral areas of the rest of the country.

By contrast, sometimes there are grounds to ask if the sampled population mean is significantly
higher or lower than the hypothesized population mean. Such tests are directional tests and

Yosef Tekle-Giorgis, HUCA, 2015 Page 70


Statistical Methods for Scientific 2015

Research
involve a one tailed test in which the rejection region is the whole α area at one of the tails.
These again fall into two categories as right tailed test and left tailed test. The aim in the right
tailed test is to find out if the sampled population mean is significantly higher than the
hypothesized population mean. Thus the rejection region is α area at the right tail. Contrary to
this the aim in case of the left tailed test is to find out if the sampled population mean is
significantly lower than the hypothesized population mean. In such tests the rejection region is α
area at the left tail. Illustration of both types of tests are given below.

7.1.2.1 Example of a right tailed test

The ministry of health specified that preserved food products containing 50 or more cfu bacteria
per gm of food are unsafe for consumption. During one of the regular checkups, a health officer
randomly sampled 20 cans from MelgeWondo meat factory and checked their microbial
content. The average bacteria load in the sampled cans was 55 cfu/gm with a standard deviation
of 8 cfu/gm. Test if the food product produced by the factory is safe for consumption? Use 5%
α.

In the above example, the interest of the investigation is to find out if the food product produced
by the factory indeed contains unsafe level of microbes, i.e., more than the specified upper limit
of microbial load, 50 CFU per gm of food. Thus the test is a right tailed test and the procedures
would be as follows.

i) Hypothesis
Ho: The bacteria content in the food product produced by Melgewondo meat factory does not
exceed the upper specified level, i.e., 50 CFU/gm of food.

Yosef Tekle-Giorgis, HUCA, 2015 Page 71


Statistical Methods for Scientific 2015

Research
HA: The bacteria content in the food product produced by Melgewondo meat factory exceeds the
upper specified level, i.e., 50 CFU/gm of food.

ii) Test statistic

Givens are: n = 20 cans X = 55 CFU/gm μO = 50 CFU/gm S = 8 CFU/gm

Since σ is unknown and the sample size is < 30, the test statistics is t

X−μO
t= (55−50)
SX 8/ √ 20 = 5/1.789 = 2.795

iii) t critical = tα (n-1) t0.05 (19) = 1.729

iv) Decision: Reject Ho if tcal ≥ tα (n-1)


Since 2.795> 1.729, reject Ho

v) Conclusion
The bacteria content in the food product produced by Melgewondo meat factory exceeds the
upper specified level, i.e., 50 CFU/gm of food. Therefore the food is unsafe for consumption.

7.1.2.2 Example of a left tailed test

The mean calcium concentration of a given marine (salt water) lake is 32 mmoles/liter of water
and it is considered that such a concentration is too high if it gets into the body of animals (fish,
invertebrates etc.) that live in such kind of high calcium containing water. Therefore it is
believed that animals that live in such kind of high calcium containing water should have a

Yosef Tekle-Giorgis, HUCA, 2015 Page 72


Statistical Methods for Scientific 2015

Research
mechanism of regulating their body calcium concentration so as to keep it below the
surrounding water.

To check this, a biologist took a random sample of 13 arthropods of a given species and
determined the calcium concentration of their body fluid. The following readings were
obtained. 28, 27, 29, 29, 30, 30, 31, 30, 33, 32, 30, 32 and 31 mmoles per kg of body fluid.

Do the arthropods regulate the calcium concentration of their body fluid? Test at 1% level of
significance

The interest in the above question is to find out if the arthropods regulate their body calcium
concentration. If they do, the body calcium concentration of the arthropods would be less than
the surrounding water concentration which is 32 mmoles/liter. The test is a left tailed test.

i) Hypothesis
Ho: The calcium concentration in the body of the arthropods is not lower than the
surrounding water concentration. i.e., 32 mmoles per liter.

HA: The calcium concentration in the body of the arthropods is lower than the surrounding
water concentration. i.e., 32 mmoles per liter.

ii) Test statistic

Givens are n = 13 arthropods X = 30.23 mmole/lit, S = 1.6408 μO = 32


mmole/lit

tcal = (30.23 -32)/ (1.6408/√13) -1.77/0.455 = -3.890

iii) Critical value


-t0.01(n-1) = -t0.01 (12) = -2.681

Yosef Tekle-Giorgis, HUCA, 2015 Page 73


Statistical Methods for Scientific 2015

Research
iv) Decision
Reject Ho: if –tcal ≤ -tα (n-1)

-3.89 < -2.681, Hence reject Ho

v) Conclusion
The calcium concentration in the body of the arthropods is lower than the surrounding water
concentration. i.e., 32 mmoles per liter. Therefore the arthropods regulate their body calcium
concentration.

Exercise questions

1. From a long time data collection the mean birth weight of male calves of Sidama cattle is
known to be µ = 20 kg with σ = 5 kg. Wondogenet is one of the woreda in Sidama zone
adjoining Arsi. Since the cattle in Wondogenet interbreed with Arsi cattle, A researcher
suspected that the cattle of Wondogenet may differ in some characteristics from the rest of
the Sidama cattle. Accordingly, he wanted to compare if the birth weight of calves in
Wondogenet differ from that of Sidama cattle. For the comparison, the researcher randomly
sampled 25 male calves and the mean birth weight of the sampled calves was 22 kg. Does
the available evidence support that the mean birth weight of male calves in Wondogenet
differ from the rest of the Sidama cattle? Test at 5% level of significance.

2. The Akaki metal work factory produces pipes for domestic water supply that has a diameter
of 25 mm exact. Wider or narrower pipes don’t fit properly and cause leakage. Thus the
factory regularly checks if the width of the pipes is exactly 25 mm and adjusts the machine if
the width of the pipes significantly deviates from 25 mm. Accordingly, during one of the
regular checkups, a random sample of 10 pipes were selected and the mean diameter of the
pipes was found to be 25.02 mm with S = 0.024 mm. Does the machine need to be

Yosef Tekle-Giorgis, HUCA, 2015 Page 74


Statistical Methods for Scientific 2015

Research
readjusted? Test if the mean width of the pipes is not different from 25 mm? Use 5% α for
the test.

3. A bakery at Hawassa signed a contract agreement to deliver bread for the college cafeteria.
The agreement was that the bakery delivers bread that weigh 150 gm per loaf and the
cafeteria pays 2 Birr per loaf of bread. The student council regularly checks up if the average
weight of bread delivered by the bakery is not less than 150 gm. Accordingly, during one of
the regular checkups, a student assigned for this purpose randomly sampled 50 loaves of
bread and measured their weights. The average weight of the sampled bread was 146.0 gm
with s = 10 gm.

Yosef Tekle-Giorgis, HUCA, 2015 Page 75


Statistical Methods for Scientific 2015

Research
8 Two sample hypothesis concerning mean

Two sample hypothesis concerning mean involves comparing the mean of simultaneously
sampled two populations. This fall into two categories as independent sample hypothesis and
paired sample hypothesis test. In the first case the two samples being compared are drawn
from two independent (unrelated) populations. In case of paired sample hypothesis test, pair
data are generated from the same or related individuals. In both cases, the test statistics is t.
The procedure will be illustrated separately in the following sections

8.1 Independent sample hypothesis test concerning mean

Consider the following example:

The rangelands of Mursi-Bodi district (Jinka zone, SNNPRS) are located in altitude range of

500 – 1200 masl. The rangelands can be categorized in to two altitudinal zones namely

lower altitude areas (500 - 850 masl) and higher altitude zones (> 850 masl).

In a given study a researcher assessed if the range productivity in the two altitudinal areas

differ or not. Accordingly, the researcher randomly selected ten sampling sites in each of the

two altitude areas and estimated the total grass biomass in each of the 20 sampling locations.

The table below gives the total grass productivity in kg of DM weight /hectare estimated

from each sampling location. Does the grass productivity in the two altitudinal areas

significantly differ? Test at 5% α.

Yosef Tekle-Giorgis, HUCA, 2015 Page 76


Statistical Methods for Scientific 2015

Research

Table 8.1 Total DM weight of grass (kg/hectare) estimated from the 20 sampling locations.

Lower altitude Higher altitude


(500-850 masl) (>850 masl)
Sampling location Kg DM/hectare Sampling location Kg DM/hectare
1 450 1 590
2 500 2 630
3 525 3 670
4 550 4 720
5 590 5 770
6 630 6 810
7 660 7 840
8 700 8 870
9 720 9 940
10 750 10 980
∑X 6075 ∑X 7820
X1 607.5 X2 782.0
S21 10173.61 S22 16951.11
S1 100.86 S2 130.20
S X1
31.8947 S X2
41.1729

As seen in the example, the 10 sampling locations considered from the lower altitude areas
are totally different from the 10 sampling locations considered from the higher altitude areas.
This is therefore an independent sample hypothesis test. It is also a two tailed test as there is
no ground to assume that one of the two altitudinal zone is higher or lower in grass
productivity than the other. As usual the testing procedure begins with stating hypothesis
statement as follows.

i) Hypothesis
Ho: There is no difference in grass productivity between rangelands located at the lower and
higher altitude areas of Mursi-Bodi district.

HA: There is a difference in grass productivity between rangelands located at the lower and
higher altitude areas of Mursi-Bodi district.

Yosef Tekle-Giorgis, HUCA, 2015 Page 77


Statistical Methods for Scientific 2015

Research

ii) Test statistics


As mentioned earlier the test statistics is t and it is calculated as follows

X 2 −X 1
tcal = SE

The test is simply an extension of a one sample hypothesis t test procedure. X1 and X2

are the means of the two groups of data. Also the SE in the denominator refers to standard error
value. Since there are two standard error values associated with the respective group mean, the
standard error in the denominator is the sum of the standard errors of the two group means. This
standard error is called the standard error for the difference between two means.

However as SE values are derived from variance and are not directly additive, they should be
converted into variance of means and the latter are added.

S 21 S 22
S X 1=
n1 √ and
S X 2=

n2

S21 S22 S21 S 22


Hence SE =
SX1 + SX2 = √ √ √
n1 + n2 =
+
n1 n 2

In the above formula, it is recommended to use the average of the two sample variances in place
of the respective sample variance. The average of the two sample variance is referred as pooled
variance (S2P ) and it is calculated as follows

S12 +S 22
S 2P =
2 if n1 = n2

Yosef Tekle-Giorgis, HUCA, 2015 Page 78


Statistical Methods for Scientific 2015

Research
If n1 and n2 are different, The two variances are first converted into respective sum of square

2 SS1
values for averaging. i.e., S1= hence SS1 = S21 ¿1-1)
n1−1

2 SS2
similarly S2= hence SS2 = S22 (n2 -1)
n2 −1

S21 ( n1−1)+ S22 (n 2−1)


Thus S = 2
P if n1 ≠ n2
(n¿ ¿1−1)+(n¿¿ 2−1) ¿ ¿

Since in the present case n1 = n2, we average the two variances directly as:

S 2P = (10173.61 + 16951.11)/2 = 13562.36

Then the standard error used as a denominator is computed as

S2P S 2P
SE = √ +
n1 n 2

= √(13562.36/10) +(13562.36/10) = 52.0814

X 2 −X 1
Accordingly tcal = SE (782- 607.5)/ 52.0814 = 3.351

iii) t critical = tα/2(n1 -1 +n2 -1)


α is given as 0.05, Also note that the degrees of freedom is the sum of the tow within group

degrees of freedoms i.e., n1 -1 and n2 -1

t critical = t0.05/2(9 +9) = t0.0025(18) = 2.101

iv) Decision:

Reject Ho if tcal ≥ tα/2(n1 -1 +n2 -1) or

Yosef Tekle-Giorgis, HUCA, 2015 Page 79


Statistical Methods for Scientific 2015

Research
-tcal ≤ -tα/2(n1 -1 +n2 -1)
Here 3.351 > 2.101, hence reject Ho

v) Conclusion:
There is a difference in grass productivity between rangelands located at the lower
and higher altitude areas of Mursi-Bodi district.

8.2 Paired sample test


In this case the two data sets are generated from same or related individuals. Thus the two data
sets are not independent. Consider the following example:
Urine Iodine concentration was tested in 15 volunteer individuals to check if iodine
concentration in the body has significantly improved after Iodine treatment. Accordingly the
urine iodine concentration of the respective individual was measured (in mg iodine per liter of
urine) before and after each individual was treated with iodine containing capsule. The data are
shown in the following table. Test if iodine concentration in the body of the tested individuals
significantly improved after iodine treatment. Use 1% level of significance for the test.

Table 8.2 Urine iodine concentration before and after iodine treatment

Urine Iodine concentration (mg/lit)


Subject Before treatment After treatment di di2
1 37.8 50.2 12.4 153.8
2 47.4 65 17.6 309.8
3 34.4 45.8 11.4 130
4 44.6 62.6 18 324
5 31.3 51 19.7 388.1
6 13.6 25.7 12.1 146.4
7 25.3 38.8 13.5 182.3
8 37.7 55.4 17.7 313.3
9 30.8 50.2 19.4 376.4
10 28.3 42.2 13.9 193.2

Yosef Tekle-Giorgis, HUCA, 2015 Page 80


Statistical Methods for Scientific 2015

Research
11 51.1 70 18.9 357.2
12 24.9 50 25.1 630
13 22.8 46.5 23.7 561.7
14 66.3 98.5 32.2 1037
15 64.7 100.6 35.9 1289
Sum 291.5 6392
X 19.4333

Sd 1.86041

In the above example, the data urine iodine concentration records were taken from the same
individual before and after iodine treatment was given. Thus the two data sets are paired (or
related) data. The testing procedure is as follows:

i) Hypothesis
Ho: Urine iodine concentration remained the same After iodine treatment.

HA: Urine iodine concentration improved after iodine treatment.

ii) Test statistic

The test statistic is t and it is computed as follows:

d
tcal = S d , where
d = the mean paired difference = ∑di/n
di = is the paired difference obtained by deducting the paired data as shown by
column 3 of table 9.2.
d = 291.5/15 = 19.4333 mg/liter
S d = the standard error of the mean paired difference and it is calculated as

Yosef Tekle-Giorgis, HUCA, 2015 Page 81


Statistical Methods for Scientific 2015

Research
2
Sd = Sd
√ n
2
2 (∑ d i )
2
∑ d −¿
i
n [6392 – (291.5)2/15]/14 = 51.942
S=
d ¿
n−1

S d = 51.942 = 1.861
15√
tcal = 19.4333/1.861 = 10.442

iii) t critical
As seen from the hypothesis statement, this is a right tailed test and the rejection region is the
whole α area at the right tail. Since α is 0.05, the critical value is

t critical = t0.05(n-1) = t0.05(14) = 1.761


Note that the degrees of freedom is n-1 not (n1-1 + n2 – 1) as in the case of the independent
sample test because only n subjects were considered for the study.

iv) Decision:

Reject Ho if tcal ≥ tα(n-1)


10.442 > 1.761, hence reject Ho

v) Conclusion
Urine iodine concentration of the tested individuals has improved after iodine treatment

Yosef Tekle-Giorgis, HUCA, 2015 Page 82


Statistical Methods for Scientific 2015

Research
Exercise (second assignment Q#2 &3)

1. A company produced a new drug for blood clotting and it advertised that the new drug stops
bleeding faster than the previously used drug. To test this claim, 20 similar rats (same
genotype, sex, age etc.) were used. The 20 rats were randomly grouped into two groups of
10 rats and one of the group was tested with the new drug and the other group of ten rats
were treated with the old drug. For the test, each rat was bled by cutting the tip of the tail
and it was immediately injected with the drug. The experimenter then recorded the time
passage (in minutes) after drug administration and complete stoppage of bleeding. The data
were as follows. Test at 5% level of significance if blood clotting time of the new drug is
shorter than the old drug?

Blood clotting time


Rat ID New drug Rat ID Old drug
1 5 1 6
2 7 2 9
3 6 3 6
4 6 4 5
5 5 5 7
6 4 6 5
7 7 7 8
8 6 8 7
9 3 9 5
10 4 10 6

Yosef Tekle-Giorgis, HUCA, 2015 Page 83


Statistical Methods for Scientific 2015

Research
2. The ministry of Agriculture has a long experience of importing DAP fertilizer in bulk
from Europe (Germany). However since Kenya started to produce DAP, there is a
growing interest to purchase it from Kenya. On the other hand the ministry wanted to
check if DAP produced by Kenya is as productive as that of the European DAP.
Accordingly during one occasion 20 plots of similar land were prepared and randomly
assigned 10 under each of the two fertilizer application. Hence corn was sawn on the
twenty plots and 10 of the plots were treated with Kenyan and the rest 10 with European
DAP. All the 20 plots were managed in a similar manner and yield of corn harvested
from each plot was separately recorded as shown by the table below.

Test if the yield under the Kenyan DAP application was comparable with that obtained under
European DAP application? Use 5% α.

Corn yield quintal/hectare


Plot no. European DAP Plot no. Kenyan DAP
1 35.0 1 32.0
2 36.2 2 34.2
3 39.4 3 37.4
4 37.6 4 35.6
5 38.5 5 33.5
6 34.3 6 36.3
7 36.1 7 34.1
8 39.8 8 37.8
9 37.2 9 34.2
10 35.5 10 32.5
∑X 369.6 ∑X 347.6
Mean1 36.96 Mean2 34.76
S21 3.4693 S22 3.8516
S1 1.8626 S2 1.9625

3. Concentration of nitrogen oxide and hydrocarbons (μg/m3) were determined in a certain


urban area. Test the hypothesis that both classes of air pollutants were present in the same
concentration? Use 5% level of significance.

Yosef Tekle-Giorgis, HUCA, 2015 Page 84


Statistical Methods for Scientific 2015

Research
Day Nitrogen oxide Hydrocarbons
1 104 108
2 116 118
3 84 89
4 77 71
5 61 66
6 84 83
7 81 88
8 72 76
9 61 68
10 97 96
11 84 81

Yosef Tekle-Giorgis, HUCA, 2015 Page 85


Statistical Methods for Scientific 2015

Research
Chapter 9 Multi-sample hypothesis concerning mean
Multi-sample hypothesis concerning mean involves comparing the mean of three or more
populations. Obviously the response variable should be continuous. The simplest case involves
analyzing the effects of a single factor on the response variable. The test statistics for multi
sample hypothesis is F. Theoretically it is seems feasible to attempt the testing of multi sample
hypothesis by applying two sample tests to all possible pairs of samples. For example, one might
proceed to test the null hypothesis Ho: μ1 = μ2 = μ3 by testing each of the following hypothesis
by the two sample t test: Ho: μ1 = μ2, Ho: μ1 = μ3, and Ho μ2 = μ3. But such a procedure of
employing a series of two-sample tests to address a multi sample hypothesis is invalid. This is
because the α , which is the probability of committing type I error increases unknowingly if pair
wise tests are employed with t test instead of comparing all the groups with a single run.

For instance, if the level of significance is set to α = 0.05 and if three populations are to be tested
for the presence of significance differences, the type I error (i.e., the probability of wrongly
rejecting a true null hypothesis) increases to 0.14. If there are four populations to be compared
with α = 0.05, the type I error increases to 0.26 by performing six pair wise t tests since this
involves comparing six pairs of means when two are taken at a time. In general if multiple t tests
are employed to test for differences among more than two means, The probability of committing
type I error will become 1-(1-α)a, where α is the level of significance used for each comparison
and ‘a’ is the number of pair wise tests completed. For example, if three groups are compared by
taking two at a time, three pair wise comparisons are completed with a t test. If the α level is set
at 0.05 for each pair wise test, then the type I error rate would be 1-(1-0.05) 3 = 0.14. If four
populations are compared at the same α level, six pairs of means are tested. Then the type I error
rate would be = 1-(1-0.05)6 = 0.26 and so on.

9.1 Illustration of Single factor analysis of variance test


A multi sample hypothesis test will be demonstrated with the simplest type example that
involves testing the effect of a single treatment factor on the response variable involving
comparison of three populations.

Yosef Tekle-Giorgis, HUCA, 2015 Page 86


Statistical Methods for Scientific 2015

Research
Consider that the effects of feed on the milk yield of Borana cows was evaluated under three
types of feeds. The three experimental feeds were the following:
Feed 1 Grazing alone
Feed 2 Grazing + hay supplementation
Feed 3 Grazing + concentrate supplementation

For the experiment, 15 Borana cows were considered. The cows were similar in all factors that
affect milk yield such as: similar age (2-3 years), similar health condition etc. Accordingly the
15 cows were randomly assigned to one of the three feeds. Thus, each feeding group consisted
five cows. The cows were then fed the assigned feeds starting delivery until drying and milk
yield data was collected to estimate the milk yield per lactation of each cow. The data are shown
in the table below. Test if milk yield/lactation of the studied cows differed under the three
experimental feeds? Use 5% level of significance.

Table 9.1 Milk yield/lactation (liters/cow) data of the investigated Borana cows under the
three feeds
Feed 1 Feed 2 Feed 3
430 440 475
440 455 493
450 462 502
458 471 515
472 485 527
n1 = 5 n2 = 5 n3 = 5
Mean1 = 450 Mean2 = 462.6 Mean3 = 502.4
S21 = 262 S22 = 285.3 S23 = 400.8
∑X1 = 2250 ∑X2 = 2313 ∑X3 = 2512
Total number of cows (N) = 15
Overall grand total sum of milk yield ∑∑Xi = 7075
Overall grand mean milk yield = 471.667

The hypothesis testing procedure for the above multisample hypothesis will be as follows

Yosef Tekle-Giorgis, HUCA, 2015 Page 87


Statistical Methods for Scientific 2015

Research
i) Hypothesis
Ho: The milk yield of Borana cows is not different under the three experimental feeds
HA: The milk yield of Borana cows is different under the three experimental feeds

ii) Test statistics


The test statistics for multisample hypothesis test is F. Thus the procedure is called F test. As in
the case of the t test entertained earlier, F is also computed based on group mean differences, i.e.,
based on the differences of the three group means.

2
S
Recall that differences among the three group means is measured by variance of means ( X )
and it is computed as follows.

∑ ( X i −X )2
S 2X =
a−1 = SSmeans/df among group (‘a’ is the number of groups tested)

= {(450 – 471.667)2 + (462.6 – 471.667)2 + (502.4 – 271.667)2}/3-1 = 748.0933 liter2/lactation

The variance of means expresses differences among the three group means. Thus, based on the
variance of means, a variance known as variance among group is computed that quantifies the
differences in milk yield among the three groups of cows. And it is this variance, which is
converted into F statistic.

The variance among group is symbolized as S2among group and it is computed as

2 2
S 2
among group =¿ n∗ x=Sn ∑ ( Xi− X )
a−1 = SSamong group/df among group

‘n’ in the above formula is number of replication per group.

Yosef Tekle-Giorgis, HUCA, 2015 Page 88


Statistical Methods for Scientific 2015

Research
2
Accordingly, Samong group = 5*748.0933 = 3740.467 liter2/lactation

This implies that the three groups of cows, subjected to the three different types of feeds differed
by 3740.467 liter2/lactation, when their differences is expressed by variance. If this variation is
considered to be large, then the null hypothesis that suggests a no yield difference will be
rejected. To judge if the computed variance among group is large or not, it is necessary to
analyze the sources of variation in yield among the three groups of cows.

The sources of variation in response among the subjects exposed to different treatments can be
described as follows:

i) Treatment caused variation (symbolized as S2treatment )


One of the reasons why the cows differed in yield is obviously because of the difference in the
feed treatments. i.e., because the three groups were treated with three different types of feeds.
This is the explained portion of the total variability.

ii) Variation in response within respective treatment group (S2Within group)


Not only the three groups of cows differed in yield among one another, but also the cows within
each treatment group differed in yield. For example as seen from Table 9.1, the variation in
yield among the five cows under the first, second and third type of feed are, 262.0, 285.3, and
400.8 liter2/lactation, respectively. This difference is referred as within treatment group
variability and the reason for this source of variation is subtle and it is not straight forward to
explain. Customarily this source of variation is referred as unexplained source of variation,
since the exact cause for the variation is not easily explainable.

As explained earlier, the 15 cows used for the experiment are similar. Also the cows under the
same feeding group are exposed to the same treatment. Then what caused them to differ from
one another? The exact reason for the within treatment group variation in response is not clear.

Yosef Tekle-Giorgis, HUCA, 2015 Page 89


Statistical Methods for Scientific 2015

Research
But one thing is evident, that is how ever similar experimental subjects are, they somehow
differ in response and this difference is therefore the minimum expected variation or
unavoidable source of variation. This source of variation is also commonly referred as error
variance (S2error ) because it occurs as a random error.

In summary the two sources of variation that composed the variance among group can be
expressed as follows.

S2among group = S2treatment + S2Within group

Since the variance within group( S ¿ ¿ Within group2 )¿ is the minimum expected variation, it is

used to judge if the variance among group (S2among group ¿that contains the treatment effect is large
or not. The comparison is done as a variance ratio as follows

S 2among group
= F.
S2Within group

In fact this ratio gives the F statistics and this is why F is known as a variance ratio statistics.
2 2 2
Since Samong groupcontains both Streatment and SWithin group, when dividing the variance among group
with the variance within group, the treatment effect will be singled out. i.e.,

S 2among group S 2treatment +S 2Withingroup


F= 2 2 = S2treatment
SWithin group SWithin group

Accordingly F test is efficient to separately estimate the treatment effect as long as the
denominator variance contains only the minimum expected variation or random error variance.

To complete the computation of Fcal, the variance within group should be quantified, In this
regard, the variance within group is the average of the three within group variances. i.e.,

Yosef Tekle-Giorgis, HUCA, 2015 Page 90


Statistical Methods for Scientific 2015

Research
S2Within group = ( S¿¿ 12 +S 22 +S 23)/3 ¿, since n1 = n2 = n3

But the general formula when the replications are unequal is:

S 21 ( n 1−1 )+ S 22 ( n 2−1 ) + S23 ( n 3−1 )


S2
=
Within group
(n¿ ¿1−1)+(n¿¿ 2−1)+( n¿¿ 3−1)¿ ¿ ¿
=

(SS1+SS2+SS3)/(df1+df2+df3) = SSwithin group/dfwithin group

dfwithin group can be expressed as (n1 + n2 + n3) -3

Since n1 + n2 + n3 = N (total no. of subjects) and 3 represents number of groups (a),

dfwithin group = N – a, ie., total number of subjects – number of groups compared

In the present example, SWithin group = (262.0+285.3+400.8)/3 = 316 liter2/lactation


2

S 2among group
F cal= 2 =3740.467 /316=11.84
SWithin group

After computing Fcal, the next step is to find out the F critical value

iii) Determining F critical


To determine F critical value for the test, F distribution is consulted. F is a unit less statistical
quantity with a known distribution. As explained earlier, F is a variance ratio statistic and F
distribution can be created by repeatedly sampling pair samples from a given population and
dividing the variance of one of the sample by the variance of the other sample. Then plotting the
F values against their respective frequencies gives a positively skewed distribution that begins at
Zero and has a maximum value around one (see figure 9.1).

Yosef Tekle-Giorgis, HUCA, 2015 Page 91


Statistical Methods for Scientific 2015

Research

12

10
Frequency (%)

4 α = 0.05

0
0 1 2 3 4 5 6
F values F0.05 (2, 12) 3.89

Figure 9.1 The F distribution constructed based on 2 and 12 degrees of freedom values.

Assume that the treatment effect is zero, i.e., say the fertilizer differences did not have any effect
on yield. This means that the variance among group would be equal to the variance with in
group. Then the calculated F value becomes 1. This in return means that a value of Fcal= 1
represents a perfect agreement or totally no treatment effect.

In practice there is some treatment effect and the variance among group is expected to be larger
than the variance within group. Hence the calculated F value is expected to be larger than one.
But How large a calculated value of F indicates a significant treatment effect? Evidently a
calculated F which is large and falls at the extreme right tail indicates a significant treatment
effect. Hence F test is a right tailed test and the rejection region is an α area at the right tail, in
which α, could be 0.05 or 0.01. (see Figure 9.1)

Yosef Tekle-Giorgis, HUCA, 2015 Page 92


Statistical Methods for Scientific 2015

Research

The critical value of F is found in F 0.05 and F0.01 tables that give values of F that exclude a 0.05
and 0.01 area at the right tail of different F curves (see appendix table 3 and 4).

Since there are different shapes of F curves depending on the sample sizes, degrees of freedom
values are used to refer to a particular F curve. Accordingly two degrees of freedoms are used to

refer to a particular F curve namely dfamong group and dfwithin group. Accordingly the critical

value of F for a single factor analysis is Fα (dfamong group, dfwithin group). In the present
example, since α is 0.05,

F critical = F0.05 (a-1, N-a). F0.05 (2, 12) = 3.89

iv) Decision

Reject Ho if Fcal ≥ Fα (a-1, N-a).


Since 11.84 > 3.89, Reject Ho

v) Conclusion
The milk yield of Borana cows is different under the three types of feeds.

F test as described above is also called Analysis of Variance (ANOVA) test because variances
have been analyzed to test differences among group means. The test should involve variance
analysis because when comparing more than two groups, differences among groups can only be
described by variances. This particular F test is called single factor ANOVA test because the
effects of only one factor (feed type) on the response variable (milk yield) has been analyzed. As
will be discussed in the latter chapters, this procedure is extended to analyzing the effects of

Yosef Tekle-Giorgis, HUCA, 2015 Page 93


Statistical Methods for Scientific 2015

Research
more than one treatment factors on the response variable, which is multifactor Analysis of
Variance.

In the present example, experimental subjects (the cows) were assigned randomly to the
treatment groups. Accordingly the experimental design is called Completely Randomized
Design (CRD). This design is employed when experimental subjects are found to be similar in
factors that affect the response variable. Otherwise if subjects are known to vary in one or
another way in factors that affect the response variable, they are not randomly assigned directly
under the different treatment categories. Rather experimental designs that separately estimate
the unwanted source of variation are employed to remove the effects of the nuisance variable.
The effects of employing randomized designs when dealing with heterogeneous subjects will be
discussed in detail in latter chapters, in conjunction with non randomized designs.

9.2 Simpler computational procedure for Single factor ANOVA test

The ANOVA computational procedure illustrated earlier is rather tedious, although it is a good
starting point to better understand the underlying theory of the F test. There are simpler
computational formulae for the variance among group and the variance with in group, which are
derived from the original formulae described earlier. However the latter, although simpler, they
do not show the roots for the F test and it is wise to understand the original formulae first and
consider the easy derivative formulae for the sake of simplicity. Accordingly the standard
ANOVA computational procedure will be illustrated using the same example.

Step 1) Calculate the Total Sum of Squares

TSS = ∑ Xi2 – (∑ Xi)2/N = (4302 + 4402 + 4502 + ……+527) – (7075)2/15

3348315 – 3337042 = 11273.3333

Yosef Tekle-Giorgis, HUCA, 2015 Page 94


Statistical Methods for Scientific 2015

Research
Total degrees of freedom = N – 1 15 – 1 = 14

Step 2) Among group SS

2
n ∑ ( X i −X )
S2among group 2
Earlier we have seen that =n S X = a−1 = SSamong group/df among group

The above formula for calculating SSamong group is tedious. Instead, this quantity can be calculated
more conveniently as follows;

2 ( Total∈each group )2 ( Grand total )2


n ∑ ( X i− X ) −
SSamong group = = replication∈each group N

= (22502/5 + 23132/5 + 25122/5) – (7075)2/15 = 2244523 – 337042 = 7480.9333

DF among group = a –1 3-1 = 2

iii) SS within group


Earlier S2within group has been calculated as

S 21 ( n 1−1 )+ S 22 ( n 2−1 ) + S23 ( n 3−1 )


S2
=
Within group
(n¿ ¿1−1)+(n¿¿ 2−1)+( n¿¿ 3−1)¿ ¿ ¿
= SSwithin group/dfwithin group

Here also the procedure of calculating the SS within group is tedious and it can be computed
from the following relationship.

Total SS = SSamong group + SSwithin group, Hence

SSwithin group = Total SS - SSamong group 11273.3333- 7480.9333= 3792.4

Note that the purpose of computing total SS is to compute the within group SS by
difference.

DF within group = Total DF – DF among group 14 – 2 = 12

Yosef Tekle-Giorgis, HUCA, 2015 Page 95


Statistical Methods for Scientific 2015

Research
This is also calculated as DF within group = N – a 15 – 3 = 12

Also DF within group =a ( n-1) 3 (5-1) = 12

Now all the necessary SS and DF values required to compute S 2 among group and S2 within
group are secured. Thus computing F cal and the rest of the steps to complete the test can be
done fairly easily. From here on it is customary to summarize the computations in a table form
known as ANOVA table and the procedure will be illustrated as follows

Single factor ANOVA Summary table

Source of
variation SS DF MS Fcal Fcritical Decision
Total 11273.333 N-1=14 -
3
Among 7480.9333 a-1=2 7480.9333/ 3740.467/316 F0.05(2,12) Reject Ho
group 2 11.84 3.89
3740.467
Within group 3792.4 N-a=12 3792.4/12
(error) 316.0

Conclusion: there is a difference in milk yield of Borana cows under the three feeds.

Note the following:

 MS in the above summary table is mean square and as described earlier, it is the other
name of variance. In Analysis of variance it is customary to refer variance as Mean
Square.
 The total variance is not computed as it is not tested. As said earlier, the purpose of
computing the total SS is to obtain error SS by difference.

Yosef Tekle-Giorgis, HUCA, 2015 Page 96


Statistical Methods for Scientific 2015

Research
9.3 Multiple comparison tests

As shown above, the F test declared the presence of significant differences in milk yield among
the three groups of Borana cows subjected to the three types of feed treatments. However this
does not necessarily mean that all the three groups are significantly different from each other.
Therefore after the F test declared significant differences, a multiple comparison test should be
employed to find out which group differed from which other. Multiple comparison tests are also
known by other names like group separation tests, Post Hoc test (meaning post ANOVA test)
etc.

There are different multiple comparison tests and some are designed to perform better under
certain circumstances than others. For example, such tests like Tukey (Honesty Significant
Difference test), SNK (Student Newman Keul) test, Duncan multiple range test, Dunnett’s test
etc., are some of the tests employed just to mention a few. All of the tests follow analogous
procedure, they take group means and compute pair wise differences. They basically differ from
one another in the standard error value they use and hence in the kind of test statistic value used.
The procedure of Tukey test is illustrated below.

Tukey multiple comparison test procedure

i) Arrange the group means in ascending order as follows

Mean 1 Mean 2 Mean 3


450.0 462.6 502.4

ii) Make pair wise group comparison starting with the largest difference. i.e., the first to
compare is Mean 3 Vs Mean1,then Mean3 Vs Mean 2, and then Mean 2 Vs Mean 1.

Yosef Tekle-Giorgis, HUCA, 2015 Page 97


Statistical Methods for Scientific 2015

Research
iii) Compute the test statistic for the comparison. The test statistic for Tukey test is q

X B− X A
and it is computed as q= SE , where X B and X A are any two means

SE for q is√ MS error /n where, MS error is the error MS from the ANOVA table
and n is the number of replication per group.
316
For the present test, SE for q =
√ 5
=7.95

iv) Determine the q critical from q table ( appendix table 5)


q critical = qα (a, df error), where ‘a’ is the number of groups compared and df error
is the within group degrees of freedom. For the present test q critical is:
q critical = q0.05 (3, 12) = 3.773

v) Decision
X B > X , if q cal is ≥ q critical, otherwise the two are not different.
A

These five steps are illustrated in the following summary table

Comparison Difference SE q cal q critical Decision


X 3 Vs 502.4-450 = 52.4 7.95 52.4/7.95=6.591 3.773 X 3 >

X 1
X 1
X 3 Vs 502.4-462.6=39.8 7.95 39.8/7.95=5.00 3.773 X 3>
X 2
X 2
X 2 Vs 462.6-450=12.6 7.95 12.6/7.95=1.585 3.773 X 2=
X 1
X 1

Yosef Tekle-Giorgis, HUCA, 2015 Page 98


Statistical Methods for Scientific 2015

Research
The above test result indicated that mean milk yield of cows under the third type of feed is
significantly higher than the yield under the first and second type of feeds. But the difference in
yield under the first and the second type of feed is not significant.

To convey this message, make a report table of the mean yield under the respective treatment
category and give different superscript letters to the different means starting with ‘a’ for the
smallest mean, ‘b’ for the next larger etc. Note that the SE and number of replication per
treatment group should be indicated in the report table. The report table may look like the
following

Table 9.2 Mean milk yield of Borana cows under the three experimental feeds

Corn yield (Q/ha)


Feed type n Mean SE
Grazing 5 450.0a 7.95
Grazing + hay supplementation 5 462.6a 7.95
Grazing + concentrate supplementation 5 502.4b 7.95

Means with different superscript letters are significantly different (α<0.05)

In the above table note the following

 The table should have concise but informative table caption (title) as above.
 There are no grid lines dividing the table parts column wise or row wise except at the
table heading and the last row.
 The SE values are the SE used in the q test. After ANOVA test it is customary to give a
common standard error value to all the groups rather than separate standard error.
 A foot note as above should be given below the table if in fact there are different group
means. Note that if all the group means are not different from each other, do not bother

Yosef Tekle-Giorgis, HUCA, 2015 Page 99


Statistical Methods for Scientific 2015

Research
to give the same superscript letter to all. Superscript letters are assigned if at least one
group mean is different from the others.

Exercise assignment three Q#3

1. The effects of three levels of DAP fertilizer application on the yield of a given corn variety
was evaluated.
The three levels of DAP tried were
Level 1 50 kg of DAP per hectare
Level 2 100 kg of DAP per hectare
Level 3 150 kg of DAP per hectare  
For the experiment, 15 similar plots of land each measuring (4 m by 5 m) were prepared. Since
the 15 plots were practically similar (e.g. in soil condition, etc.,), they were randomly
assigned under the three fertilizer treatments.  Accordingly, 5 plots were assigned under each
fertilizer treatment. Then same variety of corn was sawn in all of the 15 plots. Also all the 15
plots were managed in a similar manner except that they received different levels of fertilizer
treatment. Upon maturity the corn yield was harvested from all the 15 plots in a single day and
the yield from the respective plot was separately recorded. The data are shown in the table
below. Test if corn yield (quintals/ hectare) differed under the three levels of DAP application?
Use 5% level of significance. Also perform Tukey multiple comparison test to find out which
DAP level treatments significantly differed in yield from which others?

Corn yield (Quintals/ hectare) data harvested from the respective plot.

Yosef Tekle-Giorgis, HUCA, 2015 Page 100


Statistical Methods for Scientific 2015

Research
Level of DAP application
50 kg/ hectare 100 kg/ hectare 150 kg/ hectare
33.0 39.0 40.5
34.0 40.5 42.3
35.0 41.2 43.2
35.8 42.1 44.5
37.2 43.5 45.7
Mean 1 = 35.0 Mean 2 = 41.26 Mean 3 = 43.24
2 2 2
S 1 = 2.62 S 2 = 2.853 S 3= 4.008
∑X1 = 175 ∑X2 = 206.3 ∑X3 = 216.2
Total sum = 597.5
Grand mean = 597.5/15 = 39.833
2. Among the many bacteria species that cause mastitis problem in cattle, Streptococcus sp. is
the most common type. Sometime ago, One group of ARSc senior students worked their
senior research on the effectiveness of three types of antibiotics (i.e., Tetracycline,
streptomycin and ampicllin) on the growth of Streptococcus bacteria. Accordingly 30
petridishes containing similar growing media were prepared and the petridishes were
inoculated with pure isolate of Streptococcus bacteria. The petridishes were then incubated
for 48 hrs. to allow bacteria growth. Afterwards the 30 petridishes were randomly divided
into 3 groups of ten petridishes and one group was treated with a solution of tetracycline.
Likewise the remaining two groups of petridishes were treated with streptomycin and
ampicllin solutions, respectively. The antibiotics were applied by dropping the solution right
at the center of the petridishes. The treated petridishes were then awaited for 24 hours and
the effects of the antibiotic treatments were assessed. When the antibiotic kills the bacteria,
it creates a clear area at the center of the petridish, which is devoid of bacteria cells. As a
result the antibiotic treatment, which created wider clear area at the center of the petridish,
was considered as more effective than the antibiotic treatment that created a narrower clear
zone (or no clear zone). Hence the width of clear area at the center of each petridish was
measured and it was considered as a means of expressing the effectiveness of the antibiotic in
killing the bacteria. The data below are measurements of clear area (in mm2) recorded for
each of the 30 petridishes.
Streptomycin Tetracycline Ampcillin
28 25 21
20 23 19
25 21 17
23 28 24

Yosef Tekle-Giorgis, HUCA, 2015 Page 101


Statistical Methods for Scientific 2015

Research
25 29 25
33 21 17
34 17 13
30 23 19
32 19 15
35 16 12
Total = 285 Total = 222 Total = 182
Grand total = ΣXi = 689
Uncorrected total sum of squares = ΣΣX2 = 16933
2.1 Do the three antibiotics differ in their effectiveness against the tested bacteria species
(Streptococcus spp.)? Perform ANOVA test at 5 % level of significance (14 points)
2.2 Which of the three antibiotic treatments is more effective against Streptococcus species?
Perform Tukey multiple comparison test at 5% level of significance (6 points).

Yosef Tekle-Giorgis, HUCA, 2015 Page 102


Statistical Methods for Scientific 2015

Research
3. Fluoride deposits in bone tissues replacing calcium if consumed in substantial quantities and
this condition usually results in weaker fracturing bones Fluoride is found in substantial
quantities in the soils of the rift valley regions as can be seen from the discoloration of teeth of
individuals brought up in the rift valley regions from child hood. Some scholars have the
impression that the soils of low altitude areas in the rift valley contain higher quantities of
fluoride than the soils of higher altitude areas. To test this hypothesis, an interested investigator
selected three areas within the rift valley that differed in altitude. These were, the Hawassa area
(representing higher altitude), the Ziway area (representing medium altitude) and the Metahara
area (representing lower altitude). Then from each of these three areas 10 soil samples were
randomly sampled for analysis. The data below are fluoride contents (mg/ gm of soil) of the 30
soil samples collected from the three areas.

Hawassa region Ziway region Metahara region


(Higher altitude) (medium altitude) (lower altitude)
1.1 1.3 1.8
1.2 1.4 2
1 1.2 1.9
1.3 1.5 2.1
1.2 1.4 1.7
1.3 1.3 2.1
1.1 1.5 2
1.2 1.4 2.2
1.1 1.5 2.1
1.3 1.2 2.2
Total = 11.8 Total = 13.7 Total = 20.1

Grand total ∑∑ Xi = 45.6


Uncorrected Sum of squares of all observations ∑∑ X2 = 73.56
3.1 Do the data support the hypothesized idea? Test at 5 % level of significance

3.2 Perform appropriate post ANOVA analysis?

3.3 Estimate the 95% CI for the mean fluoride concentration of lakes that do not differ from
each other? (10 points)

Yosef Tekle-Giorgis, HUCA, 2015 Page 103


Statistical Methods for Scientific 2015

Research

Chapter 10 Two Factor Analysis of Variance

10.1 Introduction to Multifactor Analysis of variance (Factorial ANOVA)

So far you have been introduced to the procedures of evaluating the effects of one factor
(treatment) on the response variable and the procedure is called single factor ANOVA as the
cause for the variation of the response variable is one. On the other hand, multifactor ANOVA or
factorial ANOVA design deals with evaluation of the simultaneous effects of two or more factors
on the response variable being studied. The procedure will be illustrated with the following two
factor ANOVA experiment that involve evaluating the effects of the three feeds on the milk
yield of Borana and Arsi cattle breed.

Consider an experiment designed to evaluate the milk yield of Borana and Arsi cattle
under the three types of feeds introduced earlier. i.e., grazing alone (feed 1), grazing + hay
supplementation (feed 2) and grazing + concentrate supplementation (feed 3). For the
experiment, 15 cows from each breed were considered in which all the 30 cows were similar
in all aspects possible that cause variation in milk yield (same age, parity, health condition
etc.). Accordingly the 15 Arsi cows were randomly grouped into 3 groups of 5 cows and
they were assigned to one of the three feeds. Similarly, the 15 Borana cows were also
grouped into three groups of 5 cows and they were assigned randomly to one of the three
feeds.

The layout of the experiment is as follows.

Feed type (Factor B) Breed summary


Level 1 Level 2 Level 3
Breed (Factor A) (50 (kg/ha) (100 kg/ha) (150 kg/ha)

Borana n1 (5 cows) n2 (5 cows) n3 (5 cows) 15 cows


Arsi n4 (5 cows) n5 (5 cows) n6 (5 cows)
15 cows

Yosef Tekle-Giorgis, HUCA, 2015 Page 104


Statistical Methods for Scientific 2015

Research
Feed summary 10 cows 10 plots 10 cows N = 30 cows

In due consideration, six barns were prepared to hold each group of 5 cows separately and all the
30 cows were managed in a similar manner (same type of housing, health care etc.) during the
experimental period. Accordingly, milk yield data was collected from each cow throughout the
lactation period and based on this the total milk per lactation of each cow was estimated. Recall
that the main aim of the experiment was to evaluate if the milk yield of the two breed of cows
differed under the three types of feeding conditions. i.e., which breed of cow gave better milk
under which of the three feeds

• Before going to the details of the experiment note the following three points

i) As explained above, all the 30 cows were similar in all aspects that cause variation in
milk yield (response variable) except that they were from two breeds and that they were
given three different types of feeds.

Since the experimental subjects (the 30 cows) were similar, this condition enabled the
experimenter to assign the subjects to treatment groups randomly. Accordingly, the
experimental design is called a completely randomized design (CRD). Hence, factorial ANOVA
experiments are also designed as CRD (i.e., subjects are assigned randomly to treatments) if the
investigator is reasonably sure that the experimental subjects are similar in factors that possibly
vary their response.

ii) As mentioned above the purpose of the experiment is to investigate the effects of breed
and feed type difference on milk yield. Thus two factors were considered to cause
variation in the response variable. Accordingly this particular type of factorial ANOVA
design is called a two factor ANOVA design and it is the simplest case of a multifactor
ANOVA design experiment.
In factorial ANOVA experiment, each factor is denoted by capital letters (A, B, C etc.). Note
that in the present experiment, breed type is designated as factor A and feed type as factor

Yosef Tekle-Giorgis, HUCA, 2015 Page 105


Statistical Methods for Scientific 2015

Research
B. Note that the reference letters (A, B, C etc) are arbitrarily assigned to the factors. Here
feed could be designated as factor A. And breed as factor B.

iii) Also note that the levels of each factor are designated by the corresponding small letters
(as a, b, c, etc.).
Factor A (breed type) has two levels (categories), thus a = 2

Factor B (feed type) has three levels (categories) thus b = 3

Thus, this particular ANOVA design is represented as a 2 X 3 (a two by three) two factor
ANOVA design.
Consider a factorial ANOVA experiment designed as 3 X 4 X 4.
How many factors are considered in the study?
How many levels does each factor has?
Give values of a, b and c?

10.1.1 Aims (advantages) of a factorial design experiment

i) Saves time and mostly money

Factorial experiments enable evaluation of two or more factors in a single run. For
example, in the above mentioned experiment, the effects of both breed and feed could be
evaluated with only a one time experimentation. But if the investigator wishes to
investigate the effects of the two factors separately, he would have to run five different
experiments five times. i.e.,
 Evaluating the effects of breed difference on milk yield requires running three different
experiments, one for each of the three feeds
Comparing the milk of Arsi and Borana under grazing
Comparing the milk of Arsi and Borana under grazing + hay

Yosef Tekle-Giorgis, HUCA, 2015 Page 106


Statistical Methods for Scientific 2015

Research
Comparing the milk of Arsi and Borana under grazing + concentrate feed

 Again evaluating the effects of the three feeds on milk yield requires running two different
experiments twice, one on each of the two breeds.
• Comparing the milk yield of Borana under the three feed

• Comparing the milk yield of Arsi under the three feeds.

Thus as shown above five different single experiments need to be conducted five times,
which is obviously time consuming and probably more expensive. Whereas, the required
information can be obtained with only a one time experiment if designed as a two factor
experiment.

ii) The other most important advantage of a factorial experiment is it allows evaluation of both
main and interaction effects of the factors considered on the response variable. For example
the interest in the present example is to find answers to the following questions:
a) Do Borana and Arsi differ in milk yield, i.e., breed effect on milk yield (Main effects of
factor A)

b) Does milk yield of the cows differ under the three feeds, i.e., feed effect on milk yield
(Main effects of factor B)

c) Does the milk yield difference between Borana and Arsi depend on the type of feed they
are given?

This can also be addressed as follows

Does the difference in milk yield under the three feeds depend on the type of breed considered?

Or which breed is better in milk yield under which feed

Yosef Tekle-Giorgis, HUCA, 2015 Page 107


Statistical Methods for Scientific 2015

Research
Here the third question imply evaluating the interaction effects of both factors on the response
variable. Interaction effect of the factors refers to the combined effects of the factors and it
reflects the dependence of one of the factors effect on the other factor.

In a given study, it is more important to investigate the interaction effect of factors than the
simple effects. For example, it is more interesting to find out which of the two breeds of cows
performed better under which type of feed

Conducting five different single factor experiments does only provide answers to the main
(separate) effects of the two factors. Hence the experiment should be designed as a two factor
experiment to allow evaluation of the interaction effects as well.

10.1.2 Illustration of main and interaction effects of factors in a factorial experiment


Consider the following 2 X 2 two factor ANOVA experiment designed to evaluate the growth of
two different fish species (a tropical and a temperate fish species) reared under two different
water temperatures (cold and warm water temperatures).

 Back ground of the experiment

Tropical fish species are warm water adapted fish species and they grow at optimum rate at water
temperatures around 25 oC. When the water temperature falls below 20 oC their feeding and
metabolism slows down and they grow at a slow rate. At water temperatures around 15 oC, they
stop feeding and at temperatures below 10 oC, they die when exposed to an extended time.

On the other hand, temperate fish species grow at an optimum rate around 15 oC and their growth
slows down when the temperature increases to 20 to 25 oC. Further exposing them to
temperatures around 30 oC can be fatal if exposed to extended period of time. This is because
temperate fish are adapted to live in cold water that are highly saturated with oxygen and as the

Yosef Tekle-Giorgis, HUCA, 2015 Page 108


Statistical Methods for Scientific 2015

Research
water temperature increases the oxygen saturation falls down to critical levels for these kind of
fish to survive. To authenticate the above stated biology, two different pond experiments were
conducted as outlined below.

Experiment 1: An experiment that shows interaction effects of two factors

Comparison of the growth of a tropical fish species (tilapia) and a temperate fish species
(Salmon) under cold and warm water temperatures (16 oC and 25 oC). Four glass aquaria were
prepared for the experiment and the water temperature of the two aquaria was adjusted at 16 oC,
while that of the other two aquaria was adjusted at 25 oC. Ten yearling tilapia (representing a
tropical fish) were randomly divided into two groups and one of the group was kept in aquarium
containing cold water (16 oC) and the other group in aquarium containing warm water (25 oC).

In a similar manner 20 yearling Salmon (representing a temperate fish) were divided into two
groups randomly and one group was kept in aquarium maintained at 16 oC and the other group in
aquarium maintained at 25 oC.

Set up of the experiment

Yosef Tekle-Giorgis, HUCA, 2015 Page 109


Statistical Methods for Scientific 2015

Research

Aquarium 4
5 Salmon

• Note the following

i) All the 20 fish were similar in factors that affected growth except that they were from two
different breeds and maintained under two different water temperatures

ii) The four aquaria were managed similarly

Data collection

The length of each fish was measured at the start and at weekly intervals for a period of one
month. The growth of each fish was quantified as the average increase in length per month. As
will be explained later the experiment shows interaction effect of the two factors.

In order to compare with this experiment, another experiment was designed that shows no
interaction effects as shown below.

Experiment 2 An experiment that shows no interaction effects


• Evaluating the growth of two tropical fish species (Tilapia and catfish) under cold
and warm water temperatures (16 oC and 25 oC).
In this experiment the investigator also prepared 4 glass aquaria and kept two of them under cold
water temperature (16 oC ) and the other two under warm water temperature (25 oC ). He then

Yosef Tekle-Giorgis, HUCA, 2015 Page 110


Statistical Methods for Scientific 2015

Research
took ten yearling tilapia and randomly kept 5 of them in aquarium containing cold water and the
other 5 in aquarium containing warm water. In a similar manner he took 10 yearling catfish and
maintained 5 under cold water and the other 5 under warm water. Here also all the 20 fish were
similar in factors that affected growth and they were managed similarly. In due consideration,
data on length growth of each fish was collected as explained earlier.

Setup of the experiment

16 oC water 25 oC water
temperature temperature
Aquarium 1 Aquarium 2
5 tilapia 5 tilapia

Aquarium 3 Aquarium 4
5 Catfish 5 Catfish
Table 10.1 Data of experiment 1: the effects of rearing water temperature on the growth of a
tropical fish tilapia and a temperate fish Salmon data are length increase in mm per
month of each fish.

Fish species Temperature(Factor B)


o
(Factor A) 16 C 25 oC Species summary
2 7
3 6
4 7 n = 10
∑X = 47
Tilapia 3 6
Mean = 4.7
1 8
n=5 n=5
∑X = 13 ∑X = 34

Yosef Tekle-Giorgis, HUCA, 2015 Page 111


Statistical Methods for Scientific 2015

Research
Mean = 2.6 Mean 6.8
6 3
4 4
7 1 n = 10
5 2 ∑X = 38
Salmon
5 1 Mean = 3.8
n= 5 n= 5
∑X = 27 ∑X = 11
Mean = 5.4 Mean = 2.2
Temperature n = 10 n = 10 N = 20
summary ∑X = 40 ∑X = 45 ∑X = 85
Mean = 4.0 Mean = 4.5 Mean = 4.25

Interpretation of experiment 1 data

• The data illustrates that the effects of one of the factors depended on the other factor. For
example, tilapia growth was slightly faster than salmon growth, BUT this depended on
the rearing water temperature. i.e., Tilapia grew better in warmer water temperature but
salmon growth was faster under the colder water temperature.

• This is an example of an experiment in which the interaction effect of the factors is more
important than the independent effects of the two factors

Table 10.2 Data of experiment 2: the effects of rearing water temperature on the growth of
two tropical fish species, tilapia and catfish. Data are length increase in mm per
month of each fish.

Temperature(Factor B)
Fish species
(Factor A) 16 oC 25 oC Species summary
2 7
3 6 n = 10
∑X = 47
Tilapia 4 7
Mean = 4.7
3 6
1 8

Yosef Tekle-Giorgis, HUCA, 2015 Page 112


Statistical Methods for Scientific 2015

Research
n=5 n=5
∑X = 13 ∑X = 34
Mean = 2.6 Mean 6.8
10 15
13 18
14 19 n = 10
11 16 ∑X = 145
Catfish
12 17 Mean = 14.5
n= 5 n= 5  
∑X = 60 ∑X = 85
Mean = 12 Mean = 17
 Temperature n = 10 n = 10 N = 20
summary ∑X = 73 ∑X = 119 ∑X = 192
Mean = 7.3 Mean = 11.9 Mean = 9.6

Interpretation of experiment 2 data

• The data illustrates that the effects of one of the factors did not depend on the other
factor. For example, both species grew faster at the warmer water temperature than the
colder water temperature. Similarly catfish grew faster than tilapia and this is true under
both water temperatures

• This is an example of an experiment in which the main effect of the factors is more
important than the interaction effects of the two factors

10.2 Computational procedure for two factor ANOVA design


The computational procedure of a two factor ANOVA design is illustrated using data of
experiment 1 shown by Table 11.1 above.

In a two factor ANOVA, there are three sources of variations the effects of which needs to be
tested. These are the independent effects of the two factors as well as the interaction effect of the
two factors. Therefore three hypotheses should be stated as well as three mean squares one for
testing each effect as well as the error mean square should be computed. The procedure is as
follows

Yosef Tekle-Giorgis, HUCA, 2015 Page 113


Statistical Methods for Scientific 2015

Research
1) Hypothesis
1.1) The effects of species difference on growth (Factor A)
The question of interest: Are the two fish species Different in growth?
Ho: The two fish species are not different in growth

HA: The two fish species are different in growth

1.2) The effects of rearing water temperature on growth


The question of interest: Is fish growth different under the two rearing water
temperatures?
Ho: Fish growth is the same under the two rearing water temperatures
HA: Fish growth is different under the two rearing water temperatures

1.3) The interaction effects of species and rearing water temperature on growth
The question of interest: Is the difference in growth between the two fish species
dependent on the temperature in which they are reared?
Ho: The difference in growth between the two fish species does not depend on the rearing
water temperature in which the fish were reared.

HA: The difference in growth between the two fish species depends on the rearing water
temperature in which the fish were reared.

As stated above, the interaction hypothesis is set to evaluate the dependence of Factor A
on Factor B. Similarly, the hypothesis can be stated in a manner that allows evaluation of
the dependence of Factor B on Factor A such as:

The question of interest: Is the difference in growth under the two water temperatures
dependent on the species of fish?
Ho: The difference in growth under the two water temperatures does not depend on the
species of fish.

Yosef Tekle-Giorgis, HUCA, 2015 Page 114


Statistical Methods for Scientific 2015

Research
HA: The difference in growth under the two water temperatures depends on the species
of fish.

2) Computational procedure
i) Total SS = ∑∑Xi2 - (∑∑Xi)2/N (22 + 32 + 42 + ..........+12) – 852/20 = 93.75
Total DF = N-1 20-1 = 19

ii) Species SS =[(species total)2/bn] - (∑∑Xi)2/N [(472/10) + 382/10)] – 852/20 = 4.05


Note that the denominator for (species total)2 is bn because the number of values added to
give factor A totals is bn.

Spp DF = a-1 = 2-1 = 1

iii) Temperature SS = [(Temp total)2/an] - (∑∑Xi)2/N [(402/10) + 452/10)] – 852/20 = 1.25

Note that the denominator for (Temperature total)2 is ‘an’ because the number of values
added to give Factor B totals is ‘an’.

Temperature DF = b -1 2-1 = 1

At this point consider the following relationship

Total SS = SPP SS + Temperature SS + SPP X Temp SS + error SS

Cell SS
Yosef Tekle-Giorgis, HUCA, 2015 Page 115
Statistical Methods for Scientific 2015

Research

In the procedure completed so far, Total SS, SPP SS and Temperature SS are already computed.
Only the interaction and the error SS remain to be computed. The latter two are tedious to
calculate directly and they are obtained by difference based on the relationship shown above.

SPP SS, Temperature SS and SPP X Temp SS add up to a sum of square known as cell SS (also
treatment combination SS). This SS is fairly simple to compute. Accordingly the interaction
and the error SS are obtained by difference as follows

SPP X Temp SS = Cell SS – SPP SS – Temperature SS

error SS = Total SS – Cell SS

Accordingly the next step is to compute Cell SS as follows

iv) Cell SS = [(Cell totals)2/n] - (∑∑Xi)2/N [(132/5)+(272/5)+(342/5)+(112/5)]-852/20 = 73.75


Cell DF = ab-1 (2*2) – 1 = 3

v) SPP X Temperature SS = 73.75 – 4.05 – 1.25 = 68.45


SPP X Temperature DF = Cell DF – SPP DF Temperature DF 3–1–1=1
= (a-1) X (b-1) 1 X 1= 1

vi) error SS = Total SS – Cell SS 93.75 – 73.75 = 20


error DF= Total DF – Cell DF 19 – 3 = 16
= N – ab 20 – (2 X 2) = 16
=ab (n-1) (2X2)(5-1) = 16
Two factor Analysis summary ANOVA table

Source SS DF MS F cal Fcritical Decision

Yosef Tekle-Giorgis, HUCA, 2015 Page 116


Statistical Methods for Scientific 2015

Research
93.75 19
Total        
Cell 73.75 3        

Spp 4.05 1 4.05 3.24 F0.05(1,16)= 4.49 Accept Ho

Temperature 1.25 1 1.25 1.0 F0.05(1,16)= 4.49 Accept Ho

Spp * Temperature 68.45 1 68.45 54.76 F0.05(1,16)= 4.49 Reject Ho

Error 20.00 16 1.25


     

Conclusion
 The two fish species are not different in growth. Similarly, the growth of the fish is the
same under the two water temperatures. On the other hand, there is a significant SPP by
temperature interaction effect on the growth of the two fish species. i.e., the difference in
growth between the two fish species depends on the temperature in which they are reared.
The warm water fish tilapia grew faster in the higher rearing water temperature than the
cold water fish salmon. However, the reverse was true in the colder rearing water
temperature. The summary report table is as follows.

Table 10.3 The growth in length of the two fish species under the two experimental water
temperatures

Yosef Tekle-Giorgis, HUCA, 2015 Page 117


Statistical Methods for Scientific 2015

Research

    Length Growth (mm/month)


Category n Mean SE
Species      
a
Tilapia 10 4.7 0.354
a
Salmon 10 3.8 0.354
       
Temperature      
a
16 oC 10 4.0 0.354
a
25 oC 10 4.5 0.354
       
SPP X Temperature      
Tilapia      
a
16 oC 5 2.6 0.5
b
25 oC 5 6.8 0.5
Salmon      
b
16 oC 5 5.4 0.5
a
25 oC 5 2.2 0.5
Mean values under the same category that bear different superscript letters are significantly
different (α < 0.05).

Note the following

 The standard error in the above tables are computed by dividing the error mean square by
the appropriate sample size. The standard error values were computed as follows:
SE for SPP categories = √error MS/an √1.25/10 = 0.354
SE for Temp categories = √error MS/bn √1.25/10 = 0.354
SE for interaction cell means = √error MS/n √1.25/5 = 0.50

Exercise assignment 4 Q#3


1. Do the analysis for data of Experiment 2 (Table 10.2)

Yosef Tekle-Giorgis, HUCA, 2015 Page 118


Statistical Methods for Scientific 2015

Research
2. Consider an experiment designed to evaluate the yield of two varieties of corn under the
three levels of DAP treatments. i.e., 50 kg/hectare (level 1), 100 kg/hectare (level 2) and 150
kg of DAP/ hectare (level 3). For the experiment, 30 similar plots were prepared (each
measuring 4m X 5m) and they were randomly grouped into two groups of 15 plots each.
Accordingly, one group of the 15 plots were randomly assigned to one of the two corn
variety and the other group of 15 plots were assigned to the other corn variety. Then the 15
plots sown with corn variety 1 were randomly grouped into three groups of 5 plots and they
were assigned to one of the three levels of DAP treatment. Likewise the other group of 15
plots sown with corn variety 2 were further grouped into three groups of 5 plots randomly
and they were assigned randomly to one of the three levels of DAP treatment. The data are
shown by the following table.
Yield (Q/ha) of the two varieties of corn under the three levels of DAP treatment

Level of DAP application Total corn variety


corn Variety 50 kg/ hectare 100 kg/ hectare 150 kg/ hectare
33 39 40.5
34 40.5 42.3
Variety 1
35 41.2 43.2
35.8 42.1 44.5
37.2 43.5 45.7

36 41 46.5
37 42.5 48.3
Variety 2 32 43.2 49.2
32.8 44.1 50.5
40 45.5 51.7
Total dap
• Did the two varieties of corn differed in yield?

• Did the yield of corn differ under the three levels of DAP treatments?

• Did the difference in yield between the two varieties of corn depend on the level of DAP
treatment?

• Test at 5 % level of significance

Yosef Tekle-Giorgis, HUCA, 2015 Page 119


Statistical Methods for Scientific 2015

Research
3. Consider that the Wolayita zone Ministry of Agriculture conducted survey on
gastrointestinal parasite infestation in the two classes of small ruminants (sheep and
goat). The study was meant to compare the level of infestation of the two species in
the highland and the low land areas of the zone. Consider that Areka was chosen
randomly as one study area representing lowland and Boditi was chosen randomly
among the highland areas of the zone. Accordingly 10 animals of each species were
randomly sampled from each of the two selected woreda and the number of parasite
eggs per gram of fecal sample was determined for each animal. The data are as
follows.
Species of livestock Studied areas
Lowland (Areka) Highland (Boditi)
  480 150
  310 100
  280 95
  510 200
  330 210
Sheep 370 160
  290 180
  360 120
  340 88
  250 99
  55 170
  79 180
  86 190
  92 220
Goat 40 240
  60 210
  62 230
  75 231
  47 227
  81 219
 Which of the two species is more infested?
 Does infestation level differ in the two areas?
 Is there a difference in the infestation level of the two species in the two areas?
Analyze the data and interpret. Use 5 % level of significance for the test

Yosef Tekle-Giorgis, HUCA, 2015 Page 120


Statistical Methods for Scientific 2015

Research

Chapter 11 Three factor Analysis of Variance (CRD)


A three factor ANOVA design experiment will be illustrated with an experiment designed

to investigate the effects of species difference, sex and rearing water temperature on the

respiratory rate (metabolic activity) of three species of crabs reared in laboratory. For the

experiment 4 replicates of male and female crabs of each of the three species were reared

in glass aquaria at three different water temperatures (low, medium and high water

temperatures. Respiratory rate of each crab was measured in terms of oxygen

consumption. i.e., ml of O2/liter of body fluid.

The three factors investigated were the following:

• Species factor A, a = 3 (Spp1, Spp2, Spp3)

• Water temp Factor B, b = 3 (low, medium, high)

• Sex Factor C, c = 2 (male and female)

n = number of replicates = 4

N = total number of crabs used

= a X b X c X n = 3x3x2x4 = 72

All the 72 crabs were same age, health condition etc. Thus a CRD design was used to

assign the crabs to the different treatments.

Yosef Tekle-Giorgis, HUCA, 2015 Page 121


Statistical Methods for Scientific 2015

Research
Set up of the experiment was as follows

Low To Medium To High To

male
SPP 1
4 4 4
(24) Female
4 4 4

male
SPP 2
4 4 4
(24) Female
4 4 4

male
SPP 3
4 4 4
(24) Female
4 4 4

A total of 18 glass aquaria were prepared , 6


for each temperature experiment
Yosef Tekle-Giorgis, HUCA, 2015 Page 122
Statistical Methods for Scientific 2015

Research

Data of the three factor ANOVA experiment is shown by the following table
Table 11.1 Oxygen consumption (ml of O 2/liter of body fluid) of males and females of the three
SPP of crabs reared under the three experimental water temperatures.
Temp Crab species Factor A
Factor B Sex
Factor C SPP1 SPP2 Spp3
1.9 2.1 1.1
1.6 1.8 1
Low Temperature

Male
1.8 2 1.2
1.4 2.2 1.4
1.8 2.3 1.4
1.4 1.9 1.3
Female
1.7 2 1
1.5 1.7 1.2
2.3 2.4 2
Medium temperature

2 2.7 1.9
Male
2.1 2.6 2.1
2.6 2.3 2.2
2.4 2 2.4
2.7 2.3 2.6
Female
2.4 2.1 2.3
2.6 2.4 2.2
2.9 3.6 2.9
2.8 3.1 2.8
High temperature

Male
3.4 3.4 3
3.2 3.2 3.1
3 3.1 3.2
3.1 3 2.9
Female
3 2.8 2.8
2.7 3.2 2.9

Yosef Tekle-Giorgis, HUCA, 2015 Page 123


Statistical Methods for Scientific 2015

Research

11.1 Sources of variation


I) Independent effects of the three factors
1. Factor A, species effect on O2 consumption
2. Factor B, Temperature effect on O2 consumption
3. Factor C, sex effect on O2 consumption

II) Two way interaction effects on O2 consumption


4. SPP VS Temp interaction effect on O2 consumption ( A X B interaction )
5. SPP VS sex interaction effect on O2 consumption (A X C interaction)
6. Temperature VS sex interaction effect on O2 consumption (B X C interaction)

III) Three way interaction effect


7. SPP VS Temperature VS sex interaction A X B X C interaction effect
• Thus 7 different hypothesis need to be set one for each of the 7 sources of variations
Also 7 mean squares (one for each source) plus error mean square need to be computed

11.2 Hypothesis

I) Hypothesis about the main effects of the three factors

1. Main effect of factor A (Spp effect) Do the three spp of crabs differ in O2 consumption?

Ho: The three spp of craps do not differ in O2 consumption.

HA: The three spp of craps differ in O2 consumption.

2. Main effect of factor B( Temperature effect) Is O 2 consumption of the crabs differ under
the three rearing water temperatures?

Ho: The O2 consumption of the crabs is the same under the three temperatures.

HA: The O2 consumption of the crabs is different under the three temperatures.

Yosef Tekle-Giorgis, HUCA, 2015 Page 124


Statistical Methods for Scientific 2015

Research
3. Main effects of factor C (Sex effect) Do male and female crabs differ in oxygen consumption?

Ho: Male and female crabs do not differ in O2 consumption.

HA : Male and female crabs differ in O2 consumption.

II) Hypothesis about two way interaction effects of the factors

4. A X B (Spp VS Temperature ) interaction effects

Question: Does the difference in O 2 consumption among the three species depend on the three
experimental temperatures?

HO: The difference in O2 consumption among the three spp does not depend on the temperature
in which they are reared.

HA: The difference in O2 consumption among the three spp depends on the temperature in which
they are reared.

5. A X C (Spp. VS sex) interaction effect

Question: Does the difference in O2 consumption among the three species depend on the sex of
the crabs?

HO: The difference in O2 consumption among the three spp does not depend on the sex of the
crabs.

HA: The difference in O2 consumption among the three spp depends on the sex of the crabs.

6. B X C (Temperature VS sex interaction effect

Question: Does the difference in O2 consumption under the three experimental temperatures
depend on the sex of the crabs?

HO: The difference in O2 consumption of the crabs under the three temperatures does not depend
on the sex of the crabs.

HA: The difference in O2 consumption of the crabs under the three temperatures depend on the
sex of the crabs.

Yosef Tekle-Giorgis, HUCA, 2015 Page 125


Statistical Methods for Scientific 2015

Research

III) Hypothesis about three way interaction effects of the factors (A X B X C interaction)

The three way interaction hypothesis can be set in reference to one of the following three
questions

Question: Does the difference in O2 consumption of the three spp of crabs depend on the
temperature and sex of the crabs? OR

Question: Does the difference in O2 consumption of the crabs under the three temperatures
depend on the spp and sex of the crabs? OR

Question: Does the difference in O2 consumption between male and female crabs depend on the
spp and experimental temperatures?

The hypothesis statement set in reference to the first question is as follows

HO: the difference in O2 consumption among the three spp of crabs does not depend on the
temperature and sex of the crabs.

HA: the difference in O2 consumption among the three spp of crabs depends on the temperature
and sex of the crabs.

Table 11.2 General outline of the analysis of variance table

Source SS Df MS Fcal
Total TSS abcn – 1 (72 – 1=71) - -

Factor A (SPP ) SPP SS a – 1 (3 – 1 = 2 ) SPP SS/a-1 SPPMS/Error MS

Factor B (Temp ) Temp SS b – 1 (3 – 1 = 2 ) TempSS/b-1 TempMS/ErrorMS


Factor C (Sex ) Sex SS c – 1 (2 – 1 = 1 ) Sex SS/c-1 Sex MS/Error MS

Yosef Tekle-Giorgis, HUCA, 2015 Page 126


Statistical Methods for Scientific 2015

Research
AXB AB SS (a-1)(b-1) (2 X 2 = 4 ) AB SS/AB df AB MS/Error MS
AXC AC SS (a-1)(c-1) (2 X 1 = 2) AC SS/AC df AC MS/Error MS
BXC BC SS (b-1)(c-1) (2 X 1 = 2 ) BC SS/BC df BC MS/Error MS
ABC MS/Error
AX BX C ABC SS (a-1)(b-1)(c-1) (2 X 2x1 = 4 ) ABC SS/ABC df
MS
Error Error SS abc(n-1) (3X3x2x3=54 ) Error SS/error df -

The analysis has been done using SPSS soft ware and the following results were obtained

Three factor ANOVA summary table

Source SS Df MS Fcal F critical Decision


abcn – 1
Total 30.355 - - - -
72 – 1=71
Factor A a-1
1.818 0.9088 24.475 3.23 **
SPP 3–1=2
Factor B b–1
24.656 12.328 332.02 3.23 ***
Temp 3–1=2
Factor C c–1
0.009 0.0089 0.2394 4.08 NS
Sex 2–1=1
AXB (a-1)(b-1)
1.102 0.2754 7.148 2.61 *
SPP X Temp 2X2=4
AXC (a-1)(c-1)
0.370 0.1851 4.986 3.23 *
SPP X Sex 2X1=2
BXC (b-1)(c-1)
0.175 0.0876 2.36 3.23 NS
Temp X sex 2X1=2
AXBXC (a-1)(b-1)(c-1)
0.221 0.0551 1.485 2.61 NS
SPPX Temp X sex 2 X 2x1 = 4
abc(n-1)
Error 2.005 0.0371 - - -
3X3x2x3=54
Conclusion

 The three species of crabs were different in O2 consumption


 O2 consumption of the crabs also differed under the three experimental water
temperatures.

Yosef Tekle-Giorgis, HUCA, 2015 Page 127


Statistical Methods for Scientific 2015

Research
 There was also a SPP by temperature and Spp by sex interaction effect on O 2
consumption of the crabs.

Yosef Tekle-Giorgis, HUCA, 2015 Page 128


Statistical Methods for Scientific 2015

Research
Table 11.3 Mean oxygen consumption rate (ml/liter of body fluid) of male and female crabs of
the three species under the three experimental water temperatures.

Oxygen consumption rate (ml/lit)

Category N Mean SE

Sex

Male 36 2.34a 0.0321

Female 36 2.31a 0.0321

Species

Species 1 24 2.35b 0.03933

Species 2 24 2.51c 0.03933

Species 3 24 2.12a 0.03933

Temperature

Low 24 1.61a 0.03933

Medium 24 2.32b 0.03933

High 24 3.05c 0.03933

Interpretation

Species 3 is the lowest in O2 consumption, SPP 1 is intermediate, while SPP 2 is metabolically


the most active one. Oxygen consumption of the crabs significantly increased with increasing
water temperature. On the other hand male and female crabs of the tested species did not differ
in oxygen consumption.

Yosef Tekle-Giorgis, HUCA, 2015 Page 129


Statistical Methods for Scientific 2015

Research

Table 11.4 Species by temperature interaction effect on oxygen consumption of the crabs

95 % CI
Lower Upper
Temperature Species n Mean SE Bound Bound
Low Spp 1 8 1.638b 0.068 1.501 1.774

Spp 2 8 2.00c 0.068 1.863 2.137

Spp 3 8 1.20a 0.068 1.063 1.337

Medium Spp 1 8 2.388a 0.068 2.251 2.524

Spp 2 8 2.350a 0.068 2.213 2.487

Spp 3 8 2.213a 0.068 2.076 2.349

High Spp 1 8 3.013a 0.068 2.876 3.149

Spp 2 8 3.175a 0.068 3.038 3.312

Spp 3 8 2.950a 0.068 2.813 3.087

Interpretation

 Under the lowest water temperature, SPP 3 is relatively the most active metabolically
than the other two SPP. Also SPP3 was the second most active but SPP2 is the least
active when the water becomes cold. On the other hand as the water gets warmer,
differences in metabolic activity of the crabs was not significant.
Note that spp by temperature interaction effect happened to be significant because of differences
in metabolic activity among the species under low temperature.

Yosef Tekle-Giorgis, HUCA, 2015 Page 130


Statistical Methods for Scientific 2015

Research
Table 11.5 Species by sex interaction effect on oxygen consumption of the tested crabs

95% Confidence Interval


Spp Sex Mean SE Lower Bound Upper Bound
Spp 1 Male 2.333a 0.056 2.222 2.445

Female 2.358a 0.056 2.247 2.470

Spp 2 Male 2.617b 0.056 2.505 2.728

Female 2.400a 0.056 2.288 2.512

Spp 3 Male 2.058a 0.056 1.947 2.170

Female 2.183a 0.056 2.072 2.295

Interpretation
In SPP 1 and 3, male and female crabs are comparable in metabolic activity. However in case of
SPP 2, females are significantly less active than males.

Note that SPP by sex interaction effect became significant because of differences inbetween
male and females in spp 2.

Yosef Tekle-Giorgis, HUCA, 2015 Page 131


Statistical Methods for Scientific 2015

Research

Chapter 12 Randomized Complete Block Design (RCBD)

12.1 The effect of using heterogeneous experimental subjects when dealing with Completely
Randomized Designs

Consider the example used to illustrate single factor ANOVA (the effects of the three feeds on
milk yield of Borana cows) as well as two factor ANOVA (the effects of species difference
and rearing water temperature on the growth of fish). In both of these designs, experimental
subjects were assigned randomly to treatment groups. Thus these designs are called
Completely Randomized Designs (CRD). This description holds true for other factorial
designs that have three or more treatments as long as the experimental subjects are
assigned to treatment groups randomly.

To use completely randomized designs, the experimental subjects being studied must be similar
(as much as possible) in all aspects that cause variation in their response to the treatments. For
example, in the example used to illustrate single factor ANOVA design (i.e., feed effect on milk
yield), the 15 cows used for the experiment were similar in all aspects that caused variation in
milk yield (such as genotype, age etc). As a result, 5 cows were randomly assigned under each
of the three feeds. Similarity of experimental subjects is an essential prerequisite to use a
completely randomized design mentioned above because it is only when this condition is
fulfilled that the within group variance (Ms within) is expected to reflect the basic minimum
variation that exist among individuals within the same treatment group.

Otherwise if subjects being studied are heterogeneous due to extraneous sources that cause
variation in response to treatment effects, the difference in the response of individuals that
received same treatment reflects not only the basic minimum variation expected to exist among
individuals (σ2 error), but also variation caused by the extraneous source (σ2 extraneous). That
is Mswithin would compose both σ2 error and σ2 extraneous. Hence Mswithin becomes large and

Yosef Tekle-Giorgis, HUCA, 2015 Page 132


Statistical Methods for Scientific 2015

Research
when used as a denominator during the F test, it lowers the calculated F (Fcalculated). As a result the
F test fails to declare significant differences.

This fact is illustrated as follows

Fcalculated = Msgroup / Mswithin

= (σ2 treatment + σ2 error) /(σ2 error + σ2 extraneous)

Here σ2 error cancels out from the numerator and denominator and F calculated becomes
Fcalculated = (σ2 treatment) /(σ2 extraneous)

Accordingly the variance caused by the treatment effect is reduced by the variance due to
extraneous source, which is the situation that lowers the calculated F and makes the F test weak,
i.e., unable to declare significant treatment effect.

Accordingly the variance caused by the treatment effect is reduced by the variance due to
extraneous source, which is the situation that lowers the calculated F and makes the F test weak,
i.e., unable to declare significant treatment effect.

On the other hand, if experimental subjects are homogeneous, the Mswithin would contain only
σ2 error and thus the calculated F reflects only the treatment effects as shown below.

Fcalculated = Msgroup / Mswithin

= (σ2 treatment + σ2 error) /(σ2 error)

= σ2 treatment

The effects of using heterogeneous experimental subjects when dealing with completely
randomized design is illustrated by the following example.

Consider an experiment that is designed to evaluate the effects of three feeds on the weight gain
performance of yearling Borana bulls. The researcher wanted to try each of the three feeds on 5

Yosef Tekle-Giorgis, HUCA, 2015 Page 133


Statistical Methods for Scientific 2015

Research
replicate bulls. i.e., he required a total of 15 bulls for the experiment. Yearling Borana bulls at
Yavello market range in weight between 100 and 150 kg. Accordingly the 15 bulls purchased
for the experiment ranged in weight between 100 and 150 kg as shown in the table below.

Table 11.1 ID No and initial weight of the 15 bulls chosen for the feed experiment

ID No of bulls Initial body weight (kg)


1 102
2 106
3 108
4 111
5 115
6 117
7 122
8 125
9 129
10 133
11 137
12 140
13 142
14 146
15 149

 Assigning the bulls to the three feeding groups


Although the bulls were highly different in their initial weight, the researcher used a completely
randomized single factor ANOVA design and assigned the bulls at random to one of the three
feeding groups. Results of this random assignment is shown by the table below.

Table11.2 Feeding group of the 15 bulls as resulted from random assignment

ID No of bulls Initial body weight (kg) Feeding group of the bulls


1 102 Feed 3
2 106 Feed 1
3 108 Feed 2
4 111 Feed 1
5 115 Feed 3
6 117 Feed 2
7 122 Feed 3
8 125 Feed 2
9 129 Feed 1

Yosef Tekle-Giorgis, HUCA, 2015 Page 134


Statistical Methods for Scientific 2015

Research
10 133 Feed 2
11 137 Feed 1
12 140 Feed 3
13 142 Feed 1
14 146 Feed 3
15 149 Feed 2

 Starting the feeding program and recording weight gain performance of the bulls
under the three feeding groups
After assigning the bulls to each of the feeding groups, bulls assigned to the same feed were kept
in the same barn and feeding started. Accordingly the bulls were fed for three months and the
weight of each bull was recorded every week. Thus the weight gain per week of each bull was
calculated as shown by the table below

Table 11.3 Weight gain /wk of the 15 bulls under the three feeds

Feed 1 Feed 2 Feed 3


ID No. Body wt Wt gain ID No. Body wt Wt gain ID No. Body wt Wt gain
2 106 10.3 3 108 7.7 1 102 8.4
4 111 9.9 6 117 5.7 5 115 7.6
9 129 8.5 8 125 5.3 7 122 5.5
11 137 7.0 10 133 4.7 12 140 4.9
13 142 5.1 15 149 3.5 14 146 3.3
Total gain 40.8 Total gain 26.9 Total gain 29.7
Mean gain 8.16 Mean gain 5.38 Mean gain 5.94
2 2 2
S 4.608
1 S 2.372
2 S 4.263
3

Note that close inspection of the above weight gain data reveals the following facts

 The 5 bulls under each of the three feeds are highly different in their initial weight

 Under each feeding group, the lighter bulls gained higher (almost at a double rate) than
the heavier bulls. i.e, there is a high variation in gain performance of bulls under the
same feed. As a result, the within group variances (S21, S22, & S23) are quite large

Yosef Tekle-Giorgis, HUCA, 2015 Page 135


Statistical Methods for Scientific 2015

Research
 There is also a wide difference in mean weight gain per week of the bulls under the three
feeds. In particular, the mean weight gain/week under feed 1 (8.16 kg/wk) is markedly
higher than the mean under feed 2 (5.38 kg/wk) and feed 3 (5.94 kg/wk).

Does the F test declare the presence of significant differences in gain performance of the
bulls under the three feeds? Test at 5 % level of significance.

 Analysis of the weight gain data


Table11.4. Weight gain (kg/wk) data of the bulls under the three feeds

Feed type
Feed 1 Feed 2 Feed 3
Wt gain/wk Wt gain/wk Wt gain/wk
10.3 7.7 8.4
9.9 5.7 7.6
8.5 5.3 5.5
7.0 4.7 4.9
5.1 3.5 3.3
Total = 40.8 Total = 26.9 Total = 29.7
Mean = 8.16 Mean = 5.38 Mean = 5.94
S21= 4.608 S22= 2.372 S23= 4.263

ANOVA test

Ho: The weight gain performance of yearling Borana bulls is the same under the three feeds

HA: The weight gain performance of yearling Borana bulls is different under the three feeds

Summary ANOVA table showing the result of the F test

Source SS DF MS F F Decision
cal critical
Total 66.589 14

Yosef Tekle-Giorgis, HUCA, 2015 Page 136


Statistical Methods for Scientific 2015

Research

F0.05(2,12)
Feed 21.617 2 10.809 2.884 Accept HO
3.89
Error 44.972 12 3.748

As shown above, the F test declared no significant differences in weight gain among bulls fed the
three feeds. On the other hand, there is notable mean weight gain difference under the three
feeds. At least the mean wt gain under feed 1 is much higher than the mean under feed 2 and 3.

Why then the F test failed to declare significance under such observable differences?

• Could it be because the observed weight gain difference under the three feeds was in
reality not large enough to be statistically significant?

OR

• Is it possible that the variation in initial weight of the bulls under the same feed
incurred major weight gain variation under the respective feed thus resulting in a
large mean square error value, which in return lowered the calculated F

• Possible explanation

• As a matter of fact, initial body weight of animals (cattle, small ruminants etc) has effect
on body weight gain efficiency. Light weight animals usually gain more efficiently than
heavy weight animals of the same age under the same feeding condition. This
phenomenon was clearly shown by data of table 4 in which the lighter bulls gained
almost at a double rate than the heavier bulls under the respective feed. As a result, the
gain of bulls under the same feed was highly different and this heterogeneous
performance of the bulls under each feed may have resulted in a high mean square error

Yosef Tekle-Giorgis, HUCA, 2015 Page 137


Statistical Methods for Scientific 2015

Research
(i.e., inflated value of Mswithin group ). As a consequence, Msamong group when divided by the
inflated Mswithingroup gave a lower Fcal value. i.e.,

• Fcal = Msamong group / Mswithin group becomes small as the denominator Mswithin group was
inflated

This can be explained by examining the sources of variations as follows:

1. Msamong group measures the variation in gain among bulls under the three feeds. The variance is
composed of two sources:

i) Variance in gain because of feed difference (σ2feed). i.e., treatment caused variation in
gain.

ii) Variance in gain among individual bulls within the same feed caused by individual
response difference (σ2error). i.e., minimum expected variation in gain among yearling
Borana bulls.
Msamong group = σ2feed + σ2error

2. Mswithin group measures the magnitude of variation in weight gain of bulls under the same feed.
It is composed of the following

i) Variance in gain among individual bulls within the same feed caused by individual
response difference (σ2error). i.e., minimum expected variation in gain among yearling
Borana bulls.

ii) σ2initial wt Variance in weight gain due to differences in initial weight of the bulls. The bulls
within the same feed markedly differed in initial weight and a difference in initial weight

Yosef Tekle-Giorgis, HUCA, 2015 Page 138


Statistical Methods for Scientific 2015

Research
as large as 50 kg (100 – 150 kg range) can result in differences in weight gain
performance.
This variance is referred as σ2extraneous source and it is unwanted source of variation.

Mswithin group = σ2error + σ2initial wt

Under this circumstance, when performing the F test which is

Fcal = Msamong group / Mswithin group

= σ2feed + σ2error / σ2error + σ2initial wt

σ2error cancels out from the numerator and denominator and F cal becomes

Fcal = σ2feed / σ2initial wt

Accordingly, the variance in gain caused by the treatment effect (σ2feed) is under estimated
because of the presence of unwanted source of variation (σ2initial wt). This was the reason why
the calculated F value became small and when compared to the critical F, the test could not
declare significance. Thus when experimental subjects are known to vary in some way other
than the treatment, CRD or random assignment of experimental subjects to treatment groups
should not be done. Instead a design that separates the external (unwanted) source of variation
(σ2extraneous source ) from σ2error should be employed. One of such type design is Randomized
Complete Block ANOVA Design

12.2. Randomized Complete Block Design (RCBD)

This design is employed when dealing with heterogeneous experimental subjects because it
enables to separately estimate the variation in response of experimental subjects caused by
unwanted source (in the present example σ2initial ) from the variance caused by individual
wt

response difference (σ2error). Thus Mswithin group only composes σ2error and therefore can be
appropriately used as a denominator variance for the F test. For instance in the present example

Yosef Tekle-Giorgis, HUCA, 2015 Page 139


Statistical Methods for Scientific 2015

Research
Mswithin group contains σ2error+ σ2initial wt

Thus using a block design σ2initial wt can be separately estimated and Mswithin group becomes σ2error.

Blocking essentially means subdividing the heterogeneous experimental subjects into smaller
subgroups of homogeneous experimental subjects. For instance it is known that steers within 10
kg body weight range do not markedly differ in weight gain efficiency. Knowing this the
experimenter would sub divide the 15 steers into weight categories in which each category
consists of steers weighing within 10 kg range. This gives 5 weight categories or subgroups
referred as blocks.

Block 1 consists of 3 steers weighing between 100 – 109 kg

Block 2 consists of 3 steers weighing between 110 – 119 kg

Block 3 consists of 3 steers weighing between 120 – 129 kg

Block 4 consists of 3 steers weighing between 130 – 139 kg

Block 5 consists of 3 steers weighing between 140 – 149 kg

Table 11.6 below shows the ID No. (2nd column) and initial body weight (3rd column) of the
steers under each weight category.

Table 11.6 The feed experiment as designed using RCBD fashion

Block Wt category ID No Initial body Feeding Wt gain/wk


weight (kg) group
1 102 Feed 3 8.4
1 100 – 109 kg 2 106 Feed 1 10.3
3 108 Feed 2 7.7
4 111 Feed 1 9.9
2 110 – 119 kg 5 115 Feed 3 7.6
6 117 Feed 2 5.7

Yosef Tekle-Giorgis, HUCA, 2015 Page 140


Statistical Methods for Scientific 2015

Research
7 122 Feed 3 5.5
3 120 – 129 kg 8 125 Feed 2 5.3
9 129 Feed 1 8.5
10 133 Feed 2 4.7
4 130 – 139 kg 11 137 Feed 1 7.0
12 140 Feed 3 4.9
13 142 Feed 1 5.1
5 140 – 149 kg 14 146 Feed 3 3.3
15 149 Feed 2 3.5

As shown in table 11.6, each weight category (Block) contains 3 steers that are similar in initial
weight Accordingly, one of the three steers under each weight category was assigned to one of
the three feeds randomly. This random assignment of the three steers in each block to the three
experimental feeds is shown by column 4 of table 6. Note that in block design experimental
subjects are randomly assigned to treatment groups only within a block. This is because the
experimental subjects within a block are similar.

The rationale in this design is that the treatments (the three feeds) can be tried on similar subjects
with in each block and hence if experimental subjects (steers) within each block differ in their
response (in weight gain), the differences can be reasonably attributed to the effect of the
treatments (the feeds). Note that the number of experimental subjects within each block are
equal to the number of treatment groups. This is because there should be a recipient of each
treatment in the block. Since the experimental subjects within each block are randomly assigned
to treatments, the design is called Randomized Complete Block Design (RCBD)

12.2.1. Conducting the feeding trial data recording and ANOVA testing procedure

After dividing the steers the 5 weight categories (Blocks) and randomly assigning them to the
three feeds within each block, the feeding experiment started. The steers were fed for three
consecutive months and the weight gain of each bull was recorded at weekly intervals. The data
in column 5 of table 11.6 shows the average weekly body weight gain of each bull for the three

Yosef Tekle-Giorgis, HUCA, 2015 Page 141


Statistical Methods for Scientific 2015

Research
months of feeding period. To perform the ANOVA test, the data shown in table 11.6 is first
arranged in treatment versus block two way table as shown below by Table11.7.

Table 12.7 Arrangement of the weight gain data as treatment VS Block two way table.

Weight category Feed type Block summary


Feed 1 Feed 2 Feed 3
(Block)
100 – 109 kg 10.3 7.7 8.4 26.4
Block 1
110 – 119 kg 9.9 5.7 7.6 23.2
Block 2
120 – 129 kg 8.5 5.3 5.5 19.3
Block 3
130 – 139 kg 7.0 4.7 4.9 16.6
Block 4
140 – 149 kg 5.1 3.5 3.3 11.9
Block 5

12.2.2 ANOVA test of the RCBD experiment

Hypothesis

• Hypothesis about the main treatment factor

HO: The weight gain performance of the steers is the same under the three feeds

HA: The weight gain performance of the steers is different under the three feeds

• Hypothesis about the block factor (effect of initial body weight on the gain
performance of the steers)

HO: The weight gain performance of the steers under the different weight categories is the same

HA: The weight gain performance of the steers under the different weight categories is different

Note that the hypothesis on the block factor can be ignored

Yosef Tekle-Giorgis, HUCA, 2015 Page 142


Statistical Methods for Scientific 2015

Research

Note that a block ANOVA experiment resembles a two factor ANOVA experiment designed as a
completely randomized design. BUT there are basic differences between the two:

i) In a block ANOVA design, the main interest of the experiment is to test the effect of the
treatment factor on the response variable (in this case the effects of the three feeds on
the weight gain performance of the steers). The test on the block factor (which is the
effects of initial weight on the weight gain performance of the steers) is included not
because it is the original interest of the experiment, but the researcher has to categorize
the steers into similar weight categories in order to avoid the effects of initial weight
difference on the gain performance of the steers. i.e., in order to try the three feeds on
steers of similar initial weights. Incase of a two factor ANOVA design, both factors
are the original interest of the experimenter.

ii) Block ANOVA experiment usually lacks replications in each cell, where as a two factor
ANOVA experiment contains replications.

ANOVA computational procedure of RCBD experiment

i) Total SS = ∑∑Xi2 – (∑∑Xi)2/N


(10.32 + 9.92 + ….+3.32 ) – 97.42/15 = 66.589
Total df = N–1= 15 – 1 14

ii) Feed SS = ∑(Feed totals)2/b - (∑∑Xi)2/N


(40.82 + 26.92 + 29.72)/5 – 97.42/15 = 21.617
Feed df = a – 1 3–1=2

iii) Block SS =∑ (Block totals)2/a - (∑∑Xi)2/N


(26.42 + 23.22 + 19.32 +16.62 +11.92)/3 – 97.42/15 = 42.503
Block df = b – 1 5–1=4

Yosef Tekle-Giorgis, HUCA, 2015 Page 143


Statistical Methods for Scientific 2015

Research

Note the following relationship

TSS = Feed SS + Block SS + Feed X Block SS + error SS

In Block ANOVA design theoretically there is some degree of interaction between the main
factor and the block factor. But the interaction is supposed to be insignificant. As a matter of
fact there is no way of separately estimating the interaction SS as there are no replicate data per
treatment Block combination cell. Actually it is not important to separately estimate the
interaction effect because it does not affect the test on the treatment factor even if there is a
significant interaction between the treatment and the block factor.

Since the interaction term cannot be separately estimated, it merges with the error term and thus
referred as remainder SS. It means the remaining sum of squares after deducting the treatment
and the Block SS. Remainder SS is in fact the basic error variance used as a denominator when
testing the treatment (Fertilizer) effect. Remainder SS is computed from the above relationship
as follows:

iv) Remainder SS = TSS – Feed SS – Block SS


66.589 – 21.617 – 42.503 = 2.469
Remainder df = Total df – Feed df – Block df
= 15 – 2 – 4 = 9

Table 12.8 F test result on data of feed experiment run as a block design

Source SS DF MS F F Decision
cal critical
Total 66.589 14
Feed F
21.617 2 10.809 35.02 0.05(2,8) Reject HO
(treatment) 4.46
Initial wt 42.503 4 10.626 34.42 F Reject HO
0.05(4,8)

Yosef Tekle-Giorgis, HUCA, 2015 Page 144


Statistical Methods for Scientific 2015

Research
Block
3.84
factor
Remainder 2.469 8 0.309

Conclusion

 Weight gain of the steers is different under the three feeds


 Also Steers with different initial weight differ in weight gain performances

Table 9 F test result on data of feed experiment run as CRD design

Source SS DF MS F F Decision
cal critical
Total 66.589 14

F0.05(2,12)
Feed 21.617 2 10.809 2.884 Accept HO
3.89
Error 44.972 12 3.748

As shown above the test result of the feed experiment when run as CRD is also displayed.
What are the differences between the two ANOVA test results?

• The feed effect when designed as RCBD fashion declared the presence of significant feed
effect where as when designed as CRD it did not. The reason is because the error sum of
squares in the ANOVA table of CRD experiment (44.972) is much higher than the
remainder sum of squares in the RCBD test result (2.469).

• This is because when the experiment was designed as CRD (single factor ANOVA), the
variation in weight gain performance of the bulls within the same treatment (SS within =
44.972) contained both variations caused by initial weight difference (SS initial weight =

Yosef Tekle-Giorgis, HUCA, 2015 Page 145


Statistical Methods for Scientific 2015

Research
42.507) and the minimum expected variation due to individual response difference (SS
error = 2.469). Thus SS within was inflated.

• Whereas when the experiment was run as RCBD fashion, the SS due to initial weight
could be separately estimated from SS error and thus the remainder SS (2.469) could be
used as the denominator during the test.

Exercise

1. There are three types of chemicals used in tanneries for leather processing and these
chemicals are suspected to have toxic effects to human skin resulting in irritation and
discoloration of skin. An experiment was conducted on rats to investigate which of the three
chemicals are more toxic to the skin of the handling personnel. Accordingly, 8 rats were
selected for the experiment and the hair on the back of each rat was slashed to expose the
skin. Three spots were then marked on the exposed skin of each rat and each chemical was
applied to one of the spot. Thus the three chemicals were applied to each rat. Afterwards,
the degree of discoloration of the skin was ranked from 0 to 10, in which 0 representing no
apparent discoloration where as 10 denoting complete discoloration of the skin. The data are
as follows.

Analyze the data and interpret? Use 5 % level of significance for the test

  Chemical type  
Rat number A B C Totals by
rat
6 5 3 14
1
2 9 9 4 22
3 6 9 3 18
4 5 8 6 19

Yosef Tekle-Giorgis, HUCA, 2015 Page 146


Statistical Methods for Scientific 2015

Research
5 7 8 8 23
6 5 7 5 17
7 6 7 5 18
8 6 7 6 19
Total of chemicals ∑X 50 60 40 150
∑ X2 324 462 220 Grand
total

2. Milk is highly nutritious food but also highly perishable product due to microbial
contamination if it is not handled with proper hygiene. In rural areas, deterioration in quality
happens mainly due to unhygienic milking practice as well as during storage since farmers often
store their milk in clay pots that are not properly cleaned. Thus washing hand and the udder of
cows before and after milking as well as proper cleaning of milk storage equipments can greatly
reduce the risk of microbial contamination and improve the quality of the milk. As part of her
MSc project, a graduate student in ARSc department studied how much the quality of milk can
be improved by giving training to farmers about hygienic milking practices (i.e., hand and udder
washing before and after milking) and proper cleaning of milk storage equipments.
Accordingly, from her study area she selected 10 volunteer farmers and from each farmer she
collected milk samples for quality assessment before they were given the training. She judged
the quality of the milk samples based on taste, aroma and flavor of the milk. She ranked quality
using a rank index ranging from 1 to 5 in which a value of 5 was given to best quality milk
(i.e., milk with excellent taste, aroma and flavor), a value of 4 given to very good quality, 3
given to good quality, 2 given for fair quality and a rank of 1 given to poor quality milk,
which is unacceptable for consumption.

She then gave training to the ten farmers about hygienic milking and storage practices and after
the farmers were very well acquainted with the practices, she visited the houses of each farmer
and collected milk samples for quality assessment after the training. In a similar manner she
judged the quality of each milk sample following the same ranking procedure described above.
In due regard, milk samples from each house were ranked twice, i.e., before the farmers were

Yosef Tekle-Giorgis, HUCA, 2015 Page 147


Statistical Methods for Scientific 2015

Research
given training (while they were using traditional way of milking and storage practices) as well
as after they were trained about hygienic practices and were applying hygienic milking and
storage techniques. The data shown in the table below are quality rank scores give+n to each
milk sample before the training and after provision of training.

Rank scores given to milk samples of each


ID No of the sampled Farmer Sum of rank values
Before the training After the training
farmers by Farmer
Farmer 1 4 5 9
Farmer 2 4.5 4.5 9
Farmer 3 2 4 6b
Farmer 4 5 5 10
Farmer 5 2 3 5
Farmer 6 3 5 8
Farmer 7 2.5 4 6.5
Farmer 8 3.5 4 7.5
Farmer 9 3 3 6
Farmer 10 2.5 3 5.5
Sum of rank by 32 40.5 Grand total = 72.5
training
Did the quality of milk improve after provision of the training? 00

Chapter 13 Latin Square ANOVA design

Latin Square design is employed to account for two unwanted sources of variation from
inflating the basic error variance. Unlike the RCBD design where only one source of
variation is accounted for, the Latin Square design is efficient in controlling simultaneously
two sources of variation. One of the unwanted sources of variation is referred as raw
blocking and the other one as column blocking. Two unwanted sources of variations can be
controlled simultaneously if each treatment category occurs once and only once in each raw
and column blocks

Yosef Tekle-Giorgis, HUCA, 2015 Page 148


Statistical Methods for Scientific 2015

Research
Some examples in which Latin Square design finds its use are the following:

 Field trials in which the experimental area has two fertility gradients running
perpendicular to each other.

 Greenhouse trials in which the experimental pots are arranged in straight line
perpendicular to the glass or screen walls such that the difference among the row pots and
the distance from the glass wall are expected to be the two major sources of variabilities
among the experimental pots.

 Laboratory trials with replication over time such that the difference among experimental
units conducted at the same time and those conducted over time constitute the two known
sources of variability.

 Feeding trials concerning animals like steers or small ruminants that differ in age and
initial body weights so that age and initial weight of the animals are considered as the two
unwanted sources of variations.

Although Latin Square design takes care of two unwanted sources off variations, it has a major
restriction towards it applicability. Since the treatments appear once and only once in each row
and column blocks, the number of treatments should be equal to the categories of column and
row block. This in return restricts the number of treatments to be considered. i.e., the design is
not practically applicable for experiments that have large number of treatments such as, variety
screening trials consisting of more than 10 varieties. Thus in practice, Latin Square is applicable
to experiments consisting between four and eight treatments. Analysis of Latin Square design is
illustrated with the following example.

Consider an experiment designed to evaluate the effects of five different sowing densities on the
yield of Teff. Since the experimenter suspected soil heterogeneity in two directions, he divided

Yosef Tekle-Giorgis, HUCA, 2015 Page 149


Statistical Methods for Scientific 2015

Research
the total experimental field into two series of 5 blocks running perpendicular to each other. i.e.,
column blocking consisting of 5 plots and row blocking again consisting of 5 plots of land,
respectively. Thus the experiment was conducted as a 5X5 Latin square design by applying each
of the five sowing densities once and only once to each column versus row combination plots
randomly. The data below are yield of Teff (quintals/hectare) obtained under the five sowing
densities in each row and column combinations. As indicated in the body of the table, the five
sowing densities are labeled with letters A to E and the corresponding numbers refer to the yield
of Teff recorded under the respective treatments.

Table 13.1Yield of Teff Quintal/ha under five different sawing densities.

  Columns Row totals


1 2 3 4 5
E B C D A
1
13.2 20.2 23 15.6 20 92
  C D B A E
2 18.1 18.1 21.6 28.4 13.8 100
B E A C D
Rows

 
3 20.7 14 22.9 23.7 15.2 96.5
  A C D E B
4 22.3 15.4 13.7 10.3 20.9 82.6
  D A E B C
5 14.5 23.3 13.4 23 21 95.2
Column totals 88.8 91 94.6 101 90.9 466.3

Total yield under each sowing density (Treatment totals)

Sowing density (treatment) A B C D E

Total yield (Q/hectare) 116.9 106.4 101.2 77.1 64.7

Grand total = 466.3

Yosef Tekle-Giorgis, HUCA, 2015 Page 150


Statistical Methods for Scientific 2015

Research
Uncorrected Sum of squares of all observations ∑∑ X2 = 9182.79

Test the appropriate hypothesis at 5 % level of significance and interpret the results

1) Hypothesis
Ho: the yield of Teff is not different under the five levels of sawing densities

HA: the yield of Teff is different under the five levels of sawing densities

Note that the column and Row effects are not hypothesized nor tested

2) Computational procedure of Latin Square ANOVA


Total SS = ∑∑Xi2 - (∑∑Xi)2/N (13.22 + 18.12 + 20.72 + .....+212) – 466.332/25
= 485.3624
Total DF = N-1 25-1 = 24

i) Column SS =[(Column totals)2/a] - (∑∑Xi)2/N


[(88.82/5) + 912/5) + (94.62/5) + (1012/5) + (90.92/5)] – 466.32/25 = 18.454
Column DF = a-1 = 5-1 = 4

ii) Row SS = [(row total)2/b] - (∑∑Xi)2/N

= [(922/5) + 1002/5) + (96.52/5)+(82.62/5) + (95.22/5) ] – 466.32/25 = 34.982

Row DF = b -1 5-1 = 4

iii) Treatment SS = [(treatment totals)2/ c] - (∑∑Xi)2/N


[(116.92/5) +(106.42/5) + (101.22/5) + (77.12/5) + 64.72/5)] – 466.32/25 = 374.274
Treatment DF = c-1 5-1 = 4

iv) Remainder SS = Total SS – Treatment SS – Row SS – Column SS

Yosef Tekle-Giorgis, HUCA, 2015 Page 151


Statistical Methods for Scientific 2015

Research
= 485.3624 - 374.274 - 34.982 - 18.454 = 57.6794

Remainder DF = Total DF – Treatment DF – Row DF – Column DF

= 24 – 4 – 4 – 4 = 12

Latin Square ANOVA summary table

Source SS DF MS F cal Fcritical Decision


Sowing 19.47614 Reject
374.2744 4 93.5686 3.26
Density 6 Ho
Column 18.4544 4        
Row 34.9824 4        
4.804266
Remainder 57.6512 12      
7
Conclusion: The yield of Teff significantly differed under the 5 sawing densities

Table 13.2 Mean yield of Teff under the 5 sawing densities

95% Confidence Interval


        Upper
Density n Mean SE Lower Bound Bound
b
A 5 23.38 0.98 21.244 25.516
b
B 5 21.28 0.98 19.144 23.416
C 5 20.24b 0.98 18.104 22.376
a
D 5 15.42 0.98 13.284 17.556
a
E 5 12.94 0.98 10.804 15.076
Interpretation: The yield of Teff was significantly lower under the sowing densities level D
and E than the others but no differences were observed among the rest levels.

Chapter 14 Lattice Design

14.1 Incomplete block designs

Theoretically, the complete block designs, such as the Randomized Complete Block and the
Latin Square designs discussed earlier are applicable to experiments with any number of
treatments. However, complete block designs become less efficient as the number of

Yosef Tekle-Giorgis, HUCA, 2015 Page 152


Statistical Methods for Scientific 2015

Research
treatments increase, primarily because block size increases proportionally with the number of
treatments, and the homogeneity of experimental plots within a large block is difficult to
maintain. That is, the experimental error of a complete block design is generally expected to
increase with the number of treatments.

For example consider a trial involving 16 varieties of sorghum with 5 replications. Overall a
total of 80 plots are required for the experiment. If the experiment is to be run as a Randomized
Complete Block design, 5 blocks are required in which case each block containing 16 plots of
land, i.e., one plot to try the respective variety. The problem with this design is that the area of
land required per block (i.e., 16 plots) becomes too large to maintain homogeneous plots under
each block.

An alternative set of designs for single-factor experiments having a large number of treatments is
the incomplete block designs, one of which is the lattice design. As the name implies, each
block in an incomplete block design does not contain all treatments and a reasonably small block
size can be maintained even if the number of treatments is large. With smaller blocks, the
homogeneity of experimental units in the same block is easier to maintain and a higher degree of
precision can generally be expected.

The lattice design is the incomplete block design most commonly used in agricultural research.
There is sufficient flexibility in the design to make its application simpler than most other
incomplete block designs. This section is devoted primarily to the most commonly used lattice
design, the balanced lattice design.

14.2 Balanced Lattice

The balanced lattice design is characterized by the following basic features:

Yosef Tekle-Giorgis, HUCA, 2015 Page 153


Statistical Methods for Scientific 2015

Research
1. The number of treatments (t) must be a perfect square (i.e., t = b2, such as 16, 25, 36, 49, 64,
81, 100, etc.). Although this requirement may seem stringent at first, it is usually easy to satisfy
in practice. As the number of treatments becomes large, adding a few more or eliminating some
less important treatments is usually easy to accomplish. For example, if a plant breeder wishes
to test the performance of 80 varieties in a balanced lattice design, all he needs to do is add
one more variety for a perfect square. Or if he has 82 or 83 varieties to start he can easily
eliminate one or two.

11 The block size (b) is equal to the square root of the number of treatments, (i.e., b = √ t .)
3 The number of replications (r) is one more than the block size [i.e., r = (b + 1)]. That is, the
number of replications required is 5 for 16 treatments, 6 for 25 treatments, 7 for 36 treatments, 8
for 49 treatments, and so on.

14.2.1 Illustration of a balanced lattice design

• Consider a trial involving 16 rice varieties. This requires a 4X4 balanced Lattice design.

Assume that planting 4 varieties per block of land insures a reasonable homogeneous plots per
block. With this consideration, the plot size per block are four. Hence the number of replication
(r ) will be b+1 = 5.

Accordingly the layout of the field would be as follows

1) Divide the total land into 5 equal size main blocks i.e., into r = 5 replication As shown
below. Note that each of the main blocks are not homogeneous in themselves

Yosef Tekle-Giorgis, HUCA, 2015 Page 154


Statistical Methods for Scientific 2015

Research

2) Divide each main block (replication) into four equal sub blocks Note that each sub block of
land is considered homogeneous

Rep I Rep II Rep III Rep IV Rep V

B1 B5 B9 B13 B17

B2 B6 B10 B14 B18

B3 B7 B11 B15 B19

B4 B8 B12 B16 B20

Yosef Tekle-Giorgis, HUCA, 2015 Page 155


Statistical Methods for Scientific 2015

Research
3) Divide each sub block into 4 plots of land. Accordingly 4 of the 16 varieties are to be tried on
each sub block. Note that under each replication (Main Block), The 4 plots that bear same
color are within same sub block and they are homogeneous.

B1

B2

B3

B4

Yosef Tekle-Giorgis, HUCA, 2015 Page 156


Statistical Methods for Scientific 2015

Research
3) Randomly assign 4 varieties to the four plots within each sub Block as shown below

B1

B2

B3

B4

In the above design note the following:

• The four varieties tried in each sub block are planted on homogeneous plots

• Also each sub block contains only part of the verities not the whole 16 varieties.
Accordingly the sub blocks are called incomplete blocks to distinguish them from the
blocks in RCBD that are complete as they contain all the treatments.

The analysis for the 4 X 4 Lattice experiment is illustrated using data on tiller counts of 16
varieties of rice shown in the following table.

Table 14.1 tiller counts of 16 varieties of rice resulting from Lattice design experiment

Yosef Tekle-Giorgis, HUCA, 2015 Page 157


Statistical Methods for Scientific 2015

Research

Data from K. A Gomez and A.A Gomez (1984)

Yosef Tekle-Giorgis, HUCA, 2015 Page 158


Statistical Methods for Scientific 2015

Research
14.2.2 Analysis of Balanced Lattice design

In a lattice design the sources of variations are

1) Treatment effects (Variety)

2) Replication (Main blocks)

3) Sub Blocks. Note that the sub blocks are nested within the main blocks (or replications)

4) Intra-block or basic error term used as a denominator for the F test

Significance test is done only for the treatment factor (varietal differences). But Replication
(Main block) and sub block effects are not tested as they do not have any practical significance.
They are considered here only to account for some extraneous source variabilities that may
otherwise inflate the basic error term

Hypothesis to be tested

HO: There is no difference in tiller count among the tested rice varieties

HA: There is a difference in tiller count among the tested rice varieties

Table 14.2 Layout of summary ANOVA Table

Source SS DF MS Fcal
Factor A SS A a-1 SSA/a-1 MSA/MSerror
Variety 16-1=15
Factor B Replication SS B r–1   Not tested

Factor C SS C r(c-1)   Not tested


Blocks(reps) 5(4-1)=15
Intra-block SS error r[(t-1)(c-1) SS error/DF error  
error 5X3X3=45

Yosef Tekle-Giorgis, HUCA, 2015 Page 159


Statistical Methods for Scientific 2015

Research
The computation of Lattice ANOVA is tedious and only the summary ANOVA table
(output of SPSS) is displayed below.

Lattice design summary ANOVA table

Source SS df MS F F critcal
Treatment 18396.2 15 1226.4 3.8 1.89
Replication 5946.1 4 1486.5 Not tested  

Block(replication) 11381.8 15 758.8 Not tested  

Intrablock (Error) 14533.3 45 323    

Table 14.3 Mean tiller count of the 16 rice varieties.

95% Confidence Interval


Variety n Mean SE Lower Bound Upper Bound
1 5 168.7b 8.929 150.716 186.684
2 5 162.7b 8.929 144.716 180.684
3 5 185.6b 8.929 167.653 203.622
4 5 172.3b 8.929 154.341 190.309
5 5 162.6b 8.929 144.653 180.622
6 5 176.9b 8.929 158.966 194.934
7 5 165.2b 8.929 147.216 183.184
8 5 179.8b 8.929 161.841 197.809
9 5 165.08b 8.929 147.091 183.059
10 5 120.8a 8.929 102.778 138.747
11 5 187.5b 8.929 169.466 205.434
12 5 187.8b 8.929 169.841 205.809
13 5 166.3b 8.929 148.341 184.309
14 5 196.1b 8.929 178.091 214.059
15 5 187.9b 8.929 169.966 205.934
16 5 163.8b 8.929 145.778 181.747

Interpretation: Variety 10 had significantly low tiller count than the others but the rest did not
differ from each other.

Yosef Tekle-Giorgis, HUCA, 2015 Page 160


Statistical Methods for Scientific 2015

Research

Chapter 15 Split plot design

15.1 Blocking in a factorial experiment

Blocking is also employed in a factorial experiments just like single factor experiments when
experimental subjects are suspected to be heterogeneous. For example consider Blocking in a
two factor experiment. Suppose an experiment is to be conducted to evaluate the yield of 4 rice
varieties under six levels of nitrogen fertilization and each of the treatment combinations is to be
tried in three replications. The treatment combinations with the replications are as follows

Nitrogen level (Kg/ha)


Variety

0 60 90 120 150 180


(N1) (N2) (N3) (N4) (N5) (N6)
V1 N1V1 N2V1 N3V1 N4V1 N5V1 N6V1
N1V1 N2V1 N3V1 N4V1 N5V1 N6V1
N1V1 N2V1 N3V1 N4V1 N5V1 N6V1
V2 N1V2 N2V2 N3V2 N4V2 N5V2 N6V2
N1V2 N2V2 N3V2 N4V2 N5V2 N6V2
N1V2 N2V2 N3V2 N4V2 N5V2 N6V2
V3 N1V3 N2V3 N3V3 N4V3 N5V3 N6V3
N1V3 N2V3 N3V3 N4V3 N5V3 N6V3
N1V3 N2V3 N3V3 N4V3 N5V3 N6V3
V4 N1V4 N2V4 N3V4 N4V4 N5V4 N6V4
N1V4 N2V4 N3V4 N4V4 N5V4 N6V4
N1V4 N2V4 N3V4 N4V4 N5V4 N6V4

• There are 24 treatment combinations of the two factors

a X b = 6X4= 24

• Each treatment combination is to be tried on 3 replication and thus a total of 72 plots of


land are required

Yosef Tekle-Giorgis, HUCA, 2015 Page 161


Statistical Methods for Scientific 2015

Research
N = a X b X r = 6X4X3 = 72

To run the this factorial experiment as block design the steps are the following

1) Divide the total experimental land into three blocks

2) Divide each block into 24 plots so that each of the 24 treatment combinations are to be
assigned to the respective plot in each block

3) Randomly assign the 24 treatment combinations to each plot within each block.

Results of this random assignment is shown as follows.

N1V2 N6V4 N1V4 N3V1 N5V4 N1V1


Block 1

N2V4 N5V1 N2V3 N1V3 N4V4 N2V1


N3V4 N4V1 N4V2 N5V3 N6V1 N6V3
N4V3 N3V2 N5V2 N6V2 N3V3 N2V2
N2V2 N6V3 N2V1 N1V1 N4V3 N3V4
Block 2

N3V3 N2V4 N6V4 N5V1 N3V2 N4V1


N6V1 N1V2 N1V4 N2V3 N6V2 N4V2
N4V4 N5V4 N5V2 N3V1 N1V3 N5V3
N1V1 N2V1 N6V3 N2V2 N3V4 N4V3
Block 3

N5V3 N1V3 N5V1 N6V4 N2V4 N3V3


N3V1 N5V4 N6V1 N1V2 N1V4 N2V3
N5V2 N4V4 N4V2 N6V2 N3V2 N4V1

After conducting the trial the yield data are then arranged in the form of treatment VS Block
three way table for analysis as follows

Nitrogen level (kg/ha)


Block Variety 0 60 90 120 150 180
V1 =          
Block 1

V2          
V3            
V4            
V1            
Block 2

V2            
V3            
V4            

Yosef Tekle-Giorgis, HUCA, 2015 Page 162


Statistical Methods for Scientific 2015

Research
V1            
Block 3

V2            
V3            
V4            

Data analysis

The data are then analyzed as a tri-factorial ANOVA and the following hypothesis are tested

1) Hypothesis about the fertilizer effect

Ho: The yield of rice is the same under the six levels of Nitrogen treatment

HA: The yield of rice is different under the six levels of Nitrogen treatment

2) Hypothesis about varietal difference

Ho: The yield of the four varieties of rice is the same.

HA: The yield of rice is different under the six levels of Nitrogen treatment

3) Hypothesis about the Interaction effect

Ho: The difference in yield among the four verities of rice does not depend on the level of
nitrogen application

HA: The difference in yield among the four verities of rice does not depend on the level of
nitrogen application.

NB Block effect as well as its interaction with the two main effects are not tested.

15.2 Split plot design

In the above experiment, the factors nitrogen level and variety are given equal importance. Thus

the two factors are equally replicated. Suppose the experimenter wants to give more importance

Yosef Tekle-Giorgis, HUCA, 2015 Page 163


Statistical Methods for Scientific 2015

Research
to testing the effects of one of the factor with more precision than the other factor. For instance,

consider that a plant breeder wishes to evaluate the yield of the four varieties of rice under the six

levels of nitrogen application. In this case the main goal of the breeder is to evaluate the yield

differences among the vanities and he would do the varietal comparison with more precision than

the effects of the six levels of fertilizer treatments. i.e., he would give more importance to the

varietal difference and hence he would like to replicate the four varieties more than the fertilizer

levels. i.e., each of the four verities would be tested on more number of plots than the number of

plots assigned to the levels of the nitrogen fertilizer. The appropriate design to accomplish this is

Split plot design. It is similar to the two factor ANOVA designed as a block fashion but the

difference is that one of the treatment factor is more important to the researcher and its effects is

measured with greater precision by replicating it on more number of plots than the factor

considered relatively less important.

In split plot design, the treatment factor considered less important (in this case Nitrogen level), is

assigned as a main plot factor. The factor considered as the prime interest of the research is

assigned as a subplot factor. Assigning the factors as a main plot or a sub plot depends on the

objectives of the research. For instance, for an agronomist who wants to do evaluation on the

effects of the six levels of nitrogen by trying on four improved verities of rice, The primary

interest would be to evaluate the effects of the fertilizer levels than varietal comparison.

In split plot design the main plot factor is replicated less number of times than the sub plot factor.

Here the main plot factor considered as factor A (Nitrogen level) has 6 levels

Yosef Tekle-Giorgis, HUCA, 2015 Page 164


Statistical Methods for Scientific 2015

Research
The sub plot factor, Variety (Factor B) has 4 levels

The number of replication (r ) = 3

Hence a total of 72 plots are required to run the trial

15.2.1 Steps in designing the layout of split plot design

1) Divide the total area of the experimental land into 3 equal size blocks

 
Block 1

 
Block 2

 
Block 3

2) Divide each of the three blocks into number of plots equal to the levels of the main plot factor.
in this case each block is divided into 6 plots.

3) Randomly assign the levels of the main plot (Fertilizer) in the plots of each block.

N5 N4 N2 N1 N6 N3
Block 1

Yosef Tekle-Giorgis, HUCA, 2015 Page 165


Statistical Methods for Scientific 2015

Research
N2 N1 N6 N3 N5 N4
Block 2

N1 N2 N5 N6 N4 N3
Block 3

4) Sub divide each of the main plot into plots equal to the number of the sub plot factor. i.e.,
each of the main plot is a block to the sub plot. For example in block 1, the first plot of land
assigned to N5 is divided into 4 parts and this is done for all the six plots under block 1, block
2 and block 3

Accordingly assign the four varieties randomly to the four plots in each main plot. Hence the
procedure is repeated 18 times as shown in the following sketch

As seen in the lay out below, the 4 vanities are replicated 3X6 = 18 times. Whereas the less
important factor nitrogen level is replicated only 3 times in the three blocks

Thus Varietal difference in yield was measured more precisely than the effects of the main plot
factor, nitrogen level

Yosef Tekle-Giorgis, HUCA, 2015 Page 166


Statistical Methods for Scientific 2015

Research

N5 N4 N2 N1 N6 N3
BLOCK 1 V2 V1 V2 V1 V4 V3
V1 V4 V3 V2 V3 V2
V3 V2 V1 V4 V2 V1
V4 V3 V4 V3 V1 V4

 
N2 N1 N6 N3 N5 N4
BLOCK 2

V1 V4 V3 V1 V3 V1
V3 V1 V4 V2 V4 V2
V2 V3 V1 V4 V2 V4
V4 V2 V2 V3 V1 V3
           
 

N1 N2 N5 N6 N4 N3
BLOCK 3

V4 V3 V2 V1 V2 V1
V3 V4 V3 V4 V3 V4
V2 V1 V1 V3 V4 V2
V1 V2 V4 V2 V1 V3

After conducting the trial the yield data are then arranged in the form of treatment VS Block
three way table for analysis as shown below.

Table 15.1 Yield of the four varieties of rice under the six levels of fertilizer treatments.

Main plot Variety Block


(Fertilizer) (Sub plot) Block 1 Block 2 Block 3 Main plot total
V1 4430 4478 3850 48670
N1 V2 3944 5314 3660
0 kg/ha
V3 3464 2944 3142
  
V4 4126 4482 4836
N2 V1 5418 5166 6432 65738

Yosef Tekle-Giorgis, HUCA, 2015 Page 167


Statistical Methods for Scientific 2015

Research
V2 6502 5858 5586
60 kg/ha
V3 4768 6004 5556
 
V4 5192 4604 4652
V1 6076 6420 6704 70395
N3 V2 6008 6127 6642
90 kg/ha
  V3 6244 5724 6014
V4 4546 5744 4146
V1 6462 7056 6680 70373
N4 V2 7139 6982 6564
120 kg/ha
  V3 5792 5880 6370
V4 2774 5036 3638
V1 7290 7848 7552 69744
N5 V2 7682 6594 6576
150 kg/ha
  V3 7080 6662 6320
V4 1414 1960 2766
V1 8452 8832 8818 69561
N6 V2 6228 7387 6006
180 kg/ha
  V3 5594 7122 5480
V4 2248 1380 2014
Block totals 128873 135604 130004 394481
(Grand total)

15.2.2 Data summarization

Make summary tables of two way totals

Table 15.2 Nitrogen VS Block totals

Block
Nitrogen 1 2 3 Nitrogen total
N1 15964 17218 15488 48670
N2 21880 21632 22226 65738
N3 22874 24015 23506 70395
N4 22167 24954 23252 70373
N5 23466 23064 23214 69744
N6 22522 24721 22318 69561

Yosef Tekle-Giorgis, HUCA, 2015 Page 168


Statistical Methods for Scientific 2015

Research
Block total 128873 135604 130004 394481

Table 15.3 Nitrogen VS variety totals

Variety Nitrogen
Nitrogen 1 2 3 4 total
N1 12758 12918 9550 13444 48670
N2 17016 17946 16328 14448 65738
N3 19200 18777 17982 14436 70395
4N 20198 20685 18042 11448 70373
N5 22690 20852 20062 6140 69744
N6 26102 19621 18196 5642 69561
Variety total 117964 110799 100160 65558 394481

Table 15.4 Variety VS Block totals

Block
Variety 1 2 3 Variety total
1 38128 39800 40036 117964
2 37503 38262 35034 110799
3 32942 34336 32882 100160
4 20300 23206 22052 65558
Block total 128873 135604 130004 394481

15.2.3 Data analysis


The data are analyzed as a tri-factorial ANOVA and the following hypothesis are tested

1) Hypothesis about the fertilizer effect

Ho: The yield of rice is the same under the six levels of Nitrogen treatment

HA: The yield of rice is different under the six levels of Nitrogen treatment

2) Hypothesis about varietal difference

Yosef Tekle-Giorgis, HUCA, 2015 Page 169


Statistical Methods for Scientific 2015

Research
Ho: The yield of the four varieties of rice is the same.

HA: The yield of rice is different under the six levels of Nitrogen treatment

3) Hypothesis about the Interaction effect

Ho: The difference in yield among the four verities of rice does not depend on the level of
nitrogen application

HA: The difference in yield among the four verities of rice depends on the level of nitrogen
application.

NB Block effect as well as its interaction with the two main effects are not tested because the
test does not have any practical significance. i.e.,

Factor c yield difference among the three blocks ---Not tested

AC Fertilizer by block interaction effect Not tested

BC Variety by Block interaction effect not tested

ABC Fertilizer X Variety X Block effect again not tested

Although not tested their SS an MS values are computed as these are variously used as an
error term in the main and sub plot factor test.

Table 15. 5 Layout of summary ANOVA table

Main plot analysis SS DF MS Fcal


Factor A Fertilizer SS A a-1 =6 SSA/a-1 MSA/MSAC
Factor C Block SS C C–1=2 SSC/c-1 Not tested
AXC SS AC a-1 X c-1 SSAC/df AC Not tested
Fertilizer X Block
Sub plot analysis        

Yosef Tekle-Giorgis, HUCA, 2015 Page 170


Statistical Methods for Scientific 2015

Research
Factor B Variety SS B b–1=3 SS B/b-1 MSB/MSBC
AXB SS AB a-1 X b-1 SSAB/d f AB MSAB/MSABC
Fertilizer X variety
BXC SS BC b-1 X c-1 SSBC/df BC Not tested
Variety X Block
ABC interaction SS ABC a-1 X b-1 X c-1 SS ABC /df abc Not tested
Fertilizer x Variety X Block

15.2.4 Split plot design ANOVA computation

1) Correction Term (CT)

CT = (Grand total)2 /N = 394,4892/72 =2161323047

2) TSS = ∑∑∑Xi2 – CT

(44302 + 33942 + ….+ 20142) – CT= 204747916 Total DF = N-1= 80 - 1 = 79

3) SS A Main plot (Nitrogen) SS ∑[(Fertilizer totals)2/bc] – CT

(486702 + 657382 + ……+695612)/ 12 –CT = 30429200 , DF A = a-1, 6-1=5

4) SS B = Sub plot(Variety) SS ∑[(Variety totals)2/ac] - CT

(1179642 + 1107992 + … + 655582)/18 - CT = 89888101.15 DF B = b-1, 4-1= 3

5) SS C Block SS ∑[(Block totals)2/ab] - CT

(1288732 + 1356042 + 1300042)/24 – CT =1082577 DF C = c-1, 3-1=2

Yosef Tekle-Giorgis, HUCA, 2015 Page 171


Statistical Methods for Scientific 2015

Research
6) SS AB (Fertilizer X Variety SS) SS Among all AB – SSA – SSB

SS Among all AB = ∑[(AB cell totals)2/c] –CT

127582 + 170162 + …..+ 56422)/3 – CT =189660787.7

SS AB = 189660787.7 – 89888101 – 30429200 = 69343487 DF AB = df A X df B = 5X3 = 15

7) SS AC (Fertilizer X Block SS) SS Among all AC – SSA – SSC

SS Among all AC = ∑[(AC cell totals)2/b] –CT

159642 + 218802 + …..+ 223182)/4 – CT = 32931455.07

SS AC = 32931455.07 – 30429200 – 108257 = 1419678 DF AC = df A X df C = 5X2 = 10

8) SS BC (Variety X Block SS) SS Among all BC – SSB – SSC

SS Among all BC = ∑[(BC cell totals)2/a] –CT

381282 + 375032 + …..+ 220522)/6 – CT = 92137588.82

SS BC = 92137588.82 – 89888101 – 1082577 = 1166911 DF BC = df B X df C = 3X2 = 6

9) SS ABC (Fertilizer X Variety X Block SS)

=TSS – SSA – SS B- SSC – SSAB – SSAC – SSBC = 11417962

DF ABC = df A X df B X df C = 5X3X2 = 30

Split plot ANOVA Summary Table

Yosef Tekle-Giorgis, HUCA, 2015 Page 172


Statistical Methods for Scientific 2015

Research
Source SS df MS F cal Fcritical
Nitrogen 30429200 5 6085840 42.868 3.326
 
    141968a    
89888101
Variety 3 29962700 154.062 4.757
 
    194485b    
1082577
Block 2 Not computed Not tested  
 
         
69343487
Nitrogen * Variety 15 4622899 12.146 2.015
 
    380599c    
1419679
Nitrogen * Block 10 141968 Not tested  
1166911
Variety * Block 6 194485 Not tested  
Nitrogen * Variety * 11417962
Block 30 380599 Not tested  
 
a. MS(Nitrogen * Block)
b. MS(Variety * Block)
c. MS(Nitrogen * Variety * Block)

Conclusion Nitrogen level variety as well as their interaction has significant effect on yield.

Chapter 16 Nested Analysis of Variance Design

Employed when Analyzing heterogeneous populations. i.e., when individuals in a population are
found to vary or become dissimilar. Most frequently used when sampling patchy distributions
like for example, in sampling rangelands or forest land.

Yosef Tekle-Giorgis, HUCA, 2015 Page 173


Statistical Methods for Scientific 2015

Research
A nested ANOVA design is illustrated with the following example. A study was designed to
assess herbage biomass production in a communally grazed and protected rangelands (park
reserve) in Yavello area. The researcher noted variation in range condition among different
locations in each of the two types of rangelands. i.e., patchy herbaceous growth in different
locations. Accordingly he conducted an initial survey of both types of rangelands and classified
each rangeland into three categories based on herbaceous growth condition as follows.

3 2
2
1
3 1

Communally
grazed Protected
rangeland rangeland (park
Key reserve)
Low herbage growth
Moderate herbage
growth
Lush herbage growth

Yosef Tekle-Giorgis, HUCA, 2015 Page 174


Statistical Methods for Scientific 2015

Research
16.1 Sampling procedure
In each type of rangeland, he randomly marked five sampling sites within each of the three
locations to collect herbage samples

 Low herbage growth area 5 sampling sites

 Moderate herbage growth area 5 sampling sites

 Lush herbage growth area 5 sampling sites

He then took herbage samples from 1 m by 1 m quadrate area following the standard herbage
sampling procedure. In doing so he estimated the biomass of herbage (ton of DM/ha) from
the 15 sampling locations for each type of rangeland. i.e., a total of 30 samplings. The data
are shown by the following table.

Table 16.1 Dm herbage (ton/ha) of samples collected from the 30 sampling locations

Range land type


Communally grazed Protected
Loc 1 Loc 2 Loc 3 Loc 1 Loc 2 Loc 3
1.1 1.2 1.2 1.3 1.4 1.4
1.2 1.0 1.1 1.2 1.5 1.6
0.9 1.1 1.3 1.3 1.6 1.7
1.0 1.2 1.2 1.4 1.5 1.6
1.0 1.1 1.3 1.1 1.3 1.5
5.2 5.6 6.1 6.3 7.3 7.8
16.9 21.4
38.3

• Does herbage productivity of the two types of rangelands (communally grazed and
park reserve) differ?

Yosef Tekle-Giorgis, HUCA, 2015 Page 175


Statistical Methods for Scientific 2015

Research
• Does herbage productivity differ among the different locations within the same
rangeland? Test at 5% level of significance?

16.2 Hypothesis

• Hypothesis about the effect of range land difference

HO: The two types of rangelands do not differ in range productivity

HA: The two types of rangelands differ in range productivity

• Hypothesis about the effect of location to location difference

HO: The range productivity among the different locations within the same rangeland type is
similar

HA: The range productivity among the different locations within the same rangeland type is
different

16.3 Sources of variation in a nested ANOVA design

As noted above, there are two sources of variations in range productivity.

i) Variations in range productivity between the two types of range lands. This variance is
expressed by MSRange type
ii) Variation in range productivity among locations within each range type. This variance is
expressed by MSLocation within range type

MSRange type composes the following

Yosef Tekle-Giorgis, HUCA, 2015 Page 176


Statistical Methods for Scientific 2015

Research
• σ2Range type Variation in range productivity due to differences in range type. i.e., because
one is communally grazed and the other is protected rangeland.

• σ2location within range type variation in range productivity because of location to location
difference within range type

• σ2error variation in range productivity because of site to site difference. i.e., random error

MSRange type = σ2Range type + σ2location within range type + σ2error

MSlocation within range type composes the following

• σ2location within range type variation in range productivity because of location to location
difference within range type

• σ2error variation in range productivity because of site to site difference. i.e., random error

MSlocation within range type = σ2location within range type + σ2error

Accordingly during the F test F calculated for the respective source of variation is computed as
follows

Fcal (range type) = MSRange type / MSlocation within range type

= [σ2Range type + σ2location within range type + σ2error] /[σ2location within range type + σ2error]

 Thus Fcal (range type) = σ2Range type

Fcal (location within range type) = MSlocation within range type / Mserror

= [σ2location within range type + σ2error] /[σ2error]

Thus Fcal (location within range type) = σ2location within range type

Yosef Tekle-Giorgis, HUCA, 2015 Page 177


Statistical Methods for Scientific 2015

Research
A nested ANOVA design resembles a two factor ANOVA discussed earlier. BUT there are
notable differences between the two

• In A nested ANOVA design only the main group factor is the original interest of the
research. In this case originally the research was designed to test the effect of range type
difference in productivity. Whereas the test on the subordinate factor (effect of location
to location difference in range productivity within the rangeland) was included in order to
avoid interference of location to location difference in productivity. Also the subordinate
factor is a nested factor under the main factor and there is no interaction between the
two..

• In a two factor ANOVA design the test on both factors is main interests of the research.
Also both factors are main factors and there is interaction between the two.

16.4 Nested ANOVA design computational procedure

i) Total SS = ∑∑Xi2 – (∑∑Xi)2/N

= (1.12+1.22 +…….1.52) – 38.32/30 = 1.253667

Total df = N – 1 30 – 1 = 29

ii) Range type SS = ∑((Range type total)2/bn) – (∑∑Xi)2/N


(16.92 +21.42)/15 – 38.32/30 = 0.675
Range type df = a – 1 2–1=1

Consider the following relationship

Total SS = Range type SS + Location(range) SS + error SS

Yosef Tekle-Giorgis, HUCA, 2015 Page 178


Statistical Methods for Scientific 2015

Research
Range type SS and Location(range) SS add up to a SS which called SS among all locations and
this SS is simpler to compute. i.e.

Range type SS + Location(range) SS = SS among all locations

Hence by computing this first Location(range) SS and error SS can be easily computed as
follows:

Location(Range)SS = SS among all location – Range type SS

Error SS = Total SS – SS among all locations

Accordingly the next SS to compute is SS among all location

iii) SS Among all location = ∑((Location total)2/n) - (∑∑Xi)2/N


= (5.22 +5.62 +6.12 +6.32 +7.32 +7.82)/5 = 38.32/30 = 0.989667

Df among all location = ab – 1 (3X2) -1 = 5

iv) SS Location(range) = SS Among all location – SS range type


= 0.989667 – 0.675 = 0.314667

Df Location(Range) = Df among all location - Range type df 5 -1 = 4

= a (b – 1) 2 (3-1) = 4

v) Error SS = Total SS - SS Among all location


= 1.253667 - 0.989667 = 0.264
Error df = Total df – df Among all location 29 – 5 = 24

= ab ( n-1) 2X3 (5 -1) = 24

Yosef Tekle-Giorgis, HUCA, 2015 Page 179


Statistical Methods for Scientific 2015

Research

Nested ANOVA summary table

Source SS Df MS Fcal Fcritical Decision


Total 1.253667 29
Among all location 0.989667 5
Range type 0.675 1 0.675 8.581 F0.05(1,4) Reject HO
7.71
Location(range) 0.314667 4 0.07866 7.151 F0.05(4,24) Reject HO
7
2.78
Error 0.264 24 0.011

Conclusion

 The two range lands are different in productivity


 Also locations within the respective range types are different in productivity

Yosef Tekle-Giorgis, HUCA, 2015 Page 180


Statistical Methods for Scientific 2015

Research

Chapter 17 Repeated measure ANOVA design

Pen 1 Pen 2 Pen 3 Pen 4 Pen 5


Group 1 Group 1 Group 1 Group 1 Group 1
Feed 1 Feed 6 Feed 3 Feed 5 Feed 4
Group 2 Group 2 Group 2 Group 2 Group 2
Feed 4 Feed 3 Feed 5 Feed 1 Feed 2
Group 3 Group 3 Group 3 Group 3 Group 3
Feed 5 Feed 2 Feed 4 Feed 4 Feed 3
Group 4 Group 4 Group 4 Group 4 Group 4
Feed 2 Feed 5 Feed 1 Feed 6 Feed 5
Group 5 Group 5 Group 5 Group 5 Group 5
Feed 3 Feed 4 Feed 6 Feed 2 Feed 1
Group 6 Group 6 Group 6 Group 6 Group 6
Feed 6 Feed 1 Feed 2 Feed 3 Feed 6
A repeated measure experimental design, also called a within-subjects or treatment by subject
design, is one in which multiple measurements on the same experimental subjects comprise the
replicate data. Here since same subjects are exposed to treatments several times, the subjects
should not be considered as true replications. The design is illustrated with the following
example. Consider an experiment designed to evaluate the performances of RIR chicken under
six different types of feeds. The feeds varied regarding the amount of fish meal content. That is
Feed 1 contained no fish meal (control) whereas Feed 2 Feed 3, Feed 4, Feed 5 and Feed 6
contained 10%, 20 %, 30%, 40% and 50 % by weight of fish meal. The performances evaluated
were body weight gain, Feed intake, feed conversion efficiency (feed consumption per gram
body weight gain) as well as various other traits of economic importance of the chicken under
the different feeds. However for the present illustration, only body weight gain data are presented
for analysis and the discussion below refers to the manner of data collection with respect to this
parameter. For the experiment 300 day old RIR chicken were randomly divided into 30 groups
of 10 chicken each and the 30 groups were again randomly divided into 5 pens. Therefore each
pen housed 6 groups of 10 chickens each. Accordingly, the six feeds were randomly assigned to

Yosef Tekle-Giorgis, HUCA, 2015 Page 181


Statistical Methods for Scientific 2015

Research
one of the six group of chicken in each pen. Therefore each feed was tried on a group of 10
chicken per pen and as there are 5 pens, each feed was replicated 5 times. Then the body weight
of the ten chicken was recorded as a group every week and based on this the weight gain per
week per 10 chicken was calculated as shown below.

Figure 17.1 Schematic illustration of the layout of the feeding experiment explained above

Table 17.1 Weight gain performance (gm/chick/day) under the six feeds during the six weeks

Among subject factor Subjects (Chicken) Within subject factor (age in Weeks)
Feed type replicate W1 W2 W3 W4 W5 W6
1 0.67 0.93 1.3 0.62 0.91 0.78
2 0.64 0.95 1.32 0.8 1.04 1.04
Feed 1 3 0.83 0.76 1.73 1.07 1.14 1.28
4 0.85 0.92 1.68 1.76 1.3 1.53
5 0.76 0.81 1.43 1.02 1.08 1.15
1 0.69 0.87 1.65 1.4 1.81 1.76
2 0.73 0.99 1.51 1.53 1.59 1.52
Feed 2 3 0.92 0.97 1.88 1.62 1.89 2.02
4 0.8 0.84 1.67 1.82 1.71 2.21
5 0.77 0.9 1.6 1.44 1.71 1.86
1 0.9 0.98 1.82 1.78 1.93 1.81
2 0.72 0.97 1.44 1.43 1.66 1.68
Feed 3
3 0.86 0.82 1.67 1.67 1.89 1.62
4 0.86 0.74 1.71 1.55 1.59 1.61

Yosef Tekle-Giorgis, HUCA, 2015 Page 182


Statistical Methods for Scientific 2015

Research
5 0.74 0.9 1.43 1.21 1.26 1.38
1 0.82 1.09 2.1 2 1.86 2.02
2 0.77 0.95 1.75 1.59 1.68 1.49
Feed 4 3 0.87 0.88 1.86 1.51 1.35 1.64
4 0.84 0.93 1.71 1.91 1.84 1.83
5 0.76 0.96 1.79 1.37 1.76 1.63
1 0.75 0.91 1.83 1.98 2.28 2.01
2 0.77 0.73 1.6 1.62 1.89 1.91
Feed 5 3 0.72 0.79 1.59 1.54 1.66 1.74
4 0.75 0.81 1.6 1.52 1.32 1.91
5 0.87 0.87 1.66 1.88 2.2 2.1
1 0.8 0.9 1.63 1.7 1.7 1.86
2 0.67 0.89 1.77 1.52 1.44 1.58
Feed 6 3 0.62 0.81 1.71 1.24 1.5 1.72
4 0.79 0.91 1.7 1.23 1.49 1.66
5 0.78 1.04 2.06 1.71 1.74 1.57

Did the mean growth of the chicken (average body weight gain per day per chicken) vary under
the six experimental feeds (i.e., under the 6 levels of fish meal supplementation)?
Was there a difference in the growth (mean weekly gain/day) of the chicken as their age
advanced?
Was the difference in the growth performance of the chicken under the six feeds comparable at
the different weeks of experimental period? . (is there interaction effect of feed type X age of
chicken). Test at 5 % level of significance.

Hypothesis

1. Effect of feed (level of fish meal supplementation) on gain


HO: Mean body weight gain per day per chicken of the tested breed is the same under the six
levels of fish meal supplementation.

Yosef Tekle-Giorgis, HUCA, 2015 Page 183


Statistical Methods for Scientific 2015

Research
HA: Mean body weight gain per day per chicken of the tested breed is different under the six
levels of fish meal supplementation

2. Growth difference with age advancement


HO: Weekly mean gain/day (gm) of the tested breed of chicken is comparable during the course
of the feeding period. i.e., Mean gain is similar at the different age of the chicks

HA: Weekly mean gain/day (gm) of the tested breed of chicken is not comparable during the
course of the feeding period. i.e., Mean gain is not similar at the different age of the
chicks

3. Interaction effect of feed and age of chicken


HO: The difference in weekly gain/day (growth) of the tested breed of chicken at the different
age does not depend on the type of feed provided. i.e., the growth pattern or weight gain
efficiency of the chicken does not depend on the type of feed provided.

HA: The difference in weekly gain (growth) of the tested breed of chicken at the different age
depends on the type of feed provided. i.e., the growth pattern or weight gain efficiency
of the chicken depends. on the type of feed provided.

17.1 Sources of variations in the repeated measure ANOVA design

This design is typically a blocked and repeated measure ANOVA design. Here five replicates of
10 chicken each were fed each type of feed. Therefore the chicken are referred as subjects or
blocks. Feed type is referred as among subject factor because the five groups of chicken
assigned under say feed 1 are different from the five groups of chicken assigned under feed 2 etc.
Experimental weeks (age of chicken) are referred as within subject factor because the weight
gain of same chicken was measured repeatedly each week. Thus weeks are also referred as
repeated measure factor. Specifically the design is referred as one among subject and one within
subject factor repeated measure design. It is possible that more than one among and within
subject factors can be included in the model.

Yosef Tekle-Giorgis, HUCA, 2015 Page 184


Statistical Methods for Scientific 2015

Research

The following sources of variations are recognized in the present design.

The total variability can be divided into two sources

i) The variability among subjects (blocks)


ii) The variability within subjects

i) The variability among subjects (blocks) is portioned into


a) the variability among levels of factor A (feed) and
b) the variability due to subjects within factor A (Feed). i.e. chickens fed the same feed.

ii) The variability within subjects is again divided into


a) the variability due to factor B (week to week difference in growth),
b) due to the interaction of AXB, i.e., feed X week interaction effect in growth and
c) Factor B(week) Vs subjects within A interaction
Levels of factor A (feed) = a = 6

Levels of factor b (experimental weeks) = b = 6

Number of replications per factor B (groups of chicken within each feed) = n = 5

The degrees of freedoms for each source are calculated as follows.

Total DF = anb-1 =N – 1 6 * 5*6 – 1= 180-1 = 179

Subjects (blocks) DF = an – 1 6 * 5 – 1 = 30 -1 = 29

Factor A (Feed) DF = a – 1 6 – 1= 5

Subjects within factor A (S/A) DF = a(n – 1) 6(5 – 1) = 24

= Subjects df – Factor A DF = 29 -5 = 24

Within subject DF = an (b-1) 6*5(6-1) = 150

Yosef Tekle-Giorgis, HUCA, 2015 Page 185


Statistical Methods for Scientific 2015

Research
= Total DF – subjects DF 179 -29 = 150

Factor B (Weeks) DF = b – 1 6 -1 = 5

B X A DF = (a – 1) (b – 1) 5*5 = 25

B X S/A DF = a(n-1)(b-1) 6*4*5 = 120

= within subject DF – Factor B DF – BXA DF 150 -5 -25 = 120

Table 17.2 Layout of ANOVA Table for one among and one within subject factor repeated
measure ANOVA design.

Source SS DF MS Fcal
Total
Subjects
Factor A (feed) Factor A/(S/A)
S/A Not tested

Within subject
Factor B(Weeks) Factor B/( B X S/A)
BXA BXA/( B X S/A)
B X S/A

17.2 SPSS procedure for repeated measure ANOVA design

Since the computations are too tedious to perform manually, the SPSS command for the test are
as follows

i) Data entry
The data entry procedure for repeated measure ANOVA is different and it is designed to
make data entry relatively simpler. Accordingly when the data entered appears exactly as
displayed by Table 16.1. The procedure is as follows

>Activate SPSS

Yosef Tekle-Giorgis, HUCA, 2015 Page 186


Statistical Methods for Scientific 2015

Research
>In the data editor window, click variable view and enter variable names in the name column
as shown below

Name
Feed_type
Subjects
Week1
Week2
Week3
Week4
Week5
Week 6
>Click value label column and appropriately label the categories of feed type and subjects

>Click measures column and change the variables Feed type and Subjects as nominal and Week1
to Week6 scale

Click data view and enter data

ii) Analysis

.Click analyze >General Linear Model > Repeated measure

The repeated measure ANOVA command window appears

>Type weeks as the within subject factor name

>Type 6 as the within subject factor level

>Type growth as the measure name

> Click define

Yosef Tekle-Giorgis, HUCA, 2015 Page 187


Statistical Methods for Scientific 2015

Research
Move W1 to W6 into the within subject variable box

Move feed_type in the between subject factor box

>click posthoc and select Tukey and SNK

>Click options and move feed type in the display means for box also check homogeneity test

Click continue

Click ok

Output window appears as follows

SPSS gives separate summary ANOVA Table for the among subject (feed effect) and the within
subject effect analysis

Tests of Between-Subjects Effects

Type III Sum of


Source Squares df Mean Square F Sig.

Intercept 329.374 1 329.374 3031.933 .000


Feed_type 3.190 5 .638 5.873 .001
Error 2.607 24 .109

The above output is the test on the feed type effect

It indicates that there is a significant difference in weight gain performance of the chicken
under the six feeds

Mauchly's Test of Sphericity


Measure: Growth
Epsilon
Within
Subjects Approx. Chi- Greenhouse- Huynh- Lower-
Effect Mauchly's W Square df Sig. Geisser Feldt bound
Weeks .236 31.930 14 .004 .661 .939 .200

Yosef Tekle-Giorgis, HUCA, 2015 Page 188


Statistical Methods for Scientific 2015

Research
• One of the most important assumptions for Repeated measure ANOVA test is the
assumption of sphericity. The assumption for the test is that the variance-covariance
matrix of the dependent variable should be circular, or spherical, in form.
• The test for it is called Mauchly’s W test and it is shown by the above table
• Mauchly's test verifies this by testing the null hypothesis that the error covariance matrix
of the orthonormalized transformed dependent variables is proportional to an identity
matrix.
• When the significance value is less than 0.05, as in this case, the assumption for the
univariate tests does not hold. Fortunately, the degrees of freedom of the univariate tests
can be adjusted to account for violation of the assumption.
• The adjustment value, called epsilon, is needed for multiplying the numerator and
denominator degrees of freedom in the F test. There are three possible values of epsilon,
based on three different criteria.
• The Greenhouse-Geisser epsilon can be conservative, especially for small sample sizes.
• The Huynh-Feldt epsilon is an alternative that is not as conservative as the Greenhouse-
Geisser epsilon; however, it may have a value greater than 1. When its calculated value is
greater than 1, the Huynh-Feldt epsilon used is 1.000.
• The lower-bound epsilon takes the reciprocal of the degrees of freedom for the within-
subjects factor. This is the most conservative approach.

Tests of Within-Subjects Effects


Measure:Growth

Type III Sum of


Source Squares df Mean Square F Sig.

Weeks Sphericity Assumed 24.882 5 4.976 258.344 .000

Greenhouse-Geisser 24.882 3.306 7.527 258.344 .000

Yosef Tekle-Giorgis, HUCA, 2015 Page 189


Statistical Methods for Scientific 2015

Research
Huynh-Feldt 24.882 4.696 5.299 258.344 .000

Lower-bound 24.882 1.000 24.882 258.344 .000


Weeks * Feed_type Sphericity Assumed 2.383 25 .095 4.949 .000
Greenhouse-Geisser 2.383 16.529 .144 4.949 .000
Huynh-Feldt 2.383 23.479 .102 4.949 .000
Lower-bound 2.383 5.000 .477 4.949 .003
Error(Weeks) Sphericity Assumed 2.312 120 .019

Greenhouse-Geisser 2.312 79.341 .029

Huynh-Feldt 2.312 112.700 .021

Lower-bound 2.312 24.000 .096

• This table displays univariate tests for the within-subjects factors and interaction terms.
• The test associated with Sphericity Assumed assumes that the assumption about the
covariance matrix is met. The F statistic is evaluated using the original degrees of freedom.
• When the covariance matrix assumption is not met, use the Greenhouse-Geisser, Huynh-
Feldt, or Lower-bound test.
• Since the value of the Greenhouse-Geisser epsilon is 0.661, the degrees of freedom for each
of the Greenhouse-Geisser tests are 0.661 times the degrees of freedom for the Sphericity
Assumed tests.
• Likewise, since the value of the Huynh-Feldt epsilon is 0.939, the degrees of freedom for
each of the Huynh-Feldt tests are 0.939 times the degrees of freedom for the Sphericity
Assumed tests.
• Lastly, since the value of the lower-bound epsilon is 0.20, the degrees of freedom for each
of the lower-bound tests are 0.20 times the degrees of freedom for the Sphericity Assumed
tests.
• In this case, all four tests agree that the within-subjects effects significantly contribute to the
model.
The following is summary of the report table regarding the weight gain performances of the
chicken.

Table 17.3 Mean weight gain (g/day) of the chicken under the six experimental feeds

Yosef Tekle-Giorgis, HUCA, 2015 Page 190


Statistical Methods for Scientific 2015

Research

95% Confidence Interval


Feed type n Mean SE Lower Bound Upper Bound
Feed 1 5 1.070a 0.06 0.946 1.194
Feed 2 5 1.423b 0.06 1.299 1.548
Feed 3 5 1.353b 0.06 1.229 1.477
Feed 4 5 1.452b 0.06 1.327 1.576
Feed 5 5 1.460b 0.06 1.336 1.584
Feed 6 5 1.358b 0.06 1.234 1.482

Interpretation

The weight gain performance of the chicken is significantly lower under the control feed ( no
bone meal supplementation but differences among the supplemented feeds are not significant.

Chapter 18 Regression and correlation analysis

The chapters discussed so far dealt with univariate analysis, an analysis of a single variable.
Regression and correlation analysis are about relationships among variables. Variables are said
to be related when a change in the magnitude of one variable is associated with a change in the

Yosef Tekle-Giorgis, HUCA, 2015 Page 191


Statistical Methods for Scientific 2015

Research
magnitude of the other variable. The change in magnitude for quantitative variables is an
increase or decrease in the value of the variable. If an increase in the value of one of the variable
is accompanied by an increase in the other variable, the relationship is referred as a positive
relationship (Figure 18.1a). If on the other hand an increase in the value of one of the variable is
accompanied by a decrease in the other variable, the relationship is negative (Figure 18.1b). Two
or more variables are said to be unrelated if one of the variable is not responsive as the other
variable changes. (Figure 18.1c). For a nominal scale data, the positive relationship exists when
the change in the categories is along the same direction. If no,t the relationship is negative. For
example, if presence of disease is positively associates with the presence of insect vector, the
relationship is positive.

a b c
50 35 50
45 30 45
40
Y
Y

35 25 40
30 20 35
35 40 45 50 55 0 10 20 30 40 35 40 45 50 55
X X X

Figure 18.1 (Relationship between X and Y variables. a) Positive, b) negative, c) no relationship

Regression type relationship exists when there is a functional dependence of one of the variable
on the other. In such a case, the magnitude of one of the variable (dependent variable) is
assumed to be determined by (or is a function of) the magnitude of the second variable (the
independent variable), whereas the reverse is not true. For example in the relationship between

Yosef Tekle-Giorgis, HUCA, 2015 Page 192


Statistical Methods for Scientific 2015

Research
blood pressure and age, blood pressure is considered as the dependent variable because age is
one of the factors for a change in blood pressure. Thus blood pressure of an individual may be
estimated from age but it cannot work the other way around as age cannot be determined from
blood pressure. Unlike this however, there are instances that biologists employ inverse
prediction of the independent variable from the dependent variable. But this should be done with
caution. A correlation relationship exists when there is association between variables but not
dependent- independent type as described above. In the following data identify the type of
relationship exhibited by the data.

Table 18.1 photosynthetic rate (μ mol m-2s-1)and light interception (mol m-2s-1) observed on
leaves of a particular tree species.

X y
0.7619 7.58
0.7684 9.46
0.7961 10.76
0.8380 11.51
0.8381 11.68
0.8435 12.68
0.8599 12.76
0.9209 13.73
0.9993 13.89
1.0041 13.97
1.0089 14.05
1.0137 14.13
1.0184 14.20
1.0232 14.28
1.0280 14.36
Table 18.2 pH and organic carbon content measured from soil samples collected from 15 pits
taken in natural forests

Soil pH Organic carbon


(%)
Pit (x) (y)

Yosef Tekle-Giorgis, HUCA, 2015 Page 193


Statistical Methods for Scientific 2015

Research
1 5.7 2.1
2 6.1 2.17
3 5.2 1.97
4 5.7 1.39
5 5.6 2.26
6 5.1 1.29
7 5.8 1.17
8 5.5 1.14
9 5.4 2.09
10 5.9 1.01
11 5.3 0.89
12 5.4 1.6
13 5.1 0.9
14 5.1 1.01
15 5.2 1.21

Table 18.3 Daily energy amount (calories) required by adult rats at different temperatures

Energy
consumption
Rat Id Air Temp (OC) (Calorie/day)
1 10 31.7
2 12 31
3 14 29.8
4 16 29.1
5 18 28.2
6 20 27.8
7 22 27.4
8 24 25.5
9 26 24.9
10 28 23.7
11 30 23.1

18.1 Types of regression and correlation relationships

Regression and correlation relationships are categorized as follows:

Yosef Tekle-Giorgis, HUCA, 2015 Page 194


Statistical Methods for Scientific 2015

Research
1) Based on the number of variable related
i) Simple (bivariate) this is a relationship between two variables
ii) Multiple this is a relationship among three or more variables

2) Based on the rate of change


i) Linear relationship
In this kind of relationship, the rate of change in one of the variable is constant across the
domain of the other variable. In this case the points in the scatter plot fall in straight line.

ii) Curvilinear relationship


In this kind of relationship, the rate of change in one of the variable is not constant
across the domain of the other variable. Here the points in the scatter plot do not fall
in a straight line rather they follow a certain pattern that indicates the underlying
principle in the relationship.

Accordingly, the following types of regression and correlation relationships are recognized.

 Simple linear regression or correlation relationships


 Simple curvilinear regression or correlation relationship
 Multiple linear regression or correlation relationships
 Multiple curvilinear regression or correlation relationships

18.2 Simple linear regression analysis

Yosef Tekle-Giorgis, HUCA, 2015 Page 195


Statistical Methods for Scientific 2015

Research
The purpose of establishing a regression relationship is to estimates the dependent variable
from the independent variable. The independent variable is also known as predictor or
regressor variable and the independent variable is known as the response variable. The value
of the independent variable can be estimated (or predicted) from the value of the independent
variable if the equation that describes the functional relationship between the two variables is
determined. The statistical procedure is known as regression analysis. The regression
equation that describes a simple linear type regression relationship in a population is
expressed as:

Yi = α +β Xi +εi

Where α is the intercept (the value of Y when X = 0) and β is the slope of the relationship.

εi is referred as residual or error term and it is the departure of an actual (measured) Y from
the estimated Y using the above regression equation (Y^ ).

The sample regression relationship is expressed as Yi = a +b Xi +εi, where the terms are as
defined above.

18.2.1 Determining simple linear regression equation

To determine the specific equation for a given data, the values of the intercept and the slope
should be estimated from the X-Y scatter data and the steps of establishing a regression
equation will be illustrated using data of Table 17.1 as follows

i) Estimate the slope of the relationship


The formula to estimate slope b is as follows

b=
∑ ( Xi− X́ ) ∑ ( Yi−Ý ) = XiYi−¿ ¿ ¿ ¿
2 ∑
∑ ( Xi− X́ )

Yosef Tekle-Giorgis, HUCA, 2015 Page 196


Statistical Methods for Scientific 2015

Research
The numerator in the above formula is referred as sum of cross products, while the
denominator is SS of the independent variable Xi.

To calculate the slope for the data of Table 17.1, the terms in the above equation should be
calculated as follows.

Table 18.4 photosynthetic rate (μ mol m -2s-1) and light interception (mol m-2s-1) observed on
leaves of a particular tree species, data arranged for computation of regression
coefficients.

Light interception Photosynthetic rate


X Y XY X2 Y2
0.7619 7.58 5.775202 0.580492 57.4564
0.7684 9.46 7.269064 0.590439 89.4916
0.7961 10.76 8.566036 0.633775 115.7776
0.838 11.51 9.64538 0.702244 132.4801
0.8381 11.68 9.789008 0.702412 136.4224
0.8435 12.68 10.69558 0.711492 160.7824
0.8599 12.76 10.97232 0.739428 162.8176
0.9209 13.73 12.64396 0.848057 188.5129
0.9993 13.89 13.88028 0.9986 192.9321
1.0041 13.97 14.02728 1.008217 195.1609
1.0089 14.05 14.17505 1.017879 197.4025
1.0137 14.13 14.32358 1.027588 199.6569
1.0184 14.2 14.46128 1.037139 201.64
1.0232 14.28 14.6113 1.046938 203.9184
1.028 14.36 14.76208 1.056784 206.2096
13.7224 189.04 175.5974 12.70148 2440.661
∑ Xi ∑Yi ∑XY ∑X2 ∑Y2

b = ∑ XiYi−¿ ¿ ¿ ¿ ∑ 175.5974−¿ ¿ ¿ ¿

=(175.5974- 172.9388)/( 12.70148- 12.55361745) = 2.658554/0.147865609

b = 17.98 μ mol m-2s-1/ mol m-2s-1

Yosef Tekle-Giorgis, HUCA, 2015 Page 197


Statistical Methods for Scientific 2015

Research
Interpretation: Photosynthetic rate increased by 17.98 μ mol m-2s-1 for a one mol m-2s-1

increase in the amount of light interception.

nc

ii) Calculating the intercept (a)


After calculating the slope, the intercept is calculated from the following relationship
a=Ý −b∗ X́
Ý =∑ Yi/n = 189.04/15 = 12.6027 μ mol m-2s-1
X́ = ∑ Xi / n = 13.7224/15 = 0.9148

a = 12.6027 – (17.98 *0.9148) = -3.845 mol m-2s-1

Accordingly the equation that describes the relationship between rate of photosynthesis
and light interception for the studied tree species is

Yi = 17.98 X – 3.845

Thus the equation can be used to predict the rate of photosynthesis a different levels of
light interception.

18.2.2 Expressing the strength of regression analysis

A valid regression equation is the one that gives estimates of Y ( Y^ ) that are close to the actual
(measured) value of Y at any value of X. If the difference between the two is large, then the
equation is not a good or dependable equation. If the dependence of Y on X is strong, the
estimated Y from the regression equation becomes close to the measured Y value. Hence
expressing how strong a relationship exists between Y and X is important to authenticate the
validity of the regression equation.

Yosef Tekle-Giorgis, HUCA, 2015 Page 198


Statistical Methods for Scientific 2015

Research
Statistically, the coefficient of determination (R 2) value is used to express the degree of
dependence of Y on X, in other words the strength of the relationship. R 2 is computed using the
following equation:

2
X Y
( ∑ XY − ∑ ∑ n )
R 2= 2 2
(∑ X ) (∑ Y )
(∑ X− 2
n )(∑ 2
Y −
n )
2
(13.7224∗189.04)
( 175.5974−
15 )
= 13.7224 2 189.042
( 12.70148−
15 )(
2440.661−
15 )
=2.6585542/(0.147865609 * 58.2529)

=7.0679 /8.6136 = 0.82

Interpretation of the calculated value of R2

R2 = 0 82 indicates that 82 % of the variation in photosynthesis rate is due to light


interception by the leaves. In other words it means that the predicted photosynthesis rate
using the equation is 82 % related to the actually measured photosynthesis rate.

R2 value ranges between 0 and 1. A value of R 2 close to 1 indicates the presence of a strong
relationship between Y and X. Conversely a value of R 2 close to 0 indicates weak relationship.
Now the question is how large should R2 value be to indicate a strong relationship? As a rule of
thumb, the following can be considered as a rough guide.

R2 ≥ 0.9 indicates the presence of a very strong relationship between Y and X

0.8 ≤ R2 <0.9 indicates a strong relationship

Yosef Tekle-Giorgis, HUCA, 2015 Page 199


Statistical Methods for Scientific 2015

Research
0.7 ≤ R2 < 0.8 indicates an acceptable relationship

As seen from the above, R2 is a relative measure of the strength of a regression relationship.
Thus to decide on how low a value of R 2 to consider before questioning the validity of the
relationship, the significance of a regression relationship should be tested. As a matter of fact, if
the test proofs that there is a significant relationship between Y and X, an equation with R 2 value
as low as 0.6 or even 0.5 can be considered as valid. Hence it is imperative to test the
significance of the regression relationship.

18.2.3 Testing the significance of a regression relationship

Hypothesis

Ho: There is no regression relationship between photosynthesis rate and light interception

HA: There is a significant regression relationship between photosynthesis rate and light
interception

The test of significance can be done using F test as well as t test procedure and both of them are
illustrated below.

18.2.3.1 ANOVA procedure for testing the significance of a regression relationship

i) Compute the total sum of squares of Y

2
Y
Total SS = ∑ ( Yi−Ý ) 2=∑ Y 2− ( ∑ ) = 2440.661 = 189.042/15
n

2440.661 -2382.408 = 58.25329, Total DF = n – 1 = 15 – 1 14

ii) Compute regression SS

Yosef Tekle-Giorgis, HUCA, 2015 Page 200


Statistical Methods for Scientific 2015

Research
2
X Y
( ∑ XY − ∑ ∑ n )
^ Ý )2
= ∑ ( Yi− 2
(∑ X )
(∑ X−2
n )
2
(13.7224∗189.04)
( 175.5974−
15 ) 2.6585542/(0.147865609)
2

(12.70148− 13.7224
15 )

= 47.7995, Regression DF is 1 for linear regression

Note that regression SS measures the amount of variation in Y that occurred because of the
dependence of Y on X. i.e., it expresses the part of the variation in Y because of the regression
effect of X on Y

Consider the following relationship:

Total SS = Regression SS + Residual SS

Here Residual SS is defined as the discrepancy between Predicted Y and actual (measured) Y.

2
Residual SS = ∑ ( Y^ −Ý )

It can also computed as

Residual SS = Total SS – Regression SS

= 58.25329 - 47.7995 = 10.45379

Residual DF = Total DF - Regression DF

N-1 -1 = N-2 15-2 = 13

Summary ANOVA table

Source SS DF MS F cal F critical Decision

Yosef Tekle-Giorgis, HUCA, 2015 Page 201


Statistical Methods for Scientific 2015

Research
Total 58.253 14
Regression 47.8 1 47.8 59.442 4.667193 Reject Ho
Residual 10.454 13 0.804

Conclusion: There is a significant regression relationship between photosynthesis rate and light
interception.

In the above summary ANOVA table, the residual mean square is also called the variance of the
estimated Y and symbolized as S2Y . X . It is used to estimate the standard error of the slope, which
is Sb. It is estimated as:

Residual mean square 0.804


Sb =
√ ∑ X 2−
(∑ X )
n
2

√ 12.70148−
13.7224 2
15

0.804
= √ 0.147865609
√ 5.4374 = 2.3318 μ mol m-2s-1/ mol m-2s-1

The standard error of slope measures how much the sample slope (b) deviates from the
population slope (β). In the present example the photosynthesis rate determined in the
present experiment (b = 17.98 μ mol m-2s-1/ mol m-2s-1) deviated from the photosynthetic

rate for the concerned species of tree (β) by 2.3318 μ mol m-2s-1/ mol m-2s-1

18.2.3.2 t test procedure for testing the significance of a regression relationship

The t test procedure is based on the argument that if there is no regression relationship between
Y and X, the slope (b) would become 0. Therefore in the t test procedure, the computed slope is
compared with Zero. Thus if the computed slope is significantly different from zero, then there
is sufficient evidence for the presence of significant regression relationship between Y and X.
Therefore the test would be done in a manner similar to a one sample t test as follows.

Yosef Tekle-Giorgis, HUCA, 2015 Page 202


Statistical Methods for Scientific 2015

Research
i) Hypothesis
Ho: There is no regression relationship between photosynthesis rate and light interception

Β=0

HA: There is a significant regression relationship between photosynthesis rate and light
interception. Β ≠0

ii) t cal = (b -0)/Sb = 17.98/ 2.3318 = 7.71

iii) t critical = tα/2 (n-2), for α = 0.05

= t0.025 (13) = 2.160

iv) Decision: reject Ho


v) Conclusion: there is a significant regression relationship between photosynthesis
rate and light interception in the studied species of tree.

1-α% CI for the regression coefficient

Confidence interval for the population slope (β) can be computed as

b ± tα/2 (n-2) * Sb

In the present example, the 95 % CI for the rate of photosynthesis of the population of the
investigated tree species is computed as:

17.98 ± to.o25(13) *2.3318

17.98 ± 2.16 * 2.3318

17.98 ± 5.04 μ mol m-2s-1/ mol m-2s-1

Yosef Tekle-Giorgis, HUCA, 2015 Page 203


Statistical Methods for Scientific 2015

Research

18.3 Simple linear correlation analysis

The purpose of correlation analysis is to examine the presence of association (relationship)


between variables as well as estimate the strength of the relationship that occurred between the
variables. As the nature of the relationship is not dependent-independent type, a functional
relationship cannot be established among the variables. Here the presence or absence of a
significant relationship between the variables considered explains some kind of biological
phenomena behind the relationship. For continuous variables, the Pearson correlation
coefficient(r) is used to describe the presence of a relationship between the variables considered
and it is computed as follows:

2
X Y

r = √R2 =

√ (∑
(
2
X −
∑ XY − ∑ ∑
(∑ X )
n
2

)(∑
n

2
Y −
)
(∑ Y )
n
2

)
Using data of table 18.2, the correlation coefficient (r) can be computed as follows

Table 18.5 pH and organic carbon content measured from soil samples. Data arranged for
correlation analysis.

Soil pit pH (x) Organic carbon (%) (y) XY X2 Y2


1 5.7 2.1 11.97 32.49 4.41
2 6.1 2.17 13.237 37.21 4.7089
3 5.2 1.5 7.8 27.04 2.25
4 5.7 1.39 7.923 32.49 1.9321
5 5.6 2.26 12.656 31.36 5.1076
6 5.1 1.29 6.579 26.01 1.6641
7 5.8 2 11.6 33.64 4
8 5.5 1.14 6.27 30.25 1.2996
9 5.4 1.6 8.64 29.16 2.56
10 5.9 2.5 14.75 34.81 6.25
11 5.3 0.89 4.717 28.09 0.7921

Yosef Tekle-Giorgis, HUCA, 2015 Page 204


Statistical Methods for Scientific 2015

Research
12 5.4 1.6 8.64 29.16 2.56
13 5.1 0.9 4.59 26.01 0.81
14 5.1 1.01 5.151 26.01 1.0201
15 5.2 1.21 6.292 27.04 1.4641
Totals 82.1 23.56 130.815 450.77 40.8286
∑X ∑Y ∑XY ∑X2 ∑Y2

2
X Y

r=

√ (∑
(
2
X −
∑ XY − ∑ ∑
(∑ X )
n
2
n

)(∑
( 82.1∗23.56)
2
Y −

2
)
(∑ Y )
n
2

√ ( )
130.815−
15
82.12 23.562
( 450.77−
15 )(
40.8286−
15 )
2
=
√ ( 1.8633 )
( 1.40933 ) ( 3.8237 )

= √ 0.6442 = 0.803

Interpretation of the computed r is as follows

PH and organic carbon of the tested soil is 80% related and the relationship is positive

Yosef Tekle-Giorgis, HUCA, 2015 Page 205


Statistical Methods for Scientific 2015

Research
r value ranges between -1 and 1. Values close to -1 indicate a strong negative (inverse)
relationship, while values close to 1 indicate a strong positive relationship. r value close to zero
indicate poor relationship which is not significant

2
The standard error for r (Sr) is computed as 1−r

n−2

2
In the present example Sr = 1−0.803

13
√ 0.027322 = 0.1653

Once Sr value is obtained, the 1-α% confidence interval as well as significance test for r can
be carried out as follows.

Testing the significance of a correlation analysis

i) Hypothesis
Ho: There is no correlation relationship between soil pH and organic carbon for the
investigated soil

HA: There is a significant correlation relationship between soil pH and organic carbon for the
investigated soil

ii) t cal = r/Sr 0.803/0.1653 =4.858

iii) t critical = tα/2 (n-2) for α = 0.05, t0.025 (13) = 2.160

iv) Decision: reject Ho if tcal ≥t critcal

Yosef Tekle-Giorgis, HUCA, 2015 Page 206


Statistical Methods for Scientific 2015

Research
4.858 > 2.16

Reject Ho

v) Conclusion: There is a significant correlation between soil pH and organic carbon


content for the investigated soil.

Excercises

1. For data of Table 18.3 in the body, perform regression analysis


i) Establish the functional relationship between the variables considered.
ii) Also interpret the meaning of the slope.?
iii) Compute the value of R2 and interpret its meaning
iv) Test the significance of the regression relationship
v) Compute the 95% CI value for the slope.

2. Data were collected on the total rainfall (in mm) during the three main rainy months
(June, July and August) as well as the total grass production during these three months
each year for a duration of ten years at a selected natural rangeland in Afar. The aim of
the research was to examine the influence of rainfall on grass productivity of the studied
area. The data are as follows.
Estimated grass production
Total rainfall (mm) (kg/hectare)
Year X Y
1 220.5 450
2 357.4 700
3 304.8 650
4 118.9 230
5 272.8 500
6 96.3 230
7 176.3 375
8 222 400

Yosef Tekle-Giorgis, HUCA, 2015 Page 207


Statistical Methods for Scientific 2015

Research
9 172.7 362
10 196.3 350

2.1 Establish a regression relationship between the amount of rainfall and grass
productivity?

2.2 Give an estimate of R2 value? Interpret the R2 value you obtained?

2.3 Test the significance of the relationship as well as compute the 95% CI for the slope?

3. The following data are measurements of serum cholesterol (mg/100 ml) and arterial
calcium deposition (mg/100g dry weight of tissue) recorded for twelve animals

Calcium deposition Cholesterol level

59 298

52 303

42 233

59 287

24 236

24 245

40 265

32 233

63 286

57 290

36 264

Yosef Tekle-Giorgis, HUCA, 2015 Page 208


Statistical Methods for Scientific 2015

Research
24 239

3.1 Calculate the correlation coefficient and interpret?

3.2 Test for the significance of the correlation relationship?

3.3 Compute the 95% CI for the population correlation coefficient?

Chapter 19 Multiple regression and correlation analysis

In the previous chapter, the analysis of relationships between two variables has been
discussed. In this chapter this consideration is expanded to interrelationships among three or
more variables using the procedure of multiple regression and correlation. The distinction
between the two is the presence of dependence relationship or not.

19.1 Multiple regression relationship

The equation for multiple linear regression relationship is expressed as

Yi = α +β1X1 +β2X2+β3X3 ……….+βkXk +εi, where

Yi = the dependent variable

X1 …. Xk = independent (explanatory) variables considered to have influence on the Y


variable

Yosef Tekle-Giorgis, HUCA, 2015 Page 209


Statistical Methods for Scientific 2015

Research
β1 ….βk = Partial regression slopes corresponding to the respective Xi

βi is defined as the rate of change in Y for a unit change in Xi, while the effects of the other
independent variables remain constant.

εi is the residual variance in Y after taking into consideration the effects of the Xi variables
included in the model

The parallel for a multiple regression equation based on sample data is given as

Yi = α +b1X1 +b2X2+b3X3 ……….+bkXk +ε,

19.1.1 Multicollinearity test

One of the most important considerations in multiple regression analysis is that the regressor
variables should not significantly correlate with each other because if any two Xi variables
are correlated, the real magnitude of the relationship they have with the Y variable is either
deemed best or it can be depressed. i.e., if any two X variables say X1 and X2 are correlated,
the corresponding slopes b1 and b2 do not reflect the true dependence of Y on either Xi. For
instance if X1 and X2 are positively correlated, the presence of one of the variable unduly
magnifies the true dependence of Y on the other Xi and vice versa. On the other hand, if X1
and X2 are negatively correlated, the presence of one depresses the effect of the other on the
Y variable. The existence of significant correlation between regressor variables is termed as
multicollinearity problem and when it is detected one of the variable usually the one with
lower effect on the Y variable should be dropped from the model. i.e. when correlating two
variables, if r significant one of the two variables should be dropped from the regression
analysis.

19.1.2 Selecting important regressor variable to be included in the regression model

A multiple regression relationship gives a better prediction of Y than a simple linear


regression relationship that takes each Xi separately. However not all Xi variables

Yosef Tekle-Giorgis, HUCA, 2015 Page 210


Statistical Methods for Scientific 2015

Research
considered have significant influence on Y and it is important to select only those regressor
variables that significantly influence Y and include them in the regression model equation.

Popularly there are two step wise procedures employed to select the Xi variables for the
model and these are Step up (Forward) selection and Step down (back ward illumination)
procedures. In case of back ward illumination procedure, a regression is established with all
the variables at hand and step by step those variables that are not significantly related with Y
will be drooped. Whereas in case of forward selection Xi variables are added step by step
one at a time by testing their significance. The steps for the backward illumination procedure
is as follows:

i) Establish a regression relationship between Y and all Xi variables. Test for the
significance of the relationship between Y and each Xi. This is done employing the t
test procedure discussed earlier and testing each partial slope if different from 0 or
not.

ii) Drop the Xi variable that is most non-significantly related with Y and establish a
regression relationship with the rest of Xi variables.

iii) Again drop the Xi variable that is most non-significantly related with Y and re
establish the regression relationship with the rest. This procedure of dropping one Xi
variable at a time will be continued until Xi variables that are significantly related
with Y remain. Finally those that are significantly related with Y are considered to
build the predictive regression equation.

19.1.3 Illustration of establishing a multiple regression relationship with SPSS

Yosef Tekle-Giorgis, HUCA, 2015 Page 211


Statistical Methods for Scientific 2015

Research
The SPSS procedure of multiple regression will be illustrated with a data set that constitutes
body weight and linear body measurements recorded for a random sample of 20 yearling Borana
steers purchased from Yavello market. Body weight of animals is recorded for a variety of
reasons but measuring the weight of large animals is quite a difficult task since the weighing
scale is cumbersome to carry from place to place. Especially when there is a need to take weight
measurements of cattle from market or form households, the problem is severe as the balance is
not easily portable. Besides, there is also a need to construct a chute to lead the animals to the
weighing scale.

On the other hand, body weight of animals (cattle, small ruminants etc) can be estimated from
linear body measurements like heart girth, body length crown height etc with a reasonable
precision if the equation that relates the body weight of the animal with linear body
measurements is developed. Table 18.1 gives data on body weight and various linear body
measurements of 20 randomly taken yearling Borana steers. Establish a multiple regression
relationship between body weight and the linear measurements. Also select the linear
measurements that are significantly related with body weight to be included in the regression
model?

Table 19.1 Body weight (kg) and various linear body measurements recorded from 20 randomly
taken yearling Borana steers.

Body wt Heart girth Wither height Crown height Belly girth Body length
ID BWT HG WH CH Belly G BL
1 98 110 98.5 99 126.5 88.5
2 100 110 96.5 99.5 128.5 87
3 113 119 99.5 105 136 77.5
4 92 109 99.5 102 127 78.5
5 135 126 103 103.5 132 85

Yosef Tekle-Giorgis, HUCA, 2015 Page 212


Statistical Methods for Scientific 2015

Research
6 124 123 105.5 102.5 126 76
7 90 110 94.5 102 130 74
8 104 112 99 95.5 132 78
9 85.5 106 95 98.5 119.5 70
10 85.5 112 93 95 118.5 71
11 103 119 95 101 121 88
12 104 124 97 100 124 84
13 81 118 94 102 120 79
14 79 119 92 94 131 78
15 100 109 94 102 127 76
16 102 117 93 99 118 86
17 86 103 92 94 116 64
18 95 112 102 105 130 80
29 111 117 97.5 102.5 131 75.5
20 88 107 91.5 99 127.5 69.5

SPSS steps

i) Activate SPSS

ii) In the data editor click variable view and write the variable names in the names
column

iii) While in the variable view, click measures column and change all the X and Y
variables into scale

iv) Click data view and enter data

v) Click Analyze from the menu bar

 Scroll down and select regression

Yosef Tekle-Giorgis, HUCA, 2015 Page 213


Statistical Methods for Scientific 2015

Research
o Linear

 Move body weight (Bwt) into dependent variable box

 Move all linear measurements into independent variable box

 Click method and select backward

 Click statistics and select collinearity diagnostics

 Click continue

 Click ok

The output window showing the results of the regression analysis appears as follows

Table 19.2 Out put of the step down multiple regression analysis performed using data of table
18.1.

Un standardized
Coefficients     Collinearity Statistics 
 
Model Coefficient  B SE t Sig. Tolerance VIF

(Constant) -217.491 71.012 -3.063 0.008    


1
  HG 0.768 0.427 1.798 0.094 0.637 1.569
 
  WH 1.914 0.765 2.503 0.025 0.526 1.9
 
  CH 0.242 0.837 0.289 0.777 0.602 1.661

Yosef Tekle-Giorgis, HUCA, 2015 Page 214


Statistical Methods for Scientific 2015

Research

Belly girth 0.126 0.458 0.276 0.787 0.729 1.372

BL 0.045 0.387 0.117 0.909 0.705 1.419

(Constant) -217.941 68.536 -3.18 0.006    

HG 0.79 0.373 2.117 0.051 0.781 1.281


2 WH 1.919 0.738 2.6 0.02 0.528 1.895
 
  CH 0.253 0.804 0.315 0.757 0.61 1.64
 
  Belly girth 0.126 0.442 0.285 0.779 0.729 1.372

(Constant) -212.882 64.276 -3.312 0.004    

HG 0.792 0.362 2.189 0.044 0.781 1.28


3
  WH 1.983 0.683 2.902 0.01 0.581 1.722
 
  CH 0.297 0.766 0.388 0.703 0.633 1.579

(Constant) -198.565 51.312 -3.87 0.001    


4 HG 0.811 0.35 2.319 0.033 0.795 1.257
 
  WH 2.12 0.569 3.726 0.002 0.795 1.257

The procedure performed four regression analysis step by step dropping three variables that are
not significantly related with body weight. These are BL, Belly girth, and CH.

Interpretation of the multiple regression result


 The first column shows the regression analysis conducted. Accordingly 4 regression
analysis were completed
 The second column gives the names of the regression coefficients, partial slopes and the
constant (intercept ) for each regression. Note that the partial slopes are given the names
of the X variable. Constant refers to the intercept.
 The third column gives values of the regression coefficients
 The fourth column gives the SE value for respective regression coefficient

Yosef Tekle-Giorgis, HUCA, 2015 Page 215


Statistical Methods for Scientific 2015

Research
 The sixth column gives the calculated t value which is calculated by dividing each
regression coefficient with its SE value
 The 7th column gives the significance probability. Note that P value above 0.05 indicate
that the variable is not significantly influencing the dependent variable Y

 In the first step all the 5 Xi variables were included in the model. Body length (BL was
the most non-significantly related variable with body weight. Note that except wither
height (WH), all Xi variable included were not significantly related with body weight.
Amongst of these BL was the most non-significant of all because it s P value is the
largest of all. Accordingly at this step BL was dropped.
 In the second step a relationship was established between Body weight and the remaing 4
Xi variables. Belly girth is the most non significantly related variable and it is dropped

 In the third step the three remaining Xi variables were considered and amongst these
crown height happened to be the most non significantly related and it is dropped

 At the fourth step, A regression was established between body weight and the reaming
two Xi variables, Wither height (WH) and Heart girth (HG). Both of them were
significantly related with body weight and both of them were retained.
Accordingly the final regression model used to estimate the body weight of the studied steers
from linear body measurements is given as

Body weight (kg) = -198.565 + (0.811 * Heart girth (cm)) + (2.12 * Wither height (cm))

19.2 Multiple correlation

It is an extension of simple linear correlation. Partial correlation coefficients express the


correlation between any two variable while holding constant the value of each of the other
variable. In three or more variables, it is possible to perform simple correlation analysis by
taking two variables at a time following the procedure described earlier. However such

Yosef Tekle-Giorgis, HUCA, 2015 Page 216


Statistical Methods for Scientific 2015

Research
correlations will fail to take into account the interaction of any of the other variables on the two
in question.

Symbolically, a partial correlation coefficient for a situation considering three variables (referred

as first order partial correlation coefficient) would be expressed as rik·l and it refers to the
correlation between variables i and k, holding constant the effect of variable l. i.e., any effect of
the interaction of variable l on the relationship between variable i and k is eliminated. The

symbol for second order partial correlation coefficient is given as rik·lp and it involves the
correlation among four variables. It is defined as the correlation between variables i and k,
holding constant the effects of variables l and p. In general for any number of variables, the

partial correlation coefficient is symbolized as rik···, its meaning is the correlation between any
two variables i and k holding all other variables constant

Partial correlation analysis is illustrated using data that depicts the relationship between
government funding for health and disease rate. The data are shown by Table 18.3 below. There
is a claim that more government funding into health care has increased disease rate. To find out
this, perform partial correlation between health care fund and reported disease rate controlling
for confounding factor which is number of visit to a health care provider.
Here the variables in the table are the following
 Funding refers to the amount of money assigned to the health care (per 100 person)
 Disease refers to the reported disease rate by Health care providers (X 10000)
 Visit refers to number of visits to health care providers (Hospitals, clinics etc) (X 10000)

Table 19.3 Data on government fund for health care provider and disease rate reported by health
care providers.
City Fundin City Fundin
code g Disease Visits   code g Disease Visits
6 155.32 158.34 152.13   16 166.85 186.80 166.93

Yosef Tekle-Giorgis, HUCA, 2015 Page 217


Statistical Methods for Scientific 2015

Research
9 2 1 8 7 1
177.34 157.22 167.83 174.63 179.41 171.45
27 1 8 9   25 7 7 2
165.09 162.91 162.21 159.49 148.45 157.30
15 4 8 3   10 6 8 3
130.59 146.69 153.18 133.63 150.41
4 154.28 1 3   3 3 6 2
185.55 202.80 152.10 171.17
36 8 8 186.93   19 169.46 7 7
186.95 221.42 188.16 169.89 183.13 168.87
37 5 5 4   20 8 5 6
198.62 189.22 195.38 161.57 163.32 151.86
49 6 5 7   11 3 2 4
172.14 166.41 175.49 183.98 176.51 185.03
21 3 7 6   34 7 7 4
198.25 203.06 197.03 187.98 178.60 184.52
48 5 6 8   39 1 3 5
193.90 198.56 190.16 195.87 207.62 197.06
44 2 9 1   46 8 1 7
157.02 161.78 156.47 154.76 143.88 153.73
7 6 6 3   5 4 1 6
158.06 159.20 177.60 159.73 184.18
8 7 168.82 6   28 4 5 3
182.09 185.64 163.02 172.06 155.93
32 1 180.41 8   13 3 2 1
173.20 178.52 176.50 190.34 188.55 197.91
24 5 3 2   41 2 3 2
151.83 157.77 156.96 179.26 202.31 187.90
2 8 4 9   29 6 3 4
182.89 212.44 181.62 190.16 194.07
33 6 4 2   42 192.2 4 2
197.09 198.89 205.50 181.23 201.53 186.67
47 2 9 5   31 1 8 9
163.86 167.09 159.41 153.99 150.63
14 3 174.25 5   9 2 2 8
192.30 212.07 192.70 167.46 153.98 164.10
43 5 2 6   18 6 6 6
172.02 149.44 165.01 151.37 122.39 148.32
22 7 8 7   1 7 1 3
166.97 190.34 162.06 195.56 184.35 193.66
17 2 7 5   45 4 3 2
187.85 198.37 186.16 162.68 154.43 162.35
38 8 6 1   12 9 6 9

Yosef Tekle-Giorgis, HUCA, 2015 Page 218


Statistical Methods for Scientific 2015

Research
184.22 172.98 188.99 173.16 179.69 172.69
35 9 8 5   23 6 8 9
176.52 167.49 172.50
          26 3 2 8
200.73 164.54 196.14
          50 1 1 3
190.53 184.11 191.10
          40 3 6 4
180.12 178.94
          30 1 4 174.83

SPSS procedure for partial correlation analysis

 Activate SPSS

 Open SPSS file health funding save

To obtain partial correlations:

► From the menus choose:

Analyze > Correlate > Partial

 Select Health care funding and Reported disease rate as the variables to be correlated

 Select Visits to health care providers as the control variable.

 Click Options.

 Click (check) Zero-order correlations and then click Continue.

 In the main Partial Correlations dialog, click OK to run the procedure.


This gives the following partial correlation table as an out put
Table 19.4 Partial correlation analysis output on data of health care and disease rate

Yosef Tekle-Giorgis, HUCA, 2015 Page 219


Statistical Methods for Scientific 2015

Research
Health care Reported Visits to health
funding (amount diseases (rate care providers
Control Variables per 100) per 10,000) (rate per 10,000)
-none-a Health care Correlation 1.000 .737 .964
funding (amount
Significance . .000 .000
per 100)
df 0 48 48
Reported Correlation .737 1.000 .762
diseases (rate per
Significance .000 . .000
10,000)
df 48 0 48
Visits to health Correlation .964 .762 1.000
care providers
Significance .000 .000 .
(rate per 10,000)
df 48 48 0
Visits to health care Health care Correlation 1.000 .013  
providers (rate per funding (amount
10,000) per 100) Significance . .928  
df 0 47  
Reported Correlation .013 1.000  
diseases (rate per
Significance .928 .  
10,000)
df 47 0  
Interpretation of results in table 18.4

 In this example, the Partial Correlations table shows both the zero-order correlations
(correlations without any control variables) of all three variables and the partial
correlation of the first two variables controlling for the effects of the third variable.

 The zero-order correlation between health care funding and disease rates is, indeed, both
fairly high (0.737) and statistically significant (p < 0.001).

 The partial correlation controlling for the rate of visits to health care providers, however,
is negligible (0.013) and not statistically significant (p = 0.928.)

 One interpretation of this finding is that the observed positive "relationship" between
health care funding and disease rates is due to underlying relationships between each of
those variables and the rate of visits to health care providers:

 Disease rates only appear to increase as health care funding increases because more
people have access to health care providers when funding increases, and doctors and

Yosef Tekle-Giorgis, HUCA, 2015 Page 220


Statistical Methods for Scientific 2015

Research
hospitals consequently report more occurrences of diseases since more sick people come
to see them.

 Going back to the zero-order correlations, you can see that both health care funding rates
and reported disease rates are highly positively correlated with the control variable, rate
of visits to health care providers.

 Removing the effects of this variable reduces the correlation between the other two
variables to almost zero. It is even possible that controlling for the effects of some other
relevant variables might actually reveal an underlying negative relationship between
health care funding and disease rates.

Yosef Tekle-Giorgis, HUCA, 2015 Page 221


Statistical Methods for Scientific 2015

Research
Chapter 20 Logistics regression analysis

Ordinary Least Square (OLS) regression models consider that both Y and X variables are
measured on continuous scale. For example consider the relationship between amount of
Photosynthesis rate and light interception, both of them are recorded on a continuous scale.
Such ordinary regression relationships are expressed by a simple linear equation

Y = a + bX

Or by a multiple linear equation as

Y = a + b1X1+b2x2+….. + bkXk

On the other hand, there are situations commonly involving clinical, epidemiological or
social data, where the dependent variable (Y) is measured on a nominal or ordinal scale. i.e.,
where the data are in two or more categories. For example, consider a clinical study
conducted in a certain community regarding the relationship between alcohol consumption
by moms and occurrence of congenital child malformation. It is known that mothers who
consume alcohol during pregnancy are likely to deliver a child, with some kind of
malformation. The study was thus designed to examine the influence of maternal alcohol
consumption during pregnancy on congenital child malformation

As seen from the data, the dependent variable Y= congenital child malformation is recorded in a
nominal scale as Yes or No. Such data is called binary or dichotomous in which the outcome
consists of two categories, (mutually exclusive events) like presence or absence; Yes or no etc.
In the present example the X variable (alcohol consumption) is ordinal or ranked. However, for
logistic regression, the independent variable can take any form, continuous, ordinal or nominal.

BUT the dependent variable is always qualitative (Nominal or ordinal).

Yosef Tekle-Giorgis, HUCA, 2015 Page 222


Statistical Methods for Scientific 2015

Research

Table 20.1 The relationship between maternal alcohol consumption and congenital child
malformation

Maternal alcohol Occurrence of Congenital child malformation


consumption
No of drink/month Rank Yes No Total P %

None 0 48 17066 17114 0.0028 0.28


<1 1 38 14464 14502 0.0026 0.26
1 to 2 2 5 1000 1005 0.005 0.5
3 to 5 3 1 126 127 0.0079 0.79
5 to 10 4 1 35 36 0.0278 2.78
10 to 15 5 1 20 21 0.0476 4.76
15 to 20 6 1 10 11 0.0909 9.09
20 to 30 7 1 9 10 0.1 10
every day 8 1 8 9 0.1111 11.11
Total 96 32730 32826 0.0029 0.29

20.1 Simple (Binary) logistic regression

Consider the data on congenital child malformation and intensity of maternal alcohol
consumption. Plot the proportion (probability) of encountering congenital child malformation
against intensity of alcohol consumption. The relationship is expressed by an S-shaped curve
(Figure 19.1) known as logistics curve and defined by the following equation

P = 1/(1+ e-(a+bX)), where

P = The proportion (probability) of encountering congenital malformation

X = Intensity of maternal alcohol consumption during pregnancy

Yosef Tekle-Giorgis, HUCA, 2015 Page 223


Statistical Methods for Scientific 2015

Research

0.12

0.10
Proportion of congenital malformation

0.08

0.06

0.04

0.02

0.00
0 1 2 3 4 5 6 7 8
Alcohol consumption (rank)

Figure 20.1 The proportion of congenital malformed child delivered by moms consuming
alcohol during pregnancy.

Yosef Tekle-Giorgis, HUCA, 2015 Page 224


Statistical Methods for Scientific 2015

Research
In this example, the dependent variable has only two categories (binary) and the in dependent
variable (X) is single. i.e., Simple binary logistic relationship. A logistic relationship can also
involve binary dependent variable and consisting more than one explanatory variables (X’s) as
well as a dependent variable with more than two categories, i.e., multinomial logistic regression
relationship consisting of many explanatory variables.

20.1.1 Linearizing the logistic equation

The simple binary logistic regression equation can be transformed into a linear form as follows.

• P = 1/(1+ e-(a+bX))

• 1/P = 1+ e-(a+bX)

• (1/P) - 1 = e-(a+bX)

• Ln[(1/P) – 1] = -(a+bX)

• Ln(1-P/P) = -(a+bX)

• - Ln(1-P/P) = a+bX

• Ln(P/1-P) = a+bX

In the above equation the terms are

P = the probability of encountering an event, in this case observing a defective birth in the
population

1-P = the probability of not encountering the event of interest, in this case the probability of
observing a normal birth.

The ratio (P/1-P) is called Odds and it is defined as the probability (likelihood) of observing an
event relative to the probability of not observing the particular event. Odds simply mean
likelihood for the occurrence of an event.

Yosef Tekle-Giorgis, HUCA, 2015 Page 225


Statistical Methods for Scientific 2015

Research
If for example in a given community there are 60 % of females and 40 % of males, the
chance of encountering a female individual is 1.5 times higher than that of males

P(f) = 0.6, 1-P(f) 0.4

Odds (f) = 0.6/0.4 = 1.5

20.1.2 Odds ratio

Odds ratio is defined as the odds for an event at a particular situation compared to the Odds for
occurrence of that same event in another situation. For example, the Odds for encountering a
defective birth among moms who consume alcohol every day during pregnancy compared to
those who don’t at all is:

Odds (no alcohol) = 0.002805/1-0.002805 = 0.002813

Odds(daily alcohol)=0.1111/1-0.1111 = 0.1249

Odds ratio = Odds(daily alcohol)/Odds(no alcohol)

= 0.1249/0.002813 = 44.43

This means the chance of having a defective birth is 44.43 times higher for women drinking
alcohol every day during pregnancy compared to those who don’t.

20.1.3 Fitting the logistic regression relationship

Ln(P/1-P) = a+bX

Ln(P/1-P) = Ln(Odds) is statistically known as Logit

To fit the logistic regression relationship, Logit values (Ln(P/1-P)) are regressed on X.

For the present data, the specific logistic regression equation that relates the probability of
finding a defective birth with alcohol consumption can be expressed as:

Yosef Tekle-Giorgis, HUCA, 2015 Page 226


Statistical Methods for Scientific 2015

Research
Ln (Odds) = 0.571X – 6.181

a = -6.181 does not have statistical significance.

b = 0.571 is the rate of increase in Ln (Odds) or logit for a one unit increase in X

eb value is more easy to interpret, It is considered as Odds ratio.

In the present example eb = e0.571 = 1.77

It means the likelihood of observing a defective birth increases by a factor of 1.77 at each level
of alcohol consumption. For example, in the present case the chance of having a defective birth
is 1.77 times higher for mothers that consume alcohol occasionally compared to those who don’t
consume at all.

20.1.4 Logistic regression with SPSS

SPSS binary logistic regression option performs ML fit to the observed data.

The procedure is as follows

• Activate SPSS

• In the data editor, click variable view and enter variable names in the name column as

>Alcohol

>Malformation

>Frequency

• Click value label column for alcohol and specify the categories of Alcohol consumption
from 0 to 7

Also click the value column for malformation and label as

Yosef Tekle-Giorgis, HUCA, 2015 Page 227


Statistical Methods for Scientific 2015

Research
0 = no and 1=yes

• Click measures column and for each variable specify the variable type as

Alcohol = ordinal

Malformation = Nominal

Frequency = scale

• Click data view and enter data

• Click data in the menu bar and choose weight cases

>activate weight cases by

>Move frequency in the weight variable box

This specifies that each category is weighed by its corresponding frequency.

Note that this step is not necessary when entering raw data

• To perform logistic regression

• Click analyze from the menu bar

>regression

>binary logistic

The binary logistic command window appears

>move Alcohol to covariate box

>Move malformation to dependent variable box

• Click categorical

Yosef Tekle-Giorgis, HUCA, 2015 Page 228


Statistical Methods for Scientific 2015

Research
>move alcohol to categorical covariate box

>Click contrast and change it to simple and select reference category first

Click change

Click continue and click ok in the main dialog window

Logistic regression output window will appear with the following table containing the
results of the regression analysis

Table 20.2 The result of binary logistics analysis done on data of Table 19.1.

Predictor B S.E. Wald df Sig. Exp(B)


Alcohol     38.03 7 0  
Alcohol(1) -0.068 0.217 0.098 1 0.754 0.934
Alcohol(2) 0.575 0.471 1.492 1 0.222 1.778
Alcohol(3) 1.037 1.014 1.046 1 0.306 2.822
Alcohol(4) 2.318 1.024 5.121 1 0.024 10.158
Alcohol(5) 2.878 1.035 7.734 1 0.005 17.777
Alcohol(6) 3.571 1.059 11.377 1 0.001 35.554
Alcohol(7) 3.676 1.064 11.94 1 0.001 39.505
Constant -5.874 0.145 1651.34 1 0  

Logistic regression output interpretation

• The table gives values of the regression parameters a and b with the standard error values

• The procedure selected as simple contrast compares each level of the predictor variable
with the reference category.

• In this analysis, the reference category is ‘No alcohol consumption’

Yosef Tekle-Giorgis, HUCA, 2015 Page 229


Statistical Methods for Scientific 2015

Research
• Accordingly the last column (EXP(B)) indicates the odds ratio for the occurrence of child
malformation at each level of alcohol consumption compared to the reference category,
which is no alcohol consumption

• In the present analysis, the odds for occurrence of child malformation is 39.505 times
higher in moms that consume alcohol every day compared to those who do not consume
at all.

20.2 Multiple binary logistic regression relationship

Logistic regression like OLS regression can have multiple explanatory variables. Some or all of
these predictors can be categorical (nominal) as well as some can be quantitative. The general
logistic equation with multiple explanatory variables is given as:

P = 1/(1 + exp-(a + b1X1 + b2X2 + ……. + bkXk), where

b1 to bk correspond to the effects of the respective Xi on Log(Odds) controlling for the other Xi
variables. The linearized form of this expression is given as

Ln(P/1-P) = a + b1X1 + b2X2 + ……. + bkXk

Illustration of multiple binary logistic regression analysis

Consider that a bank wants to find out the characteristics that are indicative of people who are
likely to default on loans and use these characteristics to identify good and bad credit risks.

Information gathered from 700 cases include the following

Yosef Tekle-Giorgis, HUCA, 2015 Page 230


Statistical Methods for Scientific 2015

Research
Age of respondent (age) X1

Education (ed) X2

Years with current employer (Employee) X3

Years at current address (Address) X4

HH income (Income) X5

Dept to income ratio (Deptinc) X6

Credit card dept (Creddept) X7

Other dept (Other dept) X8

Previously defaulted (default) Y variable

The data are found in the SPSS data file called Bank Loan. There are all together 8 explanatory
variables considered to affect potential creditor. Amongst these education is an ordinal scale
variable but the rest 7 are recorded on continuous scale. The dependent variable is whether a
person who took loan has previously defaulted or not and it is a binary outcome (Yes or no). Fit a
binary logistic regression relationship on the data and find out which of the explanatory variables
(characteristics) are significantly related with the dependent variable.

20.2.1 SPSS procedure for multiple logistic regression

• To create the logistic regression model, from the menus choose:

• Analyze > Regression > Binary Logistic

• Select Previously defaulted as the dependent variable.

• Select Age in years through other debt in thousands as covariates.

• Select Forward: LR as the method.

Yosef Tekle-Giorgis, HUCA, 2015 Page 231


Statistical Methods for Scientific 2015

Research
• Click Categorical in the Logistic Regression dialog box.

• Select Level of education as a categorical covariate.

• Click Continue.

• Click Save in the Logistic Regression dialog box.

• Select Probabilities in the Predicted Values group, Cook's in the Influence group, and
Studentized in the Residuals group.

• Click Continue.

• Click Options in the Logistic Regression dialog box.

• Select Classification plots and Hosmer-Lemeshow goodness-of-fit.

• Click Continue.

• Click OK in the Logistic Regression dialog box.

20.2.2 SPSS output interpretation

Tests of Model Fit

 Goodness-of-fit statistics help you to determine whether the model adequately describes
the data. The Hosmer-Lemeshow statistic indicates a poor fit if the significance value is
less than 0.05. Here, the model adequately fits the data

Yosef Tekle-Giorgis, HUCA, 2015 Page 232


Statistical Methods for Scientific 2015

Research

 Forward stepwise methods start with a model that doesn’t include any of the
predictors

 At each step, the predictor with the largest score statistic whose significance value is less
than a specified value (by default 0.05) is added to the model.

 The variables left out of the analysis at the last step all have significance values larger than
0.05, so no more are added.

Yosef Tekle-Giorgis, HUCA, 2015 Page 233


Statistical Methods for Scientific 2015

Research

The variables chosen by the forward stepwise method should all have significant changes in -2
log-likelihood.
• The change in -2 log-likelihood is generally more reliable than the Wald statistic. If the
two disagree as to whether a predictor is useful to the model, trust the change in -2 log-
likelihood. See the regression coefficients table for the Wald statistic.
• As a further check, you can build a model using backward stepwise methods. Backward
methods start with a model that includes all of the predictors. At each step, the predictor
that contributes the least is removed from the model, until all of the predictors in the
model are significant. If the two methods choose the same variables, you can be fairly
confident that it’s a good model.

In the linear regression model, the coefficient of determination, R 2, summarizes the proportion
of variance in the dependent variable associated with the predictor (independent) variables, with
larger R2 values indicating that more of the variation is explained by the model, to a maximum
of 1. For regression models with a categorical dependent variable, it is not possible to compute a

Yosef Tekle-Giorgis, HUCA, 2015 Page 234


Statistical Methods for Scientific 2015

Research
single R2 statistic that has all of the characteristics of R2 in the linear regression model, so these
approximations are computed instead. The following methods are used to estimate the
coefficient of determination.

 Cox and Snell’s R2 (Cox and Snell, 1989) is based on the log likelihood for the model
compared to the log likelihood for a baseline model. However, with categorical outcomes, it
has a theoretical maximum value of less than 1, even for a “perfect” model.

 •Nagelkerke’s R2 (Nagelkerke, 1991) is an adjusted version of the Cox & Snell R-square
that adjusts the scale of the statistic to cover the full range from 0 to 1.

 McFadden’s R2 (McFadden, 1974) is another version, based on the log-likelihood kernels


for the intercept-only model and the full estimated model.

What constitutes a “good” R2 value varies between different areas of application. While these
statistics can be suggestive on their own, they are most useful when comparing competing
models for the same data. The model with the largest R 2 statistic is “best” according to this
measure.

Logistic Regression Coefficients

Yosef Tekle-Giorgis, HUCA, 2015 Page 235


Statistical Methods for Scientific 2015

Research

• As shown in the table at the 4th step, four variable have been selected to have a
significant effect on the dependent variable, namely EMPLOY, ADDRESS,
DEBINC,CREDDEBT.
 The parameter estimates table summarizes the effect of each predictor.
 The ratio of the coefficient to its standard error, squared, equals the Wald statistic.
 If the significance level of the Wald statistic is small (less than 0.05) then the parameter
is useful to the model.
 The predictors and coefficient values shown in the last step are used by the procedure to
make predictions.
• The meaning of a logistic regression coefficient is not as straightforward as that of a
linear regression coefficient. While B is convenient for testing the usefulness of
predictors, Exp(B) is easier to interpret.
• Exp(B) represents the ratio-change in the odds of the event of interest for a one-unit
change in the predictor.
• For example, Exp(B) for employ is equal to 0.781, which means that the odds of default
for a person who has been employed at their current job for two years are 0.781 times the
odds of default for a person who has been employed at their current job for 1 year, all
other things being equal.

Yosef Tekle-Giorgis, HUCA, 2015 Page 236


Statistical Methods for Scientific 2015

Research

Classification table based on the final model

Predicted by the model


% of correct
Observed No Yes Total classification
No 352 23 375 93.7

Previously defaulted Yes 67 57 124 54.03

499 82.0%

The classification table shows the practical results of using the logistic regression model.
• For each case, the predicted response is Yes if that case's model-predicted probability is
greater than the cutoff value specified in the dialogs (in this case, the default of 0.5).
• Cells on the diagonal are correct predictions.
• Cells off the diagonal are incorrect predictions.
• Of the cases used to create the model, 57 of the 124 people who previously defaulted are
classified correctly (93.7%).
• 352 of the 375 non defaulters are classified correctly (54.03%).
• Overall, 82.0% of the cases are classified correctly.
• From step to step, the improvement in classification indicates how well your model
performs. A better model should correctly identify a higher percentage of the cases.

Chapter 21 Analysis of nominal scale variable


As nominal data are counts of items or events in each of several classifications, methods for
their analysis are sometimes referred to as enumeration statistical methods. Nominal variables
combined with frequencies are amenable for statistical tests. Widely employed test for nominal

Yosef Tekle-Giorgis, HUCA, 2015 Page 237


Statistical Methods for Scientific 2015

Research
data analysis is X2 (chi square) test. The use of this test statistic is illustrated with the following
sections.

21.1 One way classification X2 test

A given forest land was surveyed for the kind of tree species it contained. In due regard, an area
50 m by 50 m was randomly selected and the different types of tree species were separately
counted. The following result was obtained

Table 20.1 Frequency distribution of the six tree species in 50 m by 50 m randomly sampled area
in a given forest land

Tree species Number counted/2500 m2 area Relative


frequency (%)
absolute Frequency
Spp 1 13 10.3a
Spp 2 26 20.6b
Spp 3 31 24.7b
Spp 4 14 11.1a
Spp 5 28 22.2b
Spp 6 14 11.1a
Total 126 100%

Test the hypothesis that the identified tree species are equally dominant in the investigated
forest area? Use 5 % level of significance.

1 Hypothesis

Ho: the identified tree species are equally dominant

Yosef Tekle-Giorgis, HUCA, 2015 Page 238


Statistical Methods for Scientific 2015

Research
HA: the identified tree species are not equally dominant.
1. Test statistic
The test statistic for this kind of test is as mentioned before X2 and it is computed as
follows
i) Determine the expected frequency considering that the null hypothesis is true
If the null hypothesis is true and the trees are equally dominant, each tree species would occur
with the proportion of 1/6. Accordingly the expected proportion is computed as

1
P=
^ where k is the number of categories
k

The expected frequency ( ^f ) is computed for each category as

P*N
^f = ^ =N/k where N is the total observation

In the present example, if the null hypothesis is true (i.e., if the trees are equally dominant in the
investigated forest), they are expected to occur with a frequency of 126/6 = 21 trees each.

ii) Calculate the test statistic X2


Chi square measure how much the observed frequency deviates from the expected frequency
calculated under the assumption of a true null hypothesis. Thus if the null hypothesis is true (if
the trees are equally dominant) the observed frequency (fi) should not deviate much from the
expected frequency ( ^f ). Accordingly the Chi square value is computed as:

2
2 ( fi−fi^ )
X =∑
fi^

The computation of the expected frequencies for each category and the Chi-square values are
shown in the following table.

Yosef Tekle-Giorgis, HUCA, 2015 Page 239


Statistical Methods for Scientific 2015

Research
Table 20.2 data of table 20.1 used to illustrate the calculation of Chi square value

Observed Expected
Tree species freq. (fi) freq ( ^f ) X2
Spp 1 13 21 3.047619
Spp 2 26 21 1.190476
Spp 3 31 21 4.761905
Spp 4 14 21 2.333333
Spp 5 28 21 2.333333
Spp 6 14 21 2.333333
Total 126 126 16.00

As seen in the table, X2cal = 16.00

Is this chi square value large enough to show significant deviation of the observed frequencies
from their expectation under the assumption a true null hypothesis? To judge this, determine the
critical value of X2

iii) Critical value of X2


This is found from X2 table (see Appendix table 5)for degrees of freedom of k-1, where k is the
number of categories compared. Thus the critical value of Chi square is:

X2 critical = X2α (k-1), if α = 0.05 then

X20.05 (6-1), = X20.05 (5) = 11.07

iv) Decision:
Reject Ho if X2cal ≥ X2α (k-1)

16.00 > 11.07, hence reject Ho

Yosef Tekle-Giorgis, HUCA, 2015 Page 240


Statistical Methods for Scientific 2015

Research
v) Conclusion:
The identified tree species are not equally dominant in the studied forest land.

21.2 Subdividing X2 test

In the above procedure, the test proved that the six tree species are not equally dominant. But
this does not mean that all of them are different from one another in dominance. Therefore to
find out which tree species differed from which others, subdividing the chi-square analysis is
essential.

In this procedure, categories that appeared to be comparable in frequency are grouped together
and differences within themselves are tested. This way sub categories with similar frequencies
can be identified. In the present example, three species 1, 4 and 6 appear comparable in
dominance. Also tree species 2, 3 and 5 appear to be similar in dominance. Thus the two groups
are tested for the presence of differences within themselves. The procedure is as follows.

1) Test if tree species 1, 4 and 6 are similar in dominance


i) Hypothesis
Ho: tree species 1, 4, and 6 are similar in dominance

HA: tree species 1, 4, and 6 are different in dominance

ii) Compute X2 cal as follows


Summary table of the calculation of X2

Tree species Observed freq. (fi) Expected freq ( ^f i) X2


Spp 1 13 13.6666667 0.03252
Spp 4 14 13.6666667 0.00813
Spp 6 14 13.6666667 0.00813
Total 41 41 0.05
2
X cal = 0.05

Yosef Tekle-Giorgis, HUCA, 2015 Page 241


Statistical Methods for Scientific 2015

Research
iii) X2 critical = X2 0.05 (3-1) = 5.991
iv) Decision : Accept Ho since 0.05 < 5.991
v) Conclusion: the three tree species are equally dominant in the investigated forest land
2) Test if tree species 2, 3, and 5 are equally dominant or not
i) Hypothesis
Ho: tree species 2, 3 and 5 are equally dominant

HA: tree species 2, 3 and 5 are not equally dominant

ii) X2 cal
Summary table of the calculation of X2

Tree species Observed freq. (fi) Expected freq ( ^f i) X2


Spp 2 26 28.33333333 0.192157
Spp 3 31 28.33333333 0.25098
Spp 5 28 28.33333333 0.003922
Total 85 85 0.45

X2 cal = 0.45

iii) X2 critical = X2 0.05 (3-1) = 5.991


iv) Decision : Accept Ho since 0.45 < 5.991
v) Conclusion: the three tree species are equally dominant in the investigated forest
land

Accordingly the summary report table is prepared as shown by table 20.1 earlier by assigning
similar superscript letter for categories that did not differ in relative frequency (%). Hence the
overall conclusion is that tree species 1, 4 and 6 did not differ in abundance. Also the reaming
three (spp 2, 3 and 5) are equally dominant. However, the latter group of trees were more
abundant than the former group.

Yosef Tekle-Giorgis, HUCA, 2015 Page 242


Statistical Methods for Scientific 2015

Research
21.3 Two way classification frequency data analysis

X2 test can be done for enumeration data classified along two or more classifying variable. For
example in a two way classification, one of the categories is referred as row variable and the
other one is referred as column variable. Thus there are r x c cells in the table. In the following
example, a two way classification X2 test will be illustrated.

Consider that a study was designed to investigate the distribution of four tree species in forest
lands of two geographical locations. Accordingly in each of the two study areas, an area
measuring 50 m by 50 m (2500 m 2) land was marked and the four tree species were counted.
The data are as follows.

Table 20. 3 the distribution of four species of trees in forest lands of two geographical locations

Tree species

Spp1 Spp2 Spp3 Spp4


Area Row total
Area 1 32 43 16 9 100

Area 2 55 65 64 16 200

Column totals 87 108 80 25 300


Grand total

Test the hypothesis that tree species distribution is not dependent on the area in which the two
forest lands exist? Use 5 % level of significance for the test

Enumeration data classified along more than one categorical variable is also referred as
contingency table and this is a 2 X 4 contingency table. In this example, if the distribution of the

Yosef Tekle-Giorgis, HUCA, 2015 Page 243


Statistical Methods for Scientific 2015

Research
four tree species is independent of the area in which they are sampled, they will be equally
distributed in the two areas. Thus the null hypothesis for the test will be as follows

21.3.1 A 2X4 contingency data test procedure

i) Hypothesis
Ho: The four tree species are equally distributed in the two forest lands

HA: The four tree species are not equally distributed in the two forest lands

ii) X2 cal
To determine the value of the X2, first the expected frequencies under the assumption of a true
null hypothesis should be computed. Thus to calculate the expected frequencies for each cell,
the arguments are the following:

 100 of the 300 trees are from area 1, thus the expected proportion of a tree to be from area 1
is 100/300
 87 of the 300 trees are spp 1. Thus the expected proportion of a tree to be spp 1 is 87/300

Accordingly the expected proportion of a tree to be spp 1 and from area 1 is

^
P11 = (100/300) * (87/300)

Hence the expected frequency of a tree to be spp 1 and from area 1 is

^f 11 = (100/300) * (87/300) * 300 = (100*87)/300

From this it can be seen that the expected frequency for a cell in row i and column j is as follows

^f ij = (C * R)/N where C = column total, R = row total, and N = grand total

Yosef Tekle-Giorgis, HUCA, 2015 Page 244


Statistical Methods for Scientific 2015

Research
Accordingly the calculations for the expected frequencies for each cell and the corresponding X 2
values will be illustrated in the following table

Table 20.4 Data of table 20.3 used to illustrate a 2 x 4 contingency data analysis

Category Observed freq expected freq  


Area Tree SPP F ^f X2
1 1 32 29 0.310345
1 2 43 36 1.361111
1 3 16 26.6667 4.2688
1 4 9 8.3333 0.05389
2 1 55 58 0.155172
2 2 65 72 0.680556
2 3 64 53.3333 2.1348
2 4 16 16.67 0.026929
X2 cal  8.991603

iii) X2 critical = X2α [(c-1) (r-1)] X2 0.05(3*1) = 7.815

iv) Decision Reject Ho: X2cal ≥ X2α [(c-1) (r-1)]

8.991 > 7.815 Reject Ho


v) Conclusion: The four trees are not equally distributed in the two forest lands

21.3.2 Sub dividing a contingency analysis

After the X2 test declared significant differences among categories, one may wish to test which
of the categories differ in frequencies from which other categories. In the present example, the
fact that the X2 test declared significant differences does not necessarily mean that all the four
tree distribution differed in the two areas. For instance, the observed and expected frequencies

Yosef Tekle-Giorgis, HUCA, 2015 Page 245


Statistical Methods for Scientific 2015

Research
of tree species 1, 2 and 4 appear to be comparable in the two areas, whereas tree species 3 seems
to be more abundant in area 2 than in area 1. In area one the observed frequency of tree spp 3 is
much less than the expected (16 Vs 26.67), whereas in area 2 it is more than the expectation (64
Vs 53.33). Thus to find out which tree distribution differed in the two areas, the categories
bearing comparable expected and observed frequencies are grouped together as demonstrated
before and difference within themselves will be tested. In the present example, since the tree
species 1, 2 and 4 appear to distribute comparably in the two areas, differences among
themselves are tested. Then tree spp 3 will be tested if it is found comparably in the two forest
areas. The procedure is as follows

21.3.2.1 Testing for differences in the distribution of tree spp 1, 2, and 4 in the two
areas
i) Hypothesis
Ho: The distribution of tree spp1, 2, and 4 is not area dependent

HA: The distribution of tree spp1, 2, and 4 is area dependent

ii) X2 cal The calculation for X2 cal is as follows

Table 20.5 The table displaying the distribution the three tree spp in the two areas
Tree species
Spp1 Spp2 Spp4 Row total
Area
Area 1 32 43 9 84
Area 2 55 65 16 136
Column totals 87 108 25 220
Grand total

Category Observed freq expected freq  


Are
a Tree SPP F ^f X2
32 0.04467
1 1 33.2182 3
1 2 43 41.2364 0.07542

Yosef Tekle-Giorgis, HUCA, 2015 Page 246


Statistical Methods for Scientific 2015

Research
9
9 0.03116
1 4 9.5455 9
55 0.02759
2 1 53.7818 2
65 0.04658
2 2 66.7636 8
16 0.02692
2 4 16.6700 9
X2 cal  0.25238

iii) X2 critical = X2 0.05 (2*1) = 5.991

iv) Decision: Accept Ho since 0.2524 < 5.991

v) Conclusion: The three tree species are equally distributed in the two forest
areas.

21.3.2.2 Testing for differences in the distribution of tree spp 3 in the two areas

i) Hypothesis
Ho: The distribution of tree spp 3 is comparable in the two areas

HA: The distribution of tree spp 3 is not comparable in the two areas

ii) X2 cal The calculation for X2 cal is as follows

Table 21.6 The distribution tree spp three in the two areas

Yosef Tekle-Giorgis, HUCA, 2015 Page 247


Statistical Methods for Scientific 2015

Research

Observed expected
Area Frequency frequency X2
Area 1 16 40 14.4
Area 2 64 40 14.4
Total 80 80 28.8

X2 cal = 28.8

iii) X2 critical = X2 0.05 (1) = 3.841

iv) Decision: Reject Ho since 28.8 > 3.841

v) Conclusion: Tree species 3 is not equally distributed in the two areas.

Finally the summary report table is prepared as follows.

Table 21.7 Relative abundance (%) of the four trees sampled from 50m X 50 m area in the two
forest lands. (Total number of trees counted = 300)

Tree species
Area Spp1 Spp2 Spp3 Spp4 Total
a a a a
Area 1 32 43 16 9 100
a a b a
Area 2 27.5 32.5 32 8 100
Different superscripts in the same column indicate that the corresponding % values are
significantly different.

Interpretation

The relative abundance of tree species 1 2 and 4 is comparable in the two forest lands. However
tree species 3 is significantly more abundant in area 2 than in area 1.

Yosef Tekle-Giorgis, HUCA, 2015 Page 248


Statistical Methods for Scientific 2015

Research

Yosef Tekle-Giorgis, HUCA, 2015 Page 249


Statistical Methods for Scientific 2015

Research
22. Appendices

Appendix Table 1 Table of areas under normal curve

Yosef Tekle-Giorgis, HUCA, 2015 Page 250


Statistical Methods for Scientific 2015

Research

Appendix Table 2 t table

Yosef Tekle-Giorgis, HUCA, 2015 Page 251


Statistical Methods for Scientific 2015

Research

Yosef Tekle-Giorgis, HUCA, 2015 Page 252


Statistical Methods for Scientific 2015

Research
Appendix Table 3a F0.05 table

Yosef Tekle-Giorgis, HUCA, 2015 Page 253


Statistical Methods for Scientific 2015

Research

Appendix Table 3b F 0.01 table

Yosef Tekle-Giorgis, HUCA, 2015 Page 254


Statistical Methods for Scientific 2015

Research

Appendix Table 4 q table

Yosef Tekle-Giorgis, HUCA, 2015 Page 255


Statistical Methods for Scientific 2015

Research

Appendix Table 5 X2 table

Yosef Tekle-Giorgis, HUCA, 2015 Page 256


Statistical Methods for Scientific 2015

Research

Yosef Tekle-Giorgis, HUCA, 2015 Page 257

You might also like