Intro CH 4-1
Intro CH 4-1
Intro CH 4-1
Measures of Variation
Measures of Variation...
The following table displays the price of a certain commodity in four cities.
Find the mean and median prices of the four cities and interpret it.
A 30 30 30 30 30
B 28 29 30 31 32
C 10 15 30 45 50
D 0 5 30 55 60
All the four data sets have mean 30 and median is also 30. But by inspection
it is apparent that the four data sets differ remarkably from one another.
So measures of central tendency alone do not provide enough information
about the nature of the data.
Thus, to have a clear picture of the data, one needs to have a measure of
dispersion or variability among observations in the data set.
Variation or dispersion may be defined as the extent of scatteredness of value
around the measures of central tendency.
Thus, a measure of dispersion tells us the extent to which the values of a
variable vary about the measure of central tendency.
Range
Variance
Variance is the most superior and widely used measure of dispersion and it
measures the average dispersion of the observations around the mean.
The variance of a data set is the sum of the squares of the deviation of each
observation taken from the mean divided by total number of observations in
the data set.
For a population containing N elements, the population variance is denoted
by the square of the Greek letter σ (sigma), σ 2 .
N
X
(Xi − µ)2
i=1
For raw data, σ 2 = .
N
k
X
fi (Xi − µ)2
For grouped data, σ 2 = i=1
k
.
X
fi
i=1
Variance...
Variance...
Example
Find the variance of: 20, 28, 40, 12, 30, 15 and 50.
a. Take the data as a population.
b. Consider it as a sample.
N
X
(Xi − µ)2
i=1
a. N = 7, µ = 27.86; σ 2 =
N
(20 − 27.86) + · · · + (50 − 27.86)2
2
1120.86
σ2 = = = 160.12
7 7
n
X
(Xi − X̄ )2
i=1
b. n = 7, X̄ = 27.86; S 2 =
n−1
Standard Deviation
The first main demerit of variance is that its unit is the square of the unit of
measurement of the variable values.
For example, the sample variance of 2m, 6m and 4m is 4m2 .
The interpretation is, on average each value differs from the mean by 4m2 ,
which is completely wrong because one thing the unit of measurement of
variance is not the same as that of the data set.
The other disadvantage of variance is, the variation of the data is exaggerated
because the deviation of the each value from the mean is squared.
For the given example, the variation of the data is exaggerated from two to
four since it is taking the square of the deviations.
Variance also gives more weight the extreme values as compared to those
which are near to the mean value.
Standard Deviation...
Standard deviation is the positive square root of variance.
It is considered to be the best measure of dispersion because the unit of
measurement is the same as the data set and the exaggeration made by
variance will be eliminated by taking the square root of it.
In simple words, it explains the average amount of variation on either sides of
the mean.
If the standard deviation of the data is small, the values are concentrated
near the mean and if it large, the values are scattered away from the mean.
For a population containing N elements, the population standard deviation is
denoted by the Greek letter σv(sigma).
u N
uX
u (Xi − µ)2
√
u
t i=1
For raw data, σ = σ2 = .
v N
u k
uX
u fi (Xi − µ)2
√ u
For grouped data, σ = σ 2 = u i=1 k .
u
u X
t f i
i=1
Standard Deviation...
Example 1
Find the standard deviation of: 20, 28, 40, 12, 30, 15 and 50.
a. Take the data as a population.
b. Consider it as a sample.
√
a. ⇒ σ = √160.12 = 12.65
b. ⇒ S = 186.81 = 13.67
Example 2
Find the variance and standard deviation of the students score data.
The necessary calculation for calculating variance are as follows: X̄ = 25.64
CBs Xi fi Xi − X̄ (Xi − X̄ )2 fi (Xi − X̄ )2
10.5-14.5 12.5 4 -13.14 172.6596 690.6384
14.5-18.5 16.5 7 -9.14 83.5396 584.7772
18.5-22.5 20.5 8 -5.14 26.4196 211.3568
22.5-26.5 24.5 10 -1.14 1.2996 12.9960
26.5-30.5 28.5 12 2.86 8.1796 98.1552
30.5-34.5 32.5 7 6.86 47.0596 329.4172
34.5-38.5 36.5 8 10.86 117.9396 943.5168
Total 56 2870.8576
X
fi (Xi − X̄ )2 2870.8576 √
σ2 = X = = 51.27, hence σ = 51.27 = 7.16
fi 56
Coefficient of Variation
Coefficient of variation is a relative measure of standard deviation.
It is the ratio of the standard deviation to the mean and expressed as percent.
Hence, it is a unit less measure of variation and also takes into account the
size of the means of the distributions.
σ
For population: CV = × 100%
µ
S
For sample: CV = × 100%
X̄
The distribution having less CV is said to be less variable or more consistent
or more uniform. For field experiments, CV , is generally reported.
If it is small, it indicates more reliability of of experimental findings.
Example
Compare the variability of the following two sample data sets using standard
deviation and coefficient of variation:
A. 2 Meters, 4 Meters, 6 Meters
B. 600 Liters, 400 Liters, 500 Liters
Exercise: The average IQ of statistics students is 110 with standard
deviation 5 and the average IQ of mathematics students is 106 with standard
deviation 4. Which class is less variable in terms of IQ?
May 30, 2023 15 / 1
Chapter 4 Measures of Variation
Standard Score
The standard score (Z-score) tells us how many standard deviations a given
value is above or below the mean depending on whether the Z-score is
negative or positive.
X −µ
For population: Z =
σ
X − X̄
For sample: Z =
S
Example
Suppose Yoseph scored 90 on a test in which the mean and standard
deviation of the class were 70 and 10 respectively. In another test, Helen
scored 600 on which the mean and standard deviation of the class were 560
and 40 respectively. Who is better of relative to his/her class?