Intro CH 4-1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Chapter 4

Chapter 4: Measures of Variation

May 30, 2023

May 30, 2023 1/1


Chapter 4 Measures of Variation

Measures of Variation

In the previous chapter, we concentrated on central values (measures of


central tendency), which gives an idea of the whole mass, that is, a complete
set of values.
However, the information so obtained is neither exhaustive nor
comprehensive, as the mean does not lead us to know whether the
observations are close to each other or far apart.
Median is a positional average and has nothing to do with the variability of
the observations in a data set.
This leads as to conclude that a measure of central tendency is not enough
to have a clear idea about the data unless all observations are the same.
Moreover, two or more data sets may have the same mean and/or median
but they may be quite different.

May 30, 2023 2/1


Chapter 4 Measures of Variation

Measures of Variation...
The following table displays the price of a certain commodity in four cities.
Find the mean and median prices of the four cities and interpret it.
A 30 30 30 30 30
B 28 29 30 31 32
C 10 15 30 45 50
D 0 5 30 55 60
All the four data sets have mean 30 and median is also 30. But by inspection
it is apparent that the four data sets differ remarkably from one another.
So measures of central tendency alone do not provide enough information
about the nature of the data.
Thus, to have a clear picture of the data, one needs to have a measure of
dispersion or variability among observations in the data set.
Variation or dispersion may be defined as the extent of scatteredness of value
around the measures of central tendency.
Thus, a measure of dispersion tells us the extent to which the values of a
variable vary about the measure of central tendency.

May 30, 2023 3/1


Chapter 4 Measures of Variation

Objectives of Measures of Variation


1 To have an idea about the reliability of the measures of central
tendency.
If the degree of scatterdness is large, an average is less reliable.
If the value of the variation is small, it indicates that a central value is a good
representative of all the values in the data set.
2 To compare two or more sets of data with regard to their variability.
Two or more data sets can be compared by calculating the same measure of
variation having the same units of measurement.
A set with smaller value posses less variability or is more uniform (or more
consistent).
3 To provide information about the structure of the data.
A value of a measure of variation gives an idea about the spread of the
observation. Further, one can thought about the limits of the expansion of the
values in the data set.
4 To pave way to the use of other statistical measures.
Measures of variation especially variance and standard deviation lead to many
statistical techniques like correlation, regression, analysis of variance,· · · .

May 30, 2023 4/1


Chapter 4 Measures of Variation

Types of Measures of Variation

Absolute Measures of Variation:


A measure of variation is said to be an absolute form when it shows the actual
amount of variation of an item from a measure of central tendency and are
expressed in concrete units in which the data have been expressed.
As a result, if two or more distributions differ in their units of measurement,
their variability cannot be compared by using any absolute measure of
variation.
The size of the absolute measures of dispersion depends upon the size of the
values in the data.
That is, if the size of the values is larger, the value of the absolute measures
will also be larger.
Therefore, an absolute measures of variation fails to be appropriate for
comparing two or more groups if the size of the data among the groups is not
the same.

May 30, 2023 5/1


Chapter 4 Measures of Variation

Types of Measures of Variation...

Relative Measures of Variation:


A relative measure of variation is the quotient obtained by dividing the
absolute measure by a quantity in respect to which absolute deviation has
been computed.
It is a unit less pure number and used for making comparisons between
different distributions.
Absolute Measures Relative Measures
Range Coefficient of Range
Variance and Standard Deviation Coefficient of Variation
Standard Scores
Before giving the details of these measures of dispersion, it is worthwhile to
point out that a measure of dispersion (variation) is to be adjudged on the
basis of all those properties of good measures of central tendency.

May 30, 2023 6/1


Chapter 4 Measures of Variation

Range

Range is the simplest and crudest measure of dispersion. It is defined as the


difference between the largest and the smallest values in the data.
For raw data: R = L − S
For grouped data: R = UCLlast − LCLfirst
Range hardly satisfies any property of good measure of dispersion as it is
based on two extreme values only, ignoring the others.
It is not also liable to further algebraic treatment.

May 30, 2023 7/1


Chapter 4 Measures of Variation

Variance

Variance is the most superior and widely used measure of dispersion and it
measures the average dispersion of the observations around the mean.
The variance of a data set is the sum of the squares of the deviation of each
observation taken from the mean divided by total number of observations in
the data set.
For a population containing N elements, the population variance is denoted
by the square of the Greek letter σ (sigma), σ 2 .
N
X
(Xi − µ)2
i=1
For raw data, σ 2 = .
N
k
X
fi (Xi − µ)2
For grouped data, σ 2 = i=1
k
.
X
fi
i=1

May 30, 2023 8/1


Chapter 4 Measures of Variation

Variance...

For a sample of n elements, the sample variance denoted by S 2 and


calculated using the formula:
n
X
(Xi − X̄ )2
For raw data, S 2 = i=1
n−1 .
Xk
fi (Xi − X̄ )2
For grouped data, S 2 = i=1
k
.
X
fi − 1
i=1

May 30, 2023 9/1


Chapter 4 Measures of Variation

Variance...

Example
Find the variance of: 20, 28, 40, 12, 30, 15 and 50.
a. Take the data as a population.
b. Consider it as a sample.
N
X
(Xi − µ)2
i=1
a. N = 7, µ = 27.86; σ 2 =
N
(20 − 27.86) + · · · + (50 − 27.86)2
2
1120.86
σ2 = = = 160.12
7 7
n
X
(Xi − X̄ )2
i=1
b. n = 7, X̄ = 27.86; S 2 =
n−1

(20 − 27.86) + · · · + (50 − 27.86)2


2
1120.86
S2 = = = 186.81
6 6

May 30, 2023 10 / 1


Chapter 4 Measures of Variation

Standard Deviation

The first main demerit of variance is that its unit is the square of the unit of
measurement of the variable values.
For example, the sample variance of 2m, 6m and 4m is 4m2 .
The interpretation is, on average each value differs from the mean by 4m2 ,
which is completely wrong because one thing the unit of measurement of
variance is not the same as that of the data set.
The other disadvantage of variance is, the variation of the data is exaggerated
because the deviation of the each value from the mean is squared.
For the given example, the variation of the data is exaggerated from two to
four since it is taking the square of the deviations.
Variance also gives more weight the extreme values as compared to those
which are near to the mean value.

May 30, 2023 11 / 1


Chapter 4 Measures of Variation

Standard Deviation...
Standard deviation is the positive square root of variance.
It is considered to be the best measure of dispersion because the unit of
measurement is the same as the data set and the exaggeration made by
variance will be eliminated by taking the square root of it.
In simple words, it explains the average amount of variation on either sides of
the mean.
If the standard deviation of the data is small, the values are concentrated
near the mean and if it large, the values are scattered away from the mean.
For a population containing N elements, the population standard deviation is
denoted by the Greek letter σv(sigma).
u N
uX
u (Xi − µ)2

u
t i=1
For raw data, σ = σ2 = .
v N
u k
uX
u fi (Xi − µ)2
√ u
For grouped data, σ = σ 2 = u i=1 k .
u
u X
t f i
i=1

May 30, 2023 12 / 1


Chapter 4 Measures of Variation

Standard Deviation...

For a sample of n elements, the sample variance denoted by S 2 can be


calculated using the formula:
v
uXn
(Xi − X̄ )2
u
u
√ t
i=1
For raw data, S = S2 = .
vn − 1
uX k
fi (Xi − X̄ )2
u
u
√ u
For grouped data, S = S 2 = u i=1 k .
u
u X
t f −1 i
i=1

Example 1
Find the standard deviation of: 20, 28, 40, 12, 30, 15 and 50.
a. Take the data as a population.
b. Consider it as a sample.

a. ⇒ σ = √160.12 = 12.65
b. ⇒ S = 186.81 = 13.67

May 30, 2023 13 / 1


Chapter 4 Measures of Variation

Example 2
Find the variance and standard deviation of the students score data.
The necessary calculation for calculating variance are as follows: X̄ = 25.64
CBs Xi fi Xi − X̄ (Xi − X̄ )2 fi (Xi − X̄ )2
10.5-14.5 12.5 4 -13.14 172.6596 690.6384
14.5-18.5 16.5 7 -9.14 83.5396 584.7772
18.5-22.5 20.5 8 -5.14 26.4196 211.3568
22.5-26.5 24.5 10 -1.14 1.2996 12.9960
26.5-30.5 28.5 12 2.86 8.1796 98.1552
30.5-34.5 32.5 7 6.86 47.0596 329.4172
34.5-38.5 36.5 8 10.86 117.9396 943.5168
Total 56 2870.8576
X
fi (Xi − X̄ )2 2870.8576 √
σ2 = X = = 51.27, hence σ = 51.27 = 7.16
fi 56

May 30, 2023 14 / 1


Chapter 4 Measures of Variation

Coefficient of Variation
Coefficient of variation is a relative measure of standard deviation.
It is the ratio of the standard deviation to the mean and expressed as percent.
Hence, it is a unit less measure of variation and also takes into account the
size of the means of the distributions.
σ
For population: CV = × 100%
µ
S
For sample: CV = × 100%

The distribution having less CV is said to be less variable or more consistent
or more uniform. For field experiments, CV , is generally reported.
If it is small, it indicates more reliability of of experimental findings.
Example
Compare the variability of the following two sample data sets using standard
deviation and coefficient of variation:
A. 2 Meters, 4 Meters, 6 Meters
B. 600 Liters, 400 Liters, 500 Liters
Exercise: The average IQ of statistics students is 110 with standard
deviation 5 and the average IQ of mathematics students is 106 with standard
deviation 4. Which class is less variable in terms of IQ?
May 30, 2023 15 / 1
Chapter 4 Measures of Variation

Standard Score

The standard score (Z-score) tells us how many standard deviations a given
value is above or below the mean depending on whether the Z-score is
negative or positive.
X −µ
For population: Z =
σ
X − X̄
For sample: Z =
S
Example
Suppose Yoseph scored 90 on a test in which the mean and standard
deviation of the class were 70 and 10 respectively. In another test, Helen
scored 600 on which the mean and standard deviation of the class were 560
and 40 respectively. Who is better of relative to his/her class?

May 30, 2023 16 / 1

You might also like