Download as pdf or txt
Download as pdf or txt
You are on page 1of 113

Module 8:

Normal Probability
Distributions
October
7: Normal Probability Distributions 1
21
Normal Probability Density Function

• Continuous random
variables are described
using probability
density function
(pdfs) curves
• Normal pdfs are
characterized by their
typical bell-shape

7: Normal Probability Distributions 2


Area Under the Curve
• pdfs should be viewed
almost like a histogram
• Top Figure: The darker
bars of the histogram
correspond to ages ≤ 9
(~40% of distribution) 1
 x− 
− 12 
 

2

f ( x) = e 
• Bottom Figure: shaded 2 

area under the curve


(AUC) corresponds to
ages ≤ 9 (~40% of area)
7: Normal Probability Distributions 3
Parameters μ and σ
• Normal pdfs have two parameters
μ - expected value (mean “mu”)
σ - standard deviation (sigma)

μ controls location σ controls spread

7: Normal Probability Distributions 4


Mean and Standard Deviation of
Normal Density

7: Normal Probability Distributions 5


Standard Deviation σ
• Points of inflections
one σ below and above
μ

7: Normal Probability Distributions 6


68-95-99.7 Rule for
Normal Distributions
• 68% of the AUC within ±1σ of μ
• 95% of the AUC within ±2σ of μ
• 99.7% of the AUC within ±3σ of μ

7: Normal Probability Distributions 7


Example: 68-95-99.7 Rule
• 68% of scores within
Examination Score in Statistics: μ±σ
Normally distributed with μ = = 100 ± 15
100 and σ = 15; X ~ N(100, 15) = 85 to 115
• 95% of scores within
μ ± 2σ
= 100 ± (2)(15)
= 70 to 130
• 99.7% of scores in
μ ± 3σ =
100 ± (3)(15)
= 55 to 145
7: Normal Probability Distributions 8
Symmetry in the Tails
Because the Normal curve is symmetrical
and the total AUC is exactly 1…

95%
… we can easily determine
the AUC in tails
7: Normal Probability Distributions 9
Example: Male Height
• Male height: Normal with μ = 70.0˝ and σ = 2.8˝
• 68% within μ ± σ = 70.0  2.8 = 67.2 to 72.8
• 32% in tails (below 67.2˝ and above 72.8˝)
• 16% below 67.2˝ and 16% above 72.8˝ (symmetry)

1
7: Normal Probability Distributions
0
Characteristics of the Standard
Normal Distribution

• Mean = 0
• Standard deviation = 1
• Total Area under the curve =1=Probability
• X ~ N(0, 1)

1
MCC 202, SPUP, 1st Trimester 2021-2022
1
Table A.0

• Gives the area or


probability that
the data lie z s.d.
above the mean 0.

1
MCC 202, SPUP, 1st Trimester 2021-2022
2
Table A.1. Used to determine the area under the curve for Z≤ z.
You are actually taking the cumulative probability up to the z-score

1
MCC 202, SPUP, 1st Trimester 2021-2022
3
Determining Normal Probabilities

When value do not fall directly on σ landmarks:

1. State the problem


x−
2. Standardize the value(s) (z score) using z=
3. Sketch, label, and shade the curve

4. Use Table

1
7: Normal Probability Distributions
4
Step 1: State the Problem
• What percentage of gestations are less
than 40 weeks?
• Let X ≡ gestational length
• We know from prior research:
X ~ N(39, 2) weeks
• Pr(X ≤ 40) = ?
1
7: Normal Probability Distributions
5
Step 2: Standardize
• Standard Normal
variable ≡ “Z” ≡ a
Normal random variable
with μ = 0 and σ = 1,
• Z ~ N(0,1)
• Use Table to look up
cumulative
probabilities for Z
1
7: Normal Probability Distributions
6
Example: A Z
variable of 1.96 has
cumulative
probability 0.9750.
1
7: Normal Probability Distributions
7
Step 2 (cont.)
x−
Turn value into z score: z=

z-score = no. of σ-units above (positive z) or
below (negative z) distribution mean μ

For example, the value 40 from X ~ N (39,2) has


40 − 39
z= = 0.5
7: Normal Probability Distributions
1
8

2
Steps 3 & 4: Sketch & Table
3. Sketch
4. Use Table A1 to lookup Pr(Z ≤ 0.5) = 0.6915

1
7: Normal Probability Distributions
9
Probabilities Between Points
a represents a lower boundary
b represents an upper boundary
Pr(a ≤ Z ≤ b) = Pr(Z ≤ b) - Pr(Z ≤ a)

2
7: Normal Probability Distributions
0
Between Two Points
Pr(-2 ≤ Z ≤ 0.5) = Pr(Z ≤ 0.5) − Pr(Z ≤ -2)
.6687 = .6915 − .0228

.0228
.6687 .6915

-2 0.5 0.5 -2

2
7: Normal Probability Distributions
1
Values Corresponding to Normal
Probabilities
1. State the problem
2. Find Z-score corresponding to percentile (Table B)
3. Sketch
4. Unstandardize:
x =  + z p

2
7: Normal Probability Distributions
2
z percentiles

▪ zp ≡ the Normal z variable with cumulative probability p


▪ Use Table A to look up the value of zp
▪ Look inside the table for the closest cumulative probability
entry
▪ Trace the z score to row and column

2
7: Normal Probability Distributions
3
e.g., What is the 97.5th
percentile on the
Standard Normal curve?
z.975 = 1.96

Notation: Let zp
represents the z score
with cumulative
probability p,
e.g., z.975 = 1.96
2
7: Normal Probability Distributions
4
Step 1: State Problem
Question: What gestational length is smaller than 97.5% of
gestations?
• Let X represent gestations length
• We know from prior research that
X ~ N(39, 2)

• A value that is smaller than .975 of gestations has a


cumulative probability of.025
2
7: Normal Probability Distributions
5
Step 2 (z percentile)
Less than 97.5% (right tail) =
greater than 2.5% (left tail)
z lookup:
z.025 = −1.96
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09


1.9 .0287 .0281 .0274 .0268 .0262 .0256 .0250 .0244 .0239 .0233
2
7: Normal Probability Distributions
6
Unstandardize and sketch
x =  + z p = 39 + (−1.96)(2)  35

The 2.5th percentile is 35 weeks


2
7: Normal Probability Distributions
7
Assessing Departures
from Normality
Approximately Same distribution
Normal histogram on Normal “Q-Q”
Plot

Normal distributions adhere to diagonal line on Q-Q


7: Normal Probability Distributions
2
8
plot
Negative Skew

2
Negative skew shows upward curve on Q-Q plot
7: Normal Probability Distributions
9
Positive Skew

Positive skew shows downward curve on Q-Q plot


7: Normal Probability Distributions
3
0
Same data as prior slide with
logarithmic transformation

3
7: Normal Probability Distributions
1
The log transform Normalize the skew
Leptokurtotic

3
Leptokurtotic distribution show S-shape on Q-Q
7: Normal Probability Distributions
2

plot
Measures of Skewness and
Kurtosis

Skewness refers to the distortion or asymmetry


of a data’s distribution from the symmetrical bell
curve, or normal distribution.
A normal distribution has 0 skewness.
+ skewness = skewed to the right
- Skewness = skewed to the left

3
MCC 202, SPUP, 1st Trimester 2021-2022
3
Skewness

Skewed Left Symmetric Normal Skewed Right


Long tail points left Tails are balanced Long tail points right

Figure 1. Sketches showing general position of mean, median, and mode in a population.

3
MCC 202, SPUP, 1st Trimester 2021-2022
4
- Fisher-Pearson’s Coefficient of Skewness

σ𝑛𝑖=1 𝑋𝑖 − 𝑋ത 3 /𝑛
𝑔1 =
𝑠3
where: s has divisor n

3
MCC 202, SPUP, 1st Trimester 2021-2022
5
Fisher-Pearson’s Coefficient of
Skewness

• For small samples


𝑛(𝑛−1) σ𝑛 ത 3
𝑖=1 𝑋𝑖 −𝑋 /𝑛
• 𝐺1 = 𝑛−2 𝑠3
Table 1. 90% range for sample skewness coefficient G1.
n Lower Limit Upper Limit n Lower Limit Upper Limit
25 -0.726 0.726 90 -0.411 0.411
30 -0.673 0.673 100 -0.391 0.391
40 -0.594 0.594 150 -0.322 0.322
50 -0.539 0.539 200 -0.281 0.281
60 -0.496 0.496 300 -0.230 0.230
70 -0.462 0.462 400 -0.200 0.200
80 -0.435 0.435 500 -0.179 0.179
Source: David P. Doane and Lori E. Seward (2011), Applied Statistics in Business and Economics,
3e, (McGraw-Hill), p. 155. The table is adapted from E. S. Pearson and H. O. Hartley, Biometrika
Tables for Statisticians, 3rd Edition, Cambridge University Press, 1970, page 207 using an
adjustment for sample size. Values outside this range would suggest a non-normal population. Table 3
MCC used
202, SPUP, 1st Trimester 2021-2022
with permission. 6
Pearson 2 Skewness Coefficient


𝑋−𝑀𝑒𝑑𝑖𝑎𝑛
• 𝑆𝑘2 = 3 𝑠

Table 4. 90% expected range for Pearson 2 skewness coefficient Sk2.


n Lower Limit Upper Limit n Lower Limit Upper Limit
10 –0.963 +0.963 60 –0.463 +0.463
20 –0.762 +0.762 70 –0.437 +0.437
30 –0.643 +0.643 80 –0.407 +0.407
40 –0.554 +0.554 90 –0.385 +0.385
50 –0.506 +0.506 100 –0.367 +0.367
Note: If your sample is from a normal population, the skewness coefficient Sk2 would fall within the
stated range 90 percent of the time. Values of Sk2 outside this range suggest non-normal skewness.

3
MCC 202, SPUP, 1st Trimester 2021-2022
7
Figure 6. 90% expected range for Pearson 2 skewness coefficient Sk2.

3
MCC 202, SPUP, 1st Trimester 2021-2022
8
Galton Skewness Formula

𝑄1 +𝑄3 −2𝑄2
• Galton skewness= 𝑄3 −𝑄1
• where 𝑄1 is the lower quartile, 𝑄3 is the upper
quartile, and 𝑄2 is the median.

3
MCC 202, SPUP, 1st Trimester 2021-2022
9
Measures of Kurtosis

• Kurtosis - a statistical measure that defines


how heavily the tails of a distribution differ
from the tails of a normal distribution.
• identifies whether the tails of a given
distribution contain extreme values
• Measure of Kurtosis of a normal distribution is
3
• Excess Kurtosis = Kurtosis - 3

4
MCC 202, SPUP, 1st Trimester 2021-2022
0
High vs Low Kurtosis

• High kurtosis indicates heavy tails or


outliers in the data.
• Low kurtosis indicates data has light
tails or data has lack of outliers

4
MCC 202, SPUP, 1st Trimester 2021-2022
1
Types of Kurtosis

4
MCC 202, SPUP, 1st Trimester 2021-2022
2
Types of Kurtosis

• Mesokurtic (=3, close to 3)


- similar to that of the normal
distribution.
- the extreme values of the
distribution are similar to that of a
normal distribution characteristic.
4
MCC 202, SPUP, 1st Trimester 2021-2022
3
Types of Kurtosis
Leptokurtic (Kurtosis > 3)
- distribution is longer, tails are fatter
- peak is higher and sharper than Mesokurtic, which means that data are heavy-
tailed or profusion of outliers.

Platykurtic: (Kurtosis < 3)


- distribution is shorter, tails are thinner than the normal distribution.
- peak is lower and broader than Mesokurtic, which means that data are light-
tailed or lack of outliers.
4
MCC 202, SPUP, 1st Trimester 2021-2022
4
𝑚4
Formula for Kurtosis =
𝑚22

Population Kurtosis Formula

Sample Kurtosis Formula

4
MCC 202, SPUP, 1st Trimester 2021-2022
5
Modules 9-11

4
MCC 202, SPUP, 1st Trimester 2021-2022
6
Parametric Test vs The
Non-Parametric Test

Properties Parametric Test Non-Parametric Test

Assumptions Yes, assumptions are made No, assumptions are not made

The mean value is the central The median value is the central
Value for central tendency
tendency tendency

Correlation Pearson Correlation Spearman Correlation

Probabilistic Distribution Normal probabilistic distribution Arbitrary probabilistic distribution

Population knowledge is not


Population Knowledge Population knowledge is required
required

Used for Used for finding interval data Used for finding nominal data

Applicable to variables and


Application Applicable to variables
attributes

Examples T-test, z-test, Mann-Whitney, Kruskal-Wallis

4
MCC 202, SPUP, 1st Trimester 2021-2022
7
However…

Parametric Tests can also be used for skewed data sets


provided:
• 1. the sample size is more than 20 for t-test
• 2. sample sizes of more than 15 for each group in 2
sample t-test
• 3. sample size is greater than 15 each for 2-9 groups
in One-Way ANOVA
• 4. sample size of is greater than 20 each for 10-12
groups in One-Way ANOVA
4
MCC 202, SPUP, 1st Trimester 2021-2022
8
Statistical Tests

4
MCC 202, SPUP, 1st Trimester 2021-2022
9
Tests of Hypothesis

Hypothesis
•A statement or tentative theory which aims to
explain facts about the real world
• An educated guess
•It is subject for testing. If it is found to be
statistically true, it is accepted. Otherwise, it gets
rejected.
5
MCC 202, SPUP, 1st Trimester 2021-2022
0
Kinds of Hypotheses
1. Null Hypothesis (Ho)
• It serves as the working hypothesis
• It is that which one hopes to accept or reject
•It must always express the idea of no
significant difference or relationship

2. Alternative Hypothesis (H1 or Ha)


•It generally represents the hypothetical
statement that the researcher wants to prove.
5
MCC 202, SPUP, 1st Trimester 2021-2022
1
Types of Alternative Hypotheses
(Ha)
1. Directional hypothesis
➢ expresses direction
➢one – tailed
➢uses order relation of “greater than” or “less than”,

2. Non – directional hypothesis


➢ does not express direction
➢ two – tailed
➢ uses the “not equal to”

5
MCC 202, SPUP, 1st Trimester 2021-2022
2
Type I and Type II Errors

➢When making a decision about a proposed


hypothesis based on the sample data, one runs the
risk of making an error. The following table on the
next slide summarizes the possibilities:

5
MCC 202, SPUP, 1st Trimester 2021-2022
3
5
MCC 202, SPUP, 1st Trimester 2021-2022
4
Level of Significance

➢The probability of making Type I error or alpha


error in a test is called the significance level of the
test. The significance level of a test is the maximum
value of the probability of rejecting the null
hypothesis (Ho) when in fact it is true.

5
MCC 202, SPUP, 1st Trimester 2021-2022
5
Critical Region
The critical region (or rejection region) is the set of all values
of the test statistic that cause us to reject the null hypothesis.

Region of
rejection

Region of
acceptance

P - value Critical - value

5
MCC 202, SPUP, 1st Trimester 2021-2022
6
Critical Value

A critical value is any value that separates the


critical region (where we reject the null
hypothesis) from the values of the test statistic
that do not lead to rejection of the null
hypothesis, the sampling distribution that
applies, and the significance level .

5
MCC 202, SPUP, 1st Trimester 2021-2022
7
P - Value
The P-value (probability value) is the probability of
getting a value of the test statistic that is at least as
extreme as the one representing the sample data,
assuming that the null hypothesis is true. The null
hypothesis is rejected if the P-value is very small,
such as 0.05 or less.

5
MCC 202, SPUP, 1st Trimester 2021-2022
8
Two-tailed, Right-tailed and
Left-tailed Tests

• The tails in a distribution are the extreme


regions bounded by critical values.

5
MCC 202, SPUP, 1st Trimester 2021-2022
9
Two-tailed Tests
Given:
H0: = ; H1: ≠

6
MCC 202, SPUP, 1st Trimester 2021-2022
0
Right – tailed Tests
Given:
H0: = ; H1: >

6
MCC 202, SPUP, 1st Trimester 2021-2022
1
Left – tailed Tests
Given:
H0: = ; H1: <

6
MCC 202, SPUP, 1st Trimester 2021-2022
2
Steps in Hypothesis
Testing
1. Formulate the null hypothesis (Ho) that there is no
significant difference between the items compared. State
the alternative hypothesis (Ha) which is used in case Ho
is rejected.

2. Set the level of significance of the test, .

3. Determine the test to be used.


❖ Z – TEST – used if the population standard deviation
is given
❖ T – TEST – used if the sample standard deviation is
given
MCC 202, SPUP, 1st Trimester 2021-2022
6
3
Steps in Hypothesis Testing
4. Determine the tabular value of the test.
***For a Z – test, the table below summarizes the
critical values at varying significance levels
Type of Level of Significance
Test 0.10 0.05 0.025 0.01

One – ± 1. 28 ± 1. 645 ± 1.96 ± 2.33


Tailed
Two – ± 1.645 ± 1.96 ± 2.33 ± 2. 58
Tailed 6
MCC 202, SPUP, 1st Trimester 2021-2022
4
Steps in Hypothesis Testing
4. Determine the tabular value of the test.

***For a T – test, one must compute first the


degree/s of freedom (df) then look for the tabular
value from the table of Students’T – Distribution.

i. For a single sample


df = n – 1
ii. For two samples
df = n1 + n2 – 2

6
MCC 202, SPUP, 1st Trimester 2021-2022
5
Steps in Hypothesis
Testing
5. Compute for z or t as needed. Vary your solutions using
the formulas:

➢ For z – test
i. Sample mean compared with a population mean
ii. Comparing two sample means
iii. Comparing two sample proportions

➢ For t – test
i. Sample mean compared with a population mean
ii. Comparing two sample means
6
MCC 202, SPUP, 1st Trimester 2021-2022
6
Steps in Hypothesis Testing
6. Compare the computed value with its
corresponding tabular value, then state your
conclusions based on the following guidelines:

✓ Reject Ho if the absolute computed value is


equal to or greater than the absolute tabular value
✓ Accept Ho if the absolute computed value is less
than the absolute tabular value

6
MCC 202, SPUP, 1st Trimester 2021-2022
7
Decision Criterion

Traditional Method:
***Reject H0 (Accept H1 ) if the test
statistic falls within the critical region.
***Fail to reject H0 (Accept Ho) if the
test statistic does not fall within the critical
region.

6
MCC 202, SPUP, 1st Trimester 2021-2022
8
Decision Criterion
P - value method:

*** Reject Ho (Accept H1 ) if P-value 


 (where  is the significance level, such as
0.05)

***Fail to reject H0 (Accept Ho) if


P-value > 

6
MCC 202, SPUP, 1st Trimester 2021-2022
9
Decision Criterion

Another option:
Instead of using a significance level
such as 0.05, simply identify the P-value and
leave the decision to the reader.

7
MCC 202, SPUP, 1st Trimester 2021-2022
0
Z - TEST
1. Sample Mean (X) Compared with a Population Mean (μ)


𝑋−𝜇 𝑛
Z= 𝜎
Where:
X – sample mean
μ – population mean
n – number of items in the sample

𝜎 – population standard deviation


7
MCC 202, SPUP, 1st Trimester 2021-2022
1
Z - TEST
2. Comparing Two Sample Means (𝑥1 − 𝑥ҧ2)

𝑥1 −𝑥ത2
Z= 1 1
𝜎 +
Where: 𝑛1 𝑛1

𝑥1 –mean of the first sample


𝑥2 –mean of the second sample
n1 – number of items in the first sample
n2– number of items in the second sample
𝜎 – population standard deviation
7
MCC 202, SPUP, 1st Trimester 2021-2022
2
Z- TEST
3. Comparing Two Sample Proportions (P1 & P2)
𝑝1 −𝑝2
Z= 𝑝1 𝑞1 𝑝2 𝑞2
+
𝑛1 𝑛2
Where:
p1 – proportion of the first sample
p2 – proportion of the second sample
n1 – number of items in the first sample
n2 – number of items in the second sample
q1 = 1 – p1
q2 = 1 – p2
7
MCC 202, SPUP, 1st Trimester 2021-2022
3
T- TEST
4. Sample Mean (X) Compared with a Population Mean (μ)


𝑥−𝜇 𝑛−1
t= 𝑠
Where:

𝑥ҧ –sample mean
μ – population mean
n – number of items in the sample

s – sample standard deviation

7
MCC 202, SPUP, 1st Trimester 2021-2022
4
T- TEST
5. Comparing Two Sample Means 𝑥1 − 𝑥2
𝑥1 − 𝑥2
t=
1 1 𝑛1 −1 𝑠2 2
1 + 𝑛2 − 1 𝑠2
+
𝑛1 𝑛2 𝑛1 + 𝑛2 −2
Where:
𝑥1– mean of the first sample
𝑥2 – mean of the second sample

𝑛1 – number of items in the first sample


𝑛2 – number of items in the second sample
s1 – standard deviation of the first sample
s2 – standard deviation of the second sample 7
MCC 202, SPUP, 1st Trimester 2021-2022
5
6. Paired t-test for the
Difference of Means
𝑑ത
• 𝑡=𝑠 𝑛

σ 𝑑−𝑑ത 2
• 𝑠2 = 𝑛−1

• 𝑑 = 𝑉𝑎𝑓𝑡𝑒𝑟 − 𝑉𝑏𝑒𝑓𝑜𝑟𝑒
• 𝑛 − 𝑛𝑜. 𝑜𝑓 𝑝𝑎𝑖𝑟𝑠

7
MCC 202, SPUP, 1st Trimester 2021-2022
6
Example 1

Data from a school census show that the


mean weight of college students is 45 kilos with a
standard deviation of 3 kilos. A sample of 100
college students were found to have a mean of 47
kilos. Are the college students really heavier than
the rest using the 0.05 level of significance?

7
MCC 202, SPUP, 1st Trimester 2021-2022
7
Example 2

A researcher wishes to find out whether or not there is


significant difference in the monthly allowance of morning and
afternoon students in his school. By random sampling, he took a
sample of 239 students in the morning session. The students were
found to have a mean monthly allowance of P142.00. The researcher
also took a sample of 209 students in the afternoon session . They
were found to have a mean monthly allowance of P148.00. The
population of students in that school have a standard deviation of
P40.00. Is there a significant difference between the two samples at
0.01 level?
7
MCC 202, SPUP, 1st Trimester 2021-2022
8
Example 3

A sample survey of television programs in


Tuguegarao shows that 80 out of 200 men and 75
out of 250 women dislike “Juan dela Cruz”
program. One likes to know whether the difference
between the two sample proportions, 80/200 = 0.40
and 75/250 = 0.30, is significant or not at 0.05
level.
7
MCC 202, SPUP, 1st Trimester 2021-2022
9
Example 4
A researcher knows that the average height of
Filipino women is 1.525 meters. A random sample
of 26 women was taken and was found to have a
mean height of 1.56 meters, with a standard
deviation of 0.10 meters. Is there reason to believe
that the 26 women are significantly taller than the
rest using the 0.05 level of significance?
8
MCC 202, SPUP, 1st Trimester 2021-2022
0
Example 5

Beta company is manufacturing steel wire


with an average tensile strength of 50 kilos. The
laboratory tests 16 pieces and finds that the mean is
47 kilos with a standard deviation of 15 kilos. Are
the results in accordance with the hypothesis that
the population mean is 50 kilos?

8
MCC 202, SPUP, 1st Trimester 2021-2022
1
Example 6
It is known from the records of the city
schools that the standard deviation of math test
scores on ABC test is 5. A sample of 200 students
from the system was taken and it was found out that
the sample mean is 75. Previous tests showed the
population mean to be 70. Is it safe to conclude that
the sample is significantly different from the
population at 0.01 level?
8
MCC 202, SPUP, 1st Trimester 2021-2022
2
Example 7
• Two types of rice varieties are being considered for
yield and a comparison is needed. Thirty hectares were
planted with the rice varieties exposed to fairly uniform
conditions. The results are tabulated below:
Variety A Variety B
Average yield 80 sack/hec 85 sack/hec
Sample Variance 5.90 12.10
Is there significant difference in the yield of the two
varieties at 0.05 level of significance?
8
MCC 202, SPUP, 1st Trimester 2021-2022
3
Example 8
A manufacturer of flashlight batteries claims
that the average life of his product will exceed 40
hours. A company is willing to buy a very large
shipment of batteries provided the claim is true. A
random sample of 36 batteries is tested, and it was
found out that the sample mean is 45 hours. If the
population of batteries has a standard deviation of 5
hours, is it likely that the batteries will be bought?
8
MCC 202, SPUP, 1st Trimester 2021-2022
4
Example 9
A company is trying to decide which brand of two types
to buy for their trucks. They would like to adopt Brand c unless
there is some evidence that Brand D is better. An experiment was
conducted where 16 from each brand were used. The tires were
run under uniform conditions until they wore out. The results
are:
Brand C: X1 = 40,000 km s1 = 5,400 km
Brand D: X2 = 38,000 km s2 = 3,200 km

What conclusion can be drawn?


MCC 202, SPUP, 1st Trimester 2021-2022
8
5
Example 10
All freshmen in a particular school were
found to have a variability in grades expressed as a
standard deviation of 3. two samples among these
freshmen, made up of 20 and 50 students each,
were found to have means of 88 and 85respectively.
Based on their grades, is the first group really
brighter than the second group using 0.01 level of
significance?
8
MCC 202, SPUP, 1st Trimester 2021-2022
6
Example 11
• IQ test was administered to 5 persons before and
after they were trained. The results are given
below:
Candidates I II III IV V

Before 110 120 123 132 125


Training

After 120 118 125 136 121


Training

• Test whether there is any change in IQ after the


training.
8
MCC 202, SPUP, 1st Trimester 2021-2022
7
Analysis of Variance (F - Test)

-A test that was developed by Ronald A. Fisher

-A technique in inferential statistics designed to test


whether or not more than two samples (or groups)
are significantly different from each other

8
MCC 202, SPUP, 1st Trimester 2021-2022
8
One-Way Analysis of Variance
Steps:
1. Compute for the sum of squares

( x ) 2
TSS =  x2 −
N
σ 𝑥1 σ 𝑥2 σ 𝑥𝑟 σ𝑥 2
𝑆𝑆𝐵 = + +…+ −
𝑛1 𝑛2 𝑛𝑟 𝑁

SSW = TSS – SSB

8
MCC 202, SPUP, 1st Trimester 2021-2022
9
One-Way Analysis of Variance

2. Compute degrees of freedom

𝑑𝑓𝑡 == 𝑁 – 1
𝑑𝑓𝐵 = 𝑣1 = 𝑘 − 1
𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑣 2 = 𝑑𝑓𝑡 − 𝑑𝑓𝐵

= 𝑁−𝑘

9
MCC 202, SPUP, 1st Trimester 2021-2022
0
One-Way Analysis of Variance
3. Compute for the mean sum of squares

𝑆𝑆𝐵
𝑀𝑆𝑆𝐵 =
𝑑𝑓𝐵
𝑆𝑆𝑊
𝑀𝑆𝑆𝑊 =
𝑑𝑓𝑊
4. Compute for the F – Ratio
MSSB
F=
MSSW 9
MCC 202, SPUP, 1st Trimester 2021-2022
1
Contingency Table for ANOVA
Sources of Sum of Degree of Mean Sum F – Ratio
Variation Squares Freedom of Squares
(df)
Between SSB 𝑑𝑓𝐵 = 𝑘 − 1 MSSB
Column

Within SSW 𝑑𝑓𝑊 = 𝑁 − 𝑘 MSSW


Column

Total TSS 𝑑𝑓𝑇 = 𝑁 − 1

9
MCC 202, SPUP, 1st Trimester 2021-2022
2
Exercise
1. The weights in kilograms of three groups of 5 members
each are shown in the table below. Is there unusual
variation among the groups? ( use 𝛼 = 0.05)

Group
Members A B C
1 50 60 53
2 48 40 55
3 55 50 40
4 50 60 40
5 46 52 47
9
MCC 202, SPUP, 1st Trimester 2021-2022
3
Exercise
2. The following are the mileage obtained after several road tests were
run using 5 different kinds of gasoline on a Toyota Car.

Road Type of Gasoline


Test
A B C D E
1ST 35 61 38 65 56
2ND 31 63 54 60 69
3RD 42 50 47 57 70
4TH 48 42 60 55 50
5TH 40 49 55 60 48

Is there significant difference among the mileage yields, at 1% level?


9
MCC 202, SPUP, 1st Trimester 2021-2022
4
Exercise
3. Below are the bowling scores of four groups og four members each. At 5%
significance level, find out if there is unusual variation among the groups.

Members Group
A B C D
1 98 100 87 90
2 78 95 92 93
3 95 90 105 95
4 110 85 88 97

9
MCC 202, SPUP, 1st Trimester 2021-2022
5
Chi – Square Test (𝜒2)
- Used to test significant difference or relationship
- Used if data are in frequencies (enumeration data)

USES:
1.to test the goodness of fit of a normal curve; that is to
find out whether or not a sample distribution conforms
with the hypothetical normal distribution
2. to find out whether or not an observed proportion is
equal to some given ideal or expected proportion
3.to test the independence of one variable from another
variable.
9
MCC 202, SPUP, 1st Trimester 2021-2022
6
Formulas:
i. For a 2 x 2 table (with YATE’s correction for continuity)

(OF − EF − 0.5)2
X2 = 
EF

ii. For a non 2 x 2 table

(OF − EF)2
X2 =  EF
𝑑𝑓 = (𝑟 − 1)(𝑐 − 1) 𝑂𝐹 = 𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠
𝐸𝐹 = 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠 9
MCC 202, SPUP, 1st Trimester 2021-2022
7
Expected Frequency:

C D Total

A 𝑇𝐴 𝑇𝐶 𝑇𝐴𝑇𝐷 𝑇𝐴
= =
𝑇𝑜𝑡𝑎𝑙 𝑇𝑜𝑡𝑎𝑙

B 𝑇𝐵 𝑇𝐶 𝑇𝐵 𝑇𝐷 𝑇𝐵
= =
𝑇𝑜𝑡𝑎𝑙 𝑇𝑜𝑡𝑎𝑙

Total 𝑇𝐶 𝑇𝐷 Total

9
MCC 202, SPUP, 1st Trimester 2021-2022
8
Exercise
1. Test the hypothesis that educational attainment does not
depend on socio – economic status for the following 100
persons in a particular community.

Socio – economic Educational Attainment


status
Finished College Did Not Finish
College
Poor 18 10

Middle Class 28 25

Rich 14 5
9
MCC 202, SPUP, 1st Trimester 2021-2022
9
Exercise
2. At 1% significance level, does college academic grade
depend on the high school NSAT results for the following
200 students?
NSAT Rating
Academic
Grade Low Average High

Above 85 13 25 21

75 – 85 18 31 38

Below 75 14 20 20
1
MCC 202, SPUP, 1st Trimester 2021-2022 0
0
Exercise
3. At ABC Company, there are 28 males and 32
females. Out of the 28 males, 10 holds executive
posts and the others do clerical work. Of the 32
females, only 5 hold executive position and the
others do clerical work. Prepare a contingency
table, then test the hypothesis that position is
independent on sex.

1
MCC 202, SPUP, 1st Trimester 2021-2022 0
1
Exercise
• 4. To determine whether type of personality is related
to academic performance, a random sample of 180
high school students from a certain college were taken
and the data are as follows:
Low Average Average High Average
Introvert 35 30 25
Extrovert 31 23 36

Is there a significant relationship between personality type


and academic performance?
1
MCC 202, SPUP, 1st Trimester 2021-2022 0
2
Correlation
and
Regression Analysis

1
MCC 202, SPUP, 1st Trimester 2021-2022 0
3
Regression Analysis
- concerned with the problem of estimation and
forecasting

Where:
𝑦 = 𝑎 + 𝑏𝑥
𝑦 → predicted score
𝑎 → y – intercept
𝑏 → slope of the line
1
MCC 202, SPUP, 1st Trimester 2021-2022 0
4
Regression Analysis

𝑛 σ 𝑥𝑦 − σ 𝑥 σ 𝑦
𝑏=
𝑛 σ 𝑥2 − σ 𝑥 2
𝑎 = 𝑦ത − 𝑏𝑥ҧ
Where:

𝑦ത → mean of the y values


𝑥ҧ → mean of the x values
1
MCC 202, SPUP, 1st Trimester 2021-2022 0
5
Correlation Analysis
- Concerned in the relationship of the changes of
the variables
Formula: Pearson Product Moment Correlation (r)

n(  xy ) − (  x)( y)
r=
[n( x 2 ) − ( x) 2 ][n( y 2 ) − ( y) 2

1
MCC 202, SPUP, 1st Trimester 2021-2022 0
6
Range of Values: r = [-1, 1]

(+) r – shows a direct positive relationship


(- ) r – shows a negative or inverse relationship

r = 0 → this indicates no relationship


r = 1→ perfect positive relationship
r = -1 → perfect negative relationship

1
MCC 202, SPUP, 1st Trimester 2021-2022 0
7
Interpretation:
Pearson r Qualitative Description

±1 Perfect Correlation

± 0.91 – ± 0.99 Very High

± 0.71 – ± 0.90 High

± 0.41 – ± 0.70 Marked

± 0.21 – ± 0.40 Slight/Low

0 – ± 0.20 Negligible

1
MCC 202, SPUP, 1st Trimester 2021-2022 0
8
Testing the Significance of r

𝑛−2 2
𝑡=𝑟
1 − 𝑟2

1
MCC 202, SPUP, 1st Trimester 2021-2022 0
9
Exercise
1. It is generally known that the number of road accidents is inversely
proportional with road width. The following data shows the result of
a study indicating the number of accidents occurring per hundred
thousand vehicles.

Road width (in feet) (x) 75 52 60 33 22

Number of accidents (y) 40 84 55 92 90

a. draw a scatter diagram


b. find the equation of the LSRL
c. predict accident frequency for a road whose width is 55 feet;
48 feet
d. find the degree of relationship between road width and
accident frequency.
MCC 202, SPUP, 1st Trimester 2021-2022
1
1
0
Exercise
2. The following table shows the final grades of ten students
in Algebra and Statistics.

Algebra (x) 75 80 93 65 87 71
Statistics (y) 82 78 86 72 91 80

a. draw a scatter diagram


b. find the equation of the LSRL
c.predict grade in Statistics if grade in
Algebra is 78; 82; 89; 95; 100
d. find the degree of relationship between grades in
Algebra and Statistics
MCC 202, SPUP, 1st Trimester 2021-2022
1
1
1
Exploring SPSS

1
MCC 202, SPUP, 1st Trimester 2021-2022 1
2
The End!!!

1
MCC 202, SPUP, 1st Trimester 2021-2022 1
3

You might also like