Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

BALIWAG POLYTECHNIC COLLEGE

Baliwag, Bulacan

ACT09 – Statistical Analysis and Software Application


Second Semester | A.Y. 2021 – 2022

MODULE 1

Prepared by:

Marvic V. Ablaza, MM
Assistant Professor I

www.btech.edu.ph STATISTICAL ANALYSIS AND SOFTWARE APPLICATION Page 1 of 13


STATISTICAL
ANALYSIS AND
SOFTWARE
APPLICATION
Introduction to Statistical Concept

Objectives:
After successful completion of this module, you should be able to:

 Define statistics
 Enumerate the importance and limitation of statistics
 Explain the process of statistics
 Know the difference between descriptive and inferential statistics
 Distinguish between qualitative and quantitative variables
 Distinguish between discrete and continuous variables
 Determine the level of measurement of a variable

www.btech.edu.ph STATISTICAL ANALYSIS AND SOFTWARE APPLICATION Page 2 of 13


LESSON 1

Definition of Statistics

Statistics plays a major role in many aspects of our lives. It is used in sports, for
example, to help a general manager decide which player might be the best fit for a
team. It is used in politics to help candidates understand how the public feels about
various policies. And statistics is used in medicine to help determine the effectiveness of
new drugs. Used appropriately, statistics can enhance our understanding of the world
around us. Used inappropriately, it can lend support to inaccurate beliefs.
Understanding statistical methods will provide you with the ability to analyze and
critique studies and the opportunity to become an informed consumer of information.
Understanding statistical methods will also enable you to distinguish solid analysis from
bogus “facts.”
Many people say that statistics is numbers. After all, we are bombarded by
numbers that supposedly represent how we feel and who we are. Certainly, statistics
has a lot to do with numbers, but this definition is only partially correct. Statistics is also
about where the numbers come from (that is, how they were obtained) and how closely
the numbers reflect reality.
Statistics is the science of collecting, organizing, summarizing, and analyzing
information to draw conclusions or answer questions. In addition, statistics is about
providing a measure of confidence in any conclusions.
Let’s break this definition into four parts. The first part states that statistics
involves the collection of information. The second refers to the organization and
summarization of information. The third states that the information is analyzed
to draw conclusions or answer specific questions. The fourth part states that
results should be reported using some measure that represents how
convinced we are that our conclusions reflect reality.

 Statistics is important because it enables people to make decisions based on


empirical evidence.
 Statistics provides us with tools needed to convert massive data into pertinent
information that can be used in decision making.
 Statistics can provide us information that we can use to make sensible decisions.

www.btech.edu.ph STATISTICAL ANALYSIS AND SOFTWARE APPLICATION Page 3 of 13


What information is referred to in the definition?
The information referred to the definition is the data. According to the Merriam
Webster dictionary, data are “factual information used as a basis for reasoning,
discussion, or calculation”.
Data can be numerical, as in height, or non-numerical, as in gender. In either
case, data describe characteristics of an individual.

Field of Statistics
A. Mathematical Statistics - The study and development of statistical theory and
methods in the abstract.
B. Applied Statistics - The application of statistical methods to solve real
problems involving randomly generated data and the development of new
statistical methodology motivated by real problems. Example branches of Applied
Statistics: psychometric, econometrics, and biostatistics.

Limitation of Statistics
1. Statistics is not suitable to the study of qualitative phenomenon.
2. Statistics does not study individuals.
3. Statistical laws are not exact.
4. Statistics table may be misused.
5. Statistics is only, one of the methods of studying a problem.

Definitions:
 Universe is the set of all entities under study.

 A Population is the total or entire group of individuals or observations from


which information is desired by a researcher. Apart from persons, a population
may consist of mosquitoes, villages, institution, etc.

 An individual is a person or object that is a member of the population being


studied.

 A statistic is a numerical summary of a sample.

 Sample is the subset of the population.

 Descriptive statistics consist of organizing and summarizing data. Descriptive


statistics describe data through numerical summaries, tables, and graphs.

 Inferential statistics uses methods that take a result from a sample, extend it
to the population, and measure the reliability of the result.

 A parameter is a numerical summary of a population

www.btech.edu.ph STATISTICAL ANALYSIS AND SOFTWARE APPLICATION Page 4 of 13


Example: Consider the Scenario.

You are walking down the street and notice that a person walking in front of you drops
PHP100. Nobody seems to notice the PHP100 except you. Since you could keep the
money without anyone knowing, would you keep the money or return it to the owner?

Suppose you wanted to use this scenario as a gauge of the morality of students at your
school by determining the percent of students who would return the money. How might
you do this? You could attempt to present the scenario to every student at the school,
but this would be difficult or impossible if the student body is large. A second possibility
is to present the scenario to 50 students and use the results to make a statement about
all the students at the school.

In the PHP100 study presented, the population is all the students at the school. Each
student is an individual. The sample is the 50 students selected to participate in the
study.

Suppose 39 of the 50 students stated that they would return the money to the owner.
We could present this result by saying that the percent of students in the survey who
would return the money to the owner is 78%. This is an example of a descriptive
statistic because it describes the results of the sample without making any general
conclusions about the population. So 78% is a statistic because it is a numerical
summary based on a sample. Descriptive statistics make it easier to get an overview of
what the data are telling us.

If we extend the results of our sample to the population, we are performing


inferential statistics. The generalization contains uncertainty because a sample
cannot tell us everything about a population. Therefore, inferential statistics includes a
level of confidence in the results. So rather than saying that 78% of all students would
return the money, we might say that we are 95% confident that between 74% and
82% of all students would return the money. Notice how this inferential statement
includes a level of confidence (measure of reliability) in our results. It also includes a
range of values to account for the variability in our results. One goal of inferential
statistics is to use statistics to estimate parameters.

www.btech.edu.ph STATISTICAL ANALYSIS AND SOFTWARE APPLICATION Page 5 of 13


PROCESS OF STATISTICS

1. Identify the research objective.


A researcher must determine the question(s) he or she wants answered. The
question(s) must clearly identify the population that is to be studied. Identify the
research objective.
2. Collect the information needed to answer the questions.
Conducting research on an entire population is often difficult and expensive, so we
typically look at a sample. This step is vital to the statistical process, because if the data
are not collected correctly, the conclusions drawn are meaningless. Do not overlook the
importance of appropriate data collection.

Example:
A research objective is presented. For each research objective, identify the population
and sample in the study.

1. The Philippine Mental Health Associations contacts 1,028 teenagers who are 13
to 17 years of age and live in Antipolo City and asked whether or not they had
been prescribed medications for any mental disorders, such as depression or
anxiety.

Population : Teenagers 13 to 17 years of age who live in Antipolo City


Sample : 1,028 teenagers 13 to 17 years of age who live in Antipolo City

2. A farmer wanted to learn about the weight of his soybean crop. He randomly
sampled 100 plants and weighted the soybeans on each plant.

Population : Entire soybean crop


Sample : 100 selected soybean crop

3. Organize and summarize the information.


Descriptive statistics allow the researcher to obtain an overview of the data and can
help determine the type of statistical methods the researcher should use.

4. Draw conclusion from the information.


In this step the information collected from the sample is generalized to the population.
Inferential statistics uses methods that takes results obtained from a sample, extends
them to the population, and measures the reliability of the result.

www.btech.edu.ph STATISTICAL ANALYSIS AND SOFTWARE APPLICATION Page 6 of 13


Take Note!
If the entire population is studied, then inferential statistics is not necessary, because
descriptive statistics will provide all the information that we need regarding the
population.

Example:
For the following statements, decide whether it belongs to the field of descriptive
statistics or inferential statistics.

1. A badminton player wants to know his average score for the past 10 games.
(Descriptive Statistics)
2. A car manufacturer wishes to estimate the average lifetime of batteries by testing a
sample of 50 batteries. (Inferential Statistics)
3. Janine wants to determine the variability of her six exam scores in Algebra.
(Descriptive Statistics)
4. A shipping company wishes to estimate the number of passengers traveling via their
ships next year using their data on the number of passengers in the past three years.
(Inferential Statistics)
5. A politician wants to determine the total number of votes his rival obtained in the
past election based on his copies of the tally sheet of electoral returns.
(Descriptive Statistics)

DISTINCTION BETWEEN QUALITATIVE AND QUANTITATIVE VARIABLES

Variables are the characteristics of the individuals within the population. For
example, recently my mother and I planted a tomato plant in our backyard. We
collected information about the tomatoes harvested from the plant.
The individuals we studied were the tomatoes. The variable that interested us
was the weight of a tomato. My mom noted that the tomatoes had different weights
even though they came from the same plant. She discovered that variables such as
weight may vary.
If variables did not vary, they would be constants, and statistical inference
would not be necessary. Think about it this way: If each tomato had the same
weight, then knowing the weight of one tomato would allow us to determine the
weights of all tomatoes. However, the weights of the tomatoes vary.
One goal of research is to learn the causes of the variability so that we can learn
to grow plants that yield the best tomatoes.
It is helpful to divide variables into different types, as different statistical
methods are applicable to each. The main division is into qualitative (or categorical) or
quantitative (or numerical variables).

www.btech.edu.ph STATISTICAL ANALYSIS AND SOFTWARE APPLICATION Page 7 of 13


Variables can be classified into two groups:

1. Qualitative variables (Categorical) are variable that yields categorical


responses. It is a word or a code that represents a class or category.
2. Quantitative variables (Numeric) take on numerical values representing an
amount or quantity.

Example:
Determine whether the following variables are qualitative or quantitative.
1. Hair color (Qualitative)
2. Temperature (Quantitative)
3. Stages of breast cancer (Qualitative)
4. Number of hamburger sold (Quantitative)
5. Number of children (Quantitative)
6. Zip code (Qualitative)
7. Place of birth (Qualitative)
8. Degree of pain (Qualitative)

DISTINCTION BETWEEN DISCRETE AND CONTINUOUS


Quantitative variables may be further classified into:

1. A discrete variable is a quantitative variable that either a finite number of possible


values or a countable number of possible values. If you count to get the value of a
quantitative variable, it is discrete.
2. A continuous variable is a quantitative variable that has an infinite number of
possible values that are not countable. If you measure to get the value of a quantitative
variable, it is continuous.

Example:
Determine whether the following quantitative variables are discrete or continuous.

1. The number of heads obtained after flipping a coin five times. (Discrete)
2. The number of cars that arrive at a McDonald’s drive-through between 12:00 P.M
and 1:00 P.M. (Discrete)
3. The distance of a 2005 Toyota Prius can travel in city conditions with a full tank of
gas. (Continuous)
4. Number of words correctly spelled. (Discrete)
5. Time of a runner to finish one lap. (Continuous)

www.btech.edu.ph STATISTICAL ANALYSIS AND SOFTWARE APPLICATION Page 8 of 13


LEVELS OF MEASUREMENT

It is important to know which type of scale is represented by your data since


different statistics are appropriate for different scales of measurement. A characteristic
may be measured using nominal, ordinal, interval and ration scales.

1. Nominal Level - They are sometimes called categorical scales or categorical


data. Such a scale classifies persons or objects into two or more categories.
Whatever the basis for classification, a person can only be in one category, and
members of a given category have a common set of characteristics.

Example:
 Method of payment (cash, check, debit card, credit card)
 Type of school (public vs. private)
 Eye Color (Blue, Green, Brown)

2. Ordinal Level - This involves data that may be arranged in some order, but
differences between data values either cannot be determined or meaningless. An
ordinal scale not only classifies subjects but also ranks them in terms of the
degree to which they possess a characteristics of interest. In other words, an
ordinal scale puts the subjects in order from highest to lowest, from most to
least. Although ordinal scales indicate that some subjects are higher, or lower
than others, they do not indicate how much higher or how much better.

Example:
 Food Preferences
 Stage of Disease
 Social Economic Class (First, Middle, Lower )
 Severity of Pain

www.btech.edu.ph STATISTICAL ANALYSIS AND SOFTWARE APPLICATION Page 9 of 13


3. Interval Level - This is a measurement level not only classifies and orders the
measurements, but it also specifies that the distances between each interval on
the scale are equivalent along the scale from low interval to high interval. A
value of zero does not mean the absence of the quantity. Arithmetic operations
such as addition and subtraction can be performed on values of the variable.

Example:
 Temperature on Fahrenheit / Celsius Thermometer
 Trait anxiety (e.g., high anxious vs. low anxious)
 IQ (e.g., high IQ vs. average IQ vs. low IQ)

4. Ratio Level - A ratio scale represents the highest, most precise, level of
measurement. It has the properties of the interval level of measurement and the
ratios of the values of the variable have meaning. A value of zero means the
absence of the quantity. Arithmetic operations such as multiplication and division
can be performed on the values of the variable.

Example:
 Height and weight
 Time
 Time until death

Operations that make sense for variables of different scales.

Both interval and ratio data involve measurement. Most data analysis techniques
that apply to ratio data also apply to interval data. Therefore, in most practical aspects,
these types of data (interval and ratio) are grouped under metric data. In some other
instances, these type of data are also known as numerical discrete and numerical
continuous.

www.btech.edu.ph STATISTICAL ANALYSIS AND SOFTWARE APPLICATION Page 10 of 13


Example:
Categorize each of the following as nominal, ordinal, interval or ratio measurement.

1. Ranking of college athletic teams. (Ordinal)


2. Employee number. (Nominal)
3. Number of vehicles registered. (Ratio)
4. Brands of soft drinks. (Nominal)
5. Number of car passers along C5 on a given day. (Ratio)
6. Zip code (Nominal)
7. Degree of pain (Ordinal)

www.btech.edu.ph STATISTICAL ANALYSIS AND SOFTWARE APPLICATION Page 11 of 13


WHAT HAVE I LEARNED?
Read each item carefully then provide the needed correct answer.

I. A research objective is presented. For each, identify the (A) population


and (B) sample in the study.

1. A polling organization contacts 2141 male university graduates who have a white
collar job and asks whether or not they had received a raise at work during the
past 4 months.
A. ______________________________
B. ______________________________

2. Every year the PSA releases the Current Population Report based on a survey of
50,000 households. The goal of this report is to learn the demographic
characteristics, such as income, of all households within the Philippines.
A. ______________________________
B. ______________________________

3. Researchers want to determine whether or not higher folate intake is associated


with a lower risk of hypertension (high blood pressure) in women (27 to 44 years
of age). To make this determination, they look at 7373 cases of hypertension in
these women and find that those who consume at least 1000 micrograms per
day of total folate had a decreased risk of hypertension compared with those
who consume less than 200.
A. ______________________________
B. ______________________________

II. Indicate whether the following statements require the use of


descriptive or inferential statistics.

1. A teacher wants to know the attitudes of all students towards abortion.


2. A market analyst of a sales firm draws a chart showing the sales figures of a
given product for the period 2006-2007.
3. A forecaster predicts the results of an election using the number of
4. votes cast in 15 out of 25 barangays.
5. Men are better in math than women.
6. Forty percent of the employees of an organization were recorded tardy for at
least 15 working days.

www.btech.edu.ph STATISTICAL ANALYSIS AND SOFTWARE APPLICATION Page 12 of 13


7. There are very few gender-related occupations.
8. An account predicts accuracy rate of a client’s financial resources.
9. A quality control manager wishes to check production output.
10. Records indicated that 75% of the faculty in the graduate school is doctoral
degree holders.
11. There is no relationship between educational qualification of parents and
academic achievement of their children.

ASSIGNMENT
Identify the qualitative and quantitative variables and indicate the highest
level of measurement required in each. If quantitative, classify whether
discrete or continuous.

1. Occupation
2. Number of government officials
3. Favorite color
4. Temperature in Celsius degrees
5. Type of school
6. Volume of mineral water sold daily
7. Employee number
8. Civil status
9. Equity accounts
10. Brands of soft drinks
11. Socioeconomic status
12. Status Employment
13. Number of missing teeth
14. Number of vehicles registered
15. Jersey Number
16. Number of employees collecting retirement benefits from GSIS
17. Duration of a seizure
18. Cause of death
19. Dividends
20. Current assets list
21. Number of heart attacks
22. Account receivable
23. Clothing size
24. Blood type
25. Ethnic group

www.btech.edu.ph STATISTICAL ANALYSIS AND SOFTWARE APPLICATION Page 13 of 13

You might also like