Lesson 1 - Obtaining Data
Lesson 1 - Obtaining Data
Sample survey. A sample survey is a study that obtains data from a subset of a
population, in order to estimate population attributes.
METHODS OF COLLECTING DATA
Experiment. An experiment is a controlled study in which the researcher attempts to
understand cause-and-effect relationships.
The study is "controlled" in the sense that the researcher controls
(1) how subjects are assigned to groups and
(2) which treatments each group receives. In the analysis phase, the researcher
compares group scores on some dependent variable. Based on the analysis, the
researcher draws a conclusion about whether the treatment (independent variable)
had a causal effect on the dependent variable.
METHODS OF COLLECTING DATA
Observational study. Like experiments, observational studies attempt to understand
cause-and-effect relationships.
unlike experiments, the researcher is not able to control
(1) how subjects are assigned to groups and/or
(2) which treatments each group receives.
PLANNING AND CONDUCTING
SURVEYS
The survey is a series of unbiased well-constructed questions that the subject must answer.
advantages of surveys
- efficient ways of collecting information from a large number of people
- easy to administer
- a wide variety of information can be collected and they can be focused (researchers can
stick to just the questions that interest them.)
PLANNING AND CONDUCTING
SURVEYS
Some disadvantages of surveys
- depend on the subjects’ motivation, honesty, memory and ability to respond.
- answer choices to survey questions could lead to vague data. For example, the choice
“moderately agree” may mean different things to different people or to whoever ends up
interpreting the data.
PLANNING AND CONDUCTING
SURVEYS
Various methods for administering a survey
- face-to face interview
- phone interview where the researcher is questioning the subject
- self-administered survey where the subject can complete a survey on paper and
mail it back, or complete the survey online
PLANNING AND CONDUCTING
SURVEYS
The advantages of face-to-face interviews includes
- fewer misunderstood questions
- fewer incomplete responses
- higher response rates
- greater control over the environment in which the survey is administered
- the researcher can collect additional information if any of the respondents’ answers
need clarifying.
PLANNING AND CONDUCTING
SURVEYS
The disadvantages of face-to-face interviews are
- expensive
- time-consuming
- require a large staff of trained interviewers.
- the response can be biased by the appearance or attitude of the interviewer.
PLANNING AND CONDUCTING
SURVEYS
The advantages of self-administered surveys are
- less expensive than interviews
- do not require a large staff of experienced interviewers
- be administered in large numbers.
- anonymity and privacy encourage more candid and honest responses, and there is less
pressure on respondents.
PLANNING AND CONDUCTING
SURVEYS
The disadvantages of self-administered surveys are
- responders are more likely to stop participating mid-way through the survey and
respondents cannot ask them to clarify their answers
- lower response rates than in personal interviews
- often the respondents who bother to return surveys represent extremes of the population –
those people who care about the issue strongly, whichever way their opinion leans.
PLANNING AND CONDUCTING
SURVEYS
Designing a Survey
When designing a survey, the following steps are useful:
1. Determine the goal of your survey: What question do you want to answer?
2. Identify the sample population: Whom will you interview?
3. Choose an interviewing method: face-to-face interview, phone interview, self-administered
paper survey, or internet survey.
4. Decide what questions you will ask in what order, and how to phrase them. (This is
important if there is more than one piece of information you are looking for.)
5. Conduct the interview and collect the information.
6. Analyze the results by making graphs and drawing conclusions.
PLANNING AND CONDUCTING
EXPERIMENTS
PURPOSE:
- DOES ASPIRIN REDUCE THE RISK OF HEART ATTACK?
Replicate
Each treatment on many units to reduce chance variation
Example: do the mouse study many times
Randomize
Use probability (chance) to assign experimental units to treatments
May be the most important!!
Because it allows us to say the different treatment groups start out similar
PLANNING AND CONDUCTING
EXPERIMENTS
Completely Randomized Design
If all the experimental units (subjects of the experiment) are randomly assigned to either the
control group or to the treatment group, then the experiment has a completely randomized
design.
Randomize by assigning each subject a number and then generating it to choose treatment groups.
PLANNING AND CONDUCTING
EXPERIMENTS
Block Randomization
Placing subjects into groups of similar individuals. The random assignments into treatment
groups is carried out separately within each block (think stratified random sample)
PLANNING AND CONDUCTING
EXPERIMENTS
Matched Pairs Design
Subjects are matched into pairs and get different treatments
Matched pairs are more similar than random unmatched subjects
Randomizing the rest of the experiment is still important!!!
PLANNING AND CONDUCTING
EXPERIMENTS
Experimental Set Up
Treatment Imposed = Independent Variable = Factors
Experimental Units = Subjects
Response Variable Observed = Dependent Variable
PLANNING AND CONDUCTING
EXPERIMENTS
Double-Blind Experiment
PLANNING AND CONDUCTING
EXPERIMENTS
PLANNING AND CONDUCTING
EXPERIMENTS
In a double-blind experiment, neither the subjects nor the researchers know to which group,
treatment, or control, subjects have been assigned. If a researcher knows that a subject is in
the control group, they do not expect a treatment effect, and their measurement of a response
might be understated. If a researcher knows that a subject is in the treatment group, they
might overstate a response simply because they expect it.
An experiment might also be single-blind. In this case, only one of the participants, either the
subjects or the researchers, knows to which group the subjects have been assigned.
Avoids unconscious bias
PLANNING AND CONDUCTING
EXPERIMENTS
Generalizability of Results
To determine if our data is "statically significant"
i.e. is an observed effect so large that it would rarely occur by chance
If we designed and conducted our experiment well, we can generalize these results to the
population!
PLANNING AND CONDUCTING
EXPERIMENTS
The practical steps needed for planning and conducting an experiment include: recognizing the goal of
the experiment, choice of factors, choice of response, choice of the design, analysis and then drawing
conclusions. This pretty much covers the steps involved in the scientific method.
1. Recognition and statement of the problem
2. Choice of factors, levels, and ranges
3. Selection of the response variable(s)
4. Choice of design
5. Conducting the experiment
6. Statistical analysis
7. Drawing conclusions, and making recommendations
COLLECTING ENGINEERING DATA
Sometimes the data are all of the observations in the population. This results in a census.
However, in the engineering environment, the data are almost always a sample that has been selected
from the population. Three basic methods of collecting data are
• A retrospective study using historical data
• An observational study
• A designed experiment
COLLECTING ENGINEERING DATA
Retrospective Study
Montgomery, Peck, and Vining (2012) describe an acetone-butyl alcohol distillation column for
which concentration of acetone in the distillate (the output product stream) is an important variable.
Factors that may affect the distillate are the reboil temperature, the condensate temperature, and the
reflux rate. Production personnel obtain and archive the following records:
• The concentration of acetone in an hourly test sample of output product
• The reboil temperature log, which is a record of the reboil temperature over time
• The condenser temperature controller log
• The nominal reflux rate each hour
The reflux rate should be held constant for this process. Consequently, production personnel
change this very infrequently.
COLLECTING ENGINEERING DATA
A retrospective study would use either all or a sample of the historical process data archived
over some period of time and involve a significant amount of data, but those data may contain relatively
little useful information about the problem.
Some of the relevant data may be missing, there may be transcription or recording errors resulting in
outliers (or unusual values), or data on other important factors may not have been collected and archived.
In the distillation column, for example, the specific concentrations of butyl alcohol and acetone in the
input feed stream are very important factors, but they are not archived because the concentrations are too
hard to obtain on a routine basis.
As a result of these types of issues, statistical analysis of historical data sometimes identifies interesting
phenomena, but solid and reliable explanations of these phenomena are often difficult to obtain.
COLLECTING ENGINEERING DATA
Observational study
In an observational study, the engineer observes the process or population, disturbing it as little as
possible, and records the quantities of interest. Because these studies are usually conducted for a
relatively short time period, sometimes variables that are not routinely measured can be included.
In the distillation column, the engineer would design a form to record the two temperatures and
the reflux rate when acetone concentration measurements are made. It may even be possible to
measure the input feed stream concentrations so that the impact of this factor could be studied.
Generally, an observational study tends to solve problems 1 and 2 and goes a long way
toward obtaining accurate and reliable data. However, observational studies may not help resolve
problems 3 and 4 .
COLLECTING ENGINEERING DATA
Designed Experiments
In a designed experiment, the engineer makes deliberate or purposeful changes in the controllable
variables of the system or process, observes the resulting system output data, and then makes an
inference or decision about which variables are responsible for the observed changes in output
performance.
Experiments designed with basic principles such as randomization are needed to establish cause-
and-effect relationships.
THE ENGINEERING METHOD
The field of statistics deals with the collection, presentation, analysis, and use of
data to make decisions, solve problems, and design products and processes.
The
engineering method.
THE ENGINEERING METHOD
The engineering, or scientific, method is the approach to formulating and solving these problems. The
steps in the engineering method are as follows:
1. Develop a clear and concise description of the problem.
2. Identify, at least tentatively, the important factors that affect this problem or that may play a
role in its solution.
3. Propose a model for the problem, using scientific or engineering knowledge of the phenomenon
being studied. State any limitations or assumptions of the model.
4. Conduct appropriate experiments and collect data to test or validate the tentative model or
conclusions made in steps 2 and 3.
5. Refine the model on the basis of the observed data.
6. Manipulate the model to assist in developing a solution to the problem.
7. Conduct an appropriate experiment to confirm that the proposed solution to the problem is
both effective and efficient.
8. Draw conclusions or make recommendations based on the problem solution.