It is a science which helps us to collect, analyze and

present data systematically.

It is the process of collecting, processing, summarizing,

presenting, analysing and interpreting of data in order

to study and describe a given problem.

Statistics is the art of learning from data.

Statistics may be regarded as (i)the study of populations,

(ii) the study of variation, and (iii) the study of methods of

Importance of Statistics:
 It simplifies mass of data (condensation);
 Helps to get concrete information about any problem;
 Helps for reliable and objective decision making;
 It presents facts in a precise & definite form;
 It facilitates comparison(Measures of central
tendency and measures of dispersion);
 It facilitates Predictions (Time series and regression
analysis are the most commonly used methods
towards prediction.);
 It helps in formulation of suitable policies;
Limitation of :
Statistics does not deal with individual items;
Statistics deals only with quantitatively
expressed items, it does not study qualitative
Statistical results are not universally true;
Statistics is liable/responsible/ to be
Application areas of

Improving product design, testing product performance,
determining reliability and maintainability, working out
safer systems of flight control for airports, etc.

Estimating the volume of retail sales, designing optimum
inventory control system, producing auditing and
accounting procedures, improving working conditions in
industrial plants, assessing the market for new products.
Quality Control:
through adequate sampling, in process control,
consumer survey and experimental design in product
development etc.
*Realizing its importance, large organizations are
maintaining their own Statistical Quality Control
Department *.

Measuring indicators such as volume of trade, size of
labor force, and standard of living, analyzing consumer
behavior, computation of national income accounts,
formulation of economic laws, etc.
*Particularly, Regression analysis extensively uses in
the field of Economics*.
 Health and Medicine:
Developing and testing new drugs, delivering improved medical care,
preventing diagnosing, and treating disease, etc. Specifically,
inferential Statistics has a tremendous application in the fields of
health and medicine.
Exploring the interactions of species with their environment,
creating theoretical models of the nervous system, studying
genetically evolution, etc.
Measuring learning ability, intelligence, and
characteristics, personality creating psychological
behavior, etc. scales and abnormal

Testing theories about social systems, designing and conducting
sample surveys to study social attitudes, exploring cross-cultural
differences, studying the growth of human population, etc .
The steps in the engineering
method are as follows:
1. Develop a clear and concise description of
the problem.
2. Identify, the important factors that affect
this problem or that may play a role in its
3. Propose a model for the problem, using
scientific or engineering knowledge of the
phenomenon being studied. State any
limitations or assumptions of the model.
4. Conduct appropriate experiments and collect
data to test or validate the tentative model
or conclusions made in steps 2 and 3.
5. Refine the model on the basis of the
observed data.
6. Manipulate the model to assist in developing a
solution to the problem.
7. Conduct an appropriate experiment to
confirm that the proposed solution to the
problem is both effective and efficient.
8. Draw conclusions or make recommendations
based on the problem solution.
1-1 The Engineering Method and
Statistical Thinking

Figure 1.1 The Engineering method

The Engineering or Scientific Method

– Figure 1-1 Describes the Scientific or Engineering Method.

– Several steps rely on statistical methods
– Conduct experiments – how are efficient experiments designed?
– Identify the important factors – how do we account for variability when
we measure these factors?
– Confirm the solution – how do we accept or reject a
solution/hypothesis based on measurements?
– Variability complicates the task.
– Statistical methods help us understand and deal with variability.
1-1 The Engineering Method and
Statistical Thinking

• Statistical techniques are useful for describing and

understanding variability.
• By variability, we mean successive observations of a
system or phenomenon do not produce exactly the same
• Statistics gives us a framework for describing this
variability and for learning about potential sources of
is variability important to us?

– We want to predict results and control results with

accuracy. Variability makes predictions and control
more difficult and less accurate.
– If a particular part was required to be 1” + 0.010” and
the actual standard deviation was 0.010”, almost one-
third of the parts would be out of tolerance, even if
their mean was exactly 1.000”!
– Would you rather work in a room that had a constant
temperature of 70o or one where the temperature
alternated between 50o and 90o every 30 minutes?
13 Why Do We Study Probability &
– Statistics – deals with the collection, presentation, analysis, and use
of data to make decisions, solve problems, and design products &
– use statistics to draw inferences. Examples: quality, performance, or
durability of a product, weather forecasts, utilization or loading of system.
– Probability – allows us to use information & data to make intelligent
statements & forecasts about future events.
– Probability helps quantify the risks associated with statistical
– Prob & Stat are foundations for other coursework, e.g. reliability and
quality courses, robust design, simulation, design of experiments,
decision analysis, forecasting, time-series analysis, and operations
What do we want to know about our data?

A measure of central tendency: Average or mean -

x1  x2  ....  xn 1 n
x   xi
n n i 1

A measure of variability: Sample variance –

1 n
s 
 i
n  1 i 1
( x  x ) 2

We build models to
Sample Standard Deviation - explain this
s  s2 variability
An Example
Sample 1 Sample 2 X
17 23
21 16 X  3s2
23 17
X  3s1
20 21
18 25 20
22 18
X  3s1
17 15
10 X  3s 2
19 23
22 18
21 24
x1  20 x 2  20
s1  2.16 s2  3.62
Sample vs. Population Measures – Statistical Inference


– The sample mean ( x ) estimates the population mean (  )

– The sample variance ( s 2 ) estimates the population variance ( 2 )

1 N
   xi
MEAN: x   xi N i 1
n i 1

1 N
 i
n  1 i 1
( x  x ) 2 1
 2   ( xi  x) 2
N i 1

The population can sometimes be conceptual

and essentially have infinite size.
17 Sample vs. Population
We use sample measures ( x, s ) to draw conclusions about the
population measures ( ,  2
– The sample will be a (random) subset of the population
– The population may not yet exist, so the sample may be from a
small set of prototypes (analytic)
– There is an issue of stability – do the prototypes accurately reflect the
prospective population?

Sample Data – May be obtained from:

– Observational Study – sample is drawn randomly from current

process or system
– Designed experiment – deliberate changes are made to the
controllable variables of a process or system. The system output
is observed & inferences made about the effects of controlling
the input.
– Retrospective Study – Historical observations. Were you
fortunate enough that the needed variables were actually
collected accurately!?

Concept of Models
– Common engineering/physical models:
– F = ma
Let the data do
– I = E/R
the talking, right?
– d = vt
– Mechanistic models: used when we understand the physical
mechanism relating these variables.
– Empirical models: use our engineering & scientific knowledge of
the phenomena, but are not built on first-principle
understanding of the underlying mechanism. They are data

Mechanistic and Empirical Models

A mechanistic model is built from our underlying
knowledge of the basic physical mechanism that relates
several variables.
Example: Ohm’s Law
Current = voltage/resistance
I = E/R
I = E/R + 

Mechanistic and Empirical Models

An empirical model is built from our engineering and

scientific knowledge of the phenomenon, but is not
directly developed from our theoretical or first-
principles understanding of the underlying mechanism.
Mechanistic and Empirical Models

Suppose we are interested in the average molecular weight (Mn)
of a polymer. Now we know that Mn is related to the viscosity of
the material (V), and it also depends on the amount of catalyst
(C) and the temperature (T ) in the polymerization reactor when
the material is manufactured. The relationship between Mn and
these variables is

Mn = f(V,C,T)
say, where the form of the function f is unknown.

where the ’s are unknown parameters.

Mechanistic and Empirical Models

In general, this type of empirical model is called a

regression model.

The estimated regression line is given by


Figure 1-15 Three-dimensional plot of the wire and pull strength


Figure 1-16 Plot of the predicted values of pull strength from the
empirical model.

Designing Engineering Experiments

– Experiments are often used to confirm theory or to evaluate
various design options
– Often, several factors may be important
– Each factor may have more than one level of concern
– Full factorial design – considers all factors at all levels of
– For K factors, each having two levels, a total of 2 K experiments are
– For K = 4, N = 16
– For K = 8, N = 256
– Fractional factorial design – only a subset of factor
combinations are actually tested

Design of Experiments (DOE)

– Assume you want to investigate the impact of three factors on
the pull-off force of a connector:
– Wall thickness (3/32” and 1/8”)
– Cure times (1 hour and 24 hours)
– Cure temperature (70o F and 100o F)
– We can now conduct an experiment to assess the impact of
each of these variables (separately & interacting), each variable
being assessed at two different levels
– Since other sources of variability may be present, we would
do multiple experiments (replicate) at each design point.
Full Factorial Design

Figure S1-1 The factorial experiment for the connector wall thickness problem.
Importance of Factor Interactions

Figure S1-2 The two-factor interaction between cure time and cure temperature.

The Key Distinction

– The key difference between observational studies and

experimental designs is this:

– In a proper experiment you can eliminate confounding factors and

isolate effects of interest.
– In an observational study you take existing data. This may make it
impossible to distinguish the effects of two factors that appear to
explain observations equally well.

Time Series
– The correct analysis and interpretation of data collected over
time is very important in assessing & controlling the performance
of a system or process.

– When is performance normal & when is it out of control?

– What factors are driving a system out of control?
– What corrections should be applied to regain control?
– When has a change occurred – a fundamental shift in the process
Observing Processes Over Time

Figure 1-11 Adjustments applied to random disturbances over control

the process and increase the deviations from the target.
34 Observing Processes Over Time

Figure 1-12 Process mean shift is detected at observation number 57, and one
adjustment (a decrease of two units) reduces the deviations from target.
Observing Processes Over Time

Figure 1-13 A control chart for the chemical process concentration data.
36 Probability and Probability Models

• Probability models help quantify the

risks involved in statistical inference, that
is, risks involved in decisions made every
• Probability provides the framework for
the study and application of statistics.

