Medical Statistics New

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 46

PROF.

DR:MONA MOSTAFA
Evaluation of Methods -1
Quality Control -2
Interpretation of Laboratory Results -3
Research -4
DATA

.Qual. Quant
Count Measure
A/B⇨Ratio 
Proportion ⇨A/A+B 
A/A+B% ⇨Percentage 
Illustrate as⇨ Pie,Bar ,proportional bar charts
or pectogram
Intervals : Shouldn’t overlap & better to be
equal
Illustrate by histogram, bar chart or
frequency curve
 Mode: most commonly occurring value

 Mean: arithmetic mean (∑x/n)

 Median: (50%th percentile,n+1/2), the value


below which 50% of results occur when
arranged in ascending manner given ranks
1-Range:
Lowest~highest value (worst expression)

2-Interquartile range
25th%~75th% (best used for skewed data)

3-Standard deviation (best used for normal


dist.)

Ref. range=X±2SD(2.5~97.5th%)
1-Variance (V)=∑( x-x)2 /n-1

2-Standard deviation(SD)=√V

3-Coefficient of variation(CV)=SD/X %
X±1SD=68% of all results

X±2SD=95% of all results

Probability of occurrence of a Value>X+1SD is <16% of all


results
Probability of occurrence of a Value>X+2SD is <2.5% of all
results
Probability of occurrence of a Value<X-1SD is <16% of all
results
Probability of occurrence of a Value<X-2SD is <2.5% of all
results
(+ skewed)
In case of + skewed distribution:
- Most of the cases < mode
Tailing of results to the right side -
- mean > median > mode
(- skewed)
In case of - skewed distribution:
- Most of the cases > mode
Tailing of results to the left side -
- mean < median < mode
1-Population mean⇨М (meo)

Sample mean ⇨ x
2-Population deviation⇨σ (sigma)
Sample deviation ⇨ SD,S
3-SE(standard error of the mean) S/√n
Sometimes used to correct for mean deviation
from М (it’s considered a scatter parameter)
in samples with low number and in this case
is better used in stead of SD
 Blind ⇨ Patient (doesn’t know nature of the study)

 Double blind ⇨ Doctor & Patient?

 Triple blind⇨Doctor,Patient& Data analyzer?


Choose a sample to represent
:population and it should be

Well defined e.g adults type 2 diabetics-1

Accurate and precise-2


**Accuracy: nearness to the truth
How to improve:
A-Don’t choose a group with very specific
ch.ch⇨Bias(inaccuracy) better to choose from
random tables
B-Use control group: similar to studied group
in every thing except studied variable
C-Do blind and double blind study
Precision: reproducibility of results &
nearness to each other
How to improve:
A-Choose groups with relevant ch.ch. e.g.
males or females , children or adults
B-Increase sample size
Precision=√n/SD
C-Sample with low SD is more Precise
Null hypothesis: the occurrence of a result is
due to chance(biological variations) when
p>0.05 the result still within confidence
interval so accept null hypothesis (H0)

Reject the null hypothesis and accept the


alterative H1when the result is not due to
chance but due to significant difference and
doesn’t belong to confidence interval
(p«0.05)
F=S12/S22
:Where
S1 is the bigger SD
S2 is the smaller SD
Look for F at Fisher’s test table

:If significant
Try to normalize your data -1
If not use tests for non-parametric data-2
It is an aid for graphical distribution to decide
whether the data are normally distributed or
.not
Null hypothesis is the data are normally-
distributed (significant result=non parametric
)data
Many Tests are available but the most -
:appropriate for SPSS are
Klomogrov-Smirnov-1
)Sapiro-Wilk-2
Student’t test-1
Paired t test-2
ANOVA test-3
Used to compare two groups

- -
t=x1-x2/ √S12/n1+S22/n2

degree of freedom=n1+n2-2
:Uses
Evaluation of methods-1
Drug monitoring-2
Matching pairs i.e. repeated sampling after a-3
certain variable change(e.g. glucose assay
before and after treatment)
-
t=x/ SD/√n df=n-1
:NB
X&SD are of difference
It’s an extension of t test*
Better used for more than two groups*
One way=only one variable is tested at a time*

F=Difference between group means/Difference


within each group
F=Difference between group means/Difference
due to chance
F=t2
:Transfer your data into
Log 10-1

Square-2

Square root-3

If distribution becomes normal use tests for


parametric data
Test for qualitative data(Chi-square test)-1
:Tests for skewed quantitative data-2
Wilcoxon’s ranking test (t test)*
Wilcoxon’s signed rank test (paired t)**
Kruskal Wallas test (ANOVA test)***

NB: For data with significant Fisher test use the


same tests for non-parametric data
A test for qualitative data-
Groups are arranged in columns-
Different outcomes or variations arranged in-
rows
Total for each column or row is calculated-
Expected value for each cell is calculated-
:2 /E for all cells are calculated)O-E( -

X2=∑ (O-E)2 /E-


Null hypothesis(H0): the variable between
groups has no effect and the outcome will be
according to the number in each cell in
relation to total
df=(n of columns-1)(n of rows-1)-
Look for significance in special table-
:NB
Any number of groups and variables can be used at the same
time
If any cell contains less than 5(expected) shift to Fisher’s Exact
test
Used for quantitative non-parametric data*
*Used to compare two groups only
*Rank all your results as one group
Give the average rank for 2 results with the*
same value
Give the middle rank for 3 results with the*
same value
Calculate T1(total ranks of smaller group)*
Calculate Z value and look at in special table*
for significance
Used as paired t test*
Get the difference for each pair*
Rank the difference and ignore the sign *Give*
the average rank if two values are the same
and the middle rank if 3 are the same
Restore the sign to make 2 groups(+ve&-ve)*
Look for significance of T(total of smaller*
group) at special table
It is used to compare more than 2 non-*
parametric quantitative groups
You look for significant H at chi-square tables*
df=K-1(K is the number of groups)*
The null hypothesis is: there is no difference*
between groups regarding one variable and
results should have been collected in one
group
Correlation is a study to assess the degree of
association between 2 variables
:The original equation of linear regression is*
Y=a+bx
:Where
y=the result of the dependant variable
x=the result of the independent variable
a=constant error of the intercept( y axis)
b=proportional error of the slope
The null hypothesis(H0):there is no*
association between variables

Get(r) of your study from a special equation*


and for significance at correlation tables
df=n-2*

If your (r) is>critical r you correlation is*


significant i.e. reject null hypothesis(+ve or –
ve)
If both variables are normally distributed use*
Pearson’s correlation(r)

If one of them is skewed either normalize or*


use Spearman’s rank correlation coefficient(rs)

:NB
Regression analysis is also used for method&
instrument evaluation and testing for reagent or
method linearity
It’s a special type of regression analysis to
study the effect of multiple dependant
variables compared to one independent
variable where each variable has its own
shared r or squared r (rs)
You get an F value for all dependant variables*
and search for its significance (p value)

This test is used to get for example the best*


panel for diagnosis or prognosis of a disease
Main items TP,TN,FP&FN*
Sensitivity is positivity in disease TP/TP+FN*
Specificity is negativity in health TN/TN+FP*
Positive predictive value is true positive results*
to all positive results TP/TP+FP
Negative predictive value is true negative*
results to all negative results TN/TN+FN
Efficiency is true results to all results*
TP+TN/TP+TN+FP+FN
Multiple cut-off values are chosen*
Calculate sensitivity & specificity for each*
Sensitivity is represented on y axis*
Specificity is represented on x axis-1*
The best cut-off point is nearest to the upper*
left corner
It’s used to illustrate diagnostic performance*
AUC can be calculated & compared to AUC of*
another parameter in research work
NB: Both parameters can be studied together and illustrated on Multi-ROC curve
Incidence: is the number of newly
discovered cases in a place: number of
population/unit of time e.g. number of new
hepatitis cases in Giza is 2000:80000/year

Prevalence is the ratio between already


present cases(old +new cases=4000) to total
number of population in this place e.g.
hepatitis prevalence in Giza is 5%
Both incidence and prevalence & how
dangerous is the disease affect the decision
to choose a parameter with high sensitivity or
specificity or both e.g for cancer choose a
parameter with high sensitivity in spite of low
specificity while for chronic hepatitis choose
a parameter with high specificity

You might also like