Binary Logistic
Binary Logistic
with SPSS
Karl L. Wuensch
Dept of Psychology
East Carolina University
Cosmetic
Theory Testing
Meat Production
Veterinary
Medical
Predictor Variables
Gender
Ethical Idealism (9-point Likert)
Ethical Relativism (9-point Likert)
Purpose of the Research
lnODDS ln
bX
1 Y
SPSS
Bring the data into SPSS
http://core.ecu.edu/psyc/wuenschk/SPSS/L
ogistic.sav
Decision Dependent
Gender Covariate(s), OK
Unweighted Cases
Selected Cases
N
Included in Analysis
Missing Cases
Total
Unselected Cases
Total
315
0
315
0
315
Percent
100.0
.0
100.0
.0
100.0
Constant
B
-.379
S.E.
.115
Wald
10.919
df
1
Y
.379
ln ODDS ln
Sig.
.001
Exp(B)
.684
187
1 Y
Probabilities
.7(1) = .70
Step 0
Observed
decision
stop
continue
Overall Percentage
decision
stop
continue
187
0
128
0
Percentage
Correct
100.0
.0
59.4
Block 1 Model
Gender has now been added to the
model.
Model Summary: -2 Log Likelihood = how
poorly model fits the data.
Model Summary
Step
1
-2 Log
Cox & Snell
likelihood
R Square
399.913a
.078
Nagelkerke
R Square
.106
Block 1 Model
For intercept only, -2LL = 425.666.
Add gender and -2LL = 399.913.
Omnibus Tests: Drop in -2LL = 25.653 =
Model 2.
df = 1, p < .001.
Omnibus Tests of Model Coefficients
Step 1
Step
Block
Model
Chi-square
25.653
25.653
25.653
df
1
1
1
Sig.
.000
.000
.000
ODDS e
a bGender
gender
Constant
B
1.217
-.847
S.E.
.245
.154
Wald
24.757
30.152
df
1
1
Sig.
.000
.000
Exp(B)
3.376
.429
Odds, Women
ODDS e
.847 1.217 ( 0 )
.847
0.429
Odds, Men
ODDS e .847 1.217 (1) e .37 1.448
Odds Ratio
male _ odds
1.448
3.376 e1.217
female _ odds .429
1.217 was the B (slope) for Gender, 3.376 is the
Exp(B), that is, the exponentiated slope, the
odds ratio.
Men are 3.376 times more likely to vote to
continue the research than are women.
0.30
1 ODDS 1.429
0.59
1 ODDS 2.448
Classification
Decision Rule: If Prob (event) Cutoff,
then predict event will take place.
By default, SPSS uses .5 as Cutoff.
For every man, Prob(continue) = .59,
predict he will vote to continue.
For every woman Prob(continue) = .30,
predict she will vote to stop it.
Step 1
Observed
decision
stop
continue
decision
stop
continue
140
47
60
68
Overall Percentage
a. The cut value is .500
140 68 208
66%
315
315
Percentage
Correct
74.9
53.1
66.0
Sensitivity
P (correct prediction | event did occur)
P (predict Continue | subject voted to Continue)
Of all those who voted to continue the research,
for how many did we correctly predict that.
68
68
53%
68 60 128
Specificity
P (correct prediction | event did not occur)
P (predict Stop | subject voted to Stop)
Of all those who voted to stop the research, for
how many did we correctly predict that.
140
140
75%
140 47 187
47
47
41%
47 68 115
60
60
30%
140 60 200
Pearson
Crosstabs Statistics
Statistics, Chi-Square, Continue
Crosstabs Cells
Cells, Observed Counts, Row
Percentages
Crosstabs Output
Continue, OK
59% & 30% match logistics predictions.
gender * decision Crosstabulation
gender
Female
Male
Total
Count
% within gender
Count
% within gender
Count
% within gender
decision
stop
continue
140
60
70.0%
30.0%
47
68
40.9%
59.1%
187
128
59.4%
40.6%
Total
200
100.0%
115
100.0%
315
100.0%
Crosstabs Output
Likelihood Ratio 2 = 25.653, as with
logistic.
Chi-Square Tests
Pearson Chi-Square
Likelihood Ratio
N of Valid Cases
Value
25.685b
25.653
315
df
1
1
Asymp. Sig.
(2-sided)
.000
.000
Model 2: Decision =
Idealism, Relativism, Gender
Analyze, Regression, Binary Logistic
Decision Dependent
Gender, Idealism, Relatvsm
Covariate(s)
Continue, OK.
-2 Log
Cox & Snell
likelihood
R Square
346.503a
.222
Nagelkerke
R Square
.300
Obtain p
Transform, Compute
Target Variable = p
Numeric Expression =
1 - CDF.CHISQ(53.41,2)
p=?
OK
Data Editor, Variable View
Set Decimal Points to 5 for p
p < .0001
Data Editor, Data View
p = .00000
Adding the ethical ideology variables
significantly improved the model.
Hosmer-Lemeshow
H: predictions made by the model fit
perfectly with observed group
memberships
Cases are arranged in order by their
predicted probability on the criterion.
Then divided into ten bins with
approximately equal n.
This gives ten rows in the table.
Step
1
1
2
3
4
5
6
7
8
9
10
decision = stop
Observed Expected
29
29.331
30
27.673
28
25.669
20
23.265
22
20.693
15
18.058
15
15.830
10
12.920
12
9.319
6
4.241
decision = continue
Observed Expected
3
2.669
2
4.327
4
6.331
12
8.735
10
11.307
17
13.942
17
16.170
22
19.080
20
22.681
21
22.759
Total
32
32
32
32
32
32
32
32
32
27
Chi-square
8.810
df
8
Sig.
.359
Hosmer-Lemeshow
There are problems with this procedure.
Not even Hosmer and Lemeshow
recommend it these days.
Even with good fit the test may be
significant if sample sizes are large
Even with poor fit the test may not be
significant if sample sizes are small.
Box-Tidwell
If an interaction is significant, there is a
problem.
For the troublesome predictor, try
including the square of that predictor.
That is, add a polynomial component to
the model.
See
T-Test versus Binary Logistic Regression
S.E.
gender
1.147
idealism
1.130 1.921
Wald
.269 18.129
.346
df
Sig.
1
.000 3.148
.556 3.097
relatvsm
1.656 2.637
.394
1
idealism by
Step 1a
-.652
.690
.893
1
idealism_LN
relatvsm by
-.479
.949
.254
1
relatvsm_LN
Constant
-5.015 5.877
.728
1
a. Variable(s) entered on step 1: gender, idealism, relatvsm, idealism *
idealism_LN , relatvsm * relatvsm_LN .
No Problem Here.
Exp(B)
.530 5.240
.345
.521
.614
.620
.393
.007
Model 3: Decision =
Idealism, Relativism, Gender, Purpose
Need 4 dummy variables to code the five
purposes.
Consider the Medical group a reference
group.
Dummy variables are: Cosmetic, Theory,
Meat, Veterin.
0 = not in this group, 1 = in this group.
Block 0
Look at Variables not in the Equation.
Score is how much -2LL would drop if a
single variable were added to the model
with intercept only.
Variables not in the Equation
Step
0
Variables
Overall Statistics
gender
idealism
relatvsm
cosmetic
theory
meat
veterin
Score
25.685
47.679
7.239
.003
2.933
.556
.013
77.665
df
1
1
1
1
1
1
1
7
Sig.
.000
.000
.007
.955
.087
.456
.909
.000
-2 Log
Cox & Snell
likelihood
R Square
a
338.060
.243
Nagelkerke
R Square
.327
Classification Table
YOU calculate the sensitivity, specificity,
false positive rate, and false negative rate.
Classification Tablea
Predicted
Step 1
Observed
decision
Overall Percentage
stop
continue
decision
stop
continue
152
35
54
74
Percentage
Correct
81.3
57.8
71.7
Answer Key
Wald Chi-Square
A conservative test of the unique
contribution of each predictor.
Presented in Variables in the Equation.
Alternative: drop one predictor from the
model, observe the increase in -2LL, test
via 2.
Step
a
1
gender
idealism
relatvsm
cosmetic
theory
meat
veterin
Constant
B
1.255
-.701
.326
-.709
-1.160
-.866
-.542
2.279
Wald
20.586
37.891
6.634
2.850
7.346
4.164
1.751
4.867
df
1
1
1
1
1
1
1
1
Sig.
.000
.000
.010
.091
.007
.041
.186
.027
Exp(B)
3.508
.496
1.386
.492
.314
.421
.581
9.766
a. Variable(s) entered on step 1: gender, idealism, relatvsm, cosmetic, theory, meat, veterin.
Answer Key
SAS Rules
See, on page 16 of the handout, how easy
SAS makes it to see the effect of changing
the cutoff.
SAS classification tables remove bias
(using a jackknifed classification
procedure), SPSS does not have this
feature.
Interaction Terms
May want to standardize continuous
predictor variables.
Compute the interaction terms or
Let Logistic compute them.
Dependent = Guilty.
Covariates = Delib, Plain.
In left pane highlight Delib and Plain.
Under Options, ask for the HosmerLemeshow test and confidence intervals
on the odds ratios.
Significant Interaction
The interaction is large and significant
(odds ratio of .030), so we shall ignore the
main effects.
Variables in the Equation
Step
a
1
Delib
Plain
Delib by Plain
Constant
Wald
3.697
4.204
8.075
.037
df
1
1
1
1
Sig.
.054
.040
.004
.847
Exp(B)
.338
3.134
.030
1.077
Analyze, Crosstabs.
Rows = Plain, Columns = Guilty.
Statistics, Chi-square, Continue.
Cells, Observed Counts and Column
Percentages.
Continue, OK.
Guilty
Plain
Total
Attrractive Count
% within Plain
Plain
Count
% within Plain
Count
% within Plain
a. Delib = Yes
No
Yes
Total
22
73.3%
29
96.7%
51
85.0%
8
26.7%
1
3.3%
9
15.0%
30
100.0%
30
100.0%
60
100.0%
Guilty
Plain
Total
Attrractive Count
% within Plain
Plain
Count
% within Plain
Count
% within Plain
a. Delib = No
No
Yes
Total
13
48.1%
8
22.9%
21
33.9%
14
51.9%
27
77.1%
41
66.1%
27
100.0%
35
100.0%
62
100.0%
Standardizing Predictors
Most helpful with continuous predictors.
Especially when want to compare the
relative contributions of predictors in the
model.
Also useful when the predictor is
measured in units that are not intrinsically
meaningful.