Activity #2: Regression Diagnostics
When requested, SPSS provides detailed output that allows you to perform diagnostics of your
regression model. Using the example from Activity #1, we are still predicting socioeconomic
index from gender, number of science courses taken and level of education. Examine all of the
output below. Detail what assumptions are being tested and whether the model meets these
assumptions.
Activity #2: Regression Diagnostics
Activities – Logistic Regression
1. Activity #1: Use the dataset to practice using SPSS to run logistic regression
If you have a device loaded with SPSS, please use the same dataset (Logistic Regression Dataset.sav) to rerun the logistic
regression with additional predictors. Otherwise, please proceed with Exercise 2. The additional predictors are
IV5: 30-Day use of alcohol (e6a), ratio level
IV6: ATOD attitudes Baseline (attitud1), ratio level
IV7: Friends ATOD attitudes Baseline (frdatt1), ratio level
IV8: Friends ATOD use (frduse1), ratio level
IV9: Usefulness for engaging in risky behaviors (userisk1), ratio level
2. Activity #2: Read and interpret the example outputs
Block 1: Method = Enter
Omnibus Tests of Model Coefficients
Chi-square
Step 1
df
Sig.
Step
59.706
9
.000
Block
59.706
9
.000
Model
59.706
9
.000
Residency 3 & 4: Logistic Regression
Page 1 of 3
Model Summary
Step
1
-2 Log likelihood
Cox & Snell R
Nagelkerke R
Square
Square
384.995a
.145
.211
a. Estimation terminated at iteration number 5 because
parameter estimates changed by less than .001.
Hosmer and Lemeshow Test
Step
1
Chi-square
df
14.768
Sig.
8
.064
Classification Tablea
Observed
Predicted
8-Day use of cigarettes
Percentage
Correct
No Cigarettes
Smoked
Smokes
Cigarettes
No Cigarettes Smokes
31
72
30.1
Smoked Cigarettes
15
263
94.6
8-Day use of cigarettes
Step 1
Overall Percentage
77.2
a. The cut value is .500
Residency 3 & 4: Logistic Regression
Page 2 of 3
Variables in the Equation
B
S.E.
Wald
df
Sig.
Exp(B)
95% C.I.for EXP(B)
Lower
sex(1)
Step 1a
Upper
-.582
.300
3.758
1
.053
.559
.310
1.006
age
.139
.061
5.149
1
.023
1.149
1.019
1.295
e6a
.009
.015
.320
1
.571
1.009
.979
1.040
fatmosp1
-.315
.203
2.408
1
.121
.730
.490
1.086
scpruse1
-.280
.233
1.441
1
.230
.756
.479
1.194
attitud1
.791
.278
8.114
1
.004
2.205
1.280
3.799
frdatt1
-.142
.244
.340
1
.560
.868
.538
1.399
frduse1
.929
.264
12.409
1
.000
2.532
1.510
4.246
userisk1
.231
.331
.487
1
.485
1.260
.659
2.410
Constant
-2.667
1.490
3.204
1
.073
.069
a. Variable(s) entered on step 1: sex, age, e6a, fatmosp1, scpruse1, attitud1, frdatt1, frduse1, userisk1.
Please check the above output and address the following questions:
1. What is the overall fit of the model? Please use the following criteria: Pearson Chi-Square Statistic, Deviance (-2LL), HosmerLemeshow Tests, Classification Table, and Pseudo R-squares. Are they better or worse than the model demonstrated earlier?
2. Are the individual predictors significant? How do you interpret the odds ratio (Exp(B))? More importantly, how does the result
of each individual predictor differ from the results in the model demonstrated earlier?
3. Synthesize the results of your logistic regression analysis. Include a brief summary of the sample characteristics and the major
findings. Interpret the findings so that your readers will have an understanding of the fit of your overall model and the impact
of predictors on the outcome.
Residency 3 & 4: Logistic Regression
Page 3 of 3
Walden University
Academic Residencies:
Logistic Regression
PhD Residency 3 & 4
Session Learning Objectives
• Understand the nature and requirements of logistic
regression models
• Align research question and hypothesis with
regression models
• Practice running regression models
• Interpret the analysis results
• Evaluate the models
Session Agenda
• Overview
– Purpose and Types of Logistic Regression
– Model Assumptions
– Research Questions and Hypotheses
• Demonstration: Logistic Regression with SPSS
• Group Activity: SPSS Exercise and Output
• Question and Answer
What Is Logistic Regression?
• Logistic regression allows one to predict a discrete
dependent variable, such as group membership,
from one or multiple independent variables.
• Addresses the same questions that discriminant
analysis and multiple regression do, but with no
distributional assumptions on the predictors.
• The predictors do not have to be normally
distributed, linearly related, or have equal variance in
each group.
What Is Logistic Regression?
• Logistic regression is often used because the
relationship between the discrete variable (DV) and
one or more of the independent variables is nonlinear.
• It uses a maximum likelihood estimation rather than
the least squares estimation used in traditional
multiple regression.
Types of Logistic Regression
• Binary
– Dependent variable: dichotomous
– e.g., disease (yes or no)
• Multinomial
– Dependent variable: nominal level with more than 2
groups
– e.g., program (general, academic, or vocational)
• Ordinal
– Dependent variable: ordinal level
– e.g., smoker (light, moderate, or heavy)
Assumptions
• The model is correctly specified.
– The true conditional probabilities are a logistic function of
the independent variables.
– No important variables are omitted.
– No extraneous variables are included.
– The independent variables are measured without error.
• The cases are independent.
Assumptions
• The independent variables are not linear
combinations of each other.
– There are no extremely high correlations among
predictors.
• There are adequate expected frequencies and power.
Common Research Questions
• Prediction of group membership
– Can the group membership be predicted from the
set of predictors?
• Importance of predictors
– Which predictors predict the outcome?
– Does a predictor make the model better or worse
or have no effect?
• Classification of cases
– How good is the model at classifying cases?
Common Research Questions
• Parameter estimates
– How can parameter estimates be used to calculate
and interpret odds?
• Interactions among predictors
– Does adding interactions among predictors
(continuous or categorical) improve the model?
• Effect Size
– How strong is the relationship between outcome
and the set of predictors?
Demonstration
• Dataset: Logistic Regression Dataset.sav
• Research Questions
– RQ1: Can 8-day use of cigarettes be predicted by gender,
age, family atmosphere at baseline, and school peer ATOD
use baseline?
– RQ2: Which predictors (sex, age, fatmosp1, or scpruse1)
predict the 8-day use of cigarettes?
Demonstration
• Variables:
– DV: 8-day use of cigarettes (e3) with 1=Yes and 0=No
– IV1: Gender (sex) with 1= female and 0=male
– IV2: Age, interval level
– IV3: Family atmosphere at baseline (fatmosp1),
ratio level
– IV4: School peer ATOD use baseline (scpruse1),
ratio level
• SPSS → Analyze → Binary Logistic
Model Evaluation Criteria
• Assessing goodness-of-fit of models
– Pearson Chi-square statistic
– Deviance (-2LL)
– Hosmer-Lemeshow tests
– Classification table
– Pseudo R-squares
• Tests of individual variables
– Wald test
– Likelihood ratio test for adding or omitting a
predictor
Group Exercise
• Dataset: Logistic Regression Dataset.sav
• Handout: Logistic Regression Group Exercise
– Use the dataset to practice using SPSS to run
logistic regression.
– Read and interpret the example outputs.
Questions
Final Assignment
• Take a few minutes right now to reflect on what you
learned from this session.
• Write down notes for your Final Assignment:
– Application to my Dissertation
– Next steps/questions I need to answer for my
dissertation
Feedback Survey
• Please take 2 minutes right now to complete
feedback for this session in the Residency App.
Preparation for Logistic Regression Session
1. Download SPSS: Prior to attending the Residency, install the SPSS software on your
laptop. If you don’t have the software already installed, please download it from
http://academicguides.waldenu.edu/researchcenter/resources/SPSS
2. Download and open the data files: Before attending the session, download the SPSS
files and make sure that you can open them in SPSS.
•
•
•
Logistic Regression Dataset.sav
Logistic Regression Output.spv
Logistic Regression Syntax.sps
If you encounter trouble with the syntax (.sps) and output (.spv) files, refer to one of
these instead:
•
•
Logistic Regression Output.pdf
Logistic Regression Syntax.doc
3. Please make sure that you can open the data files and have them on your screen
prior to the beginning of the session.