|
|
To obtain this output:
Comments in blue are by the instructor and are not part of SPSS output.
Logistic Regression

The case processing table above shows missing values are not an issue for these data.

The Dependent Variable Encoding table above shows the dependent variable, minority, is coded with the reference category=1="yes", and the non-minority category is coded 0. This conventional for logistic analysis, which here focuses on the probability that minority=1.

Above is SPSS's parameterization of the two categorical independent variables. Note that its parameter coefficients for the last category of each such variable are all 0's, indicating the last category is the omitted value for that set of dummy variables. The parameter codings are the X values for the dummy variables. They are are multiplied by the logit (effect) coefficients as part of obtaining the predicted values of the dependent, much as one would compute an OLS regression estimate.
Block 0: Beginning Block

The classification table above is a 2 x 2 table which tallies correct and incorrect estimates for the null model with only the constant. The columns are the two predicted values of the dependent, while the rows are the two observed (actual) values of the dependent. In a perfect model, all cases will be on the diagonal and the overall percent correct will be 100%. If the logistic model has homoscedasticity (not a logistic regression assumption), the percent correct will be approximately the same for both rows. Here it is not, with the model predicting non-minority cases but not predicting any minority cases. While the overall percent correctly predicted seems moderately good at 78.1%, the researcher must note that blindly estimating the most frequent category (non-minority) for all cases would yield the same percent correct (78.1%).

Above SPSS prints the initial test for the model in which the coefficients for all the independent variables are 0. The finding of significance above indicates this null model should be rejected.

Block 1: Method = Enter

The chi-square goodness-of-fit test tests the null hypothesis that the step is justified. Here the step is from the constant-only model to the all-independents model. When as here the step was to add a variable or variables, the inclusion is justified if the significance of the step is less than 0.05. Had the step been to drop variables from the equation, then the exclusion would have been justified if the significance of the change was large (ex., over 0.10).

The Cox-Snell R2 and Nagelkerke R2 are attempts to provide a logistic analogy to R2 in OLS regression. The Nagelkerke measure adapts the Cox-Snell measure so that it varies from 0 to 1, as does R2 in OLS.

The Hosmer and Lemeshow Goodness-of-Fit Test divides subjects into deciles based on predicted probabilities, then computes a chi-square from observed and expected frequencies. The p-value=0.051 here is computed from the chi-square distribution with 8 degrees of freedom and indicates that the logistic model is a (barely) good fit. That is, if the Hosmer and Lemeshow Goodness-of-Fit test statistic is .05 or less, we reject the null hypothesis that there is no difference between the observed and predicted values of the dependent; if it is greater, as we want, we fail to reject the null hypothesis that there is no difference, implying that the model's estimates fit the data at an acceptable level. As here, this does not mean that the model explains much of the variance in the dependent, only that it does so to a significant degree.


The classification table above is a 2 x 2 table which tallies correct and incorrect estimates for the full model with the independents as well as the constant. The columns are the two predicted values of the dependent, while the rows are the two observed (actual) values of the dependent. In a perfect model, all cases will be on the diagonal and the overall percent correct will be 100%. If the logistic model has homoscedasticity (not a logistic regression assumption), the percent correct will be approximately the same for both rows. Here it is not, with the model predicting all but seven non-minority cases but predicting only one minority cases. While the overall percent correctly predicted seems moderately good at 76.8%, the researcher must note that blindly estimating the most frequent category (non-minority) for all cases would yield an even higher percent correct (78.1%), as noted above. This implies minority status cannot be differentiated on the basis of education, job experience, job category, and gender for these data.
The Wald statistic above and the corresponding significance level test the significance of each of the covariate and dummy independents in the model. The ratio of the logistic coefficient B to its standard error S.E., squared, equals the Wald statistic. If the Wald statistic is significant (i.e., less than 0.05) then the parameter is significant in the model. Of the independents, jobcat and gender are significant but educ and prevexp are not.
The "Exp(b)" column is SPSS's label for the odds ratio of the row independent with the dependent (minority).It is the predicted change in odds for a unit increase in the corresponding independent variable. Odds ratios less than 1 correspond to decreases and odds ratios more than 1.0 correspond to increases in odds. Odds ratios close to 1.0 indicate that unit changes in that independent variable do not affect the dependent variable.
Step number: 1 Observed Groups and Predicted
Probabilities 160 ô
ô ó
ó ó
ó F ó
ó R 120 ô ô E ó Y
ó Q ó N
ó U ó N
ó E 80 ô N
ô N ó N
ó C ó Y ó
N 40 ô ó ó
NN NNYN NNY ó ó NNN NNNNY NNNYY Y Y Y Y ó Predicted òòòòòòòòòòòòòòôòòòòòòòòòòòòòòôòòòòòòòòòòòòòòôòòòòòòòòòòòòòòò Prob: 0
.25 .5 .75 1 Group:
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY Predicted Probability is of
Membership for Yes The Cut Value is .50 Symbols: N - No Y - Yes Each Symbol Represents 10 Cases.
The classplot above is an alternative way of assessing correct and incorrect predictions under logistic regression. The X axis is the predicted probability from 0.0 to 1.0 of the dependent being classified "1" (minority status). The Y axis is frequency: the number of cases classified. Inside the plot are columns of observed 1's and 0's, which it here codes as Y's (for minority status) and N's (not minority), with 10 cases per symbol. Examining this plot will tell such things as how well the model classifies difficult cases (ones near p = .5). In this case, it also shows nearly all cases are coded as being in the N (not minority status) group, even if in reality they are in the Y (minority) group.
