|
|
Overview
Logistic regression can be used to predict a dependent variable on the basis of continuous and/or categorical independents and to determine the percent of variance in the dependent variable explained by the independents; to rank the relative importance of independents; to assess interaction effects; and to understand the impact of covariate control variables. The impact of predictor variables is usually explained in terms of odds ratios. Logistic regression applies maximum likelihood estimation after transforming the dependent into a logit variable (the natural log of the odds of the dependent occurring or not). In this way, logistic regression estimates the odds of a certain event occurring. Note that logistic regression calculates changes in the log odds of the dependent, not changes in the dependent itself as OLS regression does. Logistic regression has many analogies to OLS regression: logit coefficients correspond to b coefficients in the logistic regression equation, the standardized logit coefficients correspond to beta weights, and a pseudo R2 statistic is available to summarize the strength of the relationship. Unlike OLS regression, however, logistic regression does not assume linearity of relationship between the independent variables and the dependent, does not require normally distributed variables, does not assume homoscedasticity, and in general has less stringent requirements. It does, however, require that observations be independent and that the independent variables be linearly related to the logit of the dependent. The predictive success of the logistic regression can be assessed by looking at the classification table, showing correct and incorrect classifications of the dichotomous, ordinal, or polytomous dependent. Goodness-of-fit tests such as the likelihood ratio test are available as indicators of model appropriateness, as is the Wald statistic to test the significance of individual independent variables. . In PASW/SPSS, binary logistic regression, sometimes called binomial logistic regression, is under Analyze - Regression - Binary Logistic, and the multinomial version is under Analyze - Regression - Multinomial Logistic. Logit regression, discussed separately, is another related option in PASW/SPSS for using loglinear methods to analyze one or more dependents. Where both are applicable, logit regression has numerically equivalent results to logistic regression, but with different output options. For the same class of problems, logistic regression has become more popular among social scientists. |
|
z = b0 + b1X1 + b2X2 + ..... + bkXk
The "z" is the logit, also called the log odds.
The "b" terms are the logistic regression coefficients, also called parameter estimates
Exp(b) = the odds ratio for an independent variable
The odds ratio is the factor by which the independent increases or decreases
increases the log odds of the dependent (see discussion of interpreting b
parameters below.
Exp(z) = the odds that the dependent equals the level of interest rather than the reference level. In binary logistic regression, this is usually the odds the dependent = 1 rather than 0. In multinomial logistic regression, this is usually the odds the dependent = the given level rather than the highest level.
Thus for a one-independent model, z would equal the constant, plus the b coefficient times the value of X1, when predicting odds(event) for persons with a particular value of X1, by default the value "1" for the binary case. If X1 is a binary (0,1) variable, then z = X0 (that is, the constant) for the "0" group on X1 and equals the constant plus the b coefficient for the "1" group. To convert the log odds (which is z, which is the logit) back into an odds ratio, the natural logarithmic base e is raised to the zth power: odds(event) = exp(z) = odds the binary dependent is 1 rather than 0. If X1 is a continuous variable, then z equals the constant plus the b coefficient times the value of X1. For models with additional independent variables, z is the constant plus the crossproducts of the b coefficients times the values of the X (independent) variables. Exp(z) is the log odds of the dependent, or the estimate of odds(event).
It is important to be careful to specify the desired reference category of the dependent variable, which should be meaningful.
| BINARY LOGISTIC REGRESSION: DEPENDENTS | OUTCOME: |
| Binary variable is entered as a dependent | Highest is predicted, lowest is reference |
| The reference level cannot be changed. | |
| MULTINOMIAL LOGISTIC REGRESSION: DEPENDENTS | |
| Binary or multinomial variable entered as dependent | Highest is reference, all others compared to it by default. |
| Click "Reference Category" button to override default. |
| BINARY LOGISTIC REGRESSION: PREDICTORS | OUTCOME: |
| Categorical coding: | highest is reference (ex., 2=female) |
| Covariate coding: | lowest is reference (ex., 1=male) |
| MULTINOMIAL LOGISTIC REGRESSION: PREDICTORS | |
| Factor coding: | lowest is reference (ex., 1=male) |
| Covariate coding: | highest is reference (ex., 2=female) |
Put another way, OLS can be seen as a subtype of ML for the special case of a linear model characterized by normally distributed disturbances around the regression line, where the regression coefficients computed maximize the likelihood of obtaining the least sum of squared disturbances. When error is not normally distributed or when the dependent variable is not normally distributed, ML estimates are preferred to OLS estimates because they are unbiased beyond the special case handled by OLS.
The -2LL statistic is the likelihood ratio. It is also called goodness of fit, deviance chi-square, scaled deviance, deviation chi-square, DM, or L-square. It reflects the significance of the unexplained variance in the dependent. In PASW/SPSS output, this statistic is found in the "-2 Log Likelihood" column of the "Model Fitting Information" table for the Final row. The likelihood ratio is not used directly in significance testing, but it it the basis for the likelihood ratio test, which is the test of the difference between two likelihood ratios (two -2LL's), as discussed below. In general, as the model becomes better, -2LL will decrease in magnitude.
The likelihood ratio test looks at model chi-square (chi square difference) by subtracting deviance (-2LL) for the final (full) model from deviance for the intercept-only model. Degrees of freedom in this test equal the number of terms in the model minus 1 (for the constant). This is the same as the difference in the number of terms between the two models, since the null model has only one term. Model chi-square measures the improvement in fit that the explanatory variables make compared to the null model.
Warning: If the log-likelihood test statistic shows a small p value (<=.05) for a model with a large effect size, ignore contrary findings based on the Wald statistic discussed below as it is biased toward Type II errors in such instances - instead assume good model fit overall.
In the table above, the response variable is Gunown, a binary variable indicating whether or not there is a gun in the home. Predictors are marital status, race, attitude toward the death penalty (Cappun), and age. The likelihood ratio tests of individual parameters show that the model without Age is not significantly different from the final (full) model and therefore Age should be dropped based on preference for the more parsimonious reduced model. For the significant variables, the larger the chi-square value, the greater the loss of model fit if that term is dropped. In this example, dropping "marital" would result in the greatest loss of model fit.
Binary logistic regression in PASW/SPSS offers these variants in the Method area of the main binary logistic regression dialog: forward conditional, forward LR, forward Wald, backward conditional, backward LR, or backward Wald. The conditional options uses a computationally faster version of the likelihood ratio test, LR options utilize the likelihood ratio test (chi-square difference), and the Wald options use the Wald test. The LR option is most often preferred. The likelihood ratio test computes -2LL for the current model, then reestimates -2LL with the target variable removed. The conditional option is preferred when LR estimation proves too computationally time-consuming. The conditional statistic is considered not as accurate as the likelihood ratio test but more so than the third possible criterion, the Wald test. Stepwise procedures are selected in the Method drop-down list of the binary logistic regression dialog.
Multinomial logistic regression offers these variants under the Model button if a Custom model is specified: forward stepwise, backward stepwise, forward entry, and backward elimination. These four options are described in the FAQ section below. All are based on maximum likelihood estimation (MLE), with forward methods using the likelihood ratio or score statistic and backward methods using the likelihood ratio or Wald's statistic. LR is the default, but score and Wald alternatives are available under the Options button. Forward entry adds terms to the model until no omitted variable would contribute significantly to the model. Forward stepwise determines the forward entry model and then alternates between backward elimination and forward entry until all variables not in the model fail to meet entry or removal criteria. Backward elimination and backward stepwise are similar, but begin with all terms in the model and work backward. with backward elimination stopping when the model contains only terms which are significant and with backward stepwise taking this result and further alternating between forward entry and backward elimination until no omitted variable would contribute significantly to the model.
In the illustration below, forward stepwise modeling of a binary dependent, which was having a gun in the home or not (Gunown), as predicted by the categorical variable marital status (marital, with four categories), race (three categories), and attitude on the death penalty (binary). The forward stepwise procedure adds Marital first, then Race, then Gunown.
The LOGISTIC (binomial) procedure will predict the "1" category of the dependent variable, making the "0" category the reference category. In contrast, NOMREG (the multinomial procedure) by default uses the highest category as the reference category and thus for a binomial variable, will predict the "0" category, using the "1" category as the reference. For a multinomial dependent variable, the b coefficient in NOMREG reflects the effect of being in the given category of the predictor variable on the odds of being in the given category of the dependent, where odds refers to the probability of being in the given category of the dependent versus the probability of being in the reference category of the dependent. Setting reference categories is discussed above in the sections on dependent variables and on factors.
As discussed above, logits are the log odds of the event occurring (usually, that the dependent = 1 rather than 0). Where OLS regression has an identity link function, logistic regression has a logit link function (that is, logistic regression calculates changes in the log odds of the dependent, not changes in the dependent itself as OLS regression does). Parameter estimates (b coefficients) associated with explanatory variables are estimators of the change in the logit caused by a unit change in the independent. In PASW/SPSS output, the parameter estimates appear in the "B" column of the "Variables in the Equation" table. Logits do not appear but must be estimated using the logistic regression equation above, inserting appropriate values for the constant and X variable(s). The b coefficients vary between plus and minus infinity, with 0 indicating the given explanatory variable does not affect the logit (that is, makes no difference in the probability of the dependent value equaling the value of the event, usually 1); positive or negative b coefficients indicate the explanatory variable increases or decreases the logit of the dependent. Exp(b) is the odds ratio for the explanatory variable, discussed below.
If, on the other hand, sex is left as a dichotomous covariate, as in the lower portion of the figure below, then the reference category is the lower category (usually the 0 category, but here with (1,2) coding, the lower category is 1 = male). For covariate coding of sex, the odds ratio is 1.751. We can say that the odds of opposing the death penalty compared to favoring it are increased by a factor of 1.751 by being female rather than male, controlling for other variables in the model. Or we could say that the odds a female opposes the death penalty are 1.82 the odds a male opposes it, controlling for other variables in the model.
For the first category of marital in the example above, the odds ratio is .587. Recall binary logistic regression by default predicts the higher category of the dependent, which is cappun = 2 = opposing capital punishment. We would therefore say that the odds of opposing capital punishment compared to favoring it are decreased by a factor of .587 when the respondent is married compared to being never married, , controlling for other variables in the model. Similar statements would be made about the other levels of marital, all making comparision to the reference category, never married. Note, however, that the other three contrasts are not significant (marital 2, 3, or 4 versus 5).
To take a third example, let income be a continuous explanatory variable measured in ten thousands of dollars, with a parameter estimate of 1.5 in a model predicting home ownership=1, no home ownership=0. A 1 unit increase in income (one $10,000 unit) is then associated with a 1.5 increase in the log odds of home ownership. However, it is more intuitive to convert to an odds ratio: exp(1.5) = 4.48, allowing one to say that a unit ($10,000) change in income increases the odds of the event ownership=1 about 4.5 times.
Recall this chart:
| BINARY LOGISTIC REGRESSION | OUTCOME: |
| Categorical coding: | highest is reference (ex., 2=female) |
| Covariate coding: | lowest is reference (ex., 1=male) |
| MULTINOMIAL LOGISTIC REGRESSION | |
| Factor coding: | lowest is reference (ex., 1=male) |
| Covariate coding: | highest is reference (ex., 2=female) |
Keep firmly in mind that in PASW/SPSS, the default reference coding for independent variables is the opposite of that in binary logistic regression! One may enter a binary dependent in multinomial regression, but to get the same odds ratio as in binary logistic regression with categorical coding of a dichotomy like sex, one must in multinomial regression specify covariate coding, for instance.
In the figure above, for Work = 2 = part-time, the odds ratio for sex is 4.552, where sex is entered as a covariate dichotomy. We can therefore say that the odds of being part-time rather than unemployed is increased by a factor of about 4.6 by being male rather than female, controlling for other variables in the model. We cannot make a corresponding statement about full time work as that odds ratio is non-significant.
Note that R2-like measures below are not goodness-of-fit tests but rather attempt to measure strength of association. Unfortunately, the pseudo-R2 measures reflect and confound effect strength with goodness of fit. For small samples, for instance, an R2-like measure might be high when goodness of fit was unacceptable by the likelihood ratio test. PASW/SPSS supports three R2-like measures: Cox and Snell's, Nagelkerke's, and McFadden's, as illustrated below. Output is identical for binomial and multinomial logistic regression and in PASW/SPSS appears in the "Pseudo R Square" table.
This is the proportional reduction in error definition of "chance hit rate." While no particular split of the dependent variable is assumed, the split makes a difference in understanding this rate. In the example above, where the dependent is whether or not there is a gun in the home, 65% of respondents say there is not. If for all cases one guessed "No", one would be right 65% of the time. The model percent correct of 63.9% is thus not as good as it might first appear. Now suppose the dependent is split 99:1. Then one could guess the value of the dependent correctly 99% of the time just by always selecting the more common value. The classification table will likely show 0 predictions in the predicted column for the 1% value of the dependent. The closer to 50:50, the easier it is for a predictor variable to have an effect. Even at some intermediate but lopsided split, such as 85:15, it can be difficult for a predictor to improve on simple guessing (that is, on 85%). A strong predictor variable could improve on the 85% but a weak one might not. This does not mean the predictor variables are non-significant, just that they do not move the estimates enough to make a difference compared to pure guessing. When the classification table for a dichotomous dependent has a zero "Predicted" column, it is likely that the raw correlations of the predictor variables with the dependent variable are not high enough to make a difference.
By either improvement criterion, the observed hit rate does not indicate a "good model" for these data. However, it is up to the researcher to define what is "good". The most liberal criterion would be any observed hit rate above the PC baseline (above .545), which would make the model above a "good" one. The most conservative criterion would be any observed hit rate above the percentage-improved PRE criterion (above .813). "Good" is in the eye of the beholder and how "chance" is defined. Nearly all researchers would agree, however, that an observed hit rate below the proportional by chance baseline is a poor model.
Parameter codings for indicator contrasts
------------------------------------------------
Parameter
Value Freq Coding
(1) (2)
GROUP
1 106 1.000 .000
2 116 .000 1.000
3 107 .000 .000
------------------------------------------------
This example shows a three-level categorical independent (labeled GROUP), with category values of 1, 2, and 3.
The predictor here is called simply GROUP. It takes on the values 1-3, with frequencies listed in the "Freq" column. The two "Coding" columns are the internal values (parameter codings) assigned by PASW/SPSS under indicator coding. There are two columns of codings because two dummy variables are created for the three-level variable GROUP. For the first variable, which is Coding (1), cases with a value of 1 for GROUP get a 1, while all other cases get a 0. For the second,
cases with a 2 for GROUP get a 1, with all other cases getting a 0.
In the example above, both the likelihood ratio tests table and the parameter estimates table show for these data that Test (test score) is significant in differentiating those promoted from those not (i.e., subjects from matches), controlling for variables used for matching (age, gender). In addition, Rating (supervisor's rating) and Race are not significant.
LOGISTIC REGRESSION /VARIABLES income WITH age SES gender opinion1
opinion2 region
/CATEGORICAL=gender, opinion1, opinion2, region
/CONTRAST(region)=INDICATOR(4)
/METHOD FSTEP(LR)
/CLASSPLOT
Above is the PASW/SPSS syntax in simplified form. The dependent variable is the variable immediately after the VARIABLES term. The independent variables are those immediately after the WITH term. The CATEGORICAL command specifies any categorical variables; note these must also be listed in the VARIABLES statement. The CONTRAST command tells PASW/SPSS which category of a categorical variable is to be dropped when it automatically constructs dummy variables (here it is the 4th value of "region"; this value is the fourth one and is not necessarily coded "4"). The METHOD subcommand sets the method of computation, here specified as FSTEP to indicate forward stepwise logistic regression. Alternatives are BSTEP (backward stepwise logistic regression) and ENTER (enter terms as listed, usually because their order is set by theories which the researcher is testing). ENTER is the default method. The (LR) term following FSTEP specifies that likelihood ratio criteria are to be used in the stepwise addition of variables to the model. The /CLASSPLOT option specifies a histogram of predicted probabilities is to output (see above).
The full syntax is below:
LOGISTIC REGRESSION VARIABLES = dependent var
[WITH independent varlist [BY var [BY var] ... ]]
[/CATEGORICAL = var1, var2, ... ]
[/CONTRAST (categorical var) = [{INDICATOR [(refcat)] }]]
{DEVIATION [(refcat)] }
{SIMPLE [(refcat)] }
{DIFFERENCE }
{HELMERT }
{REPEATED }
{POLYNOMIAL[({1,2,3...})]}
{metric }
{SPECIAL (matrix) }
[/METHOD = {ENTER** } [{ALL }]]
{BSTEP [{COND}]} {varlist}
{LR }
{WALD}
{FSTEP [{COND}]}
{LR }
{WALD}
[/SELECT = {ALL** }]
{varname relation value}
[/{NOORIGIN**}]
{ORIGIN }
[/ID = [variable]]
[/PRINT = [DEFAULT**] [SUMMARY] [CORR] [ALL] [ITER [({1})]] [GOODFIT]]
{n}
[CI(level)]
[/CRITERIA = [BCON ({0.001**})] [ITERATE({20**})] [LCON({0** })]
{value } {n } {value }
[PIN({0.05**})] [POUT({0.10**})] [EPS({.00000001**})]]
{value } {value } {value }
[CUT[{O.5** }]]
[value }
[/CLASSPLOT]
[/MISSING = {EXCLUDE **}]
{INCLUDE }
[/CASEWISE = [tempvarlist] [OUTLIER({2 })]]
{value}
[/SAVE = tempvar[(newname)] tempvar[(newname)]...]
[/OUTFILE = [{MODEL }(filename)]]
{PARAMETER}
[/EXTERNAL]
**Default if the subcommand or keyword is omitted.
The syntax for multinomial logistic regression is:
NOMREG dependent varname [(BASE = {FIRST } ORDER = {ASCENDING**})] [BY factor list]
{LAST**} {DATA }
{value } {DESCENDING }
[WITH covariate list]
[/CRITERIA = [CIN({95**})] [DELTA({0**})] [MXITER({100**})] [MXSTEP({5**})]
{n } {n } {n } {n }
[LCONVERGE({0**})] [PCONVERGE({1.0E-6**})] [SINGULAR({1E-8**})]
{n } {n } {n }
[BIAS({0**})] [CHKSEP({20**})] ]
{n } {n }
[/FULLFACTORIAL]
[/INTERCEPT = {EXCLUDE }]
{INCLUDE** }
[/MISSING = {EXCLUDE**}]
{INCLUDE }
[/MODEL = {[effect effect ...]} [| {BACKWARD} = { effect effect ...}]]
{FORWARD }
{BSTEP }
{FSTEP }
[/STEPWISE =[RULE({SINGLE** })][MINEFFECT({0** })][MAXEFFECT(n)]]
{SFACTOR } {value}
{CONTAINMENT}
{NONE }
[PIN({0.05**})] [POUT({0.10**})]
{value } {value }
[ENTRYMETHOD({LR** })] [REMOVALMETHOD({LR**})]
{SCORE} {WALD}
[/OUTFILE = [{MODEL }(filename)]]
{PARAMETER}
[/PRINT = [CELLPROB] [CLASSTABLE] [CORB] [HISTORY({1**})] [IC] ]
{n }
[SUMMARY ] [PARAMETER ] [COVB] [FIT] [LRT] [KERNEL]
[ASSOCIATION] [CPS**] [STEP**] [MFI**] [NONE]
[/SAVE = [ACPROB[(newname)]] [ESTPROB[(rootname[:{25**}])] ]
{n }
[PCPROB[(newname)]] [PREDCAT[(newname)]]
[/SCALE = {1** }]
{n }
{DEVIANCE}
{PEARSON }
[/SUBPOP = varlist]
[/TEST[(valuelist)] = {['label'] effect valuelist effect valuelist...;}]
{['label'] ALL list; }
{['label'] ALL list }
** Default if the subcommand is omitted.
As there is no direct counterpart to R-squared in logistic regression, VIF cannot be computed -- though obviously one could apply the same logic to various psuedo-R-squared measures. Unfortunately, I am not aware of a VIF-type test for logistic regression, and I would think that the same obstacles would exist as for creating a true equivalent to OLS R-squared.
A high odds ratio would not be evidence of multicollinearity in itself.
To the extent that one independent is linearly or nonlinearly related to another independent, multicollinearity could be a problem in logistic regression since, unlike OLS regression, logistic regression does not assume linearity of relationship among independents. Some authors use the VIF test in OLS regression to screen for multicollinearity in logistic regression if nonlinearity is ruled out. In an OLS regression context, nonlinearity exists when eta-square is significantly higher than R-square. In a logistic regression context, the Box-Tidwell transformation and orthogonal polynomial contrasts are ways of testing linearity among the independents.
When an ordinal variable has been entered as a set of dummy variables, the interaction of another variable with the ordinal variable will involve multiple interaction terms. In this case the significance of the interaction of the two variables is the significance of the change of R-square of the equation with the interaction terms and the equation without the set of terms associated with the ordinal variable. (See the StatNotes section on "Regression" for computing the significance of the difference of two R-squares).
FORWARD ENTRY 1. Estimate the parameter and likelihood function for the initial model and let it be our current model. 2. Based on the MLEs of the current model, calculate the score or LR statistic for every variable eligible for inclusion and find its significance. 3. Choose the variable with the smallest significance. If that significance is less than the probability for a variable to enter, then go to step 4; otherwise, stop FORWARD. 4. Update the current model by adding a new variable. If there are no more eligible variable left, stop FORWARD; otherwise, go to step 2. FORWARD STEPWISE 1. Estimate the parameter and likelihood function for the initial model and let it be our current model. 2. Based on the MLEs of the current model, calculate the score statistic or likelihood ratio statistic for every variable eligible for inclusion and find its significance. 3. Choose the variable with the smallest significance (p-value). If that significance is less than the probability for a variable to enter, then go to step 4; otherwise, stop FSTEP. 4. Update the current model by adding a new variable. If this results in a model which has already been evaluated, stop FSTEP. 5. Calculate the significance for each variable in the current model using LR or Wald's test. 6. Choose the variable with the largest significance. If its significance is less than the probability for variable removal, then go back to step 2. If the current model with the variable deleted is the same as a previous model, stop FSTEP; otherwise go to the next step. 7. Modify the current model by removing the variable with the largest significance from the previous model. Estimate the parameters for the modified model and go back to step 5. BACKWARD ELIMINATION 1. Estimate the parameters for the full model that includes all eligible variables. Let the current model be the full model. 2. Based on the MLEs of the current model, calculate the LR or Wald's statistic for all variables eligible for removal and find its significance. 3. Choose the variable with the largest significance. If that significance is less than the probability for a variable removal, then stop BACKWARD; otherwise, go to the next step. 4. Modify the current model by removing the variable with the largest significance from the model. Estimate the parameters for the modified model. If all the variables in the BACKWARD list are removed then stop BACKWARD; otherwise, go back to step 2. BACKWARD STEPWISE 1. Estimate the parameters for the full model that includes the final model from previous method and all eligible variables. Only variables listed on the BSTEP variable list are eligible for entry and removal. Let current model be the full model. 2. Based on the MLEs of the current model, calculate the LR or Wald's statistic for every variable in the BSTEP list and find its significance. 3. Choose the variable with the largest significance. If that significance is less than the probability for a variable removal, then go to step 5. If the current model without the variable with the largest significance is the same as the previous model, stop BSTEP; otherwise go to the next step. 4. Modify the current model by removing the variable with the largest significance from the model. Estimate the parameters for the modified model and go back to step 2. 5. Check to see any eligible variable is not in the model. If there is none, stop BSTEP; otherwise, go to the next step. 6. Based on the MLEs of the current model, calculate LR statistic or score statistic for every variable not in the model and find its significance. 7. Choose the variable with the smallest significance. If that significance is less than the probability for the variable entry, then go to the next step; otherwise, stop BSTEP. 8. Add the variable with the smallest significance to the current model. If the model is not the same as any previous models, estimate the parameters for the new model and go back to step 2; otherwise, stop BSTEP.
GLM nonparametric regression allows the logit of the dependent variable to be a nonlinear function of the parameter estimates of the independent variables. While GLM techniques like logistic regression are nonlinear in that they employ a transform (for logistic regression, the natural log of the odds of a dependent variable) which is nonlinear, in traditional form the result of that transform (the logit of the dependent variable) is a linear function of the terms on the right-hand side of the equation. GLM non-parametric regression relaxes the linearity assumption to allow nonlinear relations over and beyond those of the link function (logit) transformation.
Generalized nonparametric regression is a GLM equivalent to OLS local regression (local polynomial nonparametric regression), which makes the dependent variable a single nonlinear function of the independent variables. The same problems noted for OLS local regression still exist, notably difficulty of interpretation as independent variables increase.
Generalized additive regression is the GLM equivalent to OLS additive regression, which allow the dependent variable to be the additive sum of nonlinear functions which are different for each of the independent variables. Fox (2000: 74-77) argues that generalized additive regression can reveal nonlinear relationships under certain circumstances where they are obscured using partial residual plots alone, notably when a strong nonlinear relationship among independents exists alongside a strong nonlinear relatinship between an independent and a dependent.
The second strategy is to create an indicator (dummy) variable or set of variables which reflects membership/non-membership in the group, and also to have interaction terms between the indicator dummies and other independent variables, such that the significant interactions are interpreted as indicating significant differences across groups for the corresponding independent variables. When an indicator variable has been entered as a set of dummy variables, its interaction with another variable will involve multiple interaction terms. In this case the significance of the interaction of the indicator variable and another independent variable is the significance of the change of R-square of the equation with the interaction terms and the equation without the set of terms associated with the ordinal variable. (See the StatNotes section on "Regression" for computing the significance of the difference of two R-squares).
Allison (1999: 186) has shown that "Both methods may lead to invalid conclusions if residual variation differs across groups." Unequal residual variation across groups will occur, for instance, whenever an unobserved variable (whose effect is incorporated in the disturbance term) has different impacts on the dependent variable depending on the group. Allison suggests that, as a rule of thumb, if "one group has coefficients that are consistently higher or lower than those in another group, it is a good indication of a potential problem ..." (p, 199). Allison explicated a new test to adjust for unequal residual variation, presenting the code for computation of this test in SAS, LIMDEP, BMDP, and STATA. The test is not implemented directly by PASW/SPSS or SAS, at least as of 1999. Note Allison's test is conservative in that it will always yield a chi-square which is smaller than the conventional test, making it harder to prove the existence of cross-group differences.
Copyright 1998, 2008, 2009 by G. David Garson.
Last update 8/7/09.