|
|
Overview
Data requirements. In all GLM models, the dependent(s) is/are numeric. The independents may be categorical factors (including both numeric and string types) or quantitative covariates. Data are assumed to come from a random sample for purposes of significance testing. The variance(s) of the dependent variable(s) is/are assumed to be the same for each cell formed by categories of the factor(s) (this is the homogeneity of variances assumption). Regression in GLM is simply a matter of entering the independent variables as covariates and, if there are sets of dummy variables (ex., Region, which would be translated into dummy variables in OLS regression, for ex., South = 1 or 0), the set variable (ex., Region) is entered as a fixed factor with no need for the researcher to create dummy variables manually. The b coefficients will be identical whether the regression model is run under ordinary regression (in SPSS, under Analyze, Regression, Linear) or under GLM (in SPSS, under Analyze, General Linear Model, Univariate), [However, in GLM the researcher must ask for "Parameter estimates" under the Options button in the GLM dialog, whereas their printing in the Regression procedure is automatic.] The R-square from the Regression procedure will equal the partial Eta squared from the GLM regression model. Anova family. Although regression models may be run easily in GLM, as a practical matter univariate GLM is used primarily to run analysis of variance (ANOVA) and analysis of covariance (ANCOVA) models. Multivariate GLM is used primarily to run multiple analysis of variance (MANOVA) and multiple analysis of covariance (MANCOVA) models.
The key statistic in ANOVA is the F-test of difference of group means, testing if the means of the groups formed by values of the independent variable (or combinations of values for multiple independent variables) are different enough not to have occurred by chance. If the group means do not differ significantly then it is inferred that the independent variable(s) did not have an effect on the dependent variable. If the F test shows that overall the independent variable(s) is (are) related to the dependent variable, then multiple comparison tests of significance are used to explore just which values of the independent(s) have the most to do with the relationship. If the data involve repeated measures of the same variable, as in before-after or matched pairs tests, the F-test is computed differently from the usual between-groups design, but the inference logic is the same. There are also a large variety of other ANOVA designs for special purposes, all with the same general logic. Note that analysis of variance tests the null hypotheses that group means do not differ. It is not a test of differences in variances, but rather assumes relative homogeneity of variances. Thus some key ANOVA assumptions are that the groups formed by the independent variable(s) are relatively equal in size and have similar variances on the dependent variable ("homogeneity of variances"). Like regression, ANOVA is a parametric procedure which assumes multivariate normality (the dependent has a normal distribution for each value category of the independent(s)). Analysis of covariance (ANCOVA) is used to test the main and interaction effects of categorical variables on a continuous dependent variable, controlling for the effects of selected other continuous variables which covary with the dependent.The control variable is called the "covariate." There may be more than one covariate. One may also perform planned comparison or post hoc comparisons to see which values of a factor contribute most to the explanation of the dependent. ANCOVA uses built-in regression using the covariates to predict the dependent, then does an ANOVA on the residuals (the predicted minus the actual dependent variables) to see if the factors are still significantly related to the dependent variable after the variation due to the covariates has been removed. In SPSS, select Analyze, General Linear Model, Univariate; enter the dependent variable, the factor(s), and the covariate(s); click the Model button and accept the default, which is Full Factorial (if you select Custom, your model should not include interactions of factors with covariates: that is used beforehand in testing the equality of regressions assumption discussed below in the "Assumptions" section, but not in the ANCOVA model itself). The Full Factorial model contains the intercept, all factor and covariate main effects, and all factor-by-factor interactions. For instance, for three variables A, B, and C, it includes the effects A, B, C, A*B, A*C, B*C, and A*B*C. It does not contain factor-by-covariate interactions. Covariates will be listed in the /DESIGN statement after the WITH keyword. The maximum number of covariates SPSS will process is 10.
Linear mixed models (LMM) and its subset cousin, analysis of variance components (VC), perform many of the same functions as analysis of variance under GLM. A comparison of GLM with both LMM and VC, illustrated with data, is found in the section on linear mixed models. While both GLM and LMM accept the use of random effects in models, LMM is preferred for reasons given in the comparison.
|
|
The formulas for the t-test (a special case of one-way ANOVA), and for the F-test used in ANOVA, thus reflect three things: the difference in means, group sample sizes, and the group variances. That is, the ANOVA F-test is a function of the variance of the set of group means, the overall mean of all observations, and the variances of the observations in each group weighted for group sample size. Thus, the larger the difference in means, the larger the sample sizes, and/or the lower the variances, the more likely a finding of significance.
In SPSS, select Analyze, Compare Means, One-Way ANOVA; enter the dependent variable in the Dependent list; enter the independent variable as the Factor.
Likewise, the F test of overall model significance shown for the "Corrected Model" row of the GLM Univariate table is the same as that in the Regression ANOVA table. And the R2 effect size measure in Regression output corresponds to the partial eta2 coefficient in GLM output.
This is the usual ANOVA design. There is one set of subjects: the "groups" refer to the subset of subjects associated with each category of the independent variable (in one-way ANOVA) or with each cell formed by multiple categorical independents (in multivariate ANOVA), as in the illustration above. After measurements are taken for each group, analysis of variance is computed to see if the variance on the dependent variable between groups is different from the variance within groups. Just by chance, one would expect the variance between groups to be as large as the variance within groups. If the variance between groups is enough larger than the variance within groups, as measured by the F ratio (discussed below), then it is concluded that the grouping factor (the independent variable(s) does/do have a significant effect.
In this example, using the figure above, the rows would be the four classes. The columns would be the four textbooks, administered in each of four periods. In period 1, teacher A would use the first textbook for class 1; in period 2, teacher A would use textbook 2 for class 4; in period 3, teacher A would use testbook 3 for class 3; and in period 4, techer A would use textbook 4 for class 2. The other three teachers would rotate similarly, according the the design schedule above. In the schedule, each class starts with a different teacher and text, ruling out the chance that results would be attributable to different classes starting with the same treatment. Since no two classes ever have the same textbook in the same period, results cannot be attributed to a period effect either.
The figure above represents a 2x3x2 factorial design where there are treatment and control groups, each with two groups by sex (male, female) who are administered three levels of treatment (noise = low, medium, high) and some interval measurement is taken for each group on some variable (ex., test scores). The figure only shows the design factors. There may be one or more covariates as well, such as age. A full factorial design will model the main effects of the factors noise and sex; the main effect of the covariate age; and the interaction of noise*sex. It will not model noise*age or sex*age.
Thus, in the example above, in RCB Design, there are three blocks, one for each age group, where age group is the blocking factor. Within each block there are all six possible brand-dosage treatments (ex., Brand A, Dosage 2), assigned in random order to subjects within each of the three blocks.
In a typical split-plot repeated measures design, Subjects will be measured on some Score over a number of Trials. Subjects will also be split by some Group variable. In SPSS, Analyze, General Linear Model, Univariate; enter Score as the dependent; enter Trial and Group as fixed factors; enter Subject as a random factor; Press the Model button and choose Custom, asking for the Main effects for Group and Trial, and the interaction effect of Trial*Group; then click the Paste button and modify the /DESIGN statement to also include Subject(Group) to get the Subject-within-Group effect; then select Run All in the syntax window to execute.
In SPSS, Analyze, General Linear Model, Univariate; specify the main factor as fixed or random, then specify the nested factor as random; click the Model button and enter the main effects of the main (not nested) factor(s); click the Paste button and modify the /DESIGN statement to a format such as /DESIGN = mainfactor nestedfactor(mainfactor), signifying the model is the main effect of the fixed factor plus the effects of the random nested factor at each value of the main fixed factor. In the syntax window, Run All. In the resulting ANOVA table, a significant nestedfactor(mainfactor) effect means that the dependent variable varies by the nested factor even within the same level of (controlling for) the main factor.
Put another way, if a random factor is treated as a fixed factor, the researcher opens his or her research up to the charge that the findings pertain only to the particular arbitrary cases studied and findings and inferences might well be quite different if alternative cases had been selected. The purpose of using a random effect model is to avoid these potential criticisms by taking into account the variability of the replications or random effects when computing the error term which forms the denominator of the F test for random effect models.
In the figure above, a between-subjects data design is contrasted with a within-subjects (repeated measures) data design on the same topic: what is the effect of different sign colors on stopping distance in feet? In a between-subjects design, each subject experiences a different treatment (color). In a within-subjects design, each subject experiences all three treatments (colors), and fewer subjects are needed.
A factor is a random factor if only a random sample of its values are measured, which may be the case when a factor has a very large number of values. Thus "city" would be a random factor if its values were "1=NYC", "2=Atlanta", "3=Miami", "4=Chicago", and "5=Los Angeles".

Below is a second example (anova.xls) which may be implemented on one's spreadsheet so that students can play with the numbers to see different main and interaction effects. This example shows the interaction of learning type (control vs. classroom vs. online) and hours of instruction (low, medium, high). The upper set of lines in the graph is the means, the lower is the standard deviations. Normally the researcher is primarily interested in the set of means. For the means set, that the black control group means line is below and does not cross the others shows that online and classroom education is associated with higher scores for all hours of instruction categories. That the aqua classroom means line crosses the green online means line, shows there is an interaction of learning type with hours category. For low hours, online subjects score higher, but for medium and high hours, classroom subjects score higher.

| value of d | % of comparison group below | % of non-overlap |
|---|---|---|
| 0 | 50 | 0 |
| .2 | 58 | 14.7 |
| .4 | 66 | 27.4 |
| .6 | 73 | 38.2 |
| .8 | 79 | 47.4 |
| 1.0 | 84 | 55.4 |
| 1.5 | 93.3 | 70.7 |
| 2.0 | 97.7 | 81.1 |
If the computed F value is around 1.0, differences in group means are only random variations. If the computed F score is significantly greater than 1, then there is more variation between groups than within groups, from which we infer that the grouping variable does make a difference. Note that the significant difference may be very small for large samples. The researcher should report not only significance, but also strength of association, discussed below.
The lack of fit F test is a test of the difference of a full vs. reduced model. The reduced model is the researcher's fitted model, which must be a non-full factorial model. The full model to which it is compared is a full factorial model. The sum of squares for the reduced model is partitioned into sum of squares for pure error (SSPE) and sum of squares for lack of fit (SS(LOF)). Thus SS(LOF) = SSE(Reduced)-SSPE, where Reduced refers toSSE for researcher's fitted model (in SPSS, this is found in the Error row of the Sum of Squares column of the "Test of Between Subjects Effects" table). The lack of fit test is described further in Khuri (1985) and in Levy & Neill (1990).
The difference between planned multiple comparison tests discussed in this section and post-hoc multiple comparison tests discussed in the next section is one of power, not purpose. Some, including SPSS, lump all the tests together as "post hoc tests", as illustrated below. This figure shows the SPSS post hoc tests dialog after the Post Hoc button is pressed in the GLM Univariate dialog. (There is a similar dialog when Analyze, Compare Means, One-Way ANOVA is chosen, invoking the SPSS ONEWAY procedure, which the GLM procedure has superceded). The essential difference is that the planned multiple comparison tests in this section are based on the t-test, which generally has more power than the post-hoc tests listed in the next section.
Warning! The model, discussed above, will make a difference for multiple comparison tests. A factor (ex., race) may display different multiple comparison results depending on what other factors are in the model. Covariates cannot be in the model at all for these tests to be done. Interactions may be in the model, but multiple comparison tests are not available to test them. Also note that all these t-tests are subject to the equality of variances assumption and therefore the data must meet Levene's test, discussed below. Finally, note that the significance level (.05 is default) may be set using the Options button off the main GLM dialog.
SPSS. A simple t-test, with or without Bonferroni adjustment, may be obtained by selecting Statistics, Compare Means, One-Way ANOVA. Example.
The Bonferroni method applies the simple t-test, but then adjusts the significance level by multiplying by the number of comparisons being made. For instance, a finding of .01 significance for 9 comparisons becomes .09. This is equivalent to saying that if the target alpha significance level is .05, then the t-test must show alpha/9 (ex., .05/9 = .0056) or lower for a finding of significance to be made. Bonferroni-adjusted multiple t-tests are usually employed only when there are few comparisons, as with many it quickly becomes practically impossible to show significance. If the independents formed 8 groups there would be 8!/6!2! = 28 comparisons and if one used the .05 significance level, one would expect at least one of the comparisons to generate a false positive (thinking you had a relationship when you did not). Note this adjustment may be applied to F-tests as well as t-tests. That is, it can handle nonpairwise as well as pairwise comparisons.
The Bonferroni-adjusted t-test imposes an extremely small alpha significance level as the number of comparisons becomes large. That is, this method is not recommended when the number of comparisons is large because the power of the test becomes low. Klockars and Sax (1986: 38-39) recommend using a simple .05 alpha rate when there are few comparisons, but using the more stringent Bonferroni-adjusted multiple t-test when the number of planned comparisons is greater than the number of degrees of freedom for between-groups mean square (which is k-1, where k is the number of groups). Nonetheless, researchers still try to limit the number of comparisons, trying to reduce the probability of Type II errors (accepting a false null hypothesis). This test is not recommended when the researcher wishes to perform all possible pairwise comparisons.
By the Bonferroni test, the figure above shows whites are significantly different from blacks but not from "other" races, with respect to mean highest year of education completed (the dependent variable).
In comparing group means on a post-hoc basis, one is comparing the means on the dependent variable for each of the k groups formed by the categories of the independent factor(s). The possible number of comparisons is k(k-1)/2. Multiple comparisons help specify the exact nature of the overall effect determined by the F test. However, note that post hoc tests do not control for the levels of other factors or for covariates (that is, interaction and control effects are not taken into account). Findings of significance or nonsignificance between factor levels must be understood in the context of full ANOVA F-test findings, not just post hoc tests, which are subordinant to the overall F test. Note the model cannot contain covariates when employing these tests.
Computation. The q-statistic, also called the q range statistic or the Studentized range statistic, is commonly used in coefficients for post-hoc multiple comparisons, though some post hoc tests use the t statistic. In contrast to the planned comparison t-test, coefficients based on the q-statistic, are commonly used for post-hoc comparisons - that is, when the researcher wishes to explore the data to uncover large differences, without limiting investigation by à priori theory). Both the q and t statistics use the difference of means in the numerator, but where the t statistic uses the standard error of difference between the means in the denominator, q uses the standard error of the mean. Consequently, where the t test tests the difference between two means, the q-statistic tests the probability that the largest mean and smallest mean among the k groups formed by the categories of the independent(s) were sampled from the same population. If the q-statistic computed for the two sample means is not as large as the criterion q value in a table of critical q values, then the researcher cannot reject the null hypothesis that the groups do not differ at the given alpha significance level (usually .05). If the null hypothesis is not rejected for the largest compared to smallest group means, it follows that all intermediate groups are also drawn from the same population -- so the q-statistic is thus also a test of homogeneity for all k groups formed by the independent variable(s).
Output formats: pairwise vs. multiple range. In pairwise comparisons tests, output is produced similar to the Bonferroni and Sidk tests above, for the LSD, Games-Howell, Tamhane's T2 and T3, Dunnett's C, and Dunnett's T3 tests. Homogeneous subsets for range tests are provided for S-N-K, Tukey's b, Duncan, R-E-G-W F, R-E-G-W Q, and Waller. Some tests are of both types: Tukey's honestly significant difference test, Hochberg's GT2, Gabriel's test, and Scheffé's test.
Warning! The model, discussed above, will make a difference for post hoc tests. A factor (ex., race) may display different multiple comparison results depending on what other factors are in the model. Covariates cannot be in the model at all for these tests to be done. Interactions may be in the model, but multiple comparison tests are not available to test them. Also note that all the post-hoc tests are subject to the equality of variances assumption and therefore the data must meet Levene's test, discussed below, with the exception of Tamhane's T2, Dunnett's T3, Games-Howell, and Dunnett's C, all of which are tailored for data where equal variances cannot be assumed. Finally, note that the significance level (.05 is default) may be set using the Options button off the main GLM dialog.
LSD is the most liberal of the post-hoc tests (it is most likely to reject the null hypothesis in favor of finding groups do differ). It controls the experimentwise Type I error rate at a selected alpha level (typically 5%), but only for the omnibus (overall) test of the null hypothesis. LSD allows higher Type I errors for the partial null hypotheses involved in the comparisons. Toothaker (1993: 42) recommends against any use of LSD on the grounds that it has poor control of experimentwise alpha significance, and better alternatives exist such as Shaffer-Ryan, discussed below. Others, such as Cardinal & Aitken (2005: 86) recommend its use only for factors with three levels. However, the LSD test is the default in SPSS for pairwise comparisons in its GLM or UNIANOVA procedures. As illustrated below, the LSD test is interpreted in the same manner as the Bonferroni test above and for this example yields the same substantive results: whites differ significantly from blacks but not other races on mean highest school year completed.
While the Scheffé test has the advantage of maintaining an experimentwise .05 significance level in the face of multiple comparisons, it does so at the cost of a loss in statistical power (more Type II errors may be made -- thinking you do not have a relationship when you do). That is, the Scheffé test is a very conservative one (more conservative than Dunn or Tukey, for ex.), not appropriate for planned comparisons but rather restricted to post hoc comparisons. Even for post hoc comparisons, the test is used for complex comparisons and is not recommended for pairwise comparisons due to "an unacceptably high level of Type II errors" (Brown and Melamed, 1990: 35). Toothaker (1993: 28) recommends the Scheffé test only for complex comparisons, or when the number of comparisons is large. The Scheffé test is low in power and thus not preferred for particular comparisons, but it can be used when one wishes to do all or a large number of comparisons. Tukey's HSD is preferred for making all pairwise comparisons among group means, and Scheffé for making all or a large number of other linear combinations of group means.
Output for the example of Gender as a factor predicting the dependent Highest year of school completed, looks like this:
In the output below, the Plots dialog was used to ask for a profile plot of the interaction of sex* race, and the same interaction was specified for estimated marginal means in the Options dialog. That the profile plot lines are not parallel shows there is an interaction effect between sex and race, albeit not a strong one. Thisis also indicated by the estimated marginal means in the table, though perhaps less easy to observe quickly.
Thus in the figure above, analysis of variance tests whether group means differ, in this case for a factor with three levels (groups). The homogeneity of variances assumption is met by the width of the distributional curve for each group being approximately the same, which it is in the figure. (In addition, the normality assumption, discussed below, is met by each group displaying a bell-shaped curve. And the more the positions of the group curves differ, the more likely a finding that the factor is significant.)
However, ANOVA is robust for small and even moderate departures from homogeneity of variance (Box, 1954). Still, a rule of thumb is that the ratio of largest to smallest group variances should be 3:1 or less. Moore (1995) suggests the more lenient standard of 4:1. When choosing rules of thumb, remember that the more unequal the sample sizes, the smaller the differences in variances which are acceptable. Marked violations of the homogeneity of variances assumption can lead to either over- or under-estimation of the significance level. disrupt the F-test.
In the figure above, a full factorial model is tested, in which years of education is predicted from the fixed factors region, race, and gender, using the covariate number of siblings. Since Levene's test is significant, the researcher concludes that that groups do not have equal variances. Group variances may be examined in the Descriptive Statistics table, illustrated below, by squaring the standard deviation. The largest variance in years of education is 5.8912 = 34.70 for males in the Southeast who are "other" in race (nonwhite, nonblack); the smallest variance is 1.7732 = 3.14 for Northeast black females, as shown in the partial output below. Since the ratio of the largest to smallest group variance exceeds 10, there is a substantial violation of the assumption of homogeneity of variances. This can be expected to increase Type I errors on F tests in ANOVA (recall Type I errors are false positives, concluding a relationship is significant when it is not). Put another way, if the computed significance for an F test comes out to be .04, it is likely to be worse than that and while the .04 would say the relationship is significant, that conclusion could well be a Type I error for these data.
Spread vs.Level plots. One may also inspect for homogeneity of variances visually by asking for a spread vs. level plot under the Univariate GLM Options button:
In the plot above, the factors with levels in parentheses are region (3), race (3), and sex (2), jointly giving 18 factor groups corresponding to the 18 dots on the plot. The more the dots are within a narrow band of variances on the Y axis, the greater the homogeneity of variances. Additionally, since the X axis is the means of the factor groups, one can visually inspect to see if there is a trend for variances to increase as means increase or some other pattern. Here there is no such clear pattern, but there is considerable diversity in the variances of the factor groups.
In SPSS, Analyze, Compare Means, One-Way ANOVA; click Options; select Brown-Forsyth.
In SPSS, select Analyze, General Linear Model, Univariate; click Save; select the residual res_1 for the dependentl click Plots; select Normality Plots.
Balanced ANOVA designs have equal group sizes, unbalanced ANOVA does not. Unbalanced designs require adjustments in how ANOVA is computed. This is done automatically in ANOVA and MANOVA in SPSS. In SAS, unless a recent version has changed it, no correction is made in PROC ANOVA but correction for unequal groups is done in PROC GLM.
Equal group sizes are not assumed by the t or F tests for the overall model. The range tests based on the q statistic do require a common n, but this is derived by computing the harmonic mean of the unequal group n's when differences are small, and by computing the harmonic mean of the two groups being compared when differences are larger.
Epsilon. If the researcher wishes to correct the univariate F test, this is done by using Huynh-Feldt or Greenhouse-Geisser Epsilon. The closer epsilon is to 1.0, the greater the sphericity. Recall that F is the ratio of between-groups to within-groups mean square variance. The degrees of freedom for between-groups is (k-1), where k = the number of groups. The degrees of freedom for within-groups is k(n-1), where n is the number of cases in each group. To correct F given a finding of lack of sphericity, the researcher multiplies the between-groups degrees of freedom by the value of epsilon. SPSS supplies Huynh-Feldt epsilon, and the more conservative Greenhouse-Geisser epsilon [which in turn is an extension of Box's epsilon, no longer widely used]). For more severe departures from sphericity (epsilon < .75), the more conservative Greenhouse-Geisser epsilon is used, while Huynh-Feldt epsilong is used for less severe violations of the sphericity assumption. The researcher rounds degrees of freedom down to the nearest whole number and looks up the corrected F value in a table using the corrected degrees of freedom.
GLM dependent var [BY factor list [WITH covariate list]]
[/RANDOM=factor factor...]
[/REGWGT=varname]
[/METHOD=SSTYPE({1 })]
{2 }
{3**}
{4 }
[/INTERCEPT=[INCLUDE**] [EXCLUDE]]
[/MISSING=[INCLUDE] [EXCLUDE**]]
[/CRITERIA=[EPS({1E-8**})][ALPHA({0.05**})]
{a } {a }
[/PRINT = [DESCRIPTIVE] [HOMOGENEITY] [PARAMETER][ETASQ]
[GEF] [LOF] [OPOWER] [TEST(LMATRIX)]]
[/PLOT=[SPREADLEVEL] [RESIDUALS]
[PROFILE (factor factor*factor factor*factor*factor ...)
[WITH(covariate={value} [...])]]]
{MEAN }
[/TEST=effect VS {linear combination [DF(df)]}]
{value DF (df) }
[/LMATRIX={["label"] effect list effect list ...;...}]
{["label"] effect list effect list ... }
{["label"] ALL list; ALL... }
{["label"] ALL list }
[/KMATRIX= {number }]
{number;...}
[/CONTRAST (factor name)={DEVIATION[(refcat)]** }]
{SIMPLE [(refcat)] }
{DIFFERENCE }
{HELMERT }
{REPEATED }
{POLYNOMIAL [({1,2,3...})]}
{metric }
{SPECIAL (matrix) }
[/POSTHOC =effect [effect...]
([SNK] [TUKEY] [BTUKEY][DUNCAN]
[SCHEFFE] [DUNNETT(refcat)] [DUNNETTL(refcat)]
[DUNNETTR(refcat)] [BONFERRONI] [LSD] [SIDAK]
[GT2] [GABRIEL] [FREGW] [QREGW] [T2] [T3] [GH] [C]
[WALLER ({100** })])]
{kratio}
[VS effect]
[/EMMEANS=TABLES({OVERALL }) [WITH(covariate={value} [...])]]
{factor } {MEAN }
{factor*factor..
[COMPARE ADJ(LSD) (BONFERRONI) (SIDAK)]
[/SAVE=[tempvar [(name)]] [tempvar [(name)]]...]
[/OUTFILE=[{COVB('savfile'|'dataset')}]
{CORB('savfile'|'dataset')}
[EFFECT('savfile'|'dataset')] [DESIGN('savfile'|'dataset')]
[/DESIGN={[INTERCEPT...] }]
{[effect effect...]}
** Default if the subcommand or keyword is omitted.
| SS | df | MS | F | |
|---|---|---|---|---|
| between or explained | 64 | 2 | 32 | |
| within or | 68 | 21 | 3.24 | 9.88 |
| total | 132 | 23 |
SS is the sum of squares (the variation), df the degrees of freedom, MS the mean square (the variance, which is SS/df), and F is the F ratio (which is between MS divided by within MS). As the MS for between-groups is much greater than the MS for within-groups, this table shows the grouping variable does have an effect, as indicated hy the F ratio being greater than 1. The grouping variable had three groups (high, medium, low), which is why the between-groups df was (3-1)=2. There are 8 people per group, so the within-groups d.f. is number of groups times one less than the number of people per group: 3*(8-1)=21. These are the df for the numerator and denominator respectively. We look in the F table for the .05 significance level with 2 and 21 d.f., and find the critical F value is 3.47. As the computed F value is considerably more (9.88), we can be 95% confident that the grouping (independent) variable makes a difference in the dependent variable. (In fact, the F is high enough to be significant at the .001 level, and some computer programs will print this out in the ANOVA table along with or instead of the F value).
The table for two-way ANOVA is similar, but there are additional rows for the main (dependent on independent) effects and for interaction effects, as well as total explained and residual portions:
| SS | df | MS | F | |
|---|---|---|---|---|
| Main Effects | 88 | 3 | 29.333 | 18.857 |
| X1 | 24 | 1 | 24 | 15.429 |
| X2 | 64 | 2 | 32 | 20.571 |
| 2-Way Interaction Effects | 16 | 2 | 8 | 5.143 |
| X1 X2 | 16 | 2 | 8 | 5.143 |
| explained | 104 | 5 | 20.8 | 13.371 |
| residual | 28 | 18 | 1.556 | |
| total | 132 | 23 | 5.739 |
The two-way table is interpreted in the same way, except now there are rows for assessing the between-groups (main effects) variation overall and for each independent, and there are rows for assessing the interaction effects overall and for each interaction (here there is just one interaction, which is thus the same as the overall interaction row). The Explained row now reflects the combined main and interaction effects of the grouping variables, and the Residual is the remaining within-groups variation (the total variation minus the explained variation).
The two-way ANOVA table can be interpreted in terms of the difference of mean differences. The F test for either of the main effects in the table above is reflected in the difference between row means or between column means ( depending on whether X1 or X2 is the row or column variable) in a table (not shown) where X1 and X2 are independent factors and the cell entries are means on the dependent variable. The F test for the interaction effect is reflected in the difference of these two mean differences.
Under counterbalancing, treatments are introduced in different orders for different subjects, at random, such that overall each treatment occurs equally often at each time stage (here, four time periods) and equally often before and after every other treatment. Some algorithms have been devised to help the researcher set the sequences for each subject.
MANOVA Y BY a (1,3) b(1, 4)
/DESIGN = a VS 1
a BY b = 1 VS WITHIN
b VS WITHIN
The SPSS ANOVA procedure can be used as a test for the existence of linear, quadratic, and other polynomial relationships, using the "Contrasts" option. Of course, nonlinear effects can be modeled in regression using polynomial, logarithmic, or other nonlinear data transformations. A polynomial contrast partitions the between-groups sums of squares into trend components, which can be used to test for a trend (ex., a linear trend) of the dependent variable across the ordered levels of the categorical independent variable. SPSS supports 1st, 2nd, 3rd, 4th, and 5th degree polynomials.
Copyright 1998, 2008, 2009 by G. David Garson.
Last updated 2/6/09.