|
|
Overview
Multiple analysis of variance (MANOVA) is used to see the main and interaction effects of categorical variables on multiple dependent interval variables. MANOVA uses one or more categorical independents as predictors, like ANOVA, but unlike ANOVA, there is more than one dependent variable. Where ANOVA tests the differences in means of the interval dependent for various categories of the independent(s), MANOVA tests the differences in the centroid (vector) of means of the multiple interval dependents, for various categories of the independent(s). One may also perform planned comparison or post hoc comparisons to see which values of a factor contribute most to the explanation of the dependents. There are multiple potential purposes for MANOVA.
Multiple analysis of covariance (MANCOVA) is similar to MANOVA, but interval independents may be added as "covariates." These covariates serve as control variables for the independent factors, serving to reduce the error term in the model. Like other control procedures, MANCOVA can be seen as a form of "what if" analysis, asking what would happen if all cases scored equally on the covariates, so that the effect of the factors over and beyond the covariates can be isolated. The discussion of concepts in the ANOVA section also applies, including the discussion of assumptions. See also:
|
|
Note: To get partial eta-square in this table in SPSS one must check "Estimates of effect size" in the Options button dialog. To get power, one must check "Observed power" also.
Note: To get partial eta-square in this table in SPSS one must check "Estimates of effect size" in the Options button dialog. To get power, one must check "Observed power" also.
In the Tests of Between Subjects Effects table, partial eta-square serves as an effect size measure. The noncentrality index is used to compute the power level, which by rule of thumb should be equal or greater than .80 to accept with confidence that the chance of Type II error is low enough for a finding of non-significance by the F test (that is, to be confident that the relationship does not exist).
In the example above, the researcher concludes that the main effect of all factors and covariates is significant except for the effect of sex on happiness, and for that effect there is insufficient power to do more than fail to reject the null hypothesis that sex is unrelated to general happiness. A number of the interaction effects are non-significant in this full factorial model, some with sufficient power (>.80) to conclude no relationship exists (as opposed to merely failing to reject the null hypothesis).
The figure above is a partial table. The full table contains similar information for all levels of all effects for each dependent variable. Here only one dependent (happiness) is shown, and not all interaction effect levels are shown even for this one.
For factors, the univariate t-test significance for levels of a factor (ex., marital in the figure above) refers to the significance of the contrast between the given level and the reference level (the last category if the default is accepted; here, marital = 5 = never married). It is quite possible for these univariate tests to be different from the multivariate test shown in the "Multivariate Tests" and "Tests of Between Subjects Factors" tables above. For instance, the "Marital Status" variable in the parameters table above shows all levels to be non-significant with an effect size of .000, yet the multivariate tests show marital status to be significant overall and for each of the three dependents. Again, the multivariate and between-groups tests, not the b parameters and their t-tests, are the appropriate basis for assessing factors in multivariate GLM and are what are reported.
For covariates (ex., educ above), the t-test significance level (the "Sig column) will be the same in level as in the F-test in the "Tests of Between Subjects Effects" table above and refers to whether or not the covariate contributes significantly to the model, with the same partial eta2 effect size.
While the reader is referred to the univariate GLM section for discussion of multiple comparison and post-hoc tests, the figure above illustrates two such tests: the Tukey honestly significant difference (HSD) test and the Bonferroni adjusted t-test. Typically the former would be applied for post hoc analysis and the latter for planned comparisons. The table is partial output, just for the "general happiness" dependent variable. Full output covers all dependent variables. While coefficients of significance vary, both tests lead to the same substantive conclusions. For instance, divorced and widowed respondents are not significantly different from never married respondents on general happiness while married and separated respondents are significantly different.
Since general happiness was coded such that higher scores indicate less happiness, the signs in the mean difference column indicate that never married respondents are significantly less happy (had higher general happiness scores, due to reverse coding) than married respondents while the reverse was true of separated respondents (as indicated by the negative mean difference sign). Asterisks after the mean differences indicate which ones are significant, which is the same as also indicated in the "Sig." column.
To illustrate briefly, in the figure above, simple contrasts were requested for "Marital status" (marital). Simple contrasts compare the given level with the last level by default. Comparing marital = 1 ("Married) with the reference category (marital = 5 = "Never married") shows a significant difference between the two factor levels (married vs. never married) for general happuiness and number of children, but not for labor force status. In contrast, level 2 ("Widowed") is not significantly different from level 5 on any of the three dependent variables. See the univariate GLM section for further discussion of contrast tests.
The more the means of the dependents vary by factor level, the stronger the relation of the factor to the dependent. In the example above, means vary only a little, indicating a weak relationship. By examining the overlap of upper and lower bounds one finds that most regional comparisons are not significant. An exception is South East vs. West for "Having trouble with one's boss," which is significant.
In profile plots, additional factors can be represented by additional lines, as shown below, where parallel lines indicate no interaction and crossing lines (as in this example) indicate interaction among the factors (here, interaction between education and defaulting on a loan, when dependent variables are types of debt).
Note that profile analysis in MANOVA has been superceded to some extent by multidimensional scaling, mixed model ANOVA, and/or random effects regression models.
/LMATRIX = GENDER 1 -1 /MMATRIX t1 1 t2 -1; t2 1 t3 -1; t3 1 t4 -1.
/LMATRIX = gender 1 -1 /MMATRIX = t1 .25 t2 .25 t3 .25 t4 .25.
/LMATRIX = INTERCEPT 1 gender .5 .5 /MMATRIX t1 1 t2 -1; t2 1 t3 -1; t3 1 t4 -1.The LMATRIX command specifies a contrast between the two eqaully weighted values of gender and the intercept. The MMATRIX command asks for contrasts between t1 and t2, between t2 and t3, and between t3 and t4. SPSS output will be in a section labeled "Custom Hypothesis Tests."
Input Data C:\Program Files (x86)\SPSSInc\Statistics17\Samples\English\survey_sample.sav Syntax MANOVA age childs WITH educ paeduc maeduc /DISCRIM ALL ALPHA(1) /PRINT SIGNIF(MULTIV UNIV EIGEN DIMENR).
Eigenvalues and Canonical Correlations
Root No. Eigenvalue Pct. Cum. Pct. Canon Cor. Sq. Cor
1 .19741 93.10624 93.10624 .40604 .16487
2 .01462 6.89376 100.00000 .12003 .01441
The output above indicates that the independent set of variables (educ, paeduc, maeduc for respondent's, father's, and mother's education respectively) are related to the dependent set (age, childs, for respondent's age and number of children) by two dimensions. The first dimension explains 93.1% of the variance and the second dimension explains 6.9%. The canonical correlation for the first dimension is .41 and is .12 for the second dimension.
Each canonical root represents a dimension of meaning, but what? What meaningful label do we give to each canonical root (which SPSS labels merely, 1, 2, etc.)? In factor analysis one ascribes a label to each factor based on the factor loadings of each measured variable on the factor. In MANOVA, this is done on multiple bases, using the standardized weights and structure correlations shown below. The structure correlations are often the most useful for this purpose when there is more than one significant canonical root. Structure correlations are the correlations between the measured variables and the canonical roots. In MANOVA, there will be one set of MDA output for each main and interaction effect.
- - - - - - - - - - - - - -
Standardized canonical coefficients for DEPENDENT variables
Function No.
Variable 1 2
age .79451 -.78299
childs .34987 1.05920
- - - - - - - - - - - - - -
Correlations between DEPENDENT and canonical variables
Function No.
Variable 1 2
age .94954 -.31365
childs .70192 .71225
- - - - - - - - - - - - - -
The output above shows age loads heavily in a positive direction on dimension 1. Age loads negative and childs loads positively on dimension 2. Dimension 2 is most heavily correlated with childs. Howevever, if the researcher labels dimension 1 as "the age dimension" and dimension 2 as the "nnumber of children dimension," this is a half-truth oversimplification. As in factor analysis, it is difficult to label dimensions based on loadings and correlations.
See also the separate section on canonical correlationand the annotated MANOVA output found there.
For the credit card case on the right, for observed by predicted there is a mostly random pattern, indicative of a weak or nonexistent relationship between the predictors and the dependent. The upward linear pattern of residuals by observed shows that as the observed value of credit card debt goes up, model error increases. For the example on the left, for a dichotomous dependent, as the observed value goes from 0 to 1, predicted values are a bit higher but the overlap is very large, again indicating a weak or possibly nonsignificant relationship. With a dichotomy, the residual plots cannot assume a random cloud pattern, but it is again seen that as observed increases in value, so does residual (error) value.
GLM dependent varlist [BY factor list [WITH covariate list]]
[/REGWGT=varname]
[/METHOD=SSTYPE({1 })]
{2 }
{3**}
{4 }
[/INTERCEPT=[INCLUDE**] [EXCLUDE]]
[/MISSING=[INCLUDE] [EXCLUDE**]]
[/CRITERIA=[EPS({1E-8**})] [ALPHA({0.05**})]
{a } {a }
[/PRINT = [DESCRIPTIVE] [HOMOGENEITY] [PARAMETER][ETASQ] [RSSCP]
[GEF] [LOF] [OPOWER] [TEST [([SSCP] [LMATRIX] [MMATRIX])]]
[/PLOT=[SPREADLEVEL] [RESIDUALS]
[PROFILE (factor factor*factor factor*factor*factor ...) [WITH(covariate={value} [...])]]
{MEAN }
[/LMATRIX={["label"] effect list effect list ...;...}]
{["label"] effect list effect list ... }
{["label"] ALL list; ALL... }
{["label"] ALL list }
[/MMATRIX= {["label"] depvar value depvar value ...;["label"]...}]
{["label"] depvar value depvar value ... }
{["label"] ALL list; ["label"] ... }
{["label"] ALL list }
[/KMATRIX= {list of numbers }]
{list of numbers;...}
[/SAVE=[tempvar [(list of names)]] [tempvar [(list of names)]]...]
[DESIGN]
[/OUTFILE=[{COVB('savfile'|'dataset')}]
{CORB('savfile'|'dataset')}
[EFFECT('savfile'|'dataset')] [DESIGN('savfile'|'dataset')]
[/DESIGN={[INTERCEPT...] }]
{[effect effect...]}
** Default if the subcommand or keyword is omitted.
Example
GLM SCORE1 TO SCORE4 BY METHOD(1,3).
MANOVA dependent varlist [BY factor list (min,max)[factor list...]
[WITH covariate list]]
[/WSFACTORS=varname (levels) [varname...] ]
[/WSDESIGN]*
[/TRANSFORM [(dependent varlist [/dependent varlist])]=
[ORTHONORM] [{CONTRAST}] {DEVIATION (refcat) } ]
{BASIS } {DIFFERENCE }
{HELMERT }
{SIMPLE (refcat) }
{REPEATED }
{POLYNOMIAL [({1,2,3...})]}
{ {metric } }
{SPECIAL (matrix) }
[/MEASURE=newname newname...]
[/RENAME={newname} {newname}...]
{* } {* }
[/ERROR={WITHIN } ]
{RESIDUAL }
{WITHIN + RESIDUAL}
{n }
[/CONTRAST (factorname)={DEVIATION** [(refcat)] }] †
{POLYNOMIAL**[({1,2,3...})]}
{ {metric } }
{SIMPLE [(refcat)] }
{DIFFERENCE }
{HELMERT }
{REPEATED }
{SPECIAL (matrix) }
[/PARTITION (factorname)[=({1,1... })]]
{n1,n2...}
[/METHOD=[{UNIQUE** }] [{CONSTANT**}] [{QR** }]]
{SEQUENTIAL} {NOCONSTANT} {CHOLESKY}
[/{PRINT }= [CELLINFO [([MEANS] [SSCP] [COV] [COR] [ALL])]]
{NOPRINT} [HOMOGENEITY [([ALL] [BARTLETT] [COCHRAN] [BOXM])]]
[DESIGN [([OVERALL] [ONEWAY] [DECOMP] [BIAS] [SOLUTION]
[REDUNDANCY] [COLLINEARITY] [ALL])]]
[PARAMETERS [([ESTIM] [ORTHO][COR][NEGSUM][EFSIZE][OPTIMAL][ALL])]]
[SIGNIF [[(SINGLEDF)]
[(MULTIV**)] [(EIGEN)] [(DIMENR)]
[(UNIV**)] [(HYPOTH)][(STEPDOWN)] [(BRIEF)]
[{(AVERF**)}] [(HF)] [(GG)] [(EFSIZE)]]
{(AVONLY) }
[ERROR[(STDDEV)][(COR)][(COV)][(SSCP)]]
[/OMEANS =[VARIABLES(varlist)] [TABLES ({factor name }] ]
{factor BY factor}
{CONSTANT }
[/PMEANS =[VARIABLES(varlist)] [TABLES ({factor name })] [PLOT]] ]
{factor BY factor}
{CONSTANT }
[/RESIDUALS=[CASEWISE] [PLOT] ]
[/POWER=[T({.05**})] [F({.05**})] [{APPROXIMATE}]]
{a } {a } {EXACT }
[/CINTERVAL=[{INDIVIDUAL}][({.95}) ]
{JOINT } {a }
[UNIVARIATE ({SCHEFFE})]
{BONFER }
[MULTIVARIATE ({ROY })] ]
{PILLAI }
{BONFER }
{HOTELLING}
{WILKS }
[/PCOMPS [COR] [COV] [ROTATE(rottype)]
[NCOMP(n)] [MINEIGEN(eigencut)] [ALL] ]
[/PLOT=[BOXPLOTS] [CELLPLOTS] [NORMAL] [ALL] ]
[/DISCRIM [RAW] [STAN] [ESTIM] [COR] [ALL]
[ROTATE(rottype)] [ALPHA({.25**})]]
{a }
[/MISSING=[LISTWISE**] [{EXCLUDE**}] ]
{INCLUDE }
[/MATRIX=[IN({file})] [OUT({file})]]
{[*] } {[*] }
[/ANALYSIS [({UNCONDITIONAL**})]=[(]dependent varlist
{CONDITIONAL } [WITH covariate varlist]
[/dependent varlist...][)][WITH varlist] ]
[/DESIGN={factor [(n)] }[BY factor[(n)]] [WITHIN factor[(n)]][WITHIN...]
{[POOL(varlist)}
[+ {factor [(n)] }...]
{POOL(varlist)}
[[= n] {AGAINST} {WITHIN }
{VS } {RESIDUAL}
{WR }
{n }
[{factor [(n)] } ... ]
{POOL(varlist)}
[MWITHIN factor(n)]
[MUPLUS]
[CONSTANT [=n] ]
* WSDESIGN uses the same specification as DESIGN, with only within-subjects factors.
† DEVIATION is the default for between-subjects factors, while POLYNOMIAL is the default for within-subjects factors.
** Default if the subcommand or keyword is omitted.
This command reads the active dataset and causes execution of any pending commands. See Command Order for more information.
Example 1
* Analysis of Variance
MANOVA RESULT BY TREATMNT(1,4) GROUP(1,2).
Example 2
* Analysis of Covariance
MANOVA RESULT BY TREATMNT(1,4) GROUP(1,2) WITH RAINFALL.
Example 3
* Repeated Measures Analysis
MANOVA SCORE1 TO SCORE4 BY CLASS(1,2)
/WSFACTORS=MONTH(4).
Example 4
* Parallelism Test with Crossed Factors
MANOVA YIELD BY PLOT(1,4) TYPEFERT(1,3) WITH FERT
/ANALYSIS YIELD
/DESIGN FERT, PLOT, TYPEFERT, PLOT BY TYPEFERT,
FERT BY PLOT + FERT BY TYPEFERT
+ FERT BY PLOT BY TYPEFERT.
With regard to repeated measures designs, multiple univariate ANOVA runs are not equivalent to the multivariate analysis done by a single MANOVA and will have less power (more chance of type II errors, which are false negatives, thinking there is no relationship when there is). In repeated measures designs, when contrasts of group means are correlated with each other, multiple univariate ANOVA tests may fail to reject the null hypotheses of no group differences. This is because ANOVA tests differences in means under assumptions of independence, whereas MANOVA is sensitive not only to mean differences but also does not require the assumption of orthogonality (lack of correlations) among the contrasts. Correlation of contrasts is tested by Mauchly's sphericity test, which, if significant, indicates a violation of the sphericity model (a model which implies uncorrelated contrasts) and thus requires the researcher to employ MANOVA rather than a series of univariate ANOVAs. In general, MANOVA is used to test group differences when more than two levels are involved in a repeated measures design because, unlike multiple univariate ANOVA, MANOVA does not require the assumption that contrasts be uncorrelated. When multiple univariate ANOVA test results differ from MANOVA results on the same data, this indicates that the contrasts between the levels of the repeated measures factors are correlated across subjects.
The write-up would first state the results of the overall test of inter-group differences: "The Wilks Lambda multivariate test of overall differences among groups was statistically significant (p=0.032)." Depending on the publication, the parenthetical term might contain the F parameters and value (ex., "F(4,90) = 6.28"). Optionally, one might also state "The F statistic for Wilks' Lambda was exact.", which is reported under the relevant SPSS table. If there had been more than one factor, then the overall test is reported similarly for each factor. Also, one should always report effect size as well as significance. Before running the analysis, under the Options button, check "Estimates of effect size," which will cause partial eta-squared to be reported. State, "Although significant, the effect size of this relationship was weak as indicated by partial eta-squared = .11)."
The "Tests of Between-Subjects Effects" table following the overall multivariate tests gives significance levels and partial eta-squared for each dependent. The write-up would state something like, "Univariate between-subjects tests showed that media exposure was significantly and moderately related to recycling paper (p=.0001; partial eta-squared = .21) and cans (p=.041; partial eta-squared = .10), but not to plastics (p=.421; partial eta-squared = .01)."
Since the overall test when significant only shows at least one group is significantly different from another, the next research statement in the write-up must report post-hoc contrast tests among groups. For instance, in the table of "Multiple Comparisons" SPSS prints the mean difference on the dependent variable between any two groups and its corresponding significance. After using this test, the researcher might state, "Follow-up univariate post-hoc comparisons between groups using F statistics and Bonferroni-type simultaneous confidence intervals based on Student's t distribution also showed that the movie treatment group was significantly related to the recycling of paper (p=0.002) and cans (p=0.01) but not plastics (p=.055)." A similar statement would be made about the pamphlet group. The control group, which is the omitted category, would not have a corresponding statement. There are no partial eta-squareds to report. Optionally, one might list F parameters and values as well.
Line 1: The MANOVA command word is followed by the three variables opinion1, opinion2, and opinion3. These represent the three levels of the within-subjects factor Ideology. The BY keyword tells SPSS that what follows are the groups or between-subjects factors; in this case, EducLevel and SESlevel. Following each of the two between-subjects factors are two numbers between parentheses. SESlevel (1,3) simply means that the variable SESlevel has three levels coded as 1, 2, and 3. One may have no grouping variable and thus no BY clause.
Line 2: The slash mark indicates a subcommand. The WSFACTORS subcommand, tells SPSS that there is one repeated factor called Ideology and that it has three levels (matching the three opinion measurements listed after the MANOVA keyword). This is needed by SPSS to interpret the list of dependent variables in line 1. The WSFACTORS subcommand follows the MANOVA command when there is a within-subjects factor, which is to say when there is a repeated measures design.
Line 3: The WSDESIGN subcommand tells SPSS to test the within-subjects hypotheses for repeated measures designs.
Line 4: The PRINT subcommand specifies the output. CELLINFO (MEANS) prints cell means and standard deviations used to evaluate patterns in the data. Many additional statistics could be requested.
Line 5: The DESIGN subcommand causes SPSS to test the between-subjects hypotheses.
The general MANOVA syntax, from the SPSS manual, is:
MANOVA
depvarlist [BY indvarlist (min,max) [indvarlist (min, max) ...]
[WITH covarlist]]
[/WSFACTORS = name (levels) name...]
[/{PRINT | NOPRINT} = [CELLINFO [(MEANS SSCP COV COR ALL)]]
[HOMOGENEITY [(BARTLETT COCHRAN BOXM ALL)]]
[SIGNIF (MULTIV UNIV AVERF AVONLY EFSIZE ALL)]]
[/OMEANS [VARIABLES(varlist)] [TABLES ([CONSTANT] [factor BY factor])]]
[/CONTRAST(factor) = {POLYNOMIAL[(#)] | SPECIAL(k1s + contrasts)}]
[/CONTR ...}
[/WSDESIGN = effect ...] [/DESIGN = effect ...]
Notes on Effects
Keywords: BY, W or WITHIN, MWITHIN
Varname(#): # = one of k-1 contrasts or one of k levels
Click here for syntax for other MANOVA commands for various designs.
In SPSS, make sure that the dependents are entered in the desired order, then in the MANOVA syntax, enter PRINT SIGNIF(STEPDOWN) or simply PRINT SIG(STEP). For example:
manova var1 var2 var3 BY gender(1,2) /print signif(stepdown) ...See James Stevens, Applied Multivariate Analysis for the Social Sciences, 2nd Edition.
Using discriminant analysis, the MANCOVA dependents are used as predictor variables to classify a factor (treatment) variable, and the discriminant beta weights are used to assess the relative strength of relation of the MANOVA dependents to the factor. The beta weights indicate the strength of relation of a given dependent controlling for all other dependents