|
|
Overview
Advantages of SEM compared to multiple regression include more flexible assumptions (particularly allowing interpretation even in the face of multicollinearity), use of confirmatory factor analysis to reduce measurement error by having multiple indicators per latent variable, the attraction of SEM's graphical modeling interface, the desirability of testing models overall rather than coefficients individually, the ability to test models with multiple dependents, the ability to model mediating variables rather than be restricted to an additive model (in OLS regression the dependent is a function of the Var1 effect plus the Var2 effect plus the Var3 effect, etc.), the ability to model error terms, the ability to test coefficients across multiple between-subjects groups, and ability to handle difficult data (time series with autocorrelated error, non-normal data, incomplete data). Moreover, where regression is highly susceptible to error of interpretation by misspecification, the SEM strategy of comparing alternative models to assess relative model fit makes it more robust. SEM is usually viewed as a confirmatory rather than exploratory procedure, using one of three approaches: SEM is a family of statistical techniques which incorporates and integrates path analysis and factor analysis. In fact, use of SEM software for a model in which each variable has only one indicator is a type of path analysis. Use of SEM software for a model in which each variable has multiple indicators but there are no direct effects (arrows) connecting the variables is a type of factor analysis. Usually, however, SEM refers to a hybrid model with both multiple indicators for each variable (called latent variables or factors), and paths specified connecting the latent variables. Synonyms for SEM are covariance structure analysis, covariance structure modeling, and analysis of covariance structures. Although these synonyms rightly indicate that analysis of covariance is the focus of SEM, be aware that SEM can also analyze the mean structure of a model. See also partial least squares regression, which is an alternative method of modeling the relationship among latent variables, also generating path coefficients for a SEM-type model, but without SEM's data distribution assumptions. PLS path modeling is sometimes called "soft modeling" because it makes soft or relaxed assumptions about data...
|
|
The AMOS interface looks like this (large, initially blank area to draw the path diagram on the right is not shown):
In AMOS, the general process of structural modeling is to use the icons above to draw a circle-and-arrow path diagram, associated the diagram with data (a correlation matrix or raw data), then select Analyze, Calculate Estimates from the menu.
Example. In the AMOS example above, the latent variable PriorAbility is measured by the indicator variables pretest1 and pretest2. The latent variable PostAbility is measured by the indicator variables posttst1 and posttst2. Indicator and other measured variables are depicted as rectangles by convention. Latent variables are depicted as ovals by convention.The e1 to e4 terms are the error terms associated with each indicator variable. The arrows hypothesize that PostAbility is caused by PreAbility and by the practical performance experience of the exogenous measured variable Perform. The two-headed arrow indicates Perform is thought to be correlated with PriorAbility, which is an exogenous latent variable. As is usual, there is a disturbance or error term, Dist, associated with the endogenous latent variable, PostAbility. The 1's next to certain arrows are the regression weights necessary to set metrics in the model, as discussed below. Also shown is the Data Files window (select File, Data Files from the AMOS menu), showing the associted data file, structur1.sav, which is for this example is a correlation matrix with information on n, standard deviations, and means.
For this example, correlation matrix input looks like this (though note conventional raw data input is possible also and, indeed, is necessary if certain operations such as Data Recode, discussed below, are requested).
Warning: Indicator variables cannot be combined arbitrarily to form latent variables. For instance, combining gender, race, or other demographic variables to form a latent variable called "background factors" would be improper because it would not represent any single underlying continuum of meaning. The confirmatory factor analysis step in SEM is a test of the meaningfulness of latent variables and their indicators, but the researcher may wish to apply traditional tests (ex., Cronbach's alpha) or conduct traditional factor analysis (ex., principal axis factoring).
In the illustration below, the AMOS Object Properties window has been opened to the Parameters tab to show 1 entered as the metric for the regression line from the Dist disturbance term to the PostAbility latent variable, whose path is also labeled 1 on the diagram. Object Properties may be opened on any object by right-clicking the object in the diagram, then selecting Object Properties from the context menu. Though the metric of 1 is set automatically, this is where the researcher may constrain any parameter to any value, or, alternatively, erase a setting to free the parameter to be freely estimated.
Alternatively, one may set the factor variances to 1, thereby effectively obtaining a standardized solution. This alternative is inconsistent with multiple group analysis. Note also that if the researcher does not explicitly set metrics to 1.0 but instead relies on an automatic standardization feature built into some SEM software, one may encounter underidentification error messages -- hence explicitly setting the metric of a reference variable to 1.0 is recommended. See step 2 in the computer output example. Warning: LISREL Version 8 defaulted to setting factor variances to 1 if the user did not set the loading of a reference variable to 1.
Example. In the illustration above, the highest modification indexes have to do with correlated error terms between pre- and post-tests, particularly between pretest 1 and posttest1 (error terms 1 and 3). Modification indexes are also presented to suggest adding paths (regression lines), such as from posttest1 to pretest1. However, that would violate chronological logic for these data and the path therefore should not be added. In general, one should have sound theoretical reason for adding paths suggested by MIs.
Likewise, one can have good fit in a misspecified model. One indicator of this occuring is if there are high modification indexes in spite of good fit. High MI's indicate multicollinearity in the model and/or correlated error.
A good fit doesn't mean each particular part of the model fits well. Many equivalent and alternative models may yield as good a fit -- that is, fit indexes rule out bad models but do not prove good models.Also, a good fit doesn't mean the exogenous variables are causing the endogenous variables (for instance, one may get a good fit precisely because one's model accurately reflects that most of the exogenous variables have little to do with the endogenous variables). Also keep in mind that one may get a bad fit not because the structural model is in error, but because of a poor measurement model.
All other things equal, a model with fewer indicators per factor will have a higher apparent fit than a model with more indicators per factor. Fit coefficients which reward parsimony, discussed below, are one way to adjust for this tendency.
There are three ways, listed below, in which the chi-square test may be misleading. Because of these reasons, many researchers who use SEM believe that with a reasonable sample size (ex., > 200) and good approximate fit as indicated by other fit tests (ex., NNFI, CFI, RMSEA, and others discussed below), the significance of the chi-square test may be discounted and that a significant chi-square is not a reason by itself to modify the model.
Also, when degrees of freedom are large relative to sample size, GFI is biased downward except when the number of parameters (p) is very large. Under these circumstances, Steiger recommends an adjusted GFI (GFI-hat). GFI-hat = p / (p + 2 * F-hat), where F-hat is the population estimate of the minimum value of the discrepancy function, F, computed as F-hat = (chisquare - df) / (n - 1), where df is degrees of freedom and n is sample size. GFI-hat adjusts GFI upwards. Also, GFI tends to be larger as sample size increases; correspondingly, AGFI may underestimate fit for small sample sizes, according to Bollen (1990a).
The absolute value of AIC has no intuitive value, except by comparison with another AIC, in which case the lower AIC reflects the better-fitting model. AIC close to zero reflects good fit. It is possible to obtain AIC values < 0. In model development, the researcher stops modifying when AIC starts rising.
AIC is computed as (chisq/n) + (2k/(n-1)), where chisq is model chi-square, n is the number of subjects, and k is (.5v(v+1))-df, where v is the number of variables and df is degrees of freedom. See Burnham and Anderson (1998) for further discussion of AIC and related information theory measures.
BIC is an approximation to the log of a Bayes factor for the model of interest compared to the saturated model. BIC became popular in sociology after it was popularized by Raftery in the 1980s. See Raftery (1995) on BIC's derivation. Recently, however, the limitations of BIC have been highlighted. See Winship, ed. (1999), on controversies surrounding BIC. BIC uses sample size n to estimate the amount of information associated with a given dataset. A model based on a large n but which has little variance in its variables and/or highly collinear independents may yield misleading model fit using BIC.
CFI is similar in meaning to NFI (see below) but penalizes for sample size. CFI and RMSEA are among the measures least affected by sample size (Fan, Thompson, and Wang, 1999). CFI varies from 0 to 1 (if outside this range it is reset to 0 or 1). CFI close to 1 indicates a very good fit. CFI is also used in testing modifier variables (those which create a heteroscedastic relation between an independent and a dependent, such that the relationship varies by class of the modifier). By convention, CFI should be equal to or greater than .90 to accept the model, indicating that 90% of the covariation in the data can be reproduced by the given model. It is computed as (1-max(chisq-df,0))/(max(chisq-df),(chisqn-dfn),0)), where chisq and chisqn are model chi-square for the given and null models, and df and dfn are the corresponding degrees of freedom. Note Raykov (2000, 2005) and Curran et al. (2002) have argued that CFI, because based on noncentrality, is biased as a model fit measure.
NNFI close to 1 indicates a good fit. Rarely, some authors have used the a cutoff as low as .80 since TLI tends to run lower than GFI. However, more recently, Hu and Bentler (1999) have suggested NNFI >= .95 as the cutoff for a good model fit and this is widely accepted (ex., by Schumacker & Lomax, 2004: 82) as the cutoff. . NNFI values below .90 indicate a need to respecify the model.
It may be said that RMSEA corrects for model complexity (penalizes for lack of parsimony), as shown by the fact that df is in its denominator. However, degrees of freedom is an imperfect measure of model complexity. Since RMSEA computes average lack of fit per degree of freedom, one could have near-zero lack of fit in both a complex and in a simple model and RMSEA would compute to be near zero in both, yet most methodologists would judge the simpler model to be better on parsimony grounds. Therefore model comparisons using RMSEA should be interpreted in the light of the parsimony ratio, which reflects model complexity according to its formula, PR = df(model)/df(maximum possible df). Also, RMSEA is normally reported with its confidence intervals. In a well-fitting model, the lower 90% confidence limit includes or is very close to 0, while the upper limit is less than .08.
\agfi Adjusted goodness of fit index (AGFI) \aic Akaike information criterion (AIC) \bcc Browne-Cudeck criterion (BCC) \bic Bayes information criterion (BIC) \caic Consistent AIC (CAIC) \cfi Comparative fit index (CFI) \cmin Minimum value of the discrepancy function C in Appendix B \cmindf Minimum value of the discrepancy function divided by degrees of freedom \datafilename The name of the data file. \longdatafilename displays the fully qualified path name of the data file. \datatablename The name of the data table (for those file formats that allow a single file to contain multiple data tables, such as Excel workbooks.) \date Today's date in short format. \longdate displays today's date in long format. The displayed date is made current whenever the path diagram is read from a file, saved or printed. \df Degrees of freedom \ecvi Expected cross-validation index (ECVI) \ecvihi Upper bound of 90% confidence interval on ECVI \ecvilo Lower bound of 90% confidence interval on ECVI \f0 Estimated population discrepancy (F0) \f0hi Upper bound of 90% confidence interval on F0 \f0lo Lower bound of 90% confidence interval on F0 \filename Name of the current AMW file. Use \longfilename to display the complete path to the current AMW file. \fmin Minimum value of discrepancy function F in Appendix B \format Format name (See Formats tab.) \gfi Goodness of fit index (GFI) \group Group name (See Manage groups.) \hfive Hoelter's critical N for =.05 \hone Hoelter's critical N for =.01 \ifi Incremental fit index (IFI) \longdatafilenameThe fully qualified path name of the data file. \datafilename displays the data file name without the path. \longdate Today's date in long format. \date display's today's date in short format. The displayed date is made current whenever the path diagram is read from a file, saved or printed. \longfilename Fully qualified path name of the current AMW file. Use \filename to display the file name without the path. \longtime The time in long format. \time displays the time in short format. The displayed time is made current whenever the path diagram is read from a file, saved or printed. \mecvi Modified ECVI (MECVI) \model Model name (See Manage models.) \ncp Estimate of non-centrality parameter (NCP) \ncphi Upper bound of 90% confidence interval on NCP \ncplo Lower bound of 90% confidence interval on NCP \nfi Normed fit index (NFI) \npar Number of distinct parameters \p "p value" associated with discrepancy function (test of perfect fit) \pcfi Parsimonious comparative fit index (PCFI) \pclose "p value" for testing the null hypothesis of close fit (RMSEA < .05) \pgfi Parsimonious goodness of fit index (PGFI) \pnfi Parsimonious normed fit index (PNFI) \pratio Parsimony ratio \rfi Relative fit index \rmr Root mean square residual \rmsea Root mean square error of approximation (RMSEA) \rmseahi Upper bound of 90% confidence interval on RMSEA \rmsealo Lower bound of 90% confidence interval on RMSEA \time The time in short format. \longtime displays the time in long format. The displayed time is made current whenever the path diagram is read from a file, saved or printed. \tli Tucker-Lewis index (TLI)
Warning: Specification search is based on data-driven model fitting and like stepwise methods for other procedures (ex., stepwise multiple regression), is often disparaged by researchers because it may well result in overfitting of models (fitting to noise in the data, with the consequence that the model will not generalize to new data). If stepwise methods are used, the overfitting problem is mitigated by cross-validation (developing the model on one dataset, validating it on a hold-out validation sample/).
Example. In the example above, the path from Perform to PriorAbility has been made optional (here shown in green, but yellow in AMOS). Therefore the specification search generates two default models, one with and one without the optional arrow. In the output, the original full model with the optional arrow is the one with the larger number of parameters (12). Various fit measures are shown, explained below. By most (but not all) measures, such as AIC, the original model is best-fitting.
The next step is to associate the group names with actual data files under Data, Data Files. as illustrated below, using the Data Files button as usual to add the files, here males.sav and females.sav,
Before testing for measurement invariance across groups, the researcher first checks to see if the model as drawn has acceptable fit for each of the multiple groups (in this example, for males and females). Often the researcher tests one-sample models separately first. For instance one might test the model separately for a male sample and for a female sample. Separate testing provides an overview of how consistent the model results are, but it does not constitute testing for significant differences in the model's parameters between groups. If consistency is found, then the researcher will proceed to multigroup testing. First a baseline chi-square value is derived by computing model fit for the pooled sample of all groups. To accomplish this, one simply selects Analyze, Calculate Estimates. View, Text Output, will reveal (1) the usual overall goodness of fit measures, which should show the model has acceptable fit; and (2) separate regression parameter estimates for each of the groups, and these parameters should be significant for all groups. These findings establish that the given path model is plausible for the multiple groups and set the stage for testing measurement invariance across groups.
First, the researcher can click the "View output path" icon as illustrated below, then alternately select the groups to display the path parameter estimates to verify that they do not differ strongly by group. If the path parameters seem similar, the researcher has reason to suspect that measurement invariance is upheld (or structural invariance in the case of path parameters pertaining to the structural model).
The parameters noted in the illustration above are:
The selection of parameters to constrain corresponds to the researcher's purpose. For instance, in testing measurement invariance for a factor analysis model, the weights to constrain to be equal across groups would be the regression weights from the latent variables (the factors) to the indicator variables.
After selecting the constraint model(s) using Analyze, Multiple-Group Analysis from the AMOS menu, the researcher selects Analyze, Calculate Estimates. In the main AMOS display as illustrated below, each of the models is evaluated. The "OK" next to each model indicates it could be fitted (was not unidentified).
Then the researcher selects View, Text Output, to see the output, including the (excerpted) goodness of fit measures illustrated below. Each measure is shown for the unconstrained model and each constrained model selected under Analyze, Multiple-Group Analysis. In this example, only three such models were requested. The P value for CMIN is the model chi-square test, which for these models is non-significant, meaning that none of the models are significantly different from the saturated (perfect explanation) model and all are acceptable. By GFI, fit was acceptable for all three, meaning that constraining the groups to be equal still yielded acceptable model fit.
Cheung & Rensvold (2002) examined 20 goodness of fit measures for use when testing for measurement invariance across multiple groups, recommending the use of CFI, NCP, and GFI because these measures were independent of model complexity and sample size and were uncorrelated with model chi-square.
Regardless of distributional assumptions, CR2 is how much model chi-square will increase if the parameter is fixed to 0. Note CR is computed for raw data input and is not available if input is a correlation matrix or standardized regression weights or if the estimation method is ULS or SLS.
Because mixture modeling relies on Bayesian estimation, it can model each subgroup validly even if the model is inadequate for the entire population at large. That is, mixture modeling is appropriate for a model which is incorrect overall but is nonetheless correct when the population is divided into certain subgroups. For instance, sports activity might be positively related to job performance for one group and negatively related in another group. The overall model would show little or no relationship, but when the population was grouped, a regression model might explain significant variance in performance in each group separately. To take a second example, sports and performance might be positively related for each of two groups, having the same regression slope, but the intercept might be different between the two groups, as indicated by parallel regression lines in a scatterplot.
If the researcher does not know what the groups should be, the researcher can enter names like Group1, Group2, etc. It is necessary to specify the number of groups a priori, but it is possible to run mixture analysis multiple times, specifying a different number of groups each time.
Note that estimated population proportions for each group are shown at the bottom of the Bayesian output. One can right-click on this proportions row to select either a prior table of proportions or a posterior graphic of the proportion in a given category (that is, Bayesian methods are iterative and the displayed propotion is simply the mean of a normal distribution of many estimates; these estimates can be displayed as a curve or as a histogram).
The purpose of latent growth curve modeling in SEM is to determine if a researcher-specified change model (ex., constant linear growth) if valid for some dependent variable, and if so, to see what the effect of covariates are on the rate of growth. Other inferences may be made as discussed below.
The Bollen-Stine bootstrap and Satorra-Bentler adjusted chi-square are used for inference of exact structural fit when there is reason to think there is lack of multivariate normality or other distributional misspecification. In Amos, this is selected under View, Analysis Properties, Bootstrap tab. Other non-MLE methods of estimation exist, some (like ADF) not requiring the assumption of multivariate normality. In Amos, this is selected under View, Analysis Properties, Estimation tab. See also Bollen (1989).
In general, simulation studies (Kline, 1998a: 209) suggest that under conditions of severe non-normality of data, SEM parameter estimates (ex., path estimates) are still fairly accurate but corresponding significance coefficients are too high. Chi-square values, for instance, are inflated. Recall for the chi-square test of goodness of fit of the model as a whole, the chi-square value should not be significant if there is a good model fit: the higher the chi-square, the more the difference of the model-estimated and actual covariance matrices, hence the worse the model fit. Inflated chi-square could lead researchers to think their models were more in need of modification than they actually were. Lack of multivariate normality usually inflates the chi-square statistic such that the overall chi-square fit statistic for the model as a whole is biased toward Type I error (rejecting a model which should not be rejected). The same bias also occurs for other indexes of fit beside model chi-square. Violation of multivariate normality also tends to deflate (underestimate) standard errors moderately to severely. These smaller-than-they-should-be standard errors mean that regression paths and factor/error covariances are found to be statistically significant more often than they should be. Many if not most SEM studies in the literature fail to concern themselves with this assumption in spite of its importance.
Testing for normality and using transforms to normalize data are discussed in the StatNotes section on data assumptions and is discussed below with respect to AMOS. Note, however, SEM is still unbiased and efficient in the absence of multivariate normality if residuals are multivariate normally distributed with means of 0 and have constant variance across the independents, and the residuals are not correlated with each other or with the independents. PRELIS, a statistical package which tests for multivariate normality, accompanies LISREL and provides a chi-square test of multivariate normality.
As a rule of thumb, discrete data (categorical data, ordinal data with < 15 values) may be assumed to be normal if skew and kurtosis is within the range of +/- 1.0 (some say +/- 1.5 or even 2.0) (Schumacker & Lomax, 2004: 69).
One might think SEM's use of MLE estimation meant linearity was not assumed, as in logistic regression. However, in SEM, MLE is estimating the parameters which best reproduce the sample covariance matrix, and the covariance matrix assumes linearity. That is, while the parameters are estimated in a nonlinear way, what they are in turn reflecting is a matrix requiring linear assumptions.
A model is underidentified if there are more parameters to be estimated than there are elements in the covariance matrix. The mathematical properties of underidentified models prevent a unique solution to the parameter estimates and prevent goodness of fit tests on the model.
Researchers want an overidentified model, which means one where the number of knowns (observed variable variances and covariances) is greater than the number of unknowns (parameters to be estimated). When one has overidentification, the number of degrees of freedom will be positive (recall AMOS has a DF tool icon to check this easily). Thus, in SEM software output, the listing for degrees of freedom for model chi square is a measure of the degree of overidentification of the model.
The researcher is well advised to run SEM on pretest or fictional data prior to data collection, since this will usually reveal underidentification or just identification. One good reason to do this is because one solution to underidentification is adding more exogenous variables, which must be done prior to collecting data. If underidentified, the program may issue an error message (ex., failure to converge), generate non-sensical estimates (ex., negative error variances), display very large standard errors for one or more path coefficients, yield unusually high correlation estimates (ex., over .9) among the estimated path coefficients, and/or even stall or crash. The AMOS package notifies the researcher of identification problems and suggests solutions, such as adding more constraints to the model. Alternatively, there are ways of estimating identification without actually running a model-estimation package.
If a model is underidentified or just identified (saturated), then one must do one or more of the following (not all model fitting computer packages support all strategies):
Signs of high multicollinearity:
Maximum likelihood estimation (MLE) is less effective when used with binary data. One option is to use weighted least squares (WLS) or robust weighted least squares (RWLS) estimation instead. Some researchers run the analysis twice, once with ML and once with RWLS estimation, then if results are similar use the ML output, which provides more information. AMOS supports WLS.
One rule of thumb found in the literature is that sample size should be at least 50 more than 8 times the number of variables in the model. Mitchell (1993) advances the rule of thumb that there be 10 to 20 times as many cases as variables. Another rule of thumb, based on Stevens (1996), is to have at least 15 cases per measured variable or indicator. Bentler and Chou (1987) allow as few as 5 cases per parameter estimate (including error terms as well as path coefficients) if one has met all data assumptions. The researcher should go beyond these minimum sample size recommendations particularly when data are non-normal (skewed, kurtotic) or incomplete. Note also that to compute the asymptotic covariance matrix, one needs k(k+1)/2 observations, where k is the number of variables; PRELIS will give an error message when one has fewer observations. Sample size estimation is discussed by Jaccard and Wan (1996: 70-74).
Doing Things in AMOS
A SEM diagram commonly has certain standard elements: latents are ellipses, indicators are rectangles, error and residual terms are circles, single-headed arrows are causal relations (note causality goes from a latent to its indicators), and double-headed arrows are correlations between indicators or between exogenous latents. Path coefficient values may be placed on the arrows from latents to indicators, or from one latent to another, or from an error term to an indicator, or from a residual term to a latent.
Each endogenous variable (the one 'Dependent variable' in the model below) has an error term, sometimes called a disturbance term or residual error, not to be confused with indicator error, e, associated with each indicator variable.
When listwise deletion cannot be used, some form of data imputation is recommended. Imputation means the missing values are estimated. In mean imputation the mean of the variable is substituted. Regression imputation predicts the missing value based on other variables which are not missing. LISREL uses pattern matching imputation: the missing data is replaced by the response to that variable on a case whose values on all other variables match the given case. Note that imputation by substituting mean values is not recommended as this shrinks the variances of the variables involved.
AMOS uses maximum likelihood imputation, which several studies show to have the least bias. To invoke maximum likelihood imputation in AMOS, select View, Analysis Properties, then select the Estimation tab and check "Estimate means and intercepts". Then select Analyze, Data imputation. In one example, Byrne (2001: 296-297) compared the output from an incomplete data model with output from a complete data sample and found ML imputation yielded very similar chi-square and fit measures despite 25% data loss in the incomplete data model. Warning: In AMOS 17 and 18, as noted above, if imputation is selected, latent variable names must conform to the old SPSS requirement of being 8 characters or less with no spaces or special characters.
Alternatively, SPSS's optional module Missing Value Analysis may be used to establish that data are missing at random, completely at random, and so on.
Pairwise deletion is never recommended as it can substantially bias chi-square statistics, among other problems.
Note on AMOS: AMOS version 4 uses zero for means in the null model. If the researcher has used 0 as the indicator for missing values, AMOS will fit the missing values, with the result that goodness of fit indices will be misleadingly higher than they should be. The researcher should use listwise deletion of some other procedure prior to using AMOS.
However, WLS requires very large sample sizes (>2,000 in one simulation study) for dependable results. Moreover, even when WLS is theoretically called for, empirical studies suggest WLS typically leads to similar fit statistics as maximum likelihood estimation and to no differences in interpretation.
Various types of correlation coefficients may be used in SEM:
The usual procedure is to create a latent variable (ex., Gender) which is measured by a single indicator (sex). The path from sex to gender must be specified with a value of 1 and the error variance must be specified as 0. Attempting to estimate either of these parameters instead of setting them as constraints would cause the model to be underidentified, preventing a convergent solution of the SEM model. If one has a variable one wants to include which has lower reliability, say .80, then the measurement error term for that variable would be constrained to (1 - .80) = .20 times its observed variance (that is, to the estimated error variance in the variable).
As the linked reading above discusses, the focus of SEM analysis for CFA purposes is on analysis of the error terms of the indicator variables. SEM packages usually return the unstandardized estimated measurement error variance for each given indicator. Dividing this by the observed indicator variance yields the percent of variance unexplained by the latent variables. The percent explained by the factors is 1 minus this.
In a standard CFA model each indicator is specified to load only on one factor, measurement error terms are specified to be uncorrelated with each other, and all factors are allowed to correlate with each other. One-factor standard models are identified if the factor has three or more indicators. Multi-factor standard models are identified if each factor has two or more indicators.
Non-standard CFA models, where indicators load on multiple factors and/or measurement errors are correlated, may nonetheless be identified. It is probably easiest to test identification for such models by running SEM for prestest of fictional data for the model, since SEM programs normally generate error messages signaling any underidentification problems. Non-standard models will not be identified if there are more parameters than observations. (Observations equal v(v+1)/2, where v is the number of observed indicator variables in the model. Parameters equal the number of unconstrained arrows from the latent variables to the indicator variables [unconstrained arrows are the one per latent variable constrained to 1.0, used to set the metric for that latent variable], plus the number of two-headed arrows in the model [indicating correlation of factors and/or of measurement errors], plus the number of variances [which equals the number of indicator variables plus the number of latent variables].) Note that meeting the parameters >= observations test does not guarantee identification, however.
Other reasons why significance is of less importance in SEM:
This is an alternative to the likelihood ratio test for validating measurement models. CTA has the advantages of being able to test underidentified models, non-nested models, and models which have convergence problems under the likelihood ratio test. Also, CTA as an alternative approach provides an assessment of fit under the likelihood ratio test (Bollen & Tang, 1993).
A "tetrad" for a set of four indicator variables is the difference of the product of one pair of covariances and the product of a second pair of covariances. A model with n indicator variables will have n!/(n-4)!4! sets of four variables. Since four indicator variables involve six covariances, there will be three tetrads in each set of four indicators:
Tetrad 1234 = (Cov12*Cov34) - (Cov13*Cov24)
Tetrad 1342 = (Cov13*Cov42) - (Cov14*Cov32)
Tetrad 1432 = (Cov14*Cov23) - (Cov12*Cov43)
Latent variables with more than four indicators can still be analyzed by creating sets of four (Bollen & Ting, 1993). Latent variables with fewer than four indicators must "borrow" an indicator from another latent construct to make up a set of four. CTA can also be extended to models with censored, ordinal, or dichotomous variables (Hipp & Bollen, 2003).
Bollen (1990) and Bollen & Ting (1993, 2000) observed that all model-implied tetrads equal 0 (a "vanishing tetrad") in a fully reflective model (a model with fully reflects the data). They proposed CTA as a means of distinguishing between causal and effect indicators in SEM models. If all model tetrads test as not significantly different from zero, then the H0 null hypothesis is accepted and the researcher's causal model is upheld. In contrast, "a significant test statistic supports H1 that casts doubt on the effect indicator model in favor of the alternative cause indicator model" (Gudergang, Ringle, Wende, & Will, 2008: 1239). Note that as multiple tests are involved in tetrad analysis, researchers use a Bonferroni adjustment to the alpha significance level, as suggested by Bollen (1990: 88), by dividing alpha by the number of tetrads (ex, if alpha=.05 and there are 5 tetrads, each should be significant at the .01 level).
How are degrees of freedom computed? Degrees of freedom equal sample moments minus free parameters. The number of sample moments equals the number of variances plus covariances of indicator variables (for n indicator variables, this equals n[n+1]/2). The number of free parameters equals the sum of the number of error variances plus the number of factor (latent variable) variances plus the number of regression coefficients (not counting those constrained to be 1's).
Tests related to non-recursive models:
Bollen's (1989) two-step rule is a sufficient condition to establish identification:
Also, no model can be identified if there are more parameters (unknowns) than observations (knowns). If a model passes the two-step rule above, it will also pass the observations >=parameters test.
Observations. The number of observations is (v(v+1))/2, where v is the number of observed variables in the model.
Parameters. The number of parameters (unknowns to be estimated) is (x + i + f + c + (i - v) + e), where:
Example 7
A nonrecursive model
A reciprocal causation model of
perceived academic ability, using
the female subsample of the Felson
and Bohrnstedt (1979) dataset.
$Standardized ! requests correlations and standardized regression weights
! in addition to degault covariances and unstandardized weights
$Smc ! requests squared multiple correlation output
$Structure
academic <--- GPA
ACADEMIC <--- ATTRACT
ACADEMIC <--- ERROR1 (1)
ATTRACT <--- HEIGHT
ATTRACT <--- WEIGHT
ATTRACT <--- RATING
ATTRACT <--- ACADEMIC
ATTRACT <--- ERROR2 (1)
ERROR2 <--> error1
$Input variables
academic ! Perception of
! academic ability.
athletic ! Perception of
! athletic ability.
attract ! Perception of physical
! attractiveness.
GPA ! Grade point average.
!
height ! Height minus group
! mean for age and sex.
weight ! Weight with height
! 'controlled'.
rating ! Strangers' rating of
! attractiveness.
$Sample size = 209
$Correlations
1.00
.43 1.00
.50 .48 1.00
.49 .22 .32 1.00
.10 -.04 -.03 .18 1.00
.04 .02 -.16 -.10 .34 1.00
.09 .14 .43 .15 -.16 -.27 1.00
$Standard deviations
.16 .07 .49 3.49 2.91 19.32 1.01
$Means
.12 .05 .42 10.34 .00 94.13 2.65
As can be seen, a correlation matrix is part of the input, along with a listing of standard deviations and means, and a list of indicators and their correspondence to latent variables. Constraints could also be entered in the input file, but there aren't matrices of the LISREL input type.
The LISREL code for this is found in Jaccard and Wan, 1996: 25-29. Jaccard and Wan also generalize this to three-way interactions (the modifier has a modifier) and more than two categories (pp. 31-37). Note this procedure is preferable to using regression (or some other procedure) to preprocess data by partialing the effects of a covariate out of variables used in the SEM model. Including the modifier variable in the SEM model is analogous to using it as a covariate under ANOVA.
1. Run the measurement model to get the factor loadings.
2. Run the structural model to get the maximum likelihood R-square and the model chi-square.
3. Add to the model an interaction latent with indicators. Each indicator has an error term, as usual.
4. From the measurement model output, select a few pairs of indicators for crossproducts. Use ones that have high factor loadings.
This follows Jonsson (1998) who showed only some crossproducts need to be used. Compute these crossproduct variables in the
raw data and save as an SPSS .sav file (raw data is needed for robust estimates later). Note crossproducts are only one (albeit
common) functional form for interactions; failure to find an interaction effect with the crossproduct form does not rule out the
presence of other forms of interaction. Note also that non-normally distributed indicators may bias the variance of the crossproducts
and make the interaction latent less effective when testing for interaction effects. One can, of course, apply transforms to the
indicators to attempt to bring them into normality first.
5. The regression weights (factor loadings) connecting the crossproduct indicators to the interaction latent are simply the
products of the regression coefficients of their components in the measurement model..
6. The error terms for any given crossproduct indicator equal (the measurement model factor loading squared for the first
paired indicator times the variance of its latent (1.0, so it doesn't really matter) times its error term) plus the same thing for the
second paired indicator plus the crossproduct of the two error terms.
7. The interaction model is specified using the coefficients computed in steps 5 and 6). The indicators for the regular latents
are set equal to their regression weights (factor loadings) from the measurement model run in step 1 times their corresponding
latent factor plus the error term loading from step 1 times the error term. For the crossproduct indicator variables, these have
similar formulas, but using the regression weights from step 5 and the error term loadings from step 6.
8. The interaction model sets the paths for each independent latent to their values as computed in the structural model in Step 2.
The path for the interaction latent is left to vary (an unknown to be computed), as is the path to the error term for the dependent latent.
9. The SEM package then computes the path coefficient for the interaction latent as well as the R-square for the model. When
running the interaction model, ask for robust estimation of parameters (this requires input of raw data, not just covariance matrices).
Robust estimation gives distribution-free standard errors as well as computes Satorra-Bentler scaled chi-square, an adjustment to
chi-square which penalizes chi-square for the amount of kurtosis in the data.
Note, however, the interaction latent may still display multicollinearity with its constituent observed variables, which are
indicators for other latents. There is no good solution to this possible source of bias, but one can compute the correlation of
the factor scores for the interaction latent with its constituent observed variables (not crossproducts) to assess the degree of
multicollinearity.
10. The difference of the two R-squareds can be tested with an F test of difference to determine if the models are significantly
different. Or one may use the likelihood ratio test of difference. Or one may look to see if the path coefficient for the interaction
latent to the dependent is significant.
11. If there is a finding of non-significance in step 10, then the interaction model is not significantly better than the model
without interactions and on parsimony grounds, the more complex interaction model is rejected.
One does not simply add the crossproducts as additional independents as one would do in OLS regression. In a model with two latent independents, each with three indicators, there will be 3*3 = 9 possible crossproduct interaction terms. For simplicity, it is recommended (Joreskog and Yang, 1996; Jaccard and Wan, 1996: 55) that only one of these crossproducts be modeled in testing the interaction of the two latent variables. Jonsson (1998) recommends using only a few. To model such an interaction, the researcher must add four additional input matrices to LISREL: Kappa, Alpha, Tau-X, and Tau-Y (see above) and in them specify a complex series of constraints (see Jaccard and Wan, 1996: 56-57). This topic and LISREL coding for it are discussed in Jaccard and Wan, 1996: 53-68.
1. Factor scores for the latents in a model are computed and saved. 2. An interaction latent variable is constructed based on crossproducts of the factor scores. 3. The interaction latent is modeled as an additional cause of the dependent latent. 4. In the output the researcher looks to see if the path coefficient of the interaction latent is significant. If it is, there is significant interaction between the latents.
1. Separate the sample into two (or more) groups defined by the categorical indicator and for each group, run two models: (i) an unconstrained model, and (ii) a model in which certain parameters are constrained to be equal. In Amos, an equality constraint is created when a label is assigned to the parameter. 2. There is disagreement among methodologists on just which and how many constraints to constrain to be equal. One common approach is to constrain the measurement model to be equal across groups by constraining the loadings of indicators on their respective factors to be equal. However, one could also test for structural interaction effects by constraining the path coefficients connecting latents to be equal. Even more rigorously, one could constrain error term variances to be equal, though in practice this practically guarantees that group differences will be found to be significant. 3. If the goodness of fit is similar for both the constrained and unconstrained analyses, then the unstandardized path coefficients for the model as applied to the two groups separately may be compared. If the goodness of fit of the constrained model is worse than that for the corresponding unconstrained model, then the researcher concludes that model direct effects differ by group. Depending on what was constrained, for instance the researcher may conclude that the measurement model differs between groups. That is, the slopes and intercepts differ when predicting the factor from the indicators. Put another way, a given indicator may be less useful for one group compared to another. This would be shown by the fact that its slope on counted for less and the constant counted for more in the path from the indicator to the latent.Warning: It is not a good idea to test interaction using a multiple group approach on a categorical variable created by collapsing a continuous variable (ex., collapsing income in dollars to be just high and low income). This is because (i) information is lost; (ii) tests are being done on smaller samples when the total sample is divided into groups; and (iii) the selection of a cutting point to divide the continuous variable may well have significant, unexamined effects on the parameters and conclusions.
Log-linear analysis with latent variables is a sub-interval analog to SEM. It combines log-linear analysis with latent class analysis.
There are several reasons why one may get negative variance estimates.
For more on causes and handling of negative error variance, see Chen, Bollen, Paxton, Curran, and Kirby (2001).
Solutions. Ordinarily the researcher will delete the offending indicator from the model, or will constrain the model by specifying a small positive value for that particular error term, and will otherwise work to specify a better-fitting model. Other strategies include dropping outliers from the data, applying nonlinear transforms to input data if nonlinear relations exist among variables, making sure there are at least three indicators per latent variable, specifying better starting values (better prior estimates), and gathering data on more cases. One may also drop MLE estimation in favor of GLS (generalized least squares) or even OLS (ordinary least squares).
AMOS is distinguished by having a very user-friendly graphical interface, including model-drawing tools, and has strong support for bootstrapped estimation. LISREL has a more comprehensive set of options, including nonlinear constraints on parameter estimates, and its companion PRELIS2 package can be used to generate covariance matrix input for LISREL using dichotomous or ordinal variables, or bootstrapped samples. EQS is noted for extensive data management features, flexible options for tests associated with respecifying models, and estimation procedures for non-normal data. There are also other differences in output. For instance, aside from differences in user-friendliness and output features, note that SPSS applies Bartlett's correction to chi-square whereas LISREL does not, accounting for differences in statistical output for the same data (as of 1997).
SAS PROC CALIS note:The default in CALIS is to the correlation matrix; researchers should use the COV option to get the standard form of SEM analysis based on the variance/covariance matrix.
Before proceeding to the structural model (arrows connecting the latent variables), the researcher reads data into AMOS using File, Data Files, File Name. If the data are an SPSS file, you can also launch SPSS by clicking on View Data. AMOS also reads Access, dBASE, Excel, FoxPro, and Lotus files. The researcher may or may not want to click on the Grouping Variable button to set up multiple group models. Note: in reading in data, AMOS will treat blank cells as missing; it will treat 0 cells as zeros, not missing. After the data file is opened (click Open), select "Variables in Dataset" from the View/Set menu. From the popup variable list, drag appropriate variables to the corresponding locations on the diagram. You may need to reformat the labels by clicking on the "Resize Diagram to Fit Page" tool to enlarge the diagram. There is also a "Shape Change" tool to make wider rectangles. To name the latent variables, double-click on the latent variable in the diagram and enter a name in the Variable Name textbox which appears. Alternatively you can let AMOS assign default names by selecting Tools, Name Unobserved Variables. Use the "Add Unique Variable" tool to add an error/residual term for a latent variable. Use single-headed arrow tool to represent relationships among the latent variables, and use the double-headed arrow for unexamined correlations between exogenous latent variables. Remember to choose File, Save As, to save your model diagram, which will have a .amw extension.
To run the SEM model, select View/Set, Analysis Properties, and set your options in the various tabs of the Analysis Properties dialog box. For instance, on the Output tab you can choose whether or not to have standardized estimates or if you want tests of normality. On the Estimation tab you can ask to have AMOS estimate means and intercepts (required if you have missing data). Choose File, Save As, again, prior to running the model, to save your specifications.
To run the model, choose Model Fit, Calculate Estimates, or click the Calculate Estimates (abacus) icon. When the run is finished, the word "Finished" will appear at the bottom of the screen, right after "Writing output" and the (model) chi-square value and degrees of freedom for the model.
To view the model with the parameter values on the arrows, click on the View Output Path Diagram icon in the upper left corner of the AMOS screen.
Most of the statistical output, however, is stored by AMOS in spreadsheet format, accessed by clicking on the View Table Output tool, whose icon looks like a descending histogram forming a triangle. When the output measures table comes up there will be a menu on the left with choices like Estimates, Matrices, and Fit, as well as subcategories for each. Clicking on Fit Measures 1, for instance, brings up the portion of the spreadsheet with fit measures like RMR, GFI, BFI, RMSEA, and many others discussed elsewhere in this section. The column labeled "Default model" contains the fit measures for your model. The column labeled "Saturated" contains the fit measures for a just-identified model with as many parameters as available degrees of freedom. The column labeled "Independence" contains the fit measures for the null model of uncorrelated variables. The rows labeled Discrepancy, Degrees of Freedom, and P give model chi-square and its significance level (which should be > .05 to fail to reject the null bypothesis that your model fits the data). Normal or relative chi-square is reported below this as "Discrepancy/df."
Note the last column in the statistical output, labeled Macro, contains the names of each output measure and these variable names may be placed on your model's graphical diagram if you want. For instance, the macro name for model chi-square is CMIN, and the CMIN variable could be used to display model fit on your diagram.
Top row, left to right:
draw a rectangle for an indicator
draw an oval for a latent
draw a oval for a latent with its associated indicators and their error terms
draw a single-headed arrow indicating a causal (regression) path
draw a double-leaded arrow indicating a covariance
add an error term to an already-drawn indicator
add a title (caption)
list the variables in the model
list the variables in the working dataset
select a single object
select all objects
deselect all objects
copy an object
move an object to a new location
erase an object
Middle row, left to right:
change shape of an existing object
rotate an object
reverse direction of indicator variables
move parameter values to an alternate location
scroll the diagram to a new screen location
rearrange path arrows
select and read in a data file
run Analysis Properties
run Calculate Estimates
copy diagram to clipboard
view output in text mode
view output in spreadsheet mode
define object properties
drag (copy) properties of one object to one or more others
symmetrically reposition selected objects
Third row, left to right:
zoom selected area
zoom in
zoom out
resize path to fit in window
resize path to fit on page
magnify path diagram with a magnifying glass
display model degrees of freedom
link selected objects
print path diagram
undo last step
redo last undo
redraw (refresh) path diagram
In multigroup analysis, there may be multiple data files. In AMOS, select "Manage Groups" from the Model-Fit menu, or click on the Manage Groups Icon. Click "New" and enter a group name in place of the default name (ex., in place of "Group Number 3"). Open the Data File dialog box, select each group in turn, click on "File Name," and associate a file with each group.
For group: Girls
NOTE:
The model is recursive.
Assessment of normality
min max skew c.r. kurtosis c.r.
-------- -------- -------- -------- -------- --------
wordmean 2.000 41.000 0.575 2.004 -0.212 -0.370
sentence 4.000 28.000 -0.836 -2.915 0.537 0.936
paragrap 2.000 19.000 0.374 1.305 -0.239 -0.416
lozenges 3.000 36.000 0.833 2.906 0.127 0.221
cubes 9.000 37.000 -0.131 -0.457 1.439 2.510
visperc 11.000 45.000 -0.406 -1.418 -0.281 -0.490
Multivariate 3.102 1.353
Observations farthest from the centroid (Mahalanobis distance)
Observation Mahalanobis
number d-squared p1 p2
------------- ------------- ------------- -------------
42 18.747 0.005 0.286
20 17.201 0.009 0.130
3 13.264 0.039 0.546
35 12.954 0.044 0.397
The multivariate kurtosis value of 3.102 is Mardia's coefficient. Values of 1.96 or less mean there is non-significant kurtosis. Values > 1.96 mean there is significant kurtosis, which means significant non-normality. The higher Malanobis d-squared distance for a case, the more it is improbably far from the solution centroid under assumptions of normality. The cases are listed in descending order of d-square. The researcher may wish to consider the cases with the highest d-squared to be outliers and might delete them from the analysis. This should be done with theoretical justification (ex., rationale why the outlier cases need to be explained by a different model). After deletion, it may be the data will be found normal by Mardia's coefficient when model fit is re-run. In EQS, one may use Satorra-Bentley scaled chi-square adjustment if kurtosis is detected (not available in AMOS).
Bootstrapping assumes that the sample is representative of the underlying population, making it inappropriate for non-random samples in most cases. Bootstrapping also assumes observations are independent. Though small samples increase the chances of violation of non-normality, bootstrapping does not solve this problem entirely as the larger the sample, the more the precision of bootstrapped error estimates. Bootstrapping in SEM still requires moderately large samples. If bootstrapping is used, factor variances should not be constrained, else bootstrapped standard error estimates will be highly inflated.
In bootstrapping, a large number of samples with replacement are taken (ex., several hundred) and parameter estimates are computed for each, typically using MLE. (Actually any statistic can be bootstrapped, including path coefficients and fit indices). The bootstrapped estimates can be averaged and their standard error computed, to give a way of assessing the stability of MLE estimates for the original sample. Some modeling software also supports bootstrapped goodness of fit indexes and bootstrapped chi-square difference coefficients. AMOS, EQS, and LISREL using its PRELIS2 package, all support bootstrapped estimates. AMOS is particularly strong in this area. In AMOS, the $Bootml command yields frequency distributions of the differences between model-implied and observed covariances for alternative estimation methods.
To invoke bootstrapping in the AMOS graphical interface mode, choose View, Analysis Properties, and select the Bootstrap tab. Then click on "Perform bootstrapping." Also in the Bootstrap tab, set the number of bootstrap samples (ex., 500) and check to request "Bias-corrected confidence intervals" and set the corresponding confidence level (ex., 95). Also check "Bootstrap ML." Then select Model-Fit, Calculate Estimates as usual. The bootstrapped chi-square and its df will appear on the left-hand side of the Amos workspace. Interpretation of AMOS bootstrap output is discussed further below.
Bollen-Stine bootstrap p. The Bollen-Stine bootstrap is a bootstrap modification of model chi-square, used to test model fit, adjusting for distributional misspecification of the model (ex., adjusting for lack of multivariate normality). AMOS provides this option on the View, Analysis Properties menu selection under the Bootstrap tab, check "Bollen-Stine bootstrap." If Bollen-Stine bootstrap p < .05, the model is rejected. However, like model chi-square, Bollen-Stine P is very affected by a large sample size and the researcher is advised to use other measures of fit as a criterion for model acceptance/rejection when sample size is large.
Amos Input. In Amos, select View, Analysis Properties, Bootstrap tab. Click the Perform Bootstrap checkbox and other options wanted.
Amos Output. Requesting bootstrapped path estimates in AMOS will result in output containing two sets of regression parameter standard error estimates because AMOS still presents the the default maximum likelihood (ML) estimates first, then the bootstrapped estimates. In Amos bootstrap output for regression weights, there will be six columns.The label of the regression in question, the ML (or other estimate), and the standard error (labeled S.E.). This is followed the three bootstrap columns:
If standard errors are similar and bias low, then the ML (or other) estimates can be interpreted without fear that departures from multivariate normality or due to small samples have biased the calculation of parameters.
AMOS can also be requested to print out the confidence intervals for the estimated regression weights. If zero is not within the confidence limits, we may conclude the estimate is significantly different from zero, justifying the drawing of that particular arrow on the path diagram.
Copyright 1998, 2008, 2009 by G. David Garson.
Last updated 11/18/2009.