This content is now available from Statistical Associates Publishers. Click here.
Below is the unformatted overview and table of contents.
Overview Partial least squares (PLS) analysis is an alternative to OLS regression, canonical correlation, or structural equation modeling (SEM) of systems of independent and response variables. In fact, PLS is sometimes called "component-based SEM," in contrast to "covariance-based SEM," which is the usual type and which is implemented by Amos, LISREL, EQS and other major software packages. On the response side, PLS can relate the set of independent variables to multiple dependent (response) variables. On the predictor side, PLS can handle many independent variables, even when predictors display multicollinearity. PLS may be implemented as a regression model, predicting one or more dependents from a set of one or more independents; or it can be implemented as a path model, handling causal paths relating predictors as well as paths relating the predictors to the response variable(s). PLS is implemented as a regression model by SPSS and by SAS's PROC PLS. SmartPLS is the most prevalent implementation as a path model. PLS is characterized as a technique most suitable where the research purpose is prediction or exploratory modeling. In general, covariance-based SEM is preferred when the research purpose is confirmatory modeling. PLS is less than satisfactory as an explanatory technique because it is low in power to filter out variables of minor causal importance (Tobias, 1997: 1). The advantages of PLS include ability to model multiple dependents as well as multiple independents; ability to handle multicollinearity among the independents; robustness in the face of data noise and missing data; and creating independent latents directly on the basis of crossproducts involving the response variable(s), making for stronger predictions. Disadvantages of PLS include greater difficulty of interpreting the loadings of the independent latent variables (which are based on crossproduct relations with the response variables, not based as in common factor analysis on covariances among the manifest independents) and because the distributional properties of estimates are not known, the researcher cannot assess significance except through bootstrap induction. Overall, the mix of advantages and disadvantages means PLS is favored as a predictive technique and not as an interpretive technique, except for exploratory analysis as a prelude to an intepretive technique such as multiple linear regression or covariance-based structural equation modeling. Hinseler, Ringle, and Sinkovics (2009: 282) thus state, "PLS path modeling is recommended in an early stage of theoretical development in order to test and validate exploratory models." Developed by Herman Wold (Wold, 1981, 1985) for econometrics and chemometrics, PLS has since spread to research in education (ex., Campbell & Yates, 2011), marketing (ex., Albers, 2009, cites PLS as the method of choice in success factors marketing research), and the social sciences (ex., Jacobs et al., 2011). Table of Contents Overview 4 Key Concepts and Terms 5 Background 5 Models 6 Regression vs. path models 6 PLS-DA models 7 Mixed methods 7 Reflective vs. formative models 7 Confirmatory vs. exploratory models 7 Inner (structural) model vs. outer (measurement) model 8 Variables 8 Case identifier variable 8 Measured factors and covariates 8 Modeled factors and response variables 8 Measurement level of variables 10 Parameter estimates 11 Cross-validation and goodness-of-fit 11 PRESS and optimal number of dimensions 12 PLS path modeling with SmartPLS 13 Creating a PLS project and importing data 13 Validating the data 16 Creating the path model in SmartPLS 17 Reflective vs. formative models 19 Hiding the measurement model 19 Estimation options in SmartPLS 19 Finite mixture PLS 20 Running the path model in SmartPLS 20 Data metric for centered data 21 Weighting scheme 22 SmartPLS Output 22 Path coefficients 22 Bootstrapped significance 23 Options 26 Saving the model 27 SmartPLS Output 27 Model report 27 Model fit coefficients 28 Latent variable correlations 30 Latent variable crossloadings 31 Measurement model coefficients (outer model coefficients) 32 Structural model path coefficients (inner model coefficients) 32 Factor scores 33 Multigroup/finite mixture analysis (MGA) using FIMIX-PLS 33 Overview 33 Comparing models with differing numbers of segments 34 Entropy 35 Comparing path coefficients between segments 36 T-test of differences in path coefficients 37 Labeling the segments 37 PLS regression modeling with SmartPLS 39 PLS regression: SmartPLS vs SPSS or SAS 39 PLS regression: SPSS vs. SAS 39 Example 40 Creating a simple regression model in SmartPLS 40 Statistical output 41 PLS regression modeling with SPSS 44 Example 44 Model input 44 Statistical output 47 Proportion of variance explained by latent factors 47 PRESS (predictive error sum of squares) 48 Latent factor weights and loadings 48 Variable importance in the projection (VIP) for the independent variables 50 Regression parameter estimates by dependent variable 51 Charts/plots 52 Plots of latent factor weights 55 Residual and normal quantile plots 56 Saving variables 56 PLS regression modeling using SAS 57 Overview 57 Example 57 SAS syntax 57 SAS output 58 Parameter estimates 58 Percent variation accounted for 60 Correlation loading plot 60 Software for PLS 62 PLS Regression 62 PLS Path Analysis 63 Assumptions 65 Robustness 65 Distribution-free 65 Bootstrap estimates of significance 65 Independent observations 66 Data level 66 Homogeneity 66 Linearity 67 Outliers 67 Residuals 67 Appropriate sample size 67 Model specification 68 Multicollinearity 69 Proper use of dummy variables 69 Standardized variables 69 Frequently Asked Questions 70 Why is PLS sometimes described as a 'soft modeling' technique? 70 You said PLS could handle large numbers of independents, but can't OLS regression do this too? 70 How does PLS path modeling compare to path modeling in structural equation modeling (SEM) using packages like AMOS? 70 Is PLS always a linear technique? 72 How is PLS related to principal components regression (PCR) and maximum redundancy analysis (MRA)? 72 What are the SIMPLS and PCR methods in proc PLS in SAS? 73 What are the NIPALS and SVD algorithms? 74 How does PLS relate to two-stage least squares (2SLS)? 74 How does PLS relate to neural network analysis (NNA)? 74 What is confirmatory tetrad analysis (CTA) in PLS? 74 Bibliography 76