This content is now available from Statistical Associates Publishers. Click here.

Below is the unformatted overview and table of contents.

Overview

Partial least squares (PLS) analysis is an alternative to OLS regression, canonical correlation, or structural equation modeling (SEM) of systems of independent and response variables. In fact, PLS is sometimes called "component-based SEM," in contrast to "covariance-based SEM," which is the usual type and which is implemented by Amos, LISREL, EQS and other major software packages. On the response side, PLS can relate the set of independent variables to multiple dependent (response) variables. On the predictor side, PLS can handle many independent variables, even when predictors display multicollinearity. PLS may be implemented as a regression model, predicting one or more dependents from a set of one or more independents; or it can be implemented as a path model, handling causal paths relating predictors as well as paths relating the predictors to the response variable(s). PLS is implemented as a regression model by SPSS and by SAS's PROC PLS. SmartPLS is the most prevalent implementation as a path model. 

PLS is characterized as a technique most suitable where the research purpose is prediction or exploratory modeling. In general, covariance-based SEM is preferred when the research purpose is confirmatory modeling. PLS is less than satisfactory as an explanatory technique because it is low in power to filter out variables of minor causal importance (Tobias, 1997: 1). 

The advantages of PLS include ability to model multiple dependents as well as multiple independents; ability to handle multicollinearity among the independents; robustness in the face of data noise and missing data; and creating independent latents directly on the basis of crossproducts involving the response variable(s), making for stronger predictions. Disadvantages of PLS include greater difficulty of interpreting the loadings of the independent latent variables (which are based on crossproduct relations with the response variables, not based as in common factor analysis on covariances among the manifest independents) and because the distributional properties of estimates are not known, the researcher cannot assess significance except through bootstrap induction. Overall, the mix of advantages and disadvantages means PLS is favored as a predictive technique and not as an interpretive technique, except for exploratory analysis as a prelude to an intepretive technique such as multiple linear regression or covariance-based structural equation modeling. Hinseler, Ringle, and Sinkovics (2009: 282) thus state, "PLS path modeling is recommended in an early stage of theoretical development in order to test and validate exploratory models." 

Developed by Herman Wold (Wold, 1981, 1985) for econometrics and chemometrics, PLS has since spread to research in education (ex., Campbell & Yates, 2011), marketing (ex., Albers, 2009, cites PLS as the method of choice in success factors marketing research), and the social sciences (ex., Jacobs et al., 2011).


Table of Contents
Overview	4
Key Concepts and Terms	5
Background	5
Models	6
Regression vs. path models	6
PLS-DA models	7
Mixed methods	7
Reflective vs. formative models	7
Confirmatory vs. exploratory models	7
Inner (structural) model vs. outer (measurement) model	8
Variables	8
Case identifier variable	8
Measured factors and covariates	8
Modeled factors and response variables	8
Measurement level of variables	10
Parameter estimates	11
Cross-validation and goodness-of-fit	11
PRESS and optimal number of dimensions	12
PLS path modeling with SmartPLS	13
Creating a PLS project and importing data	13
Validating the data	16
Creating the path model in SmartPLS	17
Reflective vs. formative models	19
Hiding the measurement model	19
Estimation options in SmartPLS	19
Finite mixture PLS	20
Running the path model in SmartPLS	20
Data metric for centered data	21
Weighting scheme	22
SmartPLS Output	22
Path coefficients	22
Bootstrapped significance	23
Options	26
Saving the model	27
SmartPLS Output	27
Model report	27
Model fit coefficients	28
Latent variable correlations	30
Latent variable crossloadings	31
Measurement model coefficients (outer model coefficients)	32
Structural model path coefficients (inner model coefficients)	32
Factor scores	33
Multigroup/finite mixture analysis (MGA) using FIMIX-PLS	33
Overview	33
Comparing models with differing numbers of segments	34
Entropy	35
Comparing path coefficients between segments	36
T-test of differences in path coefficients	37
Labeling the segments	37
PLS regression modeling with SmartPLS	39
PLS regression: SmartPLS vs SPSS or SAS	39
PLS regression: SPSS vs. SAS	39
Example	40
Creating a simple regression model in SmartPLS	40
Statistical output	41
PLS regression modeling with SPSS	44
Example	44
Model input	44
Statistical output	47
Proportion of variance explained by latent factors	47
PRESS (predictive error sum of squares)	48
Latent factor weights and loadings	48
Variable importance in the projection (VIP) for the independent variables	50
Regression parameter estimates by dependent variable	51
Charts/plots	52
Plots of latent factor weights	55
Residual and normal quantile plots	56
Saving variables	56
PLS regression modeling using SAS	57
Overview	57
Example	57
SAS syntax	57
SAS output	58
Parameter estimates	58
Percent variation accounted for	60
Correlation loading plot	60
Software for PLS	62
PLS Regression	62
PLS Path Analysis	63
Assumptions	65
Robustness	65
Distribution-free	65
Bootstrap estimates of significance	65
Independent observations	66
Data level	66
Homogeneity	66
Linearity	67
Outliers	67
Residuals	67
Appropriate sample size	67
Model specification	68
Multicollinearity	69
Proper use of dummy variables	69
Standardized variables	69
Frequently Asked Questions	70
Why is PLS sometimes described as a 'soft modeling' technique?	70
You said PLS could handle large numbers of independents, but can't OLS regression do this too?	70
How does PLS path modeling compare to path modeling in structural equation modeling (SEM) using packages like AMOS?	70
Is PLS always a linear technique?	72
How is PLS related to principal components regression (PCR) and maximum redundancy analysis (MRA)?	72
What are the SIMPLS and PCR methods in proc PLS in SAS?	73
What are the NIPALS and SVD algorithms?	74
How does PLS relate to two-stage least squares (2SLS)?	74
How does PLS relate to neural network analysis (NNA)?	74
What is confirmatory tetrad analysis (CTA) in PLS?	74
Bibliography	76