Correlation is a bivariate measure of association (that is, of effect size or strength) of the relationship between two variables. It varies from 0 (random relationship) to 1 (perfect linear relationship) or -1 (perfect negative linear relationship). It is usually reported in terms of its square (r2), interpreted as percent of variance explained. For instance, if r2 is .25, then the independent variable is said to explain 25% of the variance in the dependent variable.
There are several common pitfalls when using correlation. Correlation is symmetrical, not providing evidence of which way causation flows. If unmeasured variables also cause the dependent variable, then any covariance they share with the given independent variable in a correlation may be falsely attributed to that independent variable. Also, to the extent that there is a nonlinear relationship between the two variables being correlated, correlation will understate the relationship. Correlation will also be attenuated to the extent there is measurement error, including use of sub-interval data or artificial truncation of the range of the data. Correlation can also be a misleading average if the relationship varies depending on the value of the independent variable ("lack of homoscedasticity"). And, of course, atheoretical or post-hoc running of many correlations runs the risk that 5% of the coefficients may be found significant by chance alone.
Beside Pearsonian correlation (r), by far the most common type, there are other special types of correlation to handle the special characteristics of such types of variables as dichotomies, and there are other measures of association for nominal and ordinal variables. Regression procedures produce multiple correlation, R, which is the correlation of multiple independent variables with a single dependent. Also, there is partial correlation, which is the correlation of one variable with another, controlling both the given variable and the dependent for a third or additional variables, used to model three to five variables. And there is part correlation, which is the correlation of one variable with another, controlling only the given variable for a third or additional variables. The b coefficients in regression are part (semi-partial) coefficients. These topics are discussed in separate volumes of the "Blue Book" series
The full content is now available from Statistical Associates Publishers. Click here.
Below is the unformatted table of contents.
CORRELATION Table of Contents Overview Key Concepts and Terms Basic terms Deviation Covariance Standardization Correlation Correlation for interval data Pearsonian correlation Pearson's r Coefficient of determination, r2 Attenuation of correlation Ordinal correlation Correlation for ordinal and dichotomous data Spearman's rho Kendall's tau-b Polyserial correlation Polychoric correlation Pearsonian and ordinal correlation in SPSS Example SPSS correlation dialog Options Pearson correlation output Ordinal correlation output Pearsonian and ordinal correlation with SAS Example SAS syntax SAS PROC CORR output PLOTS output Cronbach coefficient alpha table Correlation for dichotomies Point-biserial correlation Biserial correlation Converting point-biserial to biserial correlation Rank biserial correlation Phi Other types of correlation Tetrachoric correlation Correlation ratio, eta Coefficient of intraclass correlation (ICC) Assumptions Interval level data Linear relationships Homoscedasticity No outliers Minimal measurement error Unrestricted variance Similar underlying distributions Common underlying normal distributions Normally distributed error terms Frequently Asked Questions Do I want one-tailed or two-tailed significance? How many correlations will there be among k variables? What rules exist for determining the appropriate significance level for testing correlation coefficients? How do I convert correlations into z scores? Z-Score Conversions of Pearson's r How is the significance of a correlation coefficient computed? Significance of r Significance of the difference between two correlations from two independent samples Significance of the difference between two dependent correlations from the same sample How do I set confidence limits on my correlation coefficients? I have ordinal variables and thus used Spearman's rho. How do I use these ordinal correlations in SPSS for partial correlation, regression, and other procedures? What is the relation of correlation to ANOVA? What is the relation of correlation to validity? What is the SPSS syntax for correlation? Bibliography