|
|
Effect size measures (ex., correlation) are reported for enumeration data but not significance tests. Since significance levels confound effect size and sample size, this is a misleading method of assessing what findings are important: effect size measures should be used instead. Nonetheless, some publishers insist on reporting significance levels anyway. If significance levels are reported for enumeration data, a footnote may be added stating, "Computed significance levels are reported in order to follow social science convention. However, as the data are an enumeration of all cases, the actual significance level for all findings is .000, not the computed level, which assumes the data are a random sample of size n." (where n is the researcher's population size).
The best that can be done with non-random sampled data is to establish that characteristics of the non-random sample are proportionate to the population to which the researcher wishes to generalize. For example, in a non-random sample of students at a college, college-level data may be available on gender and the researcher may establish that the proportion female in the non-random sample is proportionate to the known percentage of females in the college. Ideally the researcher would like to establish proportionality for key variables in the study. While significance tests do not make non-random data generalizable, the chi-square goodness of fit test can be used to determine if sampled data conform to a known distribution. This is discussed in the section on chi-square and a spreadsheet example is provided.
This is daunting and usually impossible task, but even if all variables in the study could be established to be proportionate in the population, unmeasured variables may not be proportionate and will influence the findings. Confirmatory research is almost always impossible with non-random data and findings should be reported as exploratory, with limitations of the data reported and analyzed. If significance levels are reported for non-random data, a footnote may be added stating, "Computed significance levels are reported in order to follow social science convention. However, as the data are non-random, reported significance levels are in error to an unknown degree and are presented for exploratory purposes only."
For experimental data based on randomization of subjects, a significance level at or below .05 means that were another randomization of subjects undertaken, one would expect a result as strong as the observed one less than 5% of the time. A significant result means only that the result is not due to chance of randomization. It says nothing about whether, if one took another sample of subjects on which to base randomization, one would be likely to get results as strong as those observed. The researcher is not entitled to generalize to a larger population for the identical reasons cited above with regard to any other non-random sample. That experimental researchers routinely violate this principle does not change the fact that research based on experiments with randomized subjects can only be generalized to the intended population if the data are a random sample of that population. With a large enough number of subjects, randomization does control for unmeasured variables but if the sample is not random, significance levels will be in error to an unknown degree just as the sample is unrepresentative to an unknown degree.
For problematic large samples, Monte Carlo significance estimates are available, also in the Exact Tests module. Monte Carlo estimates are derived from repeated sampling of the current dataset to form empirical data distribution parameters (as opposed, for instance, to assuming a normal distribution). Monte Carlo estimates are data-driven and may overfit the data (may reflect noise in the current dataset), but have no distribution assumptions.
Technically there is a difference between on the one hand failing to reject the null hypothesis in the case of a finding of non-significance, and on the other hand actually accepting the null hypothesis. To accept the null hypothesis given a finding of non-significance, power should be .80 or higher. It is not enough that significance be higher in magnitude than .05.
Copyright 1998, 2008, 2009, 2011 by G. David Garson.
Last update, 12/16/2011.