|
|
OverviewNormal curve means tests, commonly called simply "hypothesis tests," are a basic method of exploring possible differences between two samples, or of testing the null hypothesis that an observed sample mean does not differ significantly from zero. The normal curve test is a parametric test assuming a normal distribution, but when its assumptions are met it is more powerful than corresponding two-sample nonparametric tests. The normal curve z-test is used when sample sizes are largel (ex., > 29), but with smaller samples the t-test is used. The two tests are equivalent. With polytomous independents (more than two samples), researchers use ANOVA.
|
|
Sample standard deviation is a conservative adjustment statisticians sometimes make when dealing with sample data. It is simply the formula above, but with (n - 1) in the denominator rather than n, the sample size. By using n - 1, our estimate of the standard deviation is unbiased (the estimated value corresponds to the true value). The larger the n, the closer sample standard deviation approximates population standard deviation. In practical terms, replacing n with n-1 matters only for smaller samples.
where sd is the standard deviation for a variable and n is sample size. We are estimating that SE diminishes proportional to the square root of n. The larger the n, the smaller the SE. Often estimated standard error is just called 'standard error.'

The binomial distribution follows the formula, (p + q)n, where p is the probability of one thing (Republicans, in this example, with p = .5) and q is the probability of non-occurrence (q = 1 - p), and n is the number of trials (4 in this example). Thus,
As can be seen, this binomial expansion corresponds to the distribution shown in the figure above.


The area under the normal curve represents probability: 68.26% of cases will lie within 1 standard deviation of the mean, 95.44% within 2 standard deviations, and 99.14% within 3 standard deviations. Often this is simplified by rounding to say that 1 s.d. corresponds to 2/3 of the cases, 2 s.d. to 95%, and 3 s.d. to 99%. Another way to put this is to say there is less than a .05 chance that a sampled case will lie outside 2 standard deviations of the mean, and less than .01 chance that it will lie outside 3 standard deviations. This statement is analogous to statements pertaining to significance levels of .05 and .01.

Thus, if the mean in our sample is 20 and the standard deviation is 12, then if the data are normally distributed and randomly sampled, we would estimate that 95% of the cases will be within the range of 20 plus or minus 1.96*12 = 23.52, which is the range -3.52 to 43.53. By the same token the chance of a given case being 43.53 or higher, or -3.52 or lower, is .05. This calculation is a two-tailed test. The chance of a given case being 43.53 or higher is .025, which is the corresponding one-tailed test. Note that the significance level of a two-tailed test is numerically twice that of a one-tailed test, but since the lower the significance numerically (closer to 0) the better the significance substantively (less likelihood of the observation being just due to the chance of random sampling), the one-tailed test has substantively better significance by a factor of two.
In normal curve terms, if we hypothesize that there is a normal distribution of ages around a mean of 55 and were to take samples from this distribution, what percentage of the time would we get a sample mean age which is 4 years or more different from 55? This is similar to the two-tailed test illustrated in the figure above. If the distance from the hypothesized real mean of 55 to the sample mean of 59 (4 years) is 1.96 standard errors or greater, then the proportion of cases in the tail is .025 or less, and the proportion in both tails is .05 or less. Recall standard error is the standard deviation of sample means, which is what this example involves, but the logic is the same as for standard deviations of cases. We want the two-tail situation because the hypotheses dealt with "different from," whereas had it dealt only with "more than" then we would want the one-tail test. Dividing 1.96 into 4, if the standard error is 2.04 or less, then 59 is at least 1.96 standard errors away. We can then say that we can be 95% confident that our sample mean of 59 is significantly different from the hypothesized real mean of 55. Equivalently, we can say that the sample mean is significantly different at the .05 significance level.
One Sample Formula for z values for means tests. It is conventional to denote the value we look up in a table of areas under the normal curve as "z." In the example above, z was 1.96, but it may calculate to any number according to this formula:
where s.d. is the sample standard deviation, used as an estimate of the unknown population standard deviation, and n is the sample size. Note that the denominator term is the standard error, discussed above. The researcher uses this formula to compute the z value, then sees how it compares with the critical value (ex., 1.96 for significance=.05) in a table of areas under the normal curve. In a two-tailed test, if z is 1.96 or higher, then the difference of means is significant at the .05 level.
The sample z value is compared to critical values found in a table of areas under the normal curve, as in means tests.
| Independent Samples (Uncorrelated Data) |
|---|
|
|
| Dependent Samples (Correlated Data) |
|---|
|
|
Copyright 1998, 2008 by G. David Garson.
Last update 4/16/08.