Up to this point, we have learned how to estimate the population parameter for the mean using sam ple data and a sample statistic. From one point of view, this makes sense: we have one value for our parameter so we use a single value (called a point estimate) to estimate it. However, we have seen that all statistics have sampling error and that the value we find for the sample mean will bounce around based on the people in our sample, simply due to random chance. Thinking about estimation from this perspective, it would make more sense to take that error into account rather than relying just on our point estimate. To do this, we calculate what is known as a confidence interval. A confidence interval starts with our point estimate then creates a range of scores (this is the "interval" part) considered plausible based on our standard deviation, our sample size, and the level of confidence with which we would like to estimate the parameter. This range, which extends equally in both directions away from the point estimate, is called the margin of error. We calculate the margin of error by multiplying our two-tailed critical t-score by our standard error: \[\text
As a function of how they are constructed, we can also use confidence intervals to test hypotheses. Once a confidence interval has been constructed, using it to test a hypothesis is simple. The range of the confidence interval brackets (or contains, or is around) the null hypothesis value, we fail to reject the null hypothesis. If it does not bracket the null hypothesis value (i.e. if the entire range is above the null hypothesis value or below it), we reject the null hypothesis. The reason for this is clear if we think about what a confidence interval represents. Remember: a confidence interval is a range of values that we consider reasonable or plausible based on our data. Thus, if the null hypothesis value is in that range, then it is a value that is plausible based on our observations. If the null hypothesis is plausible, then we have no reason to reject it. Thus, if our confidence interval brackets the null hypothesis value, thereby making it a reasonable or plausible value based on our observed data, then we have no evidence against the null hypothesis and fail to reject it. However, if we build a confidence interval of reasonable values based on our observations and it does not contain the null hypothesis value, then we have no empirical (observed) reason to believe the null hypothesis value and therefore reject the null hypothesis.
Let’s see an example. You hear that the national average on a measure of friendliness is 38 points. You want to know if people in your community are more or less friendly than people nationwide, so you collect data from 30 random people in town to look for a difference. We’ll follow the same four step hypothesis testing procedure as before.
We need our critical values in order to determine the width of our margin of error. We will assume a significance level of \(α\) = 0.05 (which will give us a 95% CI). A two-tailed (non-directiona) critical value at \(\alpha = 0.05\) is actually p=0.025 on the table of critical values for t. With 29 degrees of freedom (\(N – 1 = 30 – 1 = 29)\) and p-value of 0.025, the critical t-score is 2.045.
Now we can construct our confidence interval. After we collect our data, we find that the average person in our community scored 39.85, or \(\overline = 39.85\), and our standard deviation was \(s = 5.61\). Now we can put that value, our point estimate for the sample mean, and our critical value from step 2 into the formula for a confidence interval:
\[95 \% C I=39.85 \pm 2.045(1.02) \nonumber \]
\[\text =\overline \pm (t \times \left(\frac>\right)) = 39.85 \pm (2.045 \times \left(\frac>\right)) = 39.85 \pm (2.045 \times \left(\frac \right)) = 39.85 \pm (2.045 \times 1.02) = 39.85 \pm (2.09 ) \nonumber \]
\[ \text = 39.85 - 2.09 = 37.76 \nonumber \]
\[\text = 39.85 + 2.09 = 41.94 \nonumber \]
\[95 \% C I=(37.76,41.94) \nonumber \]
Finally, we can compare our confidence interval to our null hypothesis value. The null value of 38 is higher than our lower bound of 37.76 and lower than our upper bound of 41.94. Thus, the confidence interval brackets our null hypothesis value, and we retain (fail to reject) the null hypothesis.
Based on our sample of 30 people, our community not different in average friendliness (\(\overline = 39.85\)) than the nation as a whole, \(95\%\) CI = (37.76, 41.94).
Note that we don’t report a test statistic or \(p\)-value because that is not how we tested the hypothesis, but we do report the value we found for our confidence interval.
An important characteristic of hypothesis testing is that both methods will always give you the same result. That is because both are based on the standard error and critical values in their calculations. To check this, we can calculate a t-statistic for the example above and find it to be \(t = 1.81\), which is smaller than our critical value of 2.045 and fails to reject the null hypothesis.
This page titled 8.5: Confidence Intervals is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Foster et al. (University of Missouri’s Affordable and Open Access Educational Resources Initiative) via source content that was edited to the style and standards of the LibreTexts platform.