8.5: Confidence Intervals

Up to this point, we have learned how to estimate the population parameter for the mean using sam ple data and a sample statistic. From one point of view, this makes sense: we have one value for our parameter so we use a single value (called a point estimate) to estimate it. However, we have seen that all statistics have sampling error and that the value we find for the sample mean will bounce around based on the people in our sample, simply due to random chance. Thinking about estimation from this perspective, it would make more sense to take that error into account rather than relying just on our point estimate. To do this, we calculate what is known as a confidence interval. A confidence interval starts with our point estimate then creates a range of scores (this is the "interval" part) considered plausible based on our standard deviation, our sample size, and the level of confidence with which we would like to estimate the parameter. This range, which extends equally in both directions away from the point estimate, is called the margin of error. We calculate the margin of error by multiplying our two-tailed critical t-score by our standard error: \[\text =t \times \left(\dfrac>\right) \nonumber \] The critical value we use will be based on a chosen level of confidence, which is equal to \(1 – \alpha\). Thus, a \(95\%\) level of confidence corresponds to \(\alpha = 0.05\). Thus, at the 0.05 level of significance, we create a 95% Confidence Interval. How to interpret that is discussed further on. Once we have our margin of error calculated, we add it to our point estimate for the mean to get an upper bound to the confidence interval and subtract it from the point estimate for the mean to get a lower bound for the confidence interval: \[\begin=\bar+\text > \\ =\bar-\text >\end \nonumber\] Or simply: \[\text < Confidence Interval >=\overline \pm (t\times\left(\dfrac>\right)) \nonumber \] Let’s see what this looks like with some actual numbers by taking our studying for weekly quizzes data and using it to create a 95% confidence interval estimating the average length of time for our sample. We already found that our average was \(\overline\)= 53.75 minutes, our standard error (the denominator) was 6.86, and our critical t-score was 2.353. With that, we have all the pieces we need to construct our confidence interval: \[95 \% C I=53.75 \pm 2.353(6.86) = 53.75 \pm 16.14 ​\nonumber \] \[ \text = 53.75 - 16.14 = 37.61 \nonumber \] \[ \text = 53.75 + 16.14 =69.88 \nonumber \] \[95 \% C I=(37.61,69.88) \nonumber \] So we find that our 95% confidence interval runs from 37.61 minutes to 69.88 minutes, but what does that actually mean? The range (37.61 to 69.88) represents values of the mean that we consider reasonable or plausible based on our observed data. It includes our point estimate of the mean, \(\overline = 53.75\), in the center, but it also has a range of values that could also have been the case based on what we know about how much these scores vary (i.e. our standard error). It is very tempting to also interpret this interval by saying that we are 95% confident that the true population mean falls within the range (37.61 to 69.88), but this is not true. The reason it is not true is that phrasing our interpretation this way suggests that we have firmly established an interval and the population mean does or does not fall into it, suggesting that our interval is firm and the population mean will move around. However, the population mean is an absolute that does not change; it is our interval that will vary from data collection to data collection, even taking into account our standard error. The correct interpretation, then, is that we are \(95\%\)confident that the range (37.61 to 69.88) brackets the true population mean. This is a very subtle difference, but it is an important one.

Hypothesis Testing with Confidence Intervals

As a function of how they are constructed, we can also use confidence intervals to test hypotheses. Once a confidence interval has been constructed, using it to test a hypothesis is simple. The range of the confidence interval brackets (or contains, or is around) the null hypothesis value, we fail to reject the null hypothesis. If it does not bracket the null hypothesis value (i.e. if the entire range is above the null hypothesis value or below it), we reject the null hypothesis. The reason for this is clear if we think about what a confidence interval represents. Remember: a confidence interval is a range of values that we consider reasonable or plausible based on our data. Thus, if the null hypothesis value is in that range, then it is a value that is plausible based on our observations. If the null hypothesis is plausible, then we have no reason to reject it. Thus, if our confidence interval brackets the null hypothesis value, thereby making it a reasonable or plausible value based on our observed data, then we have no evidence against the null hypothesis and fail to reject it. However, if we build a confidence interval of reasonable values based on our observations and it does not contain the null hypothesis value, then we have no empirical (observed) reason to believe the null hypothesis value and therefore reject the null hypothesis.

Scenario

Let’s see an example. You hear that the national average on a measure of friendliness is 38 points. You want to know if people in your community are more or less friendly than people nationwide, so you collect data from 30 random people in town to look for a difference. We’ll follow the same four step hypothesis testing procedure as before.

Step 1: State the Hypotheses