Interval Estimation
   

   

 Math Help -> Statistics -> Interval Estimation 

Contents of this section:

Here's what you'll find in this section:

bullet 
bullet

Confidence Intervals

We saw last week that using sampling distributions we can get an interval of the real line that will include the transformed statistic any specified proportion of the time in repeated samples, for example, [ (-1.96X-/n1.96)=.95, ] or

displaymath4017

In fact, such a probability inequality can be written down for each transformed statistic we studied last week. We have used .025 and .975 in the two examples above so that the total probability would be .95=.975-.025. It is traditional in statistics to denote this ``inclusion probability'' by tex2html_wrap_inline4021 , which in our example would mean that tex2html_wrap_inline4023 and the .025 would be tex2html_wrap_inline4025 and the .975 would be tex2html_wrap_inline4027 .

Thus if we wanted the inclusion probability to be .99 instead of .95 in the tex2html_wrap_inline3701 example, we would use tex2html_wrap_inline4035 and tex2html_wrap_inline4037 ( tex2html_wrap_inline4039 ).

Once we have such an inequality as the two above, it is a simple matter to solve the inequality to get the parameter of interest by itself in the middle. This gives what is called a ``confidence interval'' for the parameter, that is, an interval of the real line that tex2html_wrap_inline4041 percent of the time will include the value of the parameter. What we've been calling the inclusion probability is called the confidence level of the confidence interval. Recall again that such a confidence interval will contain the value of the parameter (such as a population mean, for example) in tex2html_wrap_inline4041 percent of all samples from the population.

Rather than go through all of the algebra of the inequalities, we have given at the end of this chapter a list of confidence intervals. This list is by one of 11 `case numbers' which correspond to the dialog box in the `Calculating Confidence Intervals' concept lab. For examples of using confidence intervals, see that lab.

 

 

table1311

 

 

Stataquest makes it easy to get these intervals for a given data set.

 

bullet 
bulletSample Size Determination
bullet 
bulletEstimating tex2html_wrap_inline2651 with a tex2html_wrap_inline4113 % confidence interval of length 2B
bullet

Estimating tex2html_wrap_inline2651 with a tex2html_wrap_inline4113 % confidence interval of length 2B

Here we will derive the sample size needed to obtain an interval of length 2B, where B indicates the largest possible distance between any tex2html_wrap_inline2651 in the confidence interval and the sample mean tex2html_wrap_inline2643 . Recall that

displaymath4125

Solving for n we obtain

displaymath4129

Use of the formula above requires tex2html_wrap_inline2697 to be known, which seldom happens in practice. When tex2html_wrap_inline2697 is unknown, one either estimate it with the sample standard deviation from the previous study or just use the 1/4 of the range (the difference between the largest and the smallest observations) as a rough guess.

EXAMPLE:\: A factory claims that the average working hour tex2html_wrap_inline2651 of its employees is 40. 49 workers are chosen randomly and their average working hours is 42 with the standard deviation equal to 6. For future studies, if we want to construct a 99% confidence interval with the total length less than two hours, how large a sample will we need? (Assume tex2html_wrap_inline4137 .)
bullettex2html_wrap_inline4139 = 2.58, n = tex2html_wrap_inline4143 .
We would need n=240 to get a 99% confidence interval whose length is at most 2. Note that we always round up.


 
bulletEstimating tex2html_wrap_inline2703 with a tex2html_wrap_inline4113 % confidence interval of length 2B
bullet

Estimating tex2html_wrap_inline2703 with a tex2html_wrap_inline4113 % confidence interval of length 2B

Here we will derive the sample size needed to obtain an interval of length 2B, where B indicates the largest possible distance between any tex2html_wrap_inline2703 in the confidence interval and the sample proportion p. Recall that

displaymath4161

Solving for n we obtain

displaymath4165

There is a problem with this equation: the value of p depends on n, i.e., tex2html_wrap_inline4171 . However, for p between 0 and 1, p(1-p) has a maximum value of 0.25. If we plug this maximum into our equation, we get

displaymath4177

We are guaranteed that if we take a sample of size n, our confidence interval will be no wider than 2B.

EXAMPLE:\ A produce supplier claims that 75% of his tomatoes will be ripe upon arrival at a distribution center. To test this claim, a random sample of tomatoes was selected from a shipment. Let tex2html_wrap_inline2703 denote the true proportion of ripe tomatoes in the particular shipment. What does n have to be to get a 95% confidence interval of length no more than 0.1?

displaymath4187

We would need n=385 to get a 95% confidence interval whose length is at most 0.1. Note again that we always round up.


 

bullet 
bulletComputer Lab for Week 6
bullet

Computer Lab for Week 6

Applicable StataQuest Commands:

Data tex2html_wrap_inline3057 Generate/Replace tex2html_wrap_inline3057 Random numbers to generate random Normals

Summaries tex2html_wrap_inline3057 Confidence intervals to generate t confidence intervals for data you generated or data from a file

 
bulletConcept Lab for Week 6
bullet

Concept Lab for Week 6

 
bulletCh 10: Minimum Variance Estimation
bulletCh 8: Z, t, Chi-square, F
bulletCh 9: Sampling Distributions
bulletCh 12: Interpreting Confidence Intervals


 

bullet 

 

Internet References

 

Related pages in this website

 

 

The webmaster and author of the Math Help site is Graeme McRae.
     [home]  [email]  [search]  [Links to Math Sites]  [Whiteboard]