Navigation   Home   Search   Site map

document.write (document.title)

 Contact Graeme   Home   Email   Twitter
 Math Help > Statistics > Sampling Distribution

Here's what you'll find in this section:

•
• Sampling Distributions

There are two basic questions asked by inferential statistics:

1. How close is the value of a statistic to the corresponding parameter of the entire population. For example, if we have a sample of 30 elements from a population and we find that , we would like to know how far this might be from the mean of the entire population.
2. In many cases, someone has hypothesized a particular value for the parameter of a population or some relationship between the parameters of two or more populations. For example, experience may show that the mean IQ of all people is 100 and someone may want to test whether a particular teaching method leads to higher mean IQ. Similarly, someone may wonder if mean IQ of men (call it ) is the same as that for women (call it ).
Both of these inferential questions are answered using an idea called the sampling distribution of a transformed statistic which we study in this section.

• Statistics and Random Samples
• Statistics and Random Samples

A random sample of size n from a population is a set of n elements from the population that are chosen in such a way that every set of n elements has the same probability of being chosen.

Computers are very good at selecting random samples and we will use Stataquest throughout the course to choose samples.

A statistic is a number calculated from a sample. Examples include the sample mean , the sample variance, , and so on.

• Sampling Schemes and Inferences from Samples
• Sampling Schemes and Inferences from Samples

Much of what we do in this course consists of

1. Taking a random sample (or samples) from a population (or populations),
2. Calculating a statistic (or statistics),
3. Making inferences about parameters of the whole population from the corresponding statistics calculated from the samples.

In particular, we consider the following situations:

• The Central Limit Theorem
• The Central Limit Theorem

One remarkable fact about the normal distribution is the fact that if we took many samples of size n from a population having mean and variance (any distribution we want), then the population of 's would be approximately normally distributed with mean and variance . The larger n is, the better the approximation is.

These facts are known collectively as the Central Limit Theorem and allow us to make inferences about population means using the normal distribution no matter what the distribution of the population being sampled from. See the ``Central Limit Theorem'' concept lab for more about this.

• Normal Approximation to Binomial
• Normal Approximation to Binomial

A particularly useful example of the Central Limit Theorem is when we are sampling from a 0-1 population. In this case, the number of 1's observed has the binomial distribution which is difficult to make calculations from. But notice that for the sample is in fact the sample proportion p and the Central Limit Theorem says that is approximately normal with mean equal to the mean of the 0-1 population (also known as , the proportion of 1's in the population) and variance . See the ``Z, t, Chi-square, F'' concept lab.

• The Transformed Statistics: Z, t, , F
• The Transformed Statistics: Z, t, , F

The basic idea of statistical inference is that we can determine (using what is called sampling distributions) the likely values of a number that measures how far a statistic is from the corresponding parameter. For example, we can measure how far the statistic is from the parameter by calculating the number (called a ``transformed statistic'')

and noting that if is close to , then Z should be close to 0. Similarly, we can measure how close is to by calculating

which should be close to n-1 if is close to (we will see in a minute why we use the symbols Z and to represent the numbers).

In the table below, we write down a number of transformed statistics and what they should be close to. You may wonder why we use these transformations rather than some simple measure of distance such as . The answer is that statisticians have learned over the past 100 years that the more complicated transformations listed in the table allow them to find the desired likely values while simple distance measures are much more difficult to work with.

• Sampling Distributions of the Transformed Statistics
• Sampling Distributions of the Transformed Statistics

So what good are these transformed statistics? As we said, we know what they should be close to if our statistic is close to the true parameter. The miracle is that (if certain assumptions are met) statisticians have determined mathematically intervals of the real line that a transformed statistic will fall into with specified probability.

For example, the first transformed statistic is labeled Z because statisticians have shown that if the population is normally distributed, then the transformed statistic has the Z distribution (the standard normal curve). Thus if we repeatedly selected random samples of size n and and calculated Z for each one, then we know that 95% of the samples will have a Z between -1.96 and 1.96. (You will use the ``Sampling Distribution'' concept lab to experiment with this idea).

Thus, how close is to in this situation? We saw earlier in this week that 95% of the area under a Z curve fals between -1.96 and 1.96. This tells us that 95% of all samples will have

that is, 95% of all samples will have within . For example, 95% of all samples of 25 IQ's (remember that IQ's are thought to be normally distributed with and ) will have in , that is, 95% of all samples will have within 3 of .

•
• Computer Lab
• Computer Lab

Applicable StataQuest Commands:

Data Generate/Replace Random numbers to generate random Normals, Binomials, etc.

Data Generate/Replace Formula to generate z-scores

Calculator Statistical tables Normal to find probabilities

Calculator Inverse statistical tables Normal to find z-scores

• Concept Lab
• Concept Lab

• Ch 8: Z, t, Chi-square, F Normal Curves
• Ch 5: Sampling From 0-1 Populations
• Ch 2: Random Sampling
• Ch 7: Central Limit Theorem

%

•

Related pages in this website

The webmaster and author of this Math Help site is Graeme McRae.