Here's what you'll find in this section:
In the previous chapter, we found that by computing a confidence interval, we
could obtain a range of likely values for the population parameter we're
estimating. Not only that, but we could do a heuristic ``test'' to see if claims
were correct by seeing if the confidence interval captured the claimed value.
For example, a manufacturer claims that the average lifetime of an electronic
component is 32 hours. We could take a sample of electronic components of size
n
and measure their lifetime. By measuring the sample mean and variance, we can
compute a 95% confidence interval. If 32 fell within our interval, we said we
would believe the claim of the manufacturer. If it didn't fall within the
interval, we wouldn't believe the claim. Hypothesis testing/ is a
formal way of testing claims such as these and is closely related to confidence
intervals.


Heuristic Introduction to Hypothesis Testing
Hypothesis testing in science is a lot like the criminal court system in
the United States. How do we decide guilt?
 Assume innocence until ``proven'' guilty.
 Evidence is presented at a trial.
 Proof has to be ``beyond a reasonable doubt.''
A jury's possible decision:
Note that a jury cannot declare somebody ``innocent,'' just ``not guilty.''
This is an important point. Do juries ever make mistakes?
 If a person is really innocent, but the jury decides (s)he's guilty,
then they've sent an innocent person to jail.
 If a person is really guilty, but the jury finds him/her not guilty, a
criminal is walking free on the streets.
In our criminal court system, a Type I error is considered more important
than a Type II error, so we protect against a Type I error to the detriment
of a Type II error. This is the same as in statistics.


Null and Alternative Hypotheses

Science, in general, operates by disproving/ unsatisfactory
hypotheses and proposing newandimproved hypotheses which are testable.
The approach we take in statistics is exactly this scientific method. We
start with a hypothesis which we assume/ is correct. We call
this the null hypothesis/ or
, and our goal is to reject
in favor of the alternative hypothesis,
.

Type I and Type II Errors

The kind of errors we can make are
 Type I:/ Reject
when
is really true.
 Type II:/ Fail to reject
when
is really false.
It is important to emphasize that we can either reject
/ or fail to reject/
(in the same sense, a jury can only find someone ``guilty'' or ``not
guilty,'' not ``innocent''). Some books will call the latter
accepting
, but we will try to be careful in using terminology.
In the one and two sample situation, we will always have three forms
of
:
Note that hypotheses are always about population parameters. The
first hypothesis above,
, is called a twosided/ or twotailed/ test, while
the second and third tests are onesided/ or onetailed/
hypotheses.

Review: Hypothesis Testing Facts

 Hypotheses:
 Null Hypothesis/
: The accepted explanation, status quo. This is what we're
trying to disprove.
 Alternate Hypothesis/
: What the researcher or scientist thinks might really be going
on, a (possibly) better explanation than the null.
 Test:
 The goal of the test is to reject
in favor of
. We do this by calculating a test statistic/ and
comparing its value with a value from a table in the book, the
critical value.
 If our test statistic is more extreme than our critical
value, then it falls within the rejection region/ of
our test and we reject
. We can set up the rejection region before computing our test
statistic.
 Decisions:
 Reject
.
 Fail to reject
.
 Errors:
 Type I:/ Reject
when
is really true.
 Type II:/ Fail to reject
when
is really false.


General Method for Hypothesis Testing

We will generally use the following steps in hypothesis testing:
 Identify from a word problem which category we're in (what the
appropriate test statistic is).
 Determine
and
.
 Set up the rejection region by looking up the critical value in the
appropriate table.
 Calculate the test statistic.
 Draw our conclusion: reject or fail to reject
.
 Interpret our results  say in words what our conclusion means.
Thus, just like we did using confidence intervals, all we have to do is
decide which test to use in which situation.

Reporting the pvalue of a test

Often, statisticians will report their test result as a pvalue.
The pvalue indicates the chance that one would obtained a test
statistic which is more extreme than the observed one when the
is true. The rule is always that we reject
if
The formula for pvalue is given in the next section. See the Tests
of Significance concept lab for more about pvalues.

Formulas

The formulas for the 11 cases considered in the `Calculating Tests of
Hypotheses' concept lab are given in the table at the end of this chapter.
For some examples, see the chapter for that lab.

Computer Lab


Concept Lab
Internet references
Related pages in this website
The webmaster and author of this Math Help site is
Graeme McRae.