HYPOTHESIS TESTS
We are interested in the population, but the sample is in our hands.
Some assumption is made on the population (e.g. the value of
µ
and/orσ
), and this assumption isaccepted or rejected based on the data.
May the data come from a distribution …? E.g.
µ
=µ
0?0 0:
µ
=µ
H H1:
µ
≠µ
0Null hypothesis Alternative hypothesis
2
0
0: =
H
µ µ
H1:µ
≠µ
0z-test
n z x
σ µ
= −
n z x
σ µ0 0
= − test statistic
If H0is true,z0=z
If z0takes its value in the usual range, accepted.
σ α
µ
= −
a2 < − 0 ≤za2 1 n
-z x P
3
0
0: =
H µ µ H1:µ≠µ0
σ α
µ
= −
a2 < − 0 ≤za2 1 n
-z x P
α
is the significance level nz x σ
µ
= −
n z x
σ µ0 0
= − test statistic
z-test
(
-za2<z0≤za2H0)
=1−αP
σ α
µ = −
a2< − 0 ≤za2H0 1 n
-z x P
(
0 / 2 / 0 / 2 /)
1P µ −zα σ n< < +x µ zα σ n = −α
z α /2
0
reject
reject accept
α /2 zα/2 -zα/2
n z x
σ µ0 0
= − Region of acceptance
5
(
/ 2 / / 2 /)
1P x−zα σ n< ≤ +µ x zα σ n = −α confidence interval for
µ
:If the confidence interval contains the hypothesised
µ
0value, H0is accepted.σ α
µ = −
a2< − 0 ≤za2H0 1 n
-z x P
(
/ 2 / 0 / 2 /)
1P x−zα σ n< ≤ +µ x zα σ n = −α
Confidence interval and hypothesis test
6
Example 1
The mass of an object is measured with 4 repeated measurements.
The sample mean is 5.0125 g.
From historical data the variance is known as σ2 = 10-4 g2. May we believe (based on the data) that the expected value (the true mass of the object if the balance is unbiased) is 5.0000 g?
0
0 : =
H
µ µ
H1:µ
≠µ
07
σ α
µ
= −
− ≤
< 0 2 1
2 a
a z
n -z x
P
Is the value of the test statistic in the region of acceptance?
: H1
µ
≠µ
0 E.g in case ofz α /2
0
reject
reject accept
α /2 zα -zα/2 /2
05 . 0 ,
4 ,
10 ,
5.0125 2 = 4 = =
=
σ
− nα
x
0000 . 5 :
H , 0000 . 5 :
H0
µ
=µ
0 = 1µ
≠µ
0 =− =
= n
z x
σ
µ
0 02 = za
9
Error of first and second kind
Decision The H0 hypothesis is
accepted rejected
H0 is true Proper decision Error of first kind (α)
H0 is false Error of second kind
(β) Proper decision
“fail to reject”
10
Probability of committing an error of second kind
α/2 β
f(z0H0)
f(z0H1)
α/2
(µ1-µ0)/(σ/√n)
11
OC (operating characteristic) curve
0.0 0.2 0.4 0.6 0.8 1.0
5.000 5.005 5.010 5.015 5.020
µ1 β
µ
0One-sample t test
n z x
σ µ
0 0= −
n s t0 = x−
µ
00
0: =
H
µ µ
( ) µ = − α
− ≤
<
=
≤
<
0 2 2 0 21
2 a a a
a
t
n s -t x
P t
t -t P
0 1:
H
µ
≠µ
( x -t 2 s n < µ
0 ≤ x + t
2 s n ) = 1 − α
P
a aCI contains the hypothesised
µ
0value, accepted13
i xi xi−xref
1 5.8 -0.2
2 5.7 -0.3
3 5.9 -0.1
4 5.9 -0.1
5 6.0 0.0
6 6.1 0.1
7 6.0 0.0
8 6.1 0.0
9 6.4 0.4
10 6.3 0.3
11 6.0 0.0
12 6.1 0.1
13 6.2 0.2
14 5.6 -0.4
15 6.0 0.0
( )
x xrefE =
: H0
n s t0 = x−
µ
0 xref=6.0 (standard)Example 2
Checking the bias of a gauge
( )
x xrefE ≠ 0 =
1:
H
µ
14
pis the probability of obtaining this or more extreme result ifH0is true (probability of error of first kind) Std. Err.: standard error of mean
CI contains the hypothesised
µ
0=6.0 value, accepted15
Mean; Whisker: Mean±0.95 Conf. Interv al
I II III IV
-0.4 -0.2 0.0 0.2 0.4 0.6 0.8
J. H. Steiger, R.T. Fouladi: Noncentrality Interval Estimation and the Evaluation of Statistical Models, Chapter 9 in: L.L. Harlow, S.A. Mulaik, J.H. Steiger: What if there were no significance tests? Mahwah, NJ: Erlbaum (1997)
Power, statistically significant difference
1 Sample t-Test: Power Calculation One Mean, t-Test (H0: Mu = Mu0) Power vs. Es (N = 15, Alpha = 0.05)
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Standardized Effect (Es) .1
.2 .3 .4 .5 .6 .7 .8 .9 1.0
Power
Es
µ µ
0σ
= − Power=1-βcertainty of detection
Power depends on:
(µ-µ0), σ , n, α
standardized effect:
17
The sample size (n=15) and error of first kind is fixed (α=0.05) , σ= 0.212.
What difference (µ-µ0) can be detected with 90%
probability(β=0.1)?
18