1 HYPOTHESIS TESTS
We are interested in the population, but the sample is in our hands.
Some assumption is made on the population (e.g. the value ofµand/orσ), and this assumption is accepted or rejected based on the data.
May the data come from a distribution …? E.g. µ=µ0?
0 0:µ=µ
H H1:µ≠µ0
Null hypothesis Alternative hypothesis
2
0
0: =
H µ µ H1:µ≠µ0
z-test
n z x
σ µ
= −
n z x
σ µ0 0
= − test statistic
If H0is true,z0=z
If z0takes its value in the usual range, accepted.
σ α
µ = −
a2< − 0≤za2 1 n -z x P
3
0
0: =
H µ µ H1:µ≠µ0
σ α
µ = −
a2< − 0≤za2 1 n -z x P
α is the significance level n
z x σ
µ
= − n
z x σ
µ0 0
= − test statistic z-test
4
(
-za2<z0≤za2H0)
=1−αP
σ α
µ = −
a2< − 0≤za2H0 1 n
-z x P
(
0 / 2 / 0 / 2 /)
1P µ−zα σ n< < +x µ zα σ n = −α
z α/2
0 reject
reject accept
α/2
zα/2 -zα/2
n z x
σ µ0 0
= − Region of acceptance
5
(
/ 2 / / 2 /)
1P x−zα σ n< ≤ +µ x zα σ n = −α confidence interval forµ:
If the confidence interval contains the hypothesised µ0value, H0is accepted.
σ α
µ = −
a2< − 0≤za2H0 1 n
-z x P
(
/ 2 / 0 / 2 /)
1P x−zα σ n< ≤ +µ x zα σ n = −α Confidence interval and hypothesis test
6
2
Example 1
The mass of an object is measured with 4 repeated measurements.
The sample mean is 5.0125 g.
From historical data the variance is known as σ2= 10-4g2. May we believe (based on the data) that the expected value (the true mass of the object if the balance is unbiased) is 5.0000 g?
0
0: =
H µ µ H1:µ≠µ0
7
σ α
µ = −
− ≤
< 0 2 1
2 a
a z
n -z x P
Is the value of the test statistic in the region of acceptance?
: H1 µ≠µ0 E.g in case of
z α/2
0 reject
reject accept
α/2 zα/2 -zα/2
8
05 . 0 , 4 , 10 ,
5.0125 2= 4 = =
= σ − n α
x
0000 . 5 : H , 0000 . 5 :
H0 µ=µ0= 1 µ≠µ0=
− =
= n
z x σ
µ0 0
2= za
9
Error of first and second kind
Decision The H0 hypothesis is
accepted rejected
H0 is true Proper decision Error of first kind (α) H0 is false Error of second kind
(β) Proper decision
“fail to reject”
10
Probability of committing an error of second kind
α/2 β
f(z0H0)
f(z0H1)
α/2 (µ1-µ0)/(σ/√n)
11
OC (operating characteristic) curve
0.0 0.2 0.4 0.6 0.8 1.0
5.000 5.005 5.010 5.015 5.020
µ1 β
µ0
12
3
One-sample t test
nz x σ
µ0 0
= −
n s t0=x−µ0
0
0: =
H µ µ
( )
µ = −α
< − ≤
=
≤
< 2 1
0 2 2
0
2 a a a
a t
n s -t x P t t -t P
0 1: H µ≠µ
( x -t 2s n < µ
0≤ x + t
2s n ) = 1 − α
P
a aCI contains the hypothesisedµ0value, accepted
13
i xi xi−xref
1 5.8 -0.2
2 5.7 -0.3
3 5.9 -0.1
4 5.9 -0.1
5 6.0 0.0
6 6.1 0.1
7 6.0 0.0
8 6.1 0.0
9 6.4 0.4
10 6.3 0.3
11 6.0 0.0
12 6.1 0.1
13 6.2 0.2
14 5.6 -0.4
15 6.0 0.0
( )
x xrefE =
: H0
n s t0=x−µ0 xref=6.0 (standard)
Example 2
Checking the bias of a gauge
( )
x xrefE ≠ 0=
1:
H µ
14
pis the probability of obtaining this or more extreme result ifH0is true (probability of error of first kind) Std. Err.: standard error of mean
CI contains the hypothesisedµ0=6.0 value, accepted
15
Mean; Whisker: Mean±0.95 Conf. Interv al
I II III IV
-0.4 -0.2 0.0 0.2 0.4 0.6 0.8
J. H. Steiger, R.T. Fouladi: Noncentrality Interval Estimation and the Evaluation of Statistical Models, Chapter 9 in: L.L. Harlow, S.A. Mulaik, J.H. Steiger: What if there were no significance tests? Mahwah, NJ: Erlbaum (1997)
16
Power, statistically significant difference
1 Sample t-Test: Power Calculation One Mean, t-Test (H0: Mu = Mu0) Power vs. Es (N = 15, Alpha = 0.05)
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Standardized Effect (Es) .1
.2 .3 .4 .5 .6 .7 .8 .9 1.0
Power
Es µ µ0
σ
= − Power=1-βcertainty of detection
Power depends on:
(µ-µ0),σ , n, α
standardized effect:
17
The sample size (n=15) and error of first kind is fixed (α=0.05) , σ= 0.212.
What difference (µ-µ0)can be detected with 90%
probability(β=0.1)?
18