HYPOTHESIS TESTS

(1)

HYPOTHESIS TESTS

We are interested in the population, but the sample is in our hands.

Some assumption is made on the population (e.g. the value of

µ

and/or

σ

), and this assumption is

accepted or rejected based on the data.

May the data come from a distribution …? E.g.

µ

⁼

µ

₀?

0 0:

µ

=

µ

H H₁:

µ

≠

µ

₀

Null hypothesis Alternative hypothesis

2

(2)

0

0: =

H

µ µ

H1:

µ

≠

µ

0

z-test

n z x

σ µ

= −

n z x

σ µ0 0

= − test statistic

If H₀is true,z₀=z

If z₀takes its value in the usual range, accepted.

σ α

µ

₌ ₋











 _a₂ < − ⁰ ≤z_a₂ 1 n

-z x P

3

0

0: =

H µ µ H1:µ≠µ0

σ α

µ

₌ ₋











 _a₂ < − ⁰ ≤z_a₂ 1 n

-z x P

α

is the significance level n

z x σ

µ

= −

n z x

σ µ0 0

= − test statistic

z-test

(3)

(

^-z^a²^<^z⁰^≤^z^a²^H⁰

)

⁼¹⁻^α

P

σ α

µ ₌ ₋











 _a₂< − ⁰ ≤z_a₂H₀ 1 n

-z x P

(

⁰ ^{/ 2} ^/ ⁰ ^{/ 2} ^/

)

¹

P µ −z_α σ n< < +x µ z_α σ n = −α

z α ^/2

0

reject

reject accept

α ^/2 z_α_/2 -z_α_/2

n z x

σ µ0 0

= − Region of acceptance

5

(

^{/ 2} ^/ ^{/ 2} ^/

)

¹

P x−z_α σ n< ≤ +µ x z_α σ n = −α confidence interval for

µ

:

If the confidence interval contains the hypothesised

µ

₀value, H₀is accepted.

σ α

µ ₌ ₋











 _a₂< − ⁰ ≤z_a₂H₀ 1 n

-z x P

(

^{/ 2} ^/ ⁰ ^{/ 2} ^/

)

¹

P x−z_α σ n< ≤ +µ x z_α σ n = −α

Confidence interval and hypothesis test

6

(4)

Example 1

The mass of an object is measured with 4 repeated measurements.

The sample mean is 5.0125 g.

From historical data the variance is known as σ² = 10^-4 g². May we believe (based on the data) that the expected value (the true mass of the object if the balance is unbiased) is 5.0000 g?

0

0 : =

H

µ µ

H1:

µ

≠

µ

0

7

σ α

µ

= −









 − ≤

< ⁰ ₂ 1

2 a

a z

n -z x

P

Is the value of the test statistic in the region of acceptance?

: H₁

µ

≠

µ

₀ E.g in case of

z α ^/2

0

reject

reject accept

α ^/2 z_α -z_α_/2 /2

(5)

05 . 0 ,

4 ,

10 ,

5.0125 ² = ⁴ = =

=

σ

⁻ ⁿ

α

x

0000 . 5 :

H , 0000 . 5 :

H₀

µ

=

µ

₀ = ₁

µ

≠

µ

₀ =

− =

= n

z x

σ

µ

0 0

2 = za

9

Error of first and second kind

Decision The H0 hypothesis is

accepted rejected

H0 is true Proper decision Error of first kind (α)

H0 is false Error of second kind

(β) Proper decision

“fail to reject”

10

(6)

Probability of committing an error of second kind

α^/2 β

f(z₀^H₀)

f(z₀^H₁)

α^/2

(µ1-µ0)/(σ/√n)

11

OC (operating characteristic) curve

0.0 0.2 0.4 0.6 0.8 1.0

5.000 5.005 5.010 5.015 5.020

µ₁ β

µ

₀

(7)

One-sample t test

n z x

σ µ

0 0

= −

n s t₀ = x−

µ

⁰

0

0: =

H

µ µ

( ) ^µ   ⁼ ⁻ ^α





 



 − ≤

<

=

≤

<

₀ ₂ ₂ ⁰ ₂

1

2 a a a

a

t

n s -t x

P t

t -t P

0 1:

H

µ

≠

µ

( ^x ^-t

²

^s ⁿ ^< ^µ

⁰

^≤ ^x ⁺ ^t

²

^s ⁿ ) ⁼ ¹ ⁻ ^α

P

_a _a

CI contains the hypothesised

µ

₀value, accepted

13

i xi x_i−x_ref

1 5.8 -0.2

2 5.7 -0.3

3 5.9 -0.1

4 5.9 -0.1

5 6.0 0.0

6 6.1 0.1

7 6.0 0.0

8 6.1 0.0

9 6.4 0.4

10 6.3 0.3

11 6.0 0.0

12 6.1 0.1

13 6.2 0.2

14 5.6 -0.4

15 6.0 0.0

( )

x xref

E =

: H₀

n s t₀ = x−

µ

⁰ x_ref=6.0 (standard)

Example 2

Checking the bias of a gauge

( )

x xref

E ≠ 0 =

1:

H

µ

14

(8)

pis the probability of obtaining this or more extreme result ifH₀is true (probability of error of first kind) Std. Err.: standard error of mean

CI contains the hypothesised

µ

₀^=6.0value, accepted

15

Mean; Whisker: Mean±0.95 Conf. Interv al

I II III IV

-0.4 -0.2 0.0 0.2 0.4 0.6 0.8

J. H. Steiger, R.T. Fouladi: Noncentrality Interval Estimation and the Evaluation of Statistical Models, Chapter 9 in: L.L. Harlow, S.A. Mulaik, J.H. Steiger: What if there were no significance tests? Mahwah, NJ: Erlbaum (1997)

(9)

Power, statistically significant difference

1 Sample t-Test: Power Calculation One Mean, t-Test (H0: Mu = Mu0) Power vs. Es (N = 15, Alpha = 0.05)

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Standardized Effect (Es) .1

.2 .3 .4 .5 .6 .7 .8 .9 1.0

Power

Es

µ µ

0

σ

= − Power=1-βcertainty of detection

Power depends on:

(µ-µ₀), σ , n, α

standardized effect:

17

The sample size (n=15) and error of first kind is fixed (α=0.05) , σ= 0.212.

What difference (µ-µ₀) can be detected with 90%

probability(β=0.1)?

18

HYPOTHESIS TESTS