• Nem Talált Eredményt

Mathematical Approach to Everyday Life

N/A
N/A
Protected

Academic year: 2023

Ossza meg "Mathematical Approach to Everyday Life"

Copied!
25
0
0

Teljes szövegt

(1)

Mathematical Approach to Everyday Life

Dr Ivana Djolović idjolovic@tfbor.bg.ac.rs

University of Belgrade, Technical faculty in Bor Bor, Serbia

Today, in the modern informatical society we are surrounded with different kind of stories in the media related to some predictions, claims, confidence levels and conclusions.

Verbal expressions and everyday phrases are presented to

audince in order to warn or just inform people, but mathematics stays in “the backstage”. Correctly used mathematics and statistics could be powerful tool for explanation of many situations in everyday life.

This talk will be devoted to some statistical interpretations of real life situation. Starting from some real situation, we will discover where the statistical interpretation is hidden. Also we will emphasize potential traps in understanding the situation.

(2)

9 ...9 out of 10 women recommend anti-age cream...

9 ...30% chance of snow...

9 ...the average lifetime of a light bulb is 562 days...

9 ...certain medication is the best solution for headache...

9 ...6-years old children spend 200 minutes watching TV...

9 ...less than 5% of our items are defective...

9 ...washing detergent A is more effective than others...

9 ...drinking 2 liters of water per day is healthy...

9 ...100% success in teaching...

?

9 Can I beleive in all those numbers?

9 How did they get those numbers?

9 Real life or suspicious information?

9 Who was included in the survey?

Can we test and check such claims?

(3)

Claim: A 3-month-old babies sleep an avarage 20 hours in a 24 hours.

Mathematical (statistical) interpretation 1:

A past study claimed that 3-month-old babies sleep an avarage 20 hours in a 24 hours. A researcher took a random sample of 20 babies and obtained that they slept an avarage 19 hours 15 minutes in a 24 hours. Assume that the sleeping times of all 3-month-old babies are normally distributed and population standard deviation is 45 minutes.

Using the 5% significance level, test the claim of the earlier study.

Mathematical (statistical) interpretation 2:

A past study claimed that 3-month-old babies sleep an avarage 20 hours in a 24 hours. A researcher took a random sample of 20 babies and obtained that they slept an avarage 19 hours 15 minutes in a 24 hours with standard deviation of 45 minutes. Assume that the sleeping times of all 3-month-old babies are normally. Using the 5%

significance level, test the claim of the earlier study.

The same problem? The same text? NO!!!

(4)

Hypothesis Testing -

Hypothesis tests about the mean

(hypothesis tests are used to confirm (accept) or deny (reject) a claim that is made about a population)

X – random variable – characteristic

(

x1, x2, x3, ... xn

)

sample n – sample size

Population Sample

µ-

population mean x-sample mean

σ-

population standard deviation s-sample standard deviation

1 2 3 ... n

x x x x

x n

+ + + +

=

2 2 2 2

2 (x1 x) (x2 x) (x3 x) ... (xn x)

s n

+ + + +

= , s= s2

$2 ( 1 )2 ( 2 )2 ( 3 )2 ... ( )2

1

x x x x x x xn x

s n

+ + + +

= , $

s2

s= $

Notation:

s- sample standard deviation

$s- improved sample standard deviation

(5)

Elements in the hypothesis tests:

9 Null hypothesis H0 (a claim about a population parameter that is assumed to be true until it is declared false)

9 Alternative hypothesis H1 (true if the null hypothesis is false)

Null hypothesis vs Alternative hypothesis

Real situation

H0 true H0 false Decision

Accept

H0 OK Type II Error

β Reject

H0 Type I Error α

OK

9 α - the significance level 9 C – the rejection region

9 T – test- statistics (random variable)

9 Statistically significant = Significantly different (the null hypothesis is rejected;

very small probability of happening just by chance;the difference between x and μ is statistically significant)

9 (Statistically) not significantly different (the difference between x and μ is so small that it may have occured just by chance)

(6)

Hypothesis tests about the mean

µ

1. σ known (

2

)

: ,

X N μ σ

Null hypothesis: H0(μ μ= 0)

9 Alternative hypothesis: H1(μ μ 0)

Two-tailed test; the rejection region: C= −∞ −( , zα] [ zα,), where ( ) 1

2 zα α

Φ = ;

9 Alternative hypothesis: H1(μ μ> 0)

Right-tailed test; the rejection region: C=[zα,), where ( ) 1 2

zα 2α

Φ = ;

9 Alternative hypothesis: H1(μ μ< 0)

Left-tailed test; the rejection region: C= −∞ −( , zα], where ( ) 1 2

zα 2α

Φ = ;

T X

n μ σ

=

t x 0

n σμ

=

(7)

( )zα

Φ 0.4 0.45 0.475 0.48 0.49 0.495

zα 1.285 1.645 1.96 2.055 2.325 2.575

(8)

( )zα 0.475 zα 1.96

Φ = =

( )zα 0.49! 2.32 zα 2 3.3 zα 2.325

Φ = or zα 2.33 or zα 2.32

(9)

Research 1:

A past study claimed that 3-month-old babies sleep an avarage 20 hours ina a 24 hours. A researcher took a random sample of 20 babies and obtained that they slept an avarage 19 hours 15 minutes in a 24 hours. Assume that the sleeping times of all 3-month-old babies are normally distributed and population standard deviation is 45 minutes.

Using the 5% significance level, test the claim of the earlier study.

X - the sleeping times of all 3-month-old babies; X N:

(

μ σ, 2

)

45 min 0.75h

σ = = (σ known) – population standard deviation n=20 (sample size)

19 15 min 19.25

x= h = h(sample mean)

5% 0.05

α= = (significance level)

Test-statistics 0 19.25 20 0.75 4.47

20 t x

n μ σ

= = ≈ −

1)H0(μ=20) vs H1(μ20)

two-tailed test;

the rejection region: C= −∞ −( , zα] [ zα,) ( ) 1 0.05

2 0.475 1.96

zα zα

Φ = = = C = −∞ −( , 1.96] [ 1.96,)

Since t≈ −4.47C, we reject H0(μ=20) - the claim of the earlier study 2)H0(μ=20) vs H1(μ>20)

right-tailed test;

the rejection region: C=[zα,) ( ) 1 2 0.05

2 0.45 1.645

zα − ⋅ zα

Φ = = = C=[1.645,)

Since t≈ −4.47C, we accept H0(μ=20) - the claim of the earlier study 3)H0(μ=20) vs H1(μ<20)

left-tailed test;

the rejection region: C= −∞ −( , zα]

( ) 1 2 0.05

2 0.45 1.645

zα − ⋅ zα

Φ = = = C= −∞ −( , 1.645]

Since t≈ −4.47C, we reject H0(μ=20) - the claim of the earlier study

(10)

1.

σ not known (

2

)

: ,

X N μ σ

Null hypothesis: H0(μ μ= 0)

9 Alternative hypothesis: H1(μ μ 0)

Two-tailed test; the rejection region: C= −∞ −

(

, tn1;α⎤ ⎡⎦ ⎣ tn1;α,

)

9 Alternative hypothesis: H1(μ μ> 0)

Right-tailed test; the rejection region: C=tn1;2α,

)

9 Alternative hypothesis: H1(μ μ< 0)

Left-tailed test; the rejection region:C= −∞ −

(

, tn1;2α

t-distribution (Student’s t distribution)

n-1 – degrees of freedom

1 T X

S n

μ

=

0

1 t x

s n

μ

=

OR

$ T X

S n

μ

=

t x 0

s n

μ

=

$

(11)
(12)
(13)

Research 2:

A past study claimed that 3-month-old babies sleep an avarage 20 hours in a 24 hours. A researcher took a random sample of 20 babies and obtained that they slept an avarage 19 hours 15 minutes in a 24 hours with standard deviation of 45 minutes. Assume that the sleeping times of all 3-month-old babies are normally. Using the 5%

significance level, test the claim of the earlier study.

X - the sleeping times of all 3-month-old babies; X N:

(

μ σ, 2

)

45 min 0.75

s= = h– sample standard deviation

σ unknown – population standard deviation n=20 (sample size)

19 15 min 19.25

x= h = h(sample mean)

5% 0.05

α= = (significance level)

Test-statistics 0 19.25 20 0.75 4.36

1 20 1

t x s n

μ

= = ≈ −

1)H0(μ=20) vs H1(μ20)

two-tailed test;

the rejection region: C= −∞ −

(

, tn1;α⎤ ⎡⎦ ⎣ tn1;α,

)

1; 20 1;0.05 19;0.05 2.093

tn α =t =t = C= −∞ −( , 2.093] [ 2.093,)

Since t≈ −4.36C, we reject H0(μ=20) - the claim of the earlier study 2)H0(μ=20) vs H1(μ>20)

right-tailed test;

the rejection region: C=tn1;2α,

)

1;2 20 1;2 0.05 19;0.10 1.729

tn α =t − ⋅ =t = C=[1.729,)

Since t≈ −4.36C, we accept H0(μ=20) - the claim of the earlier study 3)H0(μ=20) vs H1(μ<20)

left-tailed test;

the rejection region: C= −∞ −

(

, tn1;2α

1;2 20 1;2 0.05 19;0.10 1.729

tn α =t − ⋅ =t = C= −∞ −( , 1.729]

Since t≈ −4.36C, we reject H0(μ=20) - the claim of the earlier study

(14)

Research 3:

A past study claimed that 3-month-old babies sleep an avarage 20 hours in a 24 hours. A researcher took a random sample of 2000

babies and obtained that they slept an avarage 19 hours 15 minutes in a 24 hours with standard deviation of 45 minutes. Using the 5%

significance level, test the claim of the earlier study.

Research 4:

A past study claimed that 3-month-old babies sleep an avarage 20 hours in a 24 hours. A researcher took a random sample of 2000

babies and obtained that they slept an avarage 19 hours 15 minutes in a 24 hours. Assume that the population standard deviation is 45

minutes. Using the 5% significance level, test the claim of the earlier study.

Where is the assumption that the sleeping times of all

3-month-old babies are normally?

(15)

Central Limit Theorem

If one takes random samples of size

n

from a population of mean μ and standard deviation σ, then, as n gets large, X approaches the normal distribution, that is: X N: , 2

n μ σ

X – random variable – characteristic

(

X1, X2, X3, ... Xn

)

sample

n – sample size

1 2 n

X X X

X n

+ + +

= K

( )1 ( )2 ( )n ( )

E X =E X =K=E X =E X =μ

( ) ( ) ( ) ( )

2 2 2 2 2

1 2 n

X X X X

σ =σ =K=σ =σ =σ .

( )

X1 X2 Xn ( )

E X E E X

n μ

+ + +

= = =

K ,

( )

2( ) 2

2 2 X1 X2 Xn X

X n n n

σ σ

σ =σ + + + = =

K .

If X N: ( ,μ σ2) then X N: , 2 n μ σ

, for all n (either small (n<30) or large sample) If X has unknown distribution (not normal distribution) and known standard deviation σ , then X N: , 2

n μ σ

for large sample n30.

BUT

for CLT, we need the following:

9 a large sample size

9 known standard deviation σ What about the case: a sample is large and σ is not known?

(16)

Hypothesis tests about the mean µ according to a sample size

1.Small sample

1.1. σ known 1.2. σ not known

2.Large sample

2.1.σ known (CLT):X N: , 2 n μ σ

, that is :

( )

0,1

n Xσμ N

2.2 .σ not known: distributon of random variable X

S n

μ

$ can be approximated with normal distribution (as the sample size becomes larger, the t- distribution approaches the standard normal distribution)

The rejection regions can be obtained in the following way:

9 two-tailed test

( , ] [ , )

C= −∞ −zα zα or C= −∞ −( , tn1;α⎤ ⎡⎦ ⎣ tn1;α,), where ( ) 1

zα 2α

Φ = ;

9 right-tailed test

[ , )

C= zα orC=tn1;2α,), where ( ) 1 2

zα 2α

Φ = ;

9 left-tailed test

( , ]

C= −∞ −zα or C= −∞ −( , tn1;2α, where ( ) 1 2

zα 2α

Φ = .

(17)

Research 3:

A past study claimed that 3-month-old babies sleep an avarage 20 hours in a 24 hours. A researcher took a random sample of 2000

babies and obtained that they slept an avarage 19 hours 15 minutes in a 24 hours with standard deviation of 45 minutes. Using the 5%

significance level, test the claim of the earlier study.

X - the sleeping times of all 3-month-old babies; No assuption about the distribution!!!

45 min 0.75

s= = h– sample standard deviation

σ unknown – population standard deviation

(as the sample size becomes larger, the t-distribution approaches the standard normal distribution)

n=2000 (LARGE sample) 19 15 min 19.25

x= h = h(sample mean)

5% 0.05

α= = (significance level)

Test-statistics 0 19.25 20 0.75 44.7 1 2000 1 t x

s n

μ

= = ≈ −

1)H0(μ=20) vs H1(μ20)

two-tailed test;

the rejection region: C= −∞ −

(

, tn1;α⎤ ⎡⎦ ⎣ tn1;α,

)

1; 2000 1;0.05 1999;0.05 ;0.05 1.96

tn α =t =t =t = C= −∞ −( , 1.96] [ 1.96,)

BUT the rejection region can be also

( , ] [ , )

C= −∞ −zα zα where ( ) 1

zα 2α

Φ =

For α =0.05 we have ( ) ;0.05

1 0.05

0.475 1.96

zα 2 zα t

Φ = = = =

Since t≈ −44.7C, we reject H0(μ=20) - the claim of the earlier study

(18)

2)H0(μ=20) vs H1(μ>20)

right-tailed test;

the rejection region: C=tn1;2α,

)

1;2 2000 1;2 0.05 1999;0.10 ;0.10 1.645

tn α =t =t =t = C=[1.645,)

BUT the rejection region can be also

[ , )

C= zα where ( ) 1 2

zα 2α

Φ =

For α =0.05 we have ( ) ;0.10

1 2 0.05

0.45 1.645

zα − ⋅2 zα t

Φ = = = =

Since t≈ −44.7C, we accept H0(μ=20) - the claim of the earlier study

3)H0(μ=20) vs H1(μ<20)

left-tailed test;

the rejection region: C= −∞ −

(

, tn1;2α

1;2 2000 1;2 0.05 1999;0.10 1.645

tn α =t − ⋅ =t = C= −∞ −( , 1.645]

BUT the rejection region can be also

( , ]

C= −∞ −zα where ( ) 1 2

zα 2α

Φ =

For α =0.05 we have ( ) ;0.10

1 2 0.05

0.45 1.645

zα − ⋅2 zα t

Φ = = = =

Since t≈ −44.7C, we reject H0(μ=20) - the claim of the earlier study

(19)

Research 4:

A past study claimed that 3-month-old babies sleep an avarage 20 hours in a 24 hours. A researcher took a random sample of 2000

babies and obtained that they slept an avarage 19 hours 15 minutes in a 24 hours. Assume that the population standard deviation is 45

minutes. Using the 5% significance level, test the claim of the earlier study.

X - the sleeping times of all 3-month-old babies; No assuption about the distribution!!! CLT!!!

45 min 0.75h

σ = = (σ known) – population standard deviation n=2000 (LARGE sample)

19 15 min 19.25

x= h = h(sample mean)

5% 0.05

α= = (significance level)

Test-statistics 0 19.25 20

44.72 0.75

2000 t x

n σμ

= = ≈ −

1)H0(μ=20) vs H1(μ20)

two-tailed test;

the rejection region: C= −∞ −( , zα] [ zα,) ( ) 1 0.05

2 0.475 1.96

zα zα

Φ = = = C = −∞ −( , 1.96] [ 1.96,)

Since t≈ −44.72C, we reject H0(μ=20) - the claim of the earlier study 2)H0(μ=20) vs H1(μ>20)

right-tailed test;

the rejection region: C=[zα,) ( ) 1 2 0.05

2 0.45 1.645

zα − ⋅ zα

Φ = = = C=[1.645,)

Since t≈ −44.72C, we accept H0(μ=20) - the claim of the earlier study 3)H0(μ=20) vs H1(μ<20)

left-tailed test;

the rejection region: C= −∞ −( , zα]

( ) 1 2 0.05

2 0.45 1.645

zα − ⋅ zα

Φ = = = C= −∞ −( , 1.645]

Since t≈ −44.72C, we reject H0(μ=20) - the claim of the earlier study

(20)

Example A: A farmer is supposed to deliver potatoes to a grocery store in packages (bags) that weight 20 kilos (kg) in average. The grocery store claims that the packages are in average under 20 kilos.

A random sample of 50 packages of potatoes has an average of 19.4 kilos and standard deviation 1.9 kilos. Test the claim of the store with 1% significance level.

X - the weights of farmer’s packages; σ unknown – population standard deviation n=50 (largesample)

x=19.4kg(sample mean) 1.9kg

s= – sample standard deviation 1% 0.01

α= = (significance level)

Test-statistics 0 19.4 20 1.9 2.21

1 50 1

t x s n

μ

= = ≈ −

( )

0 20

H μ= vs H1(μ<20)

left-tailed test;

the rejection region: C= −∞ −

(

, tn1;2α

1;2 50 1;2 0.01 49;0.02 2.4

tn α =t − ⋅ =t C= −∞ −( , 2.4]

Since t≈ −2.21C, we accept H0(μ=20) - the claim of the farmer

Notice:

Since the sample is large, the rejection region can be also: C= −∞ −( , zα] where ( ) 1 2

zα 2α

Φ =

For α =0.01 we have ( ) ;0.02

1 2 0.01

0.49 2.325

zα − ⋅2 zα t

Φ = = =

(21)

Example B: A journalist claims that all adults in her city spend an average of 2 hours or more per week on jogging. A researcher wanted to test this claim. (S)he took a sample of 25 adults from that city and asked them about the time they spend per week on jogging. Their responses are as follows:

30 min, 1h, 20 min, 0 min, 1h 15min 45 min, 1h, 2h, 2h15mni, 3h, 0 min, 30min, 1h45min, 1h30min, 2h30min, 1h, 2h30min, 3h, 3h30min, 1h,

0min, 15min, 20min, 45min, 1h15min.

Assume that the times spent on jogging per week of all adults from this city are normally distributed. Using the 10% significance level test the claim of the journalist.

X - the times spent on jogging per week of all adults from the ciy; X N:

(

μ σ, 2

)

σ unknown – population standard deviation

n=25 (smallsample) x=?

30 60 20 0 75 45 60 120 135 180 ... 20 45 75

min 76.6 min

x + + + + + + +25 + + + + + +

= =

s=?

$ s=?

2 2 2 2 2 2 2 2 2 2 2

2 30 60 20 0 75 45 60 120 ... 20 45 75 2 2 2

min 76.6 min 3659.44 min

s + + + + + 25+ + + + + +

= =

or

( ) (2 ) (2 )2 ( ) (2 )2

2 30 76.6 60 76.6 20 76.6 ... 45 76.6 75 76.6 2 2

min 3659.44 min

s + + 25 + + +

= =

( ) (2 ) (2 )2 ( ) (2 )2

2 30 76.6 60 76.6 20 76.6 ... 45 76.6 75 76.6 2 2

min 3811.92 min

s + + 24 + + +

=

$

(22)

76.6 min

x= (sample mean) 3659.44 min2 60.49 min

s= = – sample standard deviation

3811.92 min2 61.74 min

s= =

$ - improved sample standard deviation

10% 0.10

α= = (significance level)

Test-statistics 0 76.6 120

3.515 60.49

1 25 1

t x s n

μ

= = ≈ −

or

0 76.6 120

3.515 61.74

25 t x

s n

μ

= = ≈ −

$

( )

0 120

H μ= vs H1(μ<120) ( )or

0 120

H μ vs H1(μ<120)

left-tailed test;

the rejection region: C= −∞ −

(

, tn1;2α

1;2 25 1;2 0.10 24;0.20 1.318

tn α =t − ⋅ =t = C= −∞ −( , 1.318]

Since t≈ −3.515C, we reject H0(μ=120)

(

H0(μ120)

)

- the claim of the journalist

(23)

Example C: A recent study claimed that the mean yield per apple plant of sort „G“ is 60 kilos. A researcher has measured the yields of 55 apple plants of certain sort „G“ and obtained the following:

Yields(kg) per plant

[51, 53) [53, 55) [55, 57) [57, 59) [59, 61) [61, 63]

Number of plants 6 9 10 12 10 8

Test the claim of the recent study with 10% significance level

X - the yields per apple plants of sort „G“

σ unknown – population standard deviation

n=55 (large sample) x=?

6 52 9 54 10 56 12 58 10 60 8 62

57.27 x ⋅ + ⋅ + ⋅ + ⋅ + ⋅55 + ⋅ kg kg

=

s=?

$ s=?

2 2 2 2 2 2

2 6 52 9 54 10 56 12 58 10 60 8 62 2 2 2

57.27

s = + ⋅ + ⋅ 55+ ⋅ + ⋅ + ⋅ kg kg

or

( )2 ( )2 ( )2 ( )2 ( )2 ( )2

2 6 52 57.27 9 54 57.27 10 56 57.27 12 58 57.27 10 60 57.27 8 62 57.27 2

55

s = + ⋅ + + + + ⋅ kg

or

( )2 ( )2 ( )2 ( )2 ( )2 ( )2

2 6 52 57.27 9 54 57.27 10 56 57.27 12 58 57.27 10 60 57.27 8 62 57.27 2

54

s = + ⋅ + + + + ⋅ kg

$

2 2

s =9.98kg

$

2 2

s 9.80kg

Hivatkozások

KAPCSOLÓDÓ DOKUMENTUMOK

In the course of my examination of quality of life and factors influencing that, I took special care to sleeping, activity, participation in social life, furthermore the

Isoproterenol (85 mg/kg) was injected subcutaneously 2 days after the first treatment at an interval of 24 hours for 2 days to produce myocardial infarction. After 17

In the current study we observed that aEEG done before six hours of age may show a normal like pattern in a relatively high proportion (15%) of term neonates pre- senting with

We have investigated the methylation changes occurring upon 16 hours’ fasting and 16 hours’ fasting followed by 8 hours’ refeeding in order to explore the effect

During the test, the prediction for average performance (equivalent peak load hours) in 15-minute periods based on 5 minutes measured data and 10 minutes predicted data from

Drawing a sample of n from a population with average m and standard deviation d, with replacement, the random variable of the sample sum disperses around its expected value nm in

the steady-state viscosity, where \f/(t) is the normalized relaxation function and G is the total relaxable shear modulus. The data of Catsiff et αΖ. 45 furnish in this way

It has been shown in Section I I that the stress-strain geometry of laminar shear is complicated b y the fact that not only d o the main directions of stress and strain rotate