• Nem Talált Eredményt

Hypothesis Test of Two Population Proportions

In document Biometry (Pldal 95-104)

4. HYPOTHESIS TESTING

4.4. Two-Sample Parametric Tests

4.4.4. Hypothesis Test of Two Population Proportions

In this section the method for testing two independent population proportions will be presented.

As discussed in Chapter 3.6. the sampling distribution of the difference of two proportions is the standard normal distribution. For the testing there are two basic assumptions: the two populations are normally distributed, and they are independent of each other, so the two sampling distributions are also independent.

First, the null and the alternate hypothesis is defined. The null hypothesis states, that H : − = , where c is a constant. As the values of and are between 0 and 1, the value of their difference is between -1 and 1. In most cases it is assumed that c is equal to 0, to test the equality of the proportions. In this case the null hypothesis states that the population proportions are equal, while the alternate hypothesis states that there is a difference regardless of the direction, or in one of the two possible directions (Table 12).

Table 12 The hypotheses for the two population proportions

Two-Sample Left-tailed Two-tailed Right-tailed

Two Proportions

Null hypothesis H0: ≥ H0: = H0:

Alternate hypothesis H1: < H1: H1: >

After setting the level of significance, the test statistic with a known theoretical distribution has to be defined. As discussed earlier, if the sample sizes n1 and n2 are greater than 30 and the values of P1 and P2 are not close to 0 or 1, the test statistic has the standard normal distribution:

= − − −

σ

Since the values of P1 and P2 are unknown, σ is estimated by s , which is the standard error of the estimated difference between the proportions. When the null hypothesis states that the population proportions are equal H : = = P, the best estimate for this common value P (based on both samples) is the pooled or combined proportion

= f + f

n + n =n ∙ p + n ∙ p n + n

(f is the number of successes out of n trials for both samples) and the estimate of the standard error can be calculated as critical value ( z ) obtained from the Tables in the APPENDICES, or determined by a statistical

96

software (as presented in Chapter 1.6.) defines the rejection and acceptance region (similar to Figure 19). The value of the test statistics calculated from the samples is compared with the critical value(s). If the value of the test statistic is more extreme than the critical value, that is, it falls in a rejection region, the decision will be to reject the null hypothesis.

EXAMPLE 4.17

The proportion of carp was examined in the fish stock of Lake Balaton on the northern and southern shores. Of the 250 fish caught on the north shore, 98 were carp, while on the south shore, 149 of the 250 fish were carp. With a 0.05 level of significance, can it be assumed that the proportion of carp does not differ on the northern and southern shores? (Hunkár, 2011)

1. Stating the hypotheses: ‘there is a significant difference between the proportion of carp on the northern and southern shores’ The hypotheses are:

H0: P = P that is H0: P − P = 0

H1: P ≠ P This indicates that a two-sided test is appropriate.

2. ’Use α = 0.05’

3. The equation is used for determining the test statistic and its distribution when H0 is correct:

= − − −

s

4. Determine the critical value(s) and define a rejection and an acceptance region for α: as it is a two-sided test, there will be two critical values.

z0.975 = 1.96 and the lower is z0.025 = -z0.975 = -1.96, the acceptance region is between these values.

5. Calculate the value of the test statistic from the samples → Let’s see the necessary sample statistics:

Nevertheless, the formula for the standard error, based on the pooled or combined proportion, is:

= 98 + 149

6. Make a decision regarding H0 by comparing the test statistic value with the critical values: the test statistic is lower than the critical value … (z = -4.660 < z0.025 = -1.96)

7. Give a conclusion on the original problem, interpret the results: The proportions of carp are considerably different on the northern and southern shores, at 5% level of significance the evidence is enough to reject the null hypothesis.

97

EXERCISES

Broiler chickens were fed with four types of feed (four types of soybean A, B, C, CP). Types A and B are new soybean varieties (with less antinutritive material), C is an old variety and CP is Type C flushed with flocked lupine. The experimental animals were housed in 6 cages per treatment. Individual weights were measured in grams on days 10, 24, and 42.

1. Classify the variables

 as qualitative or quantitative,

 by their scale of measurement.

Give an example of a random variable.

2. Suppose the weights of chicken (X) on days 10 are normally distributedwith a mean of 210 g and a standard deviation of 37 g.

Find the following probabilities:

 P(X ≤ 200)

 P(X < 250)

 P(X > 240)

 P(190 ≤ X < 220) 3. On day 10

find the desciptive statistics for the individual weight, by feed type,

construct a frequency table, a histogram and a frequency polygon, by feed type, check the normality by feed type.

4. Find the 95% and 99% confidence intervals for the unknown population mean of chicken weight by feed types A and CP, compare the two intervals and the margins of error.

5. Find the 90% and 95% confidence intervals for the unknown population variance of chicken weight and draw some conclusions.

6. Find the 95% and 99% confidence intervals for the difference of the two unknown population means of soybean A and CP and draw some conclusions.

7. Can it be assumed that the population variance for the different feed types is equal to 372 on day 10?

8. Can it be assumed that the population means of Feed A and Feed CP are higher than 210 g on day 10?

9. Can it be assumed that the population mean of Feed A is lower than that of Feed CP on day 10?

98

BIBLIOGRAPHY

Bajpai, N. (2010): Business Statistics. Dorling Kindersley. ISBN 978-81-317-2602-0

Barlow, R. J. (1999): A Guide to the Use of Statistical Methods in the Physical Sciences. John Wiley & Sons. ISBN 0-471-92294-3

Bolla M. – Krámli A. (2005): Statisztikai következtetések elmélete. Typotex ISBN 963-9548-41-3

Brase, C. H. – Brase C. P. (2010): Understandable Statistics: Concepts and Methods.

Brooks/Cole, Boston. ISBN: 978-1-4390-4779-8

Cseh E. – Farkas B. – Kocsis L. – Korcz E. – Tóth É. – Poór J. (2015): Növényi kivonatokz hatásána in vitro vizsgálata BOTRYTIS CINEREA PERS. esetében. Növényvédelem 51 (6), pp.

249-256.

Ireland, C. (2010): Experimental Statistics for Agriculture & Horticulture. Cambridge University Press, Cambridge. ISBN 978-1-84593-537-5

Freedman, D. – Pisani, R. – Purves, R. (2005): Statisztika. TYPOTEX, Budapest. ISBN 963-9548-63-4

Gaál M. (2004): A biometria számítógépes alkalmazásai a környezeti- és agrártudományokban.

Aula Kiadó Kft. Budapesti CORVINUS Egyetem. ISBN 978-9-639-58530-0

Hajtman B. (1968): Bevezetés a matematikai statisztikába. Akadémiai Kiadó, Budapest.

Hanke, J. E. – Reitsch A. G. (1991): Understanding Business Statistics. Richard D. Irwin, Boston. ISBN 0-256-06627-2

Harnos Zs. – Ladányi M. (2005): Biometria agrártudományi alkalmazásokkal. Aula Kiadó Kft.

Budapesti CORVINUS Egyetem. ISBN 978-9-639-58551-5

Healey, J. F. (2009): Statistics: A Tool for Social Research. Wadsworth Cengage Learning Academic Resource Center. ISBN 978-0-495-09655-9

Hooda, R. P. (2013): Statistics for Business and Economics. Vikas Publishing House. ISBN 978-93-259-6120-3

Hunkár, M. (2011): Biometria – Feladatgyűjtemény. Pannon Egyetem, Georgikon Kar.

Hunyadi L. – Vita L. (2002): Statisztika közgazdászoknak. KSH, Budapest. ISBN 963-215-498-3

Jánossy A. – Muraközy T. – Aradszky G.né (1966): Biometriai értelmező szótár.

Mezőgazdasági Kiadó, Budapest.

Jolicoeur, P. (1999): Introdution to Biometry. Springer Science – Business Media, LLC. New York. ISBN 978-1-4613-7163-2

Kaps, M. – Lamberson, W. R. (2004): Biostatistics for Animal Science. Wallingford, Oxfordshire, UK; Cambridge, MA : CABI Pub. ISBN 0-85199-820-8

Kovács E. (2014): Többváltozós statisztika. Typotex ISBN 978-963-279-243-9

Kozak, A. – Kozak, R. A. – Staudhammer, C. L. – Watts, S. B. (2008): Introductory Probability and Statistics. Applications for Forestry and Natural Sciences. Wallingford, Oxfordshire, UK ; Cambridge, MA : CABI Pub. ISBN 978-1-84593-275-6

Mann, P. S. (2010): Introductory Statistics. John Wiley & Sons. ISBN 978-0-470-44466-5

99

Marques de Sá, J. P. (2003): Applied Statistics using SPSS, STATISTICA and MATLAB.

Springer-Verlag Berlin Heidelberg ISBN 3-540-01156-0

Márton, A. (2018): A takarmányozás hatása az anyajuhok szaporodásbiológiai tulajdonságaira.

Doktori PhD értekezés. Pannon Egyetem, Festetics Doktori Iskola.

McDonald, J. H. (2009): Handbook of Biological Statistics. Sparky House Publishing.

Mead, R. – Curnow, R. N. – Hasted, A. M. (2003): Statistical Methods in Agriculture and Experimental Biology. Chapman & Hall/CRC. ISBN 1-58488-187-9

Nadar, E. N. (2015): Statistics. PHI Learning Private Limited, New Delhi. ISBN 978-81-203-5086-1

Norman, G. R. – Streiner, D. L. (2008): Biostatistics The Bare Essentials. B. C. Decker Inc.

Hamilton. ISBN 978-1-55009-347-6

Pituch, K. – Stevens, J. P. (2016): Applied multivariate statistics for the social sciences analyses with SAS and IBM‘s SPSS. Routledge Taylor & Francis Group

Poór, J. (2014): Descripitive Statistics. Kaposvár University – University of Pannonia – Cereal Research Non-Profit Ltd. ISBN 978-963-9639-65-2

Rangaswamy, R. (2006): A Text Book of Agricultural Statistics. New Age International Publishers, New Delhi. ISBN 81-224-0758-7

Rosner, B. (2011): Fundamentals of Biostatistics. Brooks/Cole, Cengage Learning. ISBN 978-0-538-73349-6

Sakdeo, B. M. (2017): Fundamentals of biometry. Laxmi Book Publication. ISBN 978-1-365-28684-1

Sharma, A. K. (2005): Textbook of Biostatistics I. Discovery Publishing House. New Delhi.

ISBN 81-8356-030-X

Sharma, A. K. (2005): Textbook of Biostatistics II. Discovery Publishing House. New Delhi.

ISBN 81-8356-031-8

Sahoo, P. (2003): Probability and Mathematical Statistics. Department of Mathematics University of Louisville Louisville.

Sokal, R. R. – Rohlf, F. J. (2009): Introduction to Biostatistics. Dover Publications. Inc. ISBN 978-0486-46961-4

Sváb J. (1980): Biometria jegyzet. Budapest, MÉM Mérnök- és Vezetőtovábbképző Intézet.

Sváb J. (1981): Biometriai módszerek a kutatásban. Mezőgazdasági Kiadó, Budapest. ISBN 963-231-013-6

Sváb J. (1979): Többváltozós módszerek a biometriában. Mezőgazdasági Kiadó, Budapest.

ISBN 963-230-011-4

Upton, G. – Cook, I. (2001): Introducing Statistics. Oxford University Press. ISBN 978-0-19-914-801-1

Wasserman, L. (2005): All of Statistics. A Concise Course in Statistical Inference. Springer.

ISBN 0-387-40272-1

Wegner, T. (2007): Applied Business Statistics – Methods and Excel-based Applications. Juta&

Co. Wetton. ISBN 978-0-702-17286-1

Winner, L. (2004): Introduction to Biostatistics. University of Florida Department of Statistics.

Zar, J. H. (2010): Biostatistical Analysis. Pearson Prentice Hall. ISBN 978-0-13-100846-5

100

APPENDICES

TABLE A.1 AREA UNDER THE STANDARD NORMAL CURVE TABLE A.2 CRITICAL VALUES OF THE t DISTRIBUTION TABLE A.3 CRITICAL VALUES OF THE 2 DISTRIBUTION

TABLE A.4 CRITICAL VALUES OF THE F DISTRIBUTION

101

Table A.1 Area under the standard normal curve Table A.2 Critical values of the t distribution

Percentile values for the t distribution with with v degrees of freedom

z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

102

Table A.3 Critical values of the χ2 distribution

Percentile values for the chi-square distribution with v degrees of freedom

v 0,005 0,01 0,025 0,05 0,10 0,25 0,50 0,75 0,90 0,95 0,975 0,99 0,995 80 51,1719 53,5401 57,1532 60,3915 64,2778 71,1445 79,3343 88,1303 96,5782 101,8795 106,6286 112,3288 116,3211 90 59,1963 61,7541 65,6466 69,1260 73,2911 80,6247 89,3342 98,6499 107,5650 113,1453 118,1359 124,1163 128,2989 100 67,3276 70,0649 74,2219 77,9295 82,3581 90,1332 99,3341 109,1412 118,4980 124,3421 129,5612 135,8067 140,1695 120 83,8516 86,9233 91,5726 95,7046 100,6236 109,2197 119,3340 130,0546 140,2326 146,5674 152,2114 158,9502 163,6482

103

Table A.4 Critical values of the F distribution (F > F0.95)

Numerator degrees of freedom

104

Manuscript closed: 31st, December, 2020

All rights reserved. No part of this work may be reproduced, used or transmitted in any form, or by any means – graphic, electronic or mechanical, including photocopying, recording, or information storage and retrieval systems – without the written permission of the author.

In document Biometry (Pldal 95-104)