Variance - BIOSTATISTICS A ﬁrst course in probability theory and statistics for biologists

Values of random variables with the same expected value can deviate differently from the common expected value. We would like to introduce a number in order to characterize the dispersion of the values of the random variableX around the expected value E(X). The expected value of X−E(X) is not a suitable measure, since, as we mentioned in Section 4.7, it is equal to zero. The value E(|X−E(X)|) is difficult to discuss mathematically.

An appropriate measure is the following.

The variance of the random variable X is defined as D²(X) :=E([X−E(X)]²), and its standard deviation as

D(X) :=p

D²(X) = p

E([X−E(X)²]), provided that both X and [X−E(X)]² have an expected value.

One should note that the existence of D²(X) implies the existence of E(X). At the same time, the converse statement does not hold.

A greater standard deviation expresses a greater average dispersion from the expected value.

We remark that in case the non-negative random variable [X−E(X)]² does not have an expected value, the standard deviation of X can be regarded as plus infinity.

The following useful formula traces the calculation of variance back to the determi-nation of two expected values.

If X is a random variable, and the random variable X² has an expected value, then X has both an expected value and a variance, the latter being

D²(X) =E(X²)−(E(X))². (4.3)

Example 4.9. In the example of rolling a die, for the random variable X(ω) := ω, ω = 1, . . . ,6, which gives the value of the roll, we have E(X) = ⁷₂ (see Example 4.7) and

E(X²) = 1

6 ·1²+1

6 ·2²+ 1

6·3²+ 1

6·4²+ 1

6·5²+1

6 ·6² = 91 6 . This implies

D²(X) =E(X²)−(E(X))² = 91 6 −

7 2

= 35

12, D(X) = r35

12.

According to the above considerations, the standard deviation is a measure of the dispersion of the range of a random variable around its expected value. Therefore, we intuitively expect that a constant random variable should have zero standard deviation, by shifting the range the standard deviation should not change, and by stretching the range by a positive number from the origin of the real line the standard deviation should also be multiplied by this positive number. The following statement shows that our expectations are fulfilled.

Letc be a real number and letX be a random variable then the following statements are valid.

(i) If X =c constant, then it has a variance, namely, D²(X) = 0.

(ii) If the random variable X has a variance, then the random variable X+c also has a variance, namely, D²(X+c) = D²(X).

(iii) If the random variable X has a variance, then the random variable cX also has a variance, namely, D²(cX) = c²D²(X).

Let X, Y be independent random variables in a probability space, both having a variance. Then the random variable X +Y also has a variance, and it satisfies the equality

D²(X+Y) =D²(X) +D²(Y). (4.4)

An important case is where all the random variablesX1, . . . , Xn have the same stan-dard deviation. If σ :=D(X₁) =. . .=D(X_n), then

D(X₁+. . .+X_n) =p

D²(X₁) +. . .+D²(X_n) =√ nσ.

One can see that the standard deviation of the sum increases at a slower and slower pace as the number of terms increases (see also in Section 8.1).

If the random variable X is bounded, and the smallest closed interval for which P(a ≤X ≤ b) = 1 is the interval [a, b], then the size of the random variable or the size of the probability distribution of the random variable X is defined as the length b−a of the interval.

Chapter 5 Frequently applied probability distributions

In this chapter, we present the probability distributions frequently encountered in bio-logical applications.

We recall that by defining a probability distribution one does not need to specify a probability space. In the case of discrete distributions we give the values of the random variable with the corresponding positive probabilities. Continuous distributions will be described by their density functions. We give the expected value and the variance of the distributions.

5.1 Discrete probability distributions

5.1.1 The discrete uniform distribution

The discrete random variable X has a discrete uniform distribution with parameters x₁, x₂, . . . , x_n if its values with positive probability are R(X) := {x₁, . . . , x_n}, and to every value the same probability belongs, i.e., p₁ =. . .=p_n = _n¹.

We gained a probability distribution indeed, sincep_k ≥0,k = 1, . . . , nand

k=1

p_k = 1.

The expected value of the distribution is E(X) = 1

k=1

x_k, and its variance is

D²(X) = 1 n

k=1

x²_k− 1 n²

k=1

x_k

We remark that if the range of the random variable is a countably infinite set, then its elements cannot have the same probability, because if those were either positive, or 0, the sum of the probabilities would not be equal to 1, but to +∞ or 0, respectively.

Example 5.1. In the example of rolling a die, for the random variable X(ω) := ω, ω = 1, . . . ,6, the expected value and the variance are E(X) = ⁷₂ and D²(X) =

q35 12 (see Examples 4.7 and 4.9).

5.1.2 The binomial distribution

Let n be a positive integer, and 0 < p < 1 a real number. We say that the random variable X has a binomial distribution with parameters (n, p) if

xk :=k, pk :=

n k

p^k(1−p)^n−k, k= 0,1, . . . , n.

An illustration of the binomial distribution for n = 20 and three different values of parameter p is shown in Fig. 5.1.

Figure 5.1: Binomial distribution with three different parameters p in the casen = 20.

It is worth introducing the notation q := 1−p, and then we have p_k = ⁿ_k

p^kq^n−k, k = 0,1, . . . , n.

We really gained a probability distribution because the probabilities are non-negative, and, according to the binomial theorem,

The expected value and variance of the binomial distribution are E(X) =np, D²(X) =np(1−p).

Ifnpis an integer, then the distribution is unimodal andpnpis the greatest probability value, if np is not an integer, then either one or both of the two neighbouring integers is (are) the index (indices) of maximal probability.

In Example 2.4 on the sampling with replacement the number of the selected red balls, and in Section 3.6, in Bernoulli trials the number of occurrences of the given event with the corresponding probabilities follow a binomial distribution.

5.1.3 The hypergeometric distribution

Let N be a positive integer, K and n be non-negative integers, moreover, n ≤ N and K ≤ N. We say that the random variable X has a hypergeometric distribution with parameters (N, K, n) if

x_k:=k, p_k :=

K k

_N−K n−k

N n

, k = 0, . . . , n.

We remind that ^r_s

= 0 if r < s, and so there may be probability values that are equal to 0.

We have really obtained a probability distribution, since all the probabilities are non-negative, and one can show that their sum is equal to 1.

The expected value and variance of the distribution are E(X) =nK

N, D²(X) =nK N

1−K

N 1− n−1 N −1

The probabilities of the hypergeometric distribution with parameters (N, K, n) are well approximated by the corresponding probabilities of the binomial distribution with parameters (n,^K_N) if K and N −K are large, and n is small. The expected values of both distributions are equal to n^K_N.

Example 5.2. An urn contains K white and N −K green balls. We draw n balls at random without replacement sampling without replacement. Then, the probability that k red balls will be drawn is p_k = (^K_k)(^N−Kn−k)

(^N_n) , k = 0, . . . , n.

5.1.4 Poisson distribution

Let λ be a positive real parameter. We say the the random variable X has a Poisson distribution with parameter λ if

x_k:=k, p_k :=e^−λλ^k

k!, k = 0,1,2, . . .

This is a probability distribution indeed because the probabilities are non-negative, and their sum is

since, from the Taylor series expansion of the exponential function with base e of centre 0, the sum

An illustration of Poisson distribution with different parameters is shown in Fig. 5.2.

Figure 5.2: Poisson distribution with three different parameters λ.

The expected value and variance of the distribution are E(X) = λ, D²(X) =λ.

Therefore, Poisson distribution is completely determined by its expected value.

If the expected value of λ is not an integer, then p_[λ] is the greatest probability, so the distribution is unimodal with mode [λ] ([λ] denotes the integer part ofλ). If λ is an integer, then p_λ =pλ−1 is the greatest probability, so the distribution is bimodal, with λ and λ−1 being its two modes.

Poisson distribution is often encountered in the applications. The number of ”rare events”, i.e., when the number of possible events is large and each of them is rare (for example, the number of a given type of blood cells in a small part of view of a microscope, the number of individuals of a plant species in a randomly chosen sample quadrat in a meadow or the number of tiny particles floating in a randomly chosen unit volume of the air or water) often follow a Poisson distribution.

Example 5.3. The relative frequency of a disease is 0.005 in the population. We as-sume that the cases of the individual people are independent events. We determine the probability that exactly 2 people get the disease in a village with 1000 inhabitants. We calculate the expected value and the standard deviation of the number of the diseased people. Then we compare the results with those obtained by Poisson approximation.

Solution: Let Ω be the set of inhabitants in the given village, and, in the classical probability space defined on it, let the random variable be the number of the diseased people in the village, which will be denoted by X₁. This random variable has a binomial distribution with parameters n:= 1000 and p:= 0.005.

The probability that exactly 2 people get the disease in this village is P(X₁ = 2) =

1000 2

·0.005²·0.995⁹⁹⁸ ≈0.0839.

The expected value and standard deviation of the diseased villagers are E(X₁) =np = 5, D(X₁) = p

np(1−p)≈2.230.

By Poisson approximation: Letλ:=np= 5 and let X2 be a random variable having a Poisson distribution with parameter λ, then

P(X₂ = 2) =e⁻⁵5²

2 ≈0.0842, moreover,

E(X₂) = λ= 5, D(X₂) =√

λ≈2.236.

So the error of the approximation by Poisson distribution has an order of magnitude of 10⁻³.

In the previous example the binomial distribution was ”well” approximated by Poisson distribution. This is always the case when the parameter n of the binomial distribution

is ”sufficiently large”, and its parameter p is ”sufficiently small”. The reason for this is that Poisson distribution can be obtained from the binomial distribution by taking its limit as follows:

If λ is a given positive number, p_n:= ^λ_n, and p⁽ⁿ⁾₀ , . . . , p⁽ⁿ⁾n denote the probabilities of the binomial distribution with parameters (n, pn), then

n→∞lim p⁽ⁿ⁾_k = λ^k

k!e^−λ, k= 0,1,2, . . . ,

so the limit is the probability of Poisson distribution with parameter λ corresponding to the same index.

We remark that the binomial and Poisson distribution in this statement have the same expected value, since

Ebinomial =np_n =λ=EPoisson. Furthermore, if n→ ∞, then pn= ^λ_n →0, therefore

Dbinomial =

pnp_n(1−p_n) = p

λ(1−p_n)→√

λ =DPoisson.

5.1.5 The geometric distribution

Let 0 < p < 1 be a real number. We say that the random variable X has a geometric distribution with parameter p if

xk :=k, pk := (1−p)^k−1p, k = 1,2, . . .

It is again worth introducing the notation q := 1−p, then p_k =q^k−1p, k = 1,2, . . . We really gained a probability distribution because the probabilities are non-negative, and, by using the sum of the geometric series of quotient q,

∞

k=1

q^k−1p=p

∞

k=1

q^k−1 =p· 1

1−q = 1.

The expected value and variance of the distribution are E(X) = 1

p, D²(X) = 1−p p² .

Example 5.4. Let E be an event with probability p in the probability space (Ω,A, P).

We repeatedly execute the experiment independently until event E occurs. The number of executions of the experiment is a random variable having a geometric distribution with parameter p.

Solution: Indeed, if the experiment has been executed k times, and it was the last time that Efirst occurred, then the first k−1 timesE occurred, the probability of which is 1−p. Due to the independence of the experiments,

P(E first occurred in the kth experiment) = (1−p)^k−1p.

We remark that if we denote the number of the executed experiments byX, then the sum of the events

X = 0, X = 1, X = 2, . . . (5.1)

is not the impossible event, since it can happen that for any positive integer n event E does not occur in the nth iteration of the experiment. On the other hand, the sum of the probabilities of events (5.1) is equal to 1, therefore the probability that ”E never occurs in an infinite sequence of independent experiments” is 0.

5.1.6 The negative binomial distribution

Let n be a non-negative integer, and 0 < p <1 a real number. We say that the random variable X has a negative binomial distribution with parameters (n, p) if

x_k :=n+k, p_k :=

n+k−1 k

(1−p)^kpⁿ, k = 0,1,2, . . . The notationq:= 1−pcan again be used, and thenpk = ^n+k−1_k

q^kpⁿ,k = 0,1,2, . . . It is easy to show that we gained a probability distribution indeed. The expected value and variance are

E(X) = n

p, D²(X) = n(1−p) p² .

Example 5.5. Assume that in a population the children born one after the other are boys with probability p (and girls with probability 1−p). For a given positive integer n let X be the random variable that gives which child is the nth boy. Then X has negative binomial distribution with parametrs (n, p).

In document BIOSTATISTICS A ﬁrst course in probability theory and statistics for biologists (Pldal 55-64)