3 Growth of Functions

(1)

3 Growth of Functions

The order of growth of the running time of an algorithm, defined in Chapter 2, gives a simple characterization of the algorithm’s efficiency and also allows us to compare the relative performance of alternative algorithms. Once the input sizen becomes large enough, merge sort, with its 2(nlgn)worst-case running time, beats insertion sort, whose worst-case running time is 2(n²). Although we can sometimes determine the exact running time of an algorithm, as we did for insertion sort in Chapter 2, the extra precision is not usually worth the effort of computing it. For large enough inputs, the multiplicative constants and lower-order terms of an exact running time are dominated by the effects of the input size itself.

When we look at input sizes large enough to make only the order of growth of the running time relevant, we are studying theasymptoticefficiency of algorithms.

That is, we are concerned with how the running time of an algorithm increases with the size of the inputin the limit, as the size of the input increases without bound.

Usually, an algorithm that is asymptotically more efficient will be the best choice for all but very small inputs.

This chapter gives several standard methods for simplifying the asymptotic analysis of algorithms. The next section begins by defining several types of “asymptotic notation,” of which we have already seen an example in2-notation. Several notational conventions used throughout this book are then presented, and finally we review the behavior of functions that commonly arise in the analysis of algorithms.

3.1 Asymptotic notation

The notations we use to describe the asymptotic running time of an algorithm are defined in terms of functions whose domains are the set of natural numbers N= {0,1,2, . . .}. Such notations are convenient for describing the worst-case running-time functionT(n), which is usually defined only on integer input sizes.

It is sometimes convenient, however, toabuseasymptotic notation in a variety of

(2)

ways. For example, the notation is easily extended to the domain of real numbers or, alternatively, restricted to a subset of the natural numbers. It is important, however, to understand the precise meaning of the notation so that when it is abused, it is notmisused. This section defines the basic asymptotic notations and also intro- duces some common abuses.

2-notation

In Chapter 2, we found that the worst-case running time of insertion sort is T(n)=2(n²). Let us define what this notation means. For a given functiong(n), we denote by2(g(n))theset of functions

2(g(n))= {f(n): there exist positive constantsc1,c2, andn0such that 0≤c₁g(n)≤ f(n)≤c₂g(n)for alln≥n₀}.¹

A function f(n)belongs to the set 2(g(n)) if there exist positive constants c1

and c₂ such that it can be “sandwiched” between c₁g(n)and c₂g(n), for suffi- ciently large n. Because 2(g(n)) is a set, we could write “f(n) ∈ 2(g(n))”

to indicate that f(n) is a member of 2(g(n)). Instead, we will usually write

“f(n)= 2(g(n))” to express the same notion. This abuse of equality to denote set membership may at first appear confusing, but we shall see later in this section that it has advantages.

Figure 3.1(a) gives an intuitive picture of functions f(n)and g(n), where we have that f(n)=2(g(n)). For all values ofnto the right ofn0, the value of f(n) lies at or abovec₁g(n)and at or belowc₂g(n). In other words, for alln≥n₀, the function f(n)is equal tog(n)to within a constant factor. We say thatg(n)is an asymptotically tight boundfor f(n).

The definition of 2(g(n)) requires that every member f(n) ∈ 2(g(n)) be asymptotically nonnegative, that is, that f(n)be nonnegative whenevernis sufficiently large. (Anasymptotically positivefunction is one that is positive for all sufficiently largen.) Consequently, the functiong(n)itself must be asymptotically nonnegative, or else the set2(g(n))is empty. We shall therefore assume that every function used within2-notation is asymptotically nonnegative. This assumption holds for the other asymptotic notations defined in this chapter as well.

In Chapter 2, we introduced an informal notion of2-notation that amounted to throwing away lower-order terms and ignoring the leading coefficient of the highest-order term. Let us briefly justify this intuition by using the formal definition to show that ¹₂n²−3n = 2(n²). To do so, we must determine positive constantsc₁,c₂, andn₀such that

1Within set notation, a colon should be read as “such that.”

(3)

3.1 Asymptotic notation 43

(b) (c)

(a)

n n

n n0 n0

n0

f(n)=2(g(n)) f(n)=O(g(n)) f(n)=(g(n))

f(n) f(n)

f(n)

cg(n) cg(n)

c₁g(n) c₂g(n)

Figure 3.1 Graphic examples of the2,O, andnotations. In each part, the value ofn₀shown is the minimum possible value; any greater value would also work.(a)2-notation bounds a function to within constant factors. We write f(n)=2(g(n))if there exist positive constantsn₀,c₁, andc₂such that to the right ofn₀, the value of f(n)always lies betweenc₁g(n)andc₂g(n)inclusive. (b)O- notation gives an upper bound for a function to within a constant factor. We write f(n)=O(g(n)) if there are positive constantsn₀andcsuch that to the right ofn₀, the value of f(n)always lies on or belowcg(n). (c)-notation gives a lower bound for a function to within a constant factor. We write f(n)=(g(n))if there are positive constantsn₀andcsuch that to the right ofn₀, the value of f(n)always lies on or abovecg(n).

c1n²≤ 1

2n²−3n≤c2n²

for alln≥n0. Dividing byn²yields c₁≤ 1

2− 3 n ≤c₂.

The right-hand inequality can be made to hold for any value ofn≥1 by choosing c₂ ≥ 1/2. Likewise, the left-hand inequality can be made to hold for any value ofn ≥7 by choosingc₁ ≤1/14. Thus, by choosingc₁ = 1/14,c₂ = 1/2, and n₀ = 7, we can verify that ¹₂n²−3n = 2(n²). Certainly, other choices for the constants exist, but the important thing is thatsomechoice exists. Note that these constants depend on the function¹₂n²−3n; a different function belonging to2(n²) would usually require different constants.

We can also use the formal definition to verify that 6n³ 6= 2(n²). Suppose for the purpose of contradiction thatc2andn0exist such that 6n³≤c2n²for alln≥n0. But thenn ≤c₂/6, which cannot possibly hold for arbitrarily largen, sincec₂ is constant.

Intuitively, the lower-order terms of an asymptotically positive function can be ignored in determining asymptotically tight bounds because they are insignificant for large n. A tiny fraction of the highest-order term is enough to dominate the

(4)

lower-order terms. Thus, settingc₁ to a value that is slightly smaller than the coefficient of the highest-order term and settingc₂ to a value that is slightly larger permits the inequalities in the definition of2-notation to be satisfied. The coefficient of the highest-order term can likewise be ignored, since it only changesc₁ andc₂by a constant factor equal to the coefficient.

As an example, consider any quadratic function f(n)= an²+bn+c, where a,b, and care constants anda > 0. Throwing away the lower-order terms and ignoring the constant yields f(n)=2(n²). Formally, to show the same thing, we take the constantsc₁=a/4,c₂=7a/4, andn₀=2·max((|b|/a),√

(|c|/a)). The reader may verify that 0≤c₁n²≤an²+bn+c≤c₂n²for alln≥n₀. In general, for any polynomial p(n)=Pd

i=0ainⁱ, where theai are constants andad >0, we have p(n)=2(n^d)(see Problem 3-1).

Since any constant is a degree-0 polynomial, we can express any constant function as2(n⁰), or2(1). This latter notation is a minor abuse, however, because it is not clear what variable is tending to infinity.² We shall often use the notation2(1) to mean either a constant or a constant function with respect to some variable.

O-notation

The2-notation asymptotically bounds a function from above and below. When we have only anasymptotic upper bound, we use O-notation. For a given func- tiong(n), we denote byO(g(n))(pronounced “big-oh ofgofn” or sometimes just

“oh ofgofn”) the set of functions

O(g(n))= {f(n): there exist positive constantscandn₀such that 0≤ f(n)≤cg(n)for alln≥n0}.

We use O-notation to give an upper bound on a function, to within a constant factor. Figure 3.1(b) shows the intuition behindO-notation. For all valuesnto the right ofn0, the value of the function f(n)is on or belowg(n).

We write f(n) = O(g(n)) to indicate that a function f(n) is a member of the setO(g(n)). Note that f(n) = 2(g(n))implies f(n) = O(g(n)), since 2- notation is a stronger notion than O-notation. Written set-theoretically, we have 2(g(n)) ⊆ O(g(n)). Thus, our proof that any quadratic functionan²+bn+c, wherea>0, is in2(n²)also shows that any quadratic function is inO(n²). What may be more surprising is that anylinear functionan+bis in O(n²), which is easily verified by takingc=a+ |b|andn0 =1.

2The real problem is that our ordinary notation for functions does not distinguish functions from values. Inλ-calculus, the parameters to a function are clearly specified: the functionn²could be written asλn.n², or evenλr.r². Adopting a more rigorous notation, however, would complicate algebraic manipulations, and so we choose to tolerate the abuse.

(5)

Some readers who have seen O-notation before may find it strange that we should write, for example,n= O(n²). In the literature, O-notation is sometimes used informally to describe asymptotically tight bounds, that is, what we have defined using2-notation. In this book, however, when we write f(n)= O(g(n)), we are merely claiming that some constant multiple ofg(n)is an asymptotic upper bound on f(n), with no claim about how tight an upper bound it is. Distinguish- ing asymptotic upper bounds from asymptotically tight bounds has now become standard in the algorithms literature.

Using O-notation, we can often describe the running time of an algorithm merely by inspecting the algorithm’s overall structure. For example, the doubly nested loop structure of the insertion sort algorithm from Chapter 2 immediately yields anO(n²)upper bound on the worst-case running time: the cost of each iteration of the inner loop is bounded from above byO(1)(constant), the indicesi and j are both at mostn, and the inner loop is executed at most once for each of then²pairs of values fori and j.

SinceO-notation describes an upper bound, when we use it to bound the worst- case running time of an algorithm, we have a bound on the running time of the algorithm on every input. Thus, the O(n²)bound on worst-case running time of insertion sort also applies to its running time on every input. The2(n²)bound on the worst-case running time of insertion sort, however, does not imply a2(n²) bound on the running time of insertion sort oneveryinput. For example, we saw in Chapter 2 that when the input is already sorted, insertion sort runs in2(n)time.

Technically, it is an abuse to say that the running time of insertion sort isO(n²), since for a given n, the actual running time varies, depending on the particular input of sizen. When we say “the running time isO(n²),” we mean that there is a function f(n)that isO(n²)such that for any value ofn, no matter what particular input of sizenis chosen, the running time on that input is bounded from above by the value f(n). Equivalently, we mean that the worst-case running time isO(n²).

-notation

Just asO-notation provides an asymptoticupperbound on a function,-notation provides anasymptotic lower bound. For a given function g(n), we denote by

(g(n))(pronounced “big-omega ofgofn” or sometimes just “omega ofgofn”) the set of functions

(g(n))= {f(n): there exist positive constantscandn₀such that 0≤cg(n)≤ f(n)for alln≥n0}.

The intuition behind-notation is shown in Figure 3.1(c). For all valuesnto the right ofn0, the value of f(n)is on or abovecg(n).

From the definitions of the asymptotic notations we have seen thus far, it is easy to prove the following important theorem (see Exercise 3.1-5).

(6)

Theorem 3.1

For any two functions f(n)and g(n), we have f(n) = 2(g(n)) if and only if f(n)=O(g(n))and f(n)=(g(n)).

As an example of the application of this theorem, our proof thatan²+bn+c= 2(n²) for any constants a, b, and c, where a > 0, immediately implies that an²+bn+c=(n²)andan²+bn+c= O(n²). In practice, rather than using Theorem 3.1 to obtain asymptotic upper and lower bounds from asymptotically tight bounds, as we did for this example, we usually use it to prove asymptotically tight bounds from asymptotic upper and lower bounds.

Since-notation describes a lower bound, when we use it to bound the best-case running time of an algorithm, by implication we also bound the running time of the algorithm on arbitrary inputs as well. For example, the best-case running time of insertion sort is(n), which implies that the running time of insertion sort is(n).

The running time of insertion sort therefore falls between(n)andO(n²), since it falls anywhere between a linear function of n and a quadratic function of n.

Moreover, these bounds are asymptotically as tight as possible: for instance, the running time of insertion sort is not(n²), since there exists an input for which insertion sort runs in2(n)time (e.g., when the input is already sorted). It is not contradictory, however, to say that theworst-case running time of insertion sort is(n²), since there exists an input that causes the algorithm to take(n²)time.

When we say that therunning time(no modifier) of an algorithm is(g(n)), we mean thatno matter what particular input of size n is chosen for each value of n, the running time on that input is at least a constant timesg(n), for sufficiently largen.

Asymptotic notation in equations and inequalities

We have already seen how asymptotic notation can be used within mathematical formulas. For example, in introducing O-notation, we wrote “n = O(n²).” We might also write 2n²+3n+1=2n²+2(n). How do we interpret such formulas?

When the asymptotic notation stands alone on the right-hand side of an equation (or inequality), as inn=O(n²), we have already defined the equal sign to mean set membership: n∈ O(n²). In general, however, when asymptotic notation appears in a formula, we interpret it as standing for some anonymous function that we do not care to name. For example, the formula 2n²+3n+1= 2n²+2(n)means that 2n²+3n+1=2n²+ f(n), where f(n)is some function in the set2(n). In this case, f(n)=3n+1, which indeed is in2(n).

Using asymptotic notation in this manner can help eliminate inessential detail and clutter in an equation. For example, in Chapter 2 we expressed the worst-case running time of merge sort as the recurrence

T(n)=2T(n/2)+2(n) .

(7)

If we are interested only in the asymptotic behavior of T(n), there is no point in specifying all the lower-order terms exactly; they are all understood to be included in the anonymous function denoted by the term2(n).

The number of anonymous functions in an expression is understood to be equal to the number of times the asymptotic notation appears. For example, in the expression

Xn i=1

O(i) ,

there is only a single anonymous function (a function ofi). This expression is thus not the same as O(1)+ O(2)+ · · · + O(n), which doesn’t really have a clean interpretation.

In some cases, asymptotic notation appears on the left-hand side of an equation, as in

2n²+2(n)=2(n²) .

We interpret such equations using the following rule: No matter how the anony- mous functions are chosen on the left of the equal sign, there is a way to choose the anonymous functions on the right of the equal sign to make the equation valid.

Thus, the meaning of our example is that for anyfunction f(n) ∈ 2(n), there issomefunction g(n) ∈ 2(n²)such that 2n² + f(n) = g(n)for alln. In other words, the right-hand side of an equation provides a coarser level of detail than the left-hand side.

A number of such relationships can be chained together, as in 2n²+3n+1 = 2n²+2(n)

= 2(n²) .

We can interpret each equation separately by the rule above. The first equation says that there issomefunction f(n)∈2(n)such that 2n²+3n+1=2n²+ f(n)for alln. The second equation says that foranyfunctiong(n)∈2(n)(such as the f(n) just mentioned), there issomefunctionh(n)∈2(n²)such that 2n²+g(n)=h(n) for alln. Note that this interpretation implies that 2n²+3n+1=2(n²), which is what the chaining of equations intuitively gives us.

o-notation

The asymptotic upper bound provided by O-notation may or may not be asymp- totically tight. The bound 2n² = O(n²) is asymptotically tight, but the bound 2n=O(n²)is not. We useo-notation to denote an upper bound that is not asymptotically tight. We formally defineo(g(n))(“little-oh ofgofn”) as the set

(8)

o(g(n))= {f(n): for any positive constantc>0, there exists a constant n₀ >0 such that 0≤ f(n) <cg(n)for alln≥n₀}. For example, 2n=o(n²), but 2n² 6=o(n²).

The definitions ofO-notation ando-notation are similar. The main difference is that in f(n) = O(g(n)), the bound 0 ≤ f(n) ≤ cg(n)holds forsomeconstant c>0, but in f(n) = o(g(n)), the bound 0 ≤ f(n) < cg(n)holds forall constantsc>0. Intuitively, in theo-notation, the function f(n)becomes insignificant relative tog(n)asnapproaches infinity; that is,

nlim→∞

f(n)

g(n) =0. (3.1)

Some authors use this limit as a definition of theo-notation; the definition in this book also restricts the anonymous functions to be asymptotically nonnegative.

ω-notation

By analogy, ω-notation is to-notation aso-notation is to O-notation. We use ω-notation to denote a lower bound that is not asymptotically tight. One way to define it is by

f(n)∈ω(g(n))if and only ifg(n)∈o(f(n)) .

Formally, however, we defineω(g(n))(“little-omega ofgofn”) as the set ω(g(n))= {f(n): for any positive constantc>0, there exists a constant n0>0 such that 0≤cg(n) < f(n)for alln≥n0}.

For example, n²/2 = ω(n), butn²/2 6= ω(n²). The relation f(n) = ω(g(n)) implies that

nlim→∞

f(n) g(n) = ∞,

if the limit exists. That is, f(n) becomes arbitrarily large relative to g(n)as n approaches infinity.

Comparison of functions

Many of the relational properties of real numbers apply to asymptotic comparisons as well. For the following, assume that f(n)andg(n)are asymptotically positive.

(9)

Transitivity:

f(n) = 2(g(n)) and g(n) = 2(h(n)) imply f(n) = 2(h(n)) , f(n) = O(g(n)) and g(n) = O(h(n)) imply f(n) = O(h(n)) , f(n) = (g(n)) and g(n) = (h(n)) imply f(n) = (h(n)) , f(n) = o(g(n)) and g(n) = o(h(n)) imply f(n) = o(h(n)) , f(n) = ω(g(n)) and g(n) = ω(h(n)) imply f(n) = ω(h(n)) . Reflexivity:

f(n) = 2(f(n)) , f(n) = O(f(n)) , f(n) = (f(n)) . Symmetry:

f(n)=2(g(n)) if and only if g(n)=2(f(n)) . Transpose symmetry:

f(n) = O(g(n)) if and only if g(n) = (f(n)) , f(n) = o(g(n)) if and only if g(n) = ω(f(n)) .

Because these properties hold for asymptotic notations, one can draw an analogy between the asymptotic comparison of two functions f andgand the comparison of two real numbersaandb:

f(n)=O(g(n)) ≈ a≤b, f(n)=(g(n)) ≈ a≥b, f(n)=2(g(n)) ≈ a=b, f(n)=o(g(n)) ≈ a<b, f(n)=ω(g(n)) ≈ a>b.

We say that f(n)isasymptotically smallerthang(n)if f(n)=o(g(n)), and f(n) isasymptotically largerthang(n)if f(n)=ω(g(n)).

One property of real numbers, however, does not carry over to asymptotic notation:

Trichotomy: For any two real numbersaandb, exactly one of the following must hold:a<b,a=b, ora>b.

(10)

Although any two real numbers can be compared, not all functions are asymptotically comparable. That is, for two functions f(n)andg(n), it may be the case that neither f(n)=O(g(n))nor f(n)=(g(n))holds. For example, the functionsn andn¹⁺^sinⁿ cannot be compared using asymptotic notation, since the value of the exponent inn¹⁺^sinⁿoscillates between 0 and 2, taking on all values in between.

Exercises 3.1-1

Let f(n)andg(n)be asymptotically nonnegative functions. Using the basic definition of2-notation, prove that max(f(n),g(n))=2(f(n)+g(n)).

3.1-2

Show that for any real constantsaandb, whereb>0,

(n+a)^b=2(n^b) . (3.2)

3.1-3

Explain why the statement, “The running time of algorithmAis at leastO(n²),” is meaningless.

3.1-4

Is 2ⁿ⁺¹ =O(2ⁿ)? Is 2²ⁿ = O(2ⁿ)?

3.1-5

Prove Theorem 3.1.

3.1-6

Prove that the running time of an algorithm is2(g(n))if and only if its worst-case running time isO(g(n))and its best-case running time is(g(n)).

3.1-7

Prove thato(g(n))∩ω(g(n))is the empty set.

3.1-8

We can extend our notation to the case of two parametersnandm that can go to infinity independently at different rates. For a given functiong(n,m), we denote byO(g(n,m))the set of functions

O(g(n,m))= {f(n,m): there exist positive constantsc,n0, andm0

such that 0≤ f(n,m)≤cg(n,m) for alln≥n₀andm ≥m₀}.

Give corresponding definitions for(g(n,m))and2(g(n,m)).

(11)

3.2 Standard notations and common functions 51

3.2 Standard notations and common functions

This section reviews some standard mathematical functions and notations and ex- plores the relationships among them. It also illustrates the use of the asymptotic notations.

Monotonicity

A function f(n)is monotonically increasing if m ≤ n implies f(m) ≤ f(n).

Similarly, it is monotonically decreasing if m ≤ n implies f(m) ≥ f(n). A function f(n)isstrictly increasingif m < n implies f(m) < f(n)and strictly decreasingifm <nimplies f(m) > f(n).

Floors and ceilings

For any real numberx, we denote the greatest integer less than or equal toxby⌊x⌋ (read “the floor ofx”) and the least integer greater than or equal toxby⌈x⌉(read

“the ceiling ofx”). For all realx,

x−1 < ⌊x⌋ ≤ x ≤ ⌈x⌉ < x+1. (3.3)

For any integern,

⌈n/2⌉ + ⌊n/2⌋ =n,

and for any real numbern≥0 and integersa,b>0,

⌈⌈n/a⌉/b⌉ = ⌈n/ab⌉ , (3.4)

⌊⌊n/a⌋/b⌋ = ⌊n/ab⌋ , (3.5)

⌈a/b⌉ ≤ (a+(b−1))/b, (3.6)

⌊a/b⌋ ≥ ((a−(b−1))/b. (3.7)

The floor function f(x)= ⌊x⌋is monotonically increasing, as is the ceiling function f(x)= ⌈x⌉.

Modular arithmetic

For any integeraand any positive integer n, the valueamodnis theremainder (orresidue) of the quotienta/n:

amodn=a− ⌊a/n⌋n. (3.8)

Given a well-defined notion of the remainder of one integer when divided by another, it is convenient to provide special notation to indicate equality of remainders.

(12)

If(amodn)=(bmodn), we writea ≡b (modn)and say thataisequivalent tob, modulon. In other words,a≡b (modn)ifaandbhave the same remainder when divided byn. Equivalently,a ≡ b (mod n)if and only ifnis a divisor of b−a. We writea6≡b (modn)ifais not equivalent tob, modulon.

Polynomials

Given a nonnegative integerd, apolynomial in n of degree dis a function p(n)of the form

p(n)= Xd

i=0

a_inⁱ,

where the constants a₀,a₁, . . . ,a_d are the coefficients of the polynomial and a_d 6=0. A polynomial is asymptotically positive if and only ifa_d > 0. For an asymptotically positive polynomial p(n)of degreed, we havep(n)=2(n^d). For any real constanta ≥0, the functionn^ais monotonically increasing, and for any real constanta ≤0, the function n^a is monotonically decreasing. We say that a function f(n)ispolynomially boundedif f(n)=O(n^k)for some constantk.

Exponentials

For all reala>0,m, andn, we have the following identities:

a⁰ = 1, a¹ = a, a⁻¹ = 1/a, (a^m)ⁿ = a^mn, (a^m)ⁿ = (aⁿ)^m,

a^maⁿ = a^m⁺ⁿ .

For all n and a ≥ 1, the function aⁿ is monotonically increasing in n. When convenient, we shall assume 0⁰ =1.

The rates of growth of polynomials and exponentials can be related by the following fact. For all real constantsaandbsuch thata>1,

nlim→∞

n^b

aⁿ =0, (3.9)

from which we can conclude that n^b =o(aⁿ) .

Thus, any exponential function with a base strictly greater than 1 grows faster than any polynomial function.

(13)

Usinge to denote 2.71828. . ., the base of the natural logarithm function, we have for all realx,

e^x =1+x+x² 2! +x³

3!+ · · · = X∞

i=0

xⁱ

i! , (3.10)

where “!” denotes the factorial function defined later in this section. For all realx, we have the inequality

e^x ≥1+x, (3.11)

where equality holds only whenx =0. When|x| ≤1, we have the approximation

1+x≤e^x ≤1+x+x². (3.12)

Whenx→0, the approximation ofe^xby 1+xis quite good:

e^x =1+x+2(x²) .

(In this equation, the asymptotic notation is used to describe the limiting behavior asx→0 rather than asx → ∞.) We have for allx,

nlim→∞

1+ x

n n

=e^x. (3.13)

Logarithms

We shall use the following notations:

lgn = log₂n (binary logarithm) , lnn = log_en (natural logarithm) , lg^kn = (lgn)^k (exponentiation) , lg lgn = lg(lgn) (composition) .

An important notational convention we shall adopt is thatlogarithm functions will apply only to the next term in the formula, so that lgn+kwill mean(lgn)+kand not lg(n+k). If we hold b > 1 constant, then forn > 0, the function log_bnis strictly increasing.

For all reala>0,b>0,c>0, andn, a = b^log^b^a,

log_c(ab) = log_ca+log_cb, log_baⁿ = nlog_ba,

log_ba = log_ca

log_cb , (3.14)

(14)

log_b(1/a) = −log_ba, log_ba = 1

log_ab,

a^log^b^c = c^log^b^a, (3.15)

where, in each equation above, logarithm bases are not 1.

By equation (3.14), changing the base of a logarithm from one constant to another only changes the value of the logarithm by a constant factor, and so we shall often use the notation “lgn” when we don’t care about constant factors, such as in O-notation. Computer scientists find 2 to be the most natural base for logarithms because so many algorithms and data structures involve splitting a problem into two parts.

There is a simple series expansion for ln(1+x)when|x|<1:

ln(1+x)=x−x² 2 +x³

3 − x⁴ 4 + x⁵

5 − · · · . We also have the following inequalities forx >−1:

x

1+x ≤ ln(1+x) ≤ x, (3.16)

where equality holds only forx=0.

We say that a function f(n)ispolylogarithmically boundedif f(n)=O(lg^kn) for some constantk. We can relate the growth of polynomials and polylogarithms by substituting lgnfornand 2^aforain equation (3.9), yielding

nlim→∞

lg^bn

(2^a)^lgⁿ = lim

n→∞

lg^bn n^a =0. From this limit, we can conclude that lg^bn=o(n^a)

for any constanta>0. Thus, any positive polynomial function grows faster than any polylogarithmic function.

Factorials

The notationn! (read “nfactorial”) is defined for integersn≥0 as n!=

1 ifn=0, n·(n−1)! ifn>0. Thus,n!=1·2·3· · ·n.

(15)

A weak upper bound on the factorial function is n! ≤ nⁿ, since each of then terms in the factorial product is at mostn.Stirling’s approximation,

n!=√ 2πnn

e n

1+2 1

n

, (3.17)

whereeis the base of the natural logarithm, gives us a tighter upper bound, and a lower bound as well. One can prove (see Exercise 3.2-3)

n! = o(nⁿ) , n! = ω(2ⁿ) ,

lg(n!) = 2(nlgn) , (3.18)

where Stirling’s approximation is helpful in proving equation (3.18). The following equation also holds for alln≥1:

n!=√ 2πnn

e n

e^αⁿ (3.19)

where 1

12n+1 < αn < 1

12n . (3.20)

Functional iteration

We use the notation f⁽ⁱ⁾(n)to denote the function f(n)iteratively appliedi times to an initial value ofn. Formally, let f(n)be a function over the reals. For nonnegative integersi, we recursively define

f⁽ⁱ⁾(n)=

n ifi =0, f(f⁽ⁱ⁻¹⁾(n)) ifi >0.

For example, if f(n)=2n, then f⁽ⁱ⁾(n)=2ⁱn.

The iterated logarithm function

We use the notation lg^∗n(read “log star of n”) to denote the iterated logarithm, which is defined as follows. Let lg⁽ⁱ⁾n be as defined above, with f(n) = lgn.

Because the logarithm of a nonpositive number is undefined, lg⁽ⁱ⁾nis defined only if lg⁽ⁱ⁻¹⁾n > 0. Be sure to distinguish lg⁽ⁱ⁾n(the logarithm function applied i times in succession, starting with argumentn) from lgⁱn(the logarithm ofnraised to theith power). The iterated logarithm function is defined as

lg^∗n=min{i ≥0 : lg⁽ⁱ⁾n≤1} .

(16)

The iterated logarithm is averyslowly growing function:

lg^∗2 = 1, lg^∗4 = 2, lg^∗16 = 3, lg^∗65536 = 4, lg^∗(2⁶⁵⁵³⁶) = 5.

Since the number of atoms in the observable universe is estimated to be about 10⁸⁰, which is much less than 2⁶⁵⁵³⁶, we rarely encounter an input size n such that lg^∗n>5.

Fibonacci numbers

TheFibonacci numbersare defined by the following recurrence:

F₀ = 0,

F1 = 1, (3.21)

F_i = F_i₋₁+F_i₋₂ fori ≥2.

Thus, each Fibonacci number is the sum of the two previous ones, yielding the sequence

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, . . . .

Fibonacci numbers are related to thegolden ratioφand to its conjugatebφ, which are given by the following formulas:

φ = 1+√

5

2 (3.22)

= 1.61803. . . , b

φ = 1−√

5 2

= −.61803. . . . Specifically, we have F_i = φⁱ−bφⁱ

√5 , (3.23)

which can be proved by induction (Exercise 3.2-6). Since |bφ| < 1, we have

|bφⁱ|/√

5 <1/√

5 < 1/2, so that theith Fibonacci number F_i is equal toφⁱ/√ 5 rounded to the nearest integer. Thus, Fibonacci numbers grow exponentially.

(17)

Problems for Chapter 3 57

Exercises 3.2-1

Show that if f(n)andg(n)are monotonically increasing functions, then so are the functions f(n)+g(n)and f(g(n)), and if f(n)andg(n)are in addition nonnegative, then f(n)·g(n)is monotonically increasing.

3.2-2

Prove equation (3.15).

3.2-3

Prove equation (3.18). Also prove thatn!=ω(2ⁿ)andn!=o(nⁿ).

3.2-4 ⋆

Is the function⌈lgn⌉! polynomially bounded? Is the function⌈lg lgn⌉! polynomially bounded?

3.2-5 ⋆

Which is asymptotically larger: lg(lg^∗n)or lg^∗(lgn)?

3.2-6

Prove by induction that theith Fibonacci number satisfies the equality Fi =φⁱ−bφⁱ

√5 ,

whereφis the golden ratio andbφis its conjugate.

3.2-7

Prove that fori ≥0, the(i+2)nd Fibonacci number satisfiesF_i₊₂ ≥φⁱ.

Problems

3-1 Asymptotic behavior of polynomials Let

p(n)= Xd

i=0

ainⁱ ,

where a_d > 0, be a degree-d polynomial in n, and letk be a constant. Use the definitions of the asymptotic notations to prove the following properties.

a. Ifk≥d, thenp(n)=O(n^k).

(18)

b. Ifk≤d, then p(n)=(n^k).

c. Ifk=d, then p(n)=2(n^k).

d. Ifk>d, then p(n)=o(n^k).

e. Ifk<d, then p(n)=ω(n^k).

3-2 Relative asymptotic growths

Indicate, for each pair of expressions(A,B)in the table below, whetherAisO,o,

,ω, or2ofB. Assume thatk≥1,ǫ >0, andc>1 are constants. Your answer should be in the form of the table with “yes” or “no” written in each box.

A B O o ω 2

a. lg^kn n^ǫ

b. n^k cⁿ

c. √

n n^sinn d. 2ⁿ 2^n/2 e. n^lg^c c^lgⁿ f. lg(n!) lg(nⁿ)

3-3 Ordering by asymptotic growth rates

a. Rank the following functions by order of growth; that is, find an arrangement g1,g2, . . . ,g30of the functions satisfyingg1 =(g2),g2 =(g3), . . . ,g29=

(g30). Partition your list into equivalence classes such that f(n)andg(n)are in the same class if and only if f(n)=2(g(n)).

lg(lg^∗n) 2^lg^∗ⁿ (√

2)^lgⁿ n² n! (lgn)!

(³₂)ⁿ n³ lg²n lg(n!) 2²ⁿ n^1/^lgn ln lnn lg^∗n n·2ⁿ n^{lg lg}ⁿ lnn 1

2^lgⁿ (lgn)^lgⁿ eⁿ 4^lgⁿ (n+1)! p lgn lg^∗(lgn) 2√

2 lgn n 2ⁿ nlgn 2²ⁿ⁺¹

b. Give an example of a single nonnegative function f(n)such that for all func- tionsgi(n)in part (a), f(n)is neitherO(gi(n))nor(gi(n)).

(19)

Problems for Chapter 3 59

3-4 Asymptotic notation properties

Let f(n)andg(n)be asymptotically positive functions. Prove or disprove each of the following conjectures.

a. f(n)=O(g(n))impliesg(n)=O(f(n)).

b. f(n)+g(n)=2(min(f(n),g(n))).

c. f(n) = O(g(n))implies lg(f(n)) = O(lg(g(n))), where lg(g(n)) ≥ 1 and f(n)≥1 for all sufficiently largen.

d. f(n)=O(g(n))implies 2^f⁽ⁿ⁾= O(2^g(n)).

e. f(n)=O((f(n))²).

f. f(n)=O(g(n))impliesg(n)=(f(n)).

g. f(n)=2(f(n/2)).

h. f(n)+o(f(n))=2(f(n)).

3-5 Variations onOand

Some authors define in a slightly different way than we do; let’s use^∞ (read

“omega infinity”) for this alternative definition. We say that f(n) = (g(n))^∞ if there exists a positive constantcsuch that f(n)≥cg(n) ≥0 for infinitely many integersn.

a. Show that for any two functions f(n)andg(n)that are asymptotically nonnegative, either f(n)= O(g(n))or f(n)=(g(n))^∞ or both, whereas this is not true if we usein place of.^∞

b. Describe the potential advantages and disadvantages of using^∞instead ofto characterize the running times of programs.

Some authors also defineOin a slightly different manner; let’s useO^′for the alternative definition. We say that f(n)= O^′(g(n))if and only if|f(n)| =O(g(n)).

c. What happens to each direction of the “if and only if” in Theorem 3.1 if we substitute O^′forObut still use?

Some authors define Oe (read “soft-oh”) to mean O with logarithmic factors ignored:

e

O(g(n))= {f(n): there exist positive constantsc,k, andn₀such that 0≤ f(n)≤cg(n)lg^k(n)for alln≥n₀}.

(20)

d. Defineeande2in a similar manner. Prove the corresponding analog to Theo- rem 3.1.

3-6 Iterated functions

The iteration operator^∗used in the lg^∗function can be applied to any monotonically increasing function f(n)over the reals. For a given constantc∈R, we define the iterated function f_c^∗by

f_c^∗(n)=min{i ≥0 : f⁽ⁱ⁾(n)≤c} ,

which need not be well-defined in all cases. In other words, the quantity f_c^∗(n)is the number of iterated applications of the function f required to reduce its argument down tocor less.

For each of the following functions f(n)and constantsc, give as tight a bound as possible on f_c^∗(n).

f(n) c f_c^∗(n)

a. n−1 0

b. lgn 1

c. n/2 1

d. n/2 2

e. √

n 2

f. √

n 1

g. n^1/3 2

h. n/lgn 2 Chapter notes

Knuth [182] traces the origin of theO-notation to a number-theory text by P. Bach- mann in 1892. Theo-notation was invented by E. Landau in 1909 for his discussion of the distribution of prime numbers. Theand2notations were advocated by Knuth [186] to correct the popular, but technically sloppy, practice in the literature of usingO-notation for both upper and lower bounds. Many people continue to use the O-notation where the2-notation is more technically precise. Further discussion of the history and development of asymptotic notations can be found in Knuth [182, 186] and Brassard and Bratley [46].

Not all authors define the asymptotic notations in the same way, although the various definitions agree in most common situations. Some of the alternative def-

(21)

Notes for Chapter 3 61

initions encompass functions that are not asymptotically nonnegative, as long as their absolute values are appropriately bounded.

Equation (3.19) is due to Robbins [260]. Other properties of elementary mathematical functions can be found in any good mathematical reference, such as Abramowitz and Stegun [1] or Zwillinger [320], or in a calculus book, such as Apostol [18] or Thomas and Finney [296]. Knuth [182] and Graham, Knuth, and Patashnik [132] contain a wealth of material on discrete mathematics as used in computer science.

(22)

As noted in Section 2.3.2, when an algorithm contains a recursive call to itself, its running time can often be described by a recurrence. Arecurrenceis an equation or inequality that describes a function in terms of its value on smaller inputs. For example, we saw in Section 2.3.2 that the worst-case running timeT(n)of the MERGE-SORTprocedure could be described by the recurrence

T(n)=

2(1) ifn=1,

2T(n/2)+2(n) ifn>1, (4.1)

whose solution was claimed to beT(n)=2(nlgn).

This chapter offers three methods for solving recurrences—that is, for obtain- ing asymptotic “2” or “O” bounds on the solution. In thesubstitution method, we guess a bound and then use mathematical induction to prove our guess correct. The recursion-tree methodconverts the recurrence into a tree whose nodes represent the costs incurred at various levels of the recursion; we use techniques for bound- ing summations to solve the recurrence. Themaster methodprovides bounds for recurrences of the form

T(n)=aT(n/b)+ f(n),

wherea ≥ 1, b > 1, and f(n)is a given function; it requires memorization of three cases, but once you do that, determining asymptotic bounds for many simple recurrences is easy.

Technicalities

In practice, we neglect certain technical details when we state and solve recurrences. A good example of a detail that is often glossed over is the assumption of integer arguments to functions. Normally, the running timeT(n)of an algorithm is only defined whennis an integer, since for most algorithms, the size of the input is always an integer. For example, the recurrence describing the worst-case running time of MERGE-SORTis really

(23)

4.1 The substitution method 63

T(n)=

2(1) ifn=1,

T(⌈n/2⌉)+T(⌊n/2⌋)+2(n) ifn>1. (4.2) Boundary conditions represent another class of details that we typically ignore.

Since the running time of an algorithm on a constant-sized input is a constant, the recurrences that arise from the running times of algorithms generally have T(n)=2(1)for sufficiently small n. Consequently, for convenience, we shall generally omit statements of the boundary conditions of recurrences and assume thatT(n)is constant for smalln. For example, we normally state recurrence (4.1) as

T(n)=2T(n/2)+2(n) , (4.3)

without explicitly giving values for smalln. The reason is that although changing the value of T(1)changes the solution to the recurrence, the solution typically doesn’t change by more than a constant factor, so the order of growth is unchanged.

When we state and solve recurrences, we often omit floors, ceilings, and boundary conditions. We forge ahead without these details and later determine whether or not they matter. They usually don’t, but it is important to know when they do.

Experience helps, and so do some theorems stating that these details don’t affect the asymptotic bounds of many recurrences encountered in the analysis of algorithms (see Theorem 4.1). In this chapter, however, we shall address some of these details to show the fine points of recurrence solution methods.

4.1 The substitution method

The substitution method for solving recurrences entails two steps:

1. Guess the form of the solution.

2. Use mathematical induction to find the constants and show that the solution works.

The name comes from the substitution of the guessed answer for the function when the inductive hypothesis is applied to smaller values. This method is powerful, but it obviously can be applied only in cases when it is easy to guess the form of the answer.

The substitution method can be used to establish either upper or lower bounds on a recurrence. As an example, let us determine an upper bound on the recurrence

T(n)=2T(⌊n/2⌋)+n, (4.4)

which is similar to recurrences (4.2) and (4.3). We guess that the solution isT(n)= O(nlgn). Our method is to prove thatT(n)≤cnlgnfor an appropriate choice of

(24)

the constantc>0. We start by assuming that this bound holds for⌊n/2⌋, that is, thatT(⌊n/2⌋)≤c⌊n/2⌋lg(⌊n/2⌋). Substituting into the recurrence yields T(n) ≤ 2(c⌊n/2⌋lg(⌊n/2⌋))+n

≤ cnlg(n/2)+n

= cnlgn−cnlg 2+n

= cnlgn−cn+n

≤ cnlgn,

where the last step holds as long asc≥1.

Mathematical induction now requires us to show that our solution holds for the boundary conditions. Typically, we do so by showing that the boundary conditions are suitable as base cases for the inductive proof. For the recurrence (4.4), we must show that we can choose the constantclarge enough so that the bound T(n)≤cnlgn works for the boundary conditions as well. This requirement can sometimes lead to problems. Let us assume, for the sake of argument, that T(1)=1 is the sole boundary condition of the recurrence. Then forn = 1, the boundT(n)≤cnlgnyieldsT(1)≤c1 lg 1=0, which is at odds withT(1)=1.

Consequently, the base case of our inductive proof fails to hold.

This difficulty in proving an inductive hypothesis for a specific boundary condition can be easily overcome. For example, in the recurrence (4.4), we take advan- tage of asymptotic notation only requiring us to proveT(n)≤cnlgnforn≥n₀, wheren₀is a constant of our choosing. The idea is to remove the difficult boundary condition T(1) = 1 from consideration in the inductive proof. Observe that forn>3, the recurrence does not depend directly onT(1). Thus, we can replace T(1)by T(2)andT(3)as the base cases in the inductive proof, letting n0 = 2.

Note that we make a distinction between the base case of the recurrence (n =1) and the base cases of the inductive proof (n =2 andn=3). We derive from the recurrence thatT(2)=4 andT(3)=5. The inductive proof thatT(n)≤cnlgn for some constantc≥1 can now be completed by choosingclarge enough so that T(2)≤c2 lg 2 andT(3)≤c3 lg 3. As it turns out, any choice ofc≥2 suffices for the base cases ofn =2 andn = 3 to hold. For most of the recurrences we shall examine, it is straightforward to extend boundary conditions to make the inductive assumption work for smalln.

Making a good guess

Unfortunately, there is no general way to guess the correct solutions to recurrences.

Guessing a solution takes experience and, occasionally, creativity. Fortunately, though, there are some heuristics that can help you become a good guesser. You can also use recursion trees, which we shall see in Section 4.2, to generate good guesses.

(25)

4.1 The substitution method 65

If a recurrence is similar to one you have seen before, then guessing a similar solution is reasonable. As an example, consider the recurrence

T(n)=2T(⌊n/2⌋ +17)+n,

which looks difficult because of the added “17” in the argument toT on the right- hand side. Intuitively, however, this additional term cannot substantially affect the solution to the recurrence. Whennis large, the difference betweenT(⌊n/2⌋)and T(⌊n/2⌋ +17)is not that large: both cutnnearly evenly in half. Consequently, we make the guess thatT(n) = O(nlgn), which you can verify as correct by using the substitution method (see Exercise 4.1-5).

Another way to make a good guess is to prove loose upper and lower bounds on the recurrence and then reduce the range of uncertainty. For example, we might start with a lower bound ofT(n)=(n)for the recurrence (4.4), since we have the termnin the recurrence, and we can prove an initial upper bound ofT(n)=O(n²).

Then, we can gradually lower the upper bound and raise the lower bound until we converge on the correct, asymptotically tight solution ofT(n)=2(nlgn).

Subtleties

There are times when you can correctly guess at an asymptotic bound on the solution of a recurrence, but somehow the math doesn’t seem to work out in the induction. Usually, the problem is that the inductive assumption isn’t strong enough to prove the detailed bound. When you hit such a snag, revising the guess by subtracting a lower-order term often permits the math to go through.

Consider the recurrence

T(n)=T(⌊n/2⌋)+T(⌈n/2⌉)+1.

We guess that the solution is O(n), and we try to show that T(n) ≤ cn for an appropriate choice of the constantc. Substituting our guess in the recurrence, we obtain

T(n) ≤ c⌊n/2⌋ +c⌈n/2⌉ +1

= cn+1,

which does not implyT(n)≤cnfor any choice ofc. It’s tempting to try a larger guess, sayT(n)= O(n²), which can be made to work, but in fact, our guess that the solution isT(n)= O(n)is correct. In order to show this, however, we must make a stronger inductive hypothesis.

Intuitively, our guess is nearly right: we’re only off by the constant 1, a lower- order term. Nevertheless, mathematical induction doesn’t work unless we prove the exact form of the inductive hypothesis. We overcome our difficulty bysubtracting a lower-order term from our previous guess. Our new guess is T(n) ≤ cn−b,

(26)

whereb≥0 is constant. We now have T(n) ≤ (c⌊n/2⌋ −b)+(c⌈n/2⌉ −b)+1

= cn−2b+1

≤ cn−b,

as long asb≥1. As before, the constantcmust be chosen large enough to handle the boundary conditions.

Most people find the idea of subtracting a lower-order term counterintuitive. Af- ter all, if the math doesn’t work out, shouldn’t we be increasing our guess? The key to understanding this step is to remember that we are using mathematical induction: we can prove something stronger for a given value by assuming something stronger for smaller values.

Avoiding pitfalls

It is easy to err in the use of asymptotic notation. For example, in the recurrence (4.4) we can falsely “prove” T(n) = O(n) by guessing T(n) ≤ cn and then arguing

T(n) ≤ 2(c⌊n/2⌋)+n

≤ cn+n

= O(n) , ⇐Hwrong!!

since cis a constant. The error is that we haven’t proved theexact form of the inductive hypothesis, that is, thatT(n)≤cn.

Changing variables

Sometimes, a little algebraic manipulation can make an unknown recurrence similar to one you have seen before. As an example, consider the recurrence

T(n)=2T(⌊√

n⌋)+lgn,

which looks difficult. We can simplify this recurrence, though, with a change of variables. For convenience, we shall not worry about rounding off values, such as√

n, to be integers. Renamingm=lgnyields T(2^m)=2T(2^m/2)+m.

We can now renameS(m)=T(2^m)to produce the new recurrence S(m)=2S(m/2)+m,

which is very much like recurrence (4.4). Indeed, this new recurrence has the same solution: S(m)=O(mlgm). Changing back fromS(m)toT(n), we obtain T(n)=T(2^m)=S(m)=O(mlgm)= O(lgnlg lgn).

(27)

4.2 The recursion-tree method 67

Exercises 4.1-1

Show that the solution ofT(n)=T(⌈n/2⌉)+1 isO(lgn).

4.1-2

We saw that the solution ofT(n)=2T(⌊n/2⌋)+nisO(nlgn). Show that the solution of this recurrence is also(nlgn). Conclude that the solution is2(nlgn).

4.1-3

Show that by making a different inductive hypothesis, we can overcome the difficulty with the boundary condition T(1) = 1 for the recurrence (4.4) without adjusting the boundary conditions for the inductive proof.

4.1-4

Show that2(nlgn)is the solution to the “exact” recurrence (4.2) for merge sort.

4.1-5

Show that the solution toT(n)=2T(⌊n/2⌋ +17)+nisO(nlgn).

4.1-6

Solve the recurrenceT(n)=2T(√

n)+1 by making a change of variables. Your solution should be asymptotically tight. Do not worry about whether values are integral.

4.2 The recursion-tree method

Although the substitution method can provide a succinct proof that a solution to a recurrence is correct, it is sometimes difficult to come up with a good guess.

Drawing out a recursion tree, as we did in our analysis of the merge sort recurrence in Section 2.3.2, is a straightforward way to devise a good guess. In arecursion tree, each node represents the cost of a single subproblem somewhere in the set of recursive function invocations. We sum the costs within each level of the tree to obtain a set of per-level costs, and then we sum all the per-level costs to determine the total cost of all levels of the recursion. Recursion trees are particularly useful when the recurrence describes the running time of a divide-and-conquer algorithm.

A recursion tree is best used to generate a good guess, which is then verified by the substitution method. When using a recursion tree to generate a good guess, you can often tolerate a small amount of “sloppiness,” since you will be verifying your guess later on. If you are very careful when drawing out a recursion tree and summing the costs, however, you can use a recursion tree as a direct proof of a