Change of Basis - Linear Maps - Introduction to the Theory of Computing I.

2.6 Linear Maps

2.6.5 Change of Basis

We have defined the matrix of a linear map with respect to some bases, but we have not told anything about the significance of the bases so far. One may think that the choice of the standard bases simplifies the calculations, but this is only an illusion caused by the simplicity of the notations. Roughly speaking, choosing a basis means the choice of a point of view, and one can have different reasons for changing the perspective. It can happen that the simplicity of the formulae is important, but maybe the viewpoint is fixed for some reason and we simply want to adjust the computations to it. In this section we show how the coordinate vectors and the matrices of a linear maps w.r.t. different bases are connected to each other.

First we take a closer look at the situation when a vector x of Rⁿ is given with its coordinate vector with respect to the basis B₁ = {v₁, . . . , v_n} and we have to change the basis, i.e. the coordinate vector with respect to another basis B₂ = {v⁰₁, . . . , v⁰_n} of Rⁿ is needed. In this case we can simply apply Theorem 2.6.7 for the identity mapid_Rⁿ ofRⁿ and the bases B₂ and B₁. This way we obtain

[x]_B₁ = [id_Rⁿ]_B₂_,B₁[x]_B₂.

The jth column of the matrix [id_Rⁿ]B2,B1 come from the equation v⁰_j =a_1,jv₁+· · ·+a_n,jv_n.

Now letg :Rⁿ →Rⁿbe the linear map that maps the basis elements ofB₁to the elements of B₂, that is, g(v_j) = v⁰_j holds for every 1 ≤ j ≤ n. Observe that the jth column of the matrix [id_Rⁿ]_B₂_,B₁ is by definition the same as the jth column of [g]_B₁_,B₁, so these matrices are the same. At this point we introduce a notation:

Notation. If f : Rⁿ → Rⁿ is a linear map and B is a basis of Rⁿ, then for simplicity we write[f]_B instead of[f]_B,B, and we say that [f]_B is the matrix of f w.r.t. the basis B.

Using this notation we have [g]_B₁_,B₁ = [g]_B₁ and hence [x]_B₁ = [g]_B₁[x]_B₂.

As all the elements of the basisB₂ are in Img, and so are the linear combinations of them, thus, in fact Img = Rⁿ holds. This means that g is surjective and hence by Corollary 2.6.5 it is a bijection, so its inverse g⁻¹ is defined and linear by Theorem 2.6.13, moreover, [g⁻¹]_B₁ = [g]⁻¹_B

1 holds as well. Multiplying the identity above by this matrix from the left we obtain

Theorem 2.6.14. Assume that x∈Rⁿ, and B₁ ={v₁, . . . , v_n}, B₂ ={v⁰₁, . . . , v⁰_n} are bases of Rⁿ. If g :Rⁿ →Rⁿ is the linear map for which g(v_j) =v⁰_j holds for every 1≤j ≤n, then

[x]_B₂ = [g]⁻¹_B

1[x]_B₁.

Now we turn to the change of the bases in the case of linear maps:

Theorem 2.6.15. Assume that f : Rⁿ → R^k is a linear map, B₁ = {v₁, . . . , v_n} and C1 ={v⁰₁, . . . , v⁰_n} are bases ofRⁿ, whileB2 ={w₁, . . . , w_k}and C2 ={w⁰₁, . . . , w⁰_k} are bases of R^k. Let g :Rⁿ →Rⁿ be the uniquely determined linear map for which g(v_j) = v⁰_j holds for every 1≤j ≤n (that is, g maps the vectors of the basis B₁ to the elements of the basis C₁).

Also, let h :R^k →R^k be the uniquely determined linear map for which h(w_i) =w⁰_i holds for every 1≤i≤k (that is, h maps the vectors of the basis B₂ to the elements of the basis C₂).

Then

[f]_C₁_,C₂ = [h]⁻¹_B₂[f]_B₁_,B₂[g]_B₁.

This formula above simplifies a lot whenf maps from Rⁿ into itself and we use only one

"old" and one "new" basis for the description of the map by a matrix:

Corollary 2.6.16. Assume that f :Rⁿ →Rⁿ is a linear map and let B ={v₁, . . . , v_n} and C={v⁰₁, . . . , v⁰_n} be bases of Rⁿ. Let g :Rⁿ →Rⁿ be the uniquely determined linear map for which g(v_j) = v⁰_j holds for every 1≤j ≤n. Then

[f]_C = [g]⁻¹_B [f]_B[g]_B.

Proof of Theorem 2.6.15. Assume that f(v⁰_j) = a⁰_1,jw⁰₁ +· · ·+a⁰_k,jw⁰_k. This means that the jth column of [f]_C₁_,C₂ is the vector (a⁰_1,j, . . . , a⁰_k,j)^T. As g(v_j) = v⁰_j for every 1 ≤j ≤ n and h(w_i) =w⁰_i for every 1≤i≤k, we have

(f ◦g)(v_j) = f(g(v_j)) =f(v⁰_j) =a⁰_1,jw⁰₁+· · ·+a⁰_k,jw⁰_k

=a⁰_1,jh(w₁) +· · ·+a⁰_k,jh(w_k)

=h(a⁰_1,jw₁+· · ·+a⁰_k,jw_k).

As the elementsw⁰₁, . . . , w⁰_k of the basis C₂ of R^k are in Imh, we have Imh=R^k, that is, h is surjective and hence by Corollary 2.6.5 it is a bijection, so its inverse h⁻¹ is defined and linear. Thus,

(h⁻¹◦f◦g)(v_j) =a⁰_1,jw₁+· · ·+a⁰_k,jw_k,

which means that thejth column of the matrix[h⁻¹◦f◦g]_B₁_,B₂ is the same as thejth column of [f]C1,C2. This holds for every 1≤j ≤n, hence

[f]_C₁_,C₂ = [h⁻¹◦f ◦g]_B₁_,B₂.

Now we apply Theorem 2.6.11 first forh⁻¹,f◦g and the basesB₁,B₂ and B₃ =B₂ to obtain [h⁻¹◦f◦g]_B₁_,B₂ = [h⁻¹]_B₂[f◦g]_B₁_,B₂. Next we apply the same theorem to the maps f andg

and the (ordered) triple of basesB₁,B₁, B₂, which yields [f◦g]_B₁_,B₂ = [f]_B₁_,B₂[g]_B₁. Finally, the application of Theorem 2.6.13 for[h⁻¹]_B₂ gives the statement.

By Exercise 2.5.2 we have rank(AB) ≤ rank(A) for any matrices A, B for which the productAB is defined. Note that then

rank(AB) = rank((AB)^T) = rank(B^TA^T)≤rank(B^T) = rank(B)

follow as well. Assume that f : Rⁿ → R^k is a linear map, B₁ is a basis of Rⁿ and B₂ is a basis ofR^k. If g :Rⁿ→Rⁿ is the linear map which maps the elements of the standard basis of Rⁿ to the elements of B1 and h : R^k → R^k is the linear map that maps the elements of the standard basis ofR^k to the elements of the basisB₂, then

rank([f]_B₁_,B₂) = rank([h]⁻¹[f][g])≤rank([h]⁻¹[f])≤rank([f])

by Theorem 2.6.15 and by the remarks above. Similarly, as the equation[h][f]_B₁_,B₂[g]⁻¹ = [f]

holds as well, rank([f])≤rank([f]_B₁_,B₂)also follows, hence these ranks are the same. Thus, by Corollary 2.6.8 we infer that

rank(f) = rank([f]) = rank([f]_B₁_,B₂)

for any bases B₁ and B₂, as we have already mentioned in Section 2.6.3.

Example 2.6.6. Finally, we show an application of these tools. Assume thatf :R² →R² is the reflection in the line that goes through the origin and is parallel to the vectoru= (1,2)^T. We are looking for its matrix w.r.t. the standard basis.

The calculation of the image of the standard basis elements requires some work, but we choose another way instead and use some images that can be determined easily. For example, f fixes the vector u, that is,f(u) =u. Also, after the rotation u about the origin by 90^◦ we get v = (−2,1)^T, and obviously f(v) =−v holds. Therefore, it is easy to give the matrix of f w.r.t. the basis B ={u, v}. Indeed, we have

[f]_B=

1 0 0 −1

If g : R² → R² is the linear map that maps the standard basis to B, then its matrix w.r.t.

the standard basis is [g] =

1 −2

2 1

, and [g]⁻¹ = 1 5

1 2

−2 1

by the formula for the inverse in (10). By Corollary 2.6.16 we get that [f]_B = [g]⁻¹[f][g] ⇐⇒ [g][f]_B[g]⁻¹ = [f],

so the matrix off w.r.t. the standard basis is 1

1 −2

2 1

1 0

0 −1

1 2

−2 1

= 1 5

1 −2

2 1

1 2

2 −1

= −3/5 4/5 4/5 3/5

! .

In document Introduction to the Theory of Computing I. (Pldal 103-106)