• Nem Talált Eredményt

The Matrix of a Linear Map

2.6 Linear Maps

2.6.3 The Matrix of a Linear Map

Theorem 2.6.6. Assume that b1, . . . , bn ∈ Rn is a basis in Rn and c1, . . . , cn ∈ Rk are arbitraryvectors. Then there is exactly one linear map f :Rn→Rk for which f(bi) =ci for every 1≤i≤n. That is, the image of the basis elements determines a linear map uniquely.

Proof. Assume first that f : Rn → Rk is a linear map for which f(bi) = ci holds for every 1 ≤ i ≤ n. If x ∈ Rn, then by Theorem 2.2.14 it can be written uniquely as a linear combination of the basis elements b1, . . . , bn, i.e.

x=λ1b1+· · ·+λnbn,

where the scalarsλ1, . . . , λn∈Rare determined uniquely byxand the basisB ={b1, . . . , bn}.

Note that the scalars λi are the coordinates of x relative to B. Hence by Proposition 2.6.1 we have

f(x) = f(λ1b1+· · ·+λnbn)

1f(b1) +· · ·+λnf(bn) (13)

1c1+· · ·+λncn.

This means that the values off are determined uniquely, so there is at most one linear map f :Rn→Rk that satisfiesf(bi) = ci for every i.

We show that the map defined by (13) is linear and satisfies the conditions in the state-ment. This will complete the proof of the theorem. So for everyx∈Rn we define

f(x) =λ1c1+· · ·+λncn,

where λ1, . . . , λn ∈ R are the (uniquely determined) coordinates of x relative to B. First of all, the coordinates of a basis element bi relative to B are zero except for λi = 1, hence f(bi) = ci holds for every 1≤i≤n.

It remains to show that f is linear. Assume that x, y ∈Rn and the coordinate vectors of them relative toB are [x]B = (λ1, . . . , λn)T and [y]B = (µ1, . . . , µn)T. Then

x+y = (λ1b1+· · ·+λnbn) + (µ1b1+· · ·+µnbn)

= (λ11)b1+· · ·+ (λnn)bn, hence we have that [x+y]B = (λ11, . . . , λnn)T. Therefore,

f(x+y) = (λ11)c1+· · ·+ (λnn)ck

= (λ1c1+· · ·+λncn) + (µ1c1+· · ·+µncn)

=f(x) +f(y),

which means thatf is additive. Similarly, if α∈R, then the coordinates of αx relative toB are αλ1, . . . , αλn, hence

f(αx) =αλ1c1+· · ·+αλncn

=α(λ1c1 +· · ·+λkcn)

=αf(x),

so f is homogeneous, and together with additivity this yields that f is linear and we are done.

Now we are going to assign a matrix to a linear map f :Rn →Rk once a basis is chosen in bothRn and Rk.

Definition 2.6.3. Assume that f : Rn → Rk is a linear map, B1 = {v1, . . . , vn} is a basis of Rn and B2 = {w1, . . . , wk} is a basis of Rk. If f(vi) = a1,iw1 +· · ·+ak,iwk, that is, the uniquely determined coordinates ofvi relative to B2 are a1,i, . . . , ak,i, then the matrix of the linear map f with respect to the basesB1 and B2 is

[f]B

1,B2 =

a1,1 a1,2 . . . a1,n

a2,1 a2,2 . . . a2,n ... ... . .. ... ak,1 ak,2 . . . ak,n

 .

In the special case when B1 is the standard basis of Rn and B2 is the standard basis of Rk we omit the indecesB1 and B2 in the notation and write simply[f]for the matrix of f with respect to the standard bases.

The matrix above depends not just on f, but also on the chosen bases. The theorem above together with the uniqueness of the coordinates of a vector relative to a basis assures that once the bases B1 and B2 are fixed, then the matrix [f]B1,B2 is determined uniquely by f. In other words, keeping the bases B1, B2 fixed, there is a one-to-one correspondence between the linear maps fromRn toRk and the matrices in Rk×n (but again, we get different matrices for the same linear map if we chose different bases).

This means that we can give the linear map by giving its matrix, and an advantage of this is that the matrix can be used to calculate the values of the map for an arbitrary vector:

Theorem 2.6.7. Assume that f :Rn →Rk is a linear map, B1 ={v1, . . . , vn} is a basis of Rn and B2 ={w1, . . . , wk} is a basis of Rk. If x∈Rn, then

(14) [f(x)]B2 = [f]B1,B2 ·[x]B1.

That is, if we multiply the matrix off with respect to B1 andB2 by the coordinate vector ofx relative to B1 from the right, then we obtain the coordinate vector of the vector f(x) relative to the basis B2. In the special case when B1 and B2 are the standard bases in Rn and Rk, respectively, we obtain

f(x) = [f]·x.

Proof. If

[x]B1 =

 λ1 λ2

... λn

and [f]B1,B2 =

a1,1 a1,2 . . . a1,n a2,1 a2,2 . . . a2,n

... ... . .. ... ak,1 ak,2 . . . ak,n

 ,

then

f(x) = f(λ1v1+· · ·+λnvn)

1f(v1) +· · ·+λnf(vn)

1(a1,1w1+· · ·+ak,1wk) +· · ·+λn(a1,nw1+· · ·+ak,nwk)

= (a1,1λ1+a1,2λ2+· · ·+a1,nλn)w1 +· · ·+ (ak,1λ1+· · ·+ak,nλn)wk, and this gives (14).

If B1 is the standard basis in Rn and B2 is the standard basis in Rk, then [x]B1 =x and [f(x)]B2 =f(x), hence the second statement follows.

Corollary 2.6.8. Assume that f : Rn → Rk is a linear map and [f] ∈ Rk×n is its matrix with respect to the standard bases of Rn and Rk. If c1, . . . , cn are the columns of [f], then Imf = span{c1, . . . , cn}. Moreover, rank(f) = rank([f]) holds.

Proof. Let us introduce the notation W = span{c1, . . . , cn}. If x ∈ Rn, then f(x) = [f]·x by the previous theorem, and the product on the right hand side is a linear combination of the columns of[f], soImf ⊂W. On the other hand, ify=x1c1+· · ·+xncn ∈W is a linear combination of the columns, theny = [f]·x=f(x) for x= (x1, . . . , xn)T, hence W ⊂ Imf and the first statement of the theorem holds. Moreover,

rank(f) = dim Imf = dimW = rank([f]) holds by Theorem 2.5.17.

As we have promised, now we give a formula for the rotation about the origin on the plane by the angle α. More precisely, we give the matrix of this map:

Proposition 2.6.9. Let fα :R2 →R2 be the rotation about the origin by the angle α, then fα is linear and its matrix with respect to the standard bases is

[fα] =

cosα −sinα sinα cosα

.

Proof. We have already seen in Section 2.6.1 thatfα is linear. The first column of [fα] is fα((1,0)T) = (cosα,sinα)T

by the definition of cosα and sinα. As (0,1)T is obtained when rotating (1,0) about the origin by the angle 90, we get its image from (cosα,sinα)T in the same way, hence

fα((0,1)T) = (−sinα,cosα)T, which is the second column of [fα], so we are done.

Consider the following problem: given a linear map f, we are looking for a basis inImf.

We get the matrix[f]∈Rk×n of f by writing the coordinates of f(ei)in the ith column of a matrix (1≤i≤n), wheree1, . . . , en∈Rnis the standard basis. By the previous corollary we need to find a basis of the spanned subspace of the columns of [f]. As we have already seen in the proof of Theorem 2.5.17, this means that we have to find a maximal set of independent column vectors of [f], and the details of the algorithm for this task was given at the end of Section 2.5.4.

Now we handle the same problem for the subspace kerf. By Theorem 2.6.4 we have dim kerf =n−dim Imf =n−rank([f]).

The subspacekerf consists of those vectorsx∈Rn for which the equation[f]·x= 0 holds.

Hence we need to findn−rank([f])independent vectors among the solutions of the equation above, which is equivalent to a system of linear equations. When we apply the Gaussian elimination for this system, we get n−rank(f) free parameters which can be chosen freely and after that the values of the other variables are defined uniquely. Every solution gives the coordinates of a vectorxwhich solves the matrix equation[f]·x= 0. It is easy to see that if we take those vectors that come from the solutions where exactly one of the free parameters is1while the other free parameters are zero, then we get n−rank([f]) independent vectors, so they form a basis in kerf. Indeed, assume, that m = n−rank(f) and xj1, . . . , xjm are the free parameters, where 1 ≤ j1 < j2 < · · · < jm ≤ k are the indeces of them. For an 1 ≤ i ≤ m, let y

i be the solution of [f]·x = 0 whose jith coordinate is 1 but whose jlth coordinate is 0 for every 1≤l ≤ m, l 6=i. Then the matrix Y ∈Rk×m whose ith column is yi contains the identity matrix Im as a sub-matrix, hence m =rd(Y) = rank(Y) =rc(Y), so the columns ofY are independent.

Note that there is no significance of the special choice of the basesB1 andB2here, we only made these changes for simplicity. The argument above can be told (with some appropriate minor changes) for the matrix [f]B1,B2 where B1 and B2 are arbitrary bases of Rn and Rk, respectively. The details are left to the reader.

Example 2.6.4. Let f :R5 →R4 be the linear map given by the matrix

[f] =A=

2 8 6 4 2

1 2 −1 12 7

−1 −1 3 −12 0

5 22 19 4 7

 .

After applying Gaussian elimination for the system given by the matrix(A|0)we obtain the following reduced row echelon form:

1 0 −5 0 −31 0

0 1 2 0 7 0

0 0 0 1 2 0

.

The columns that contain a leading coefficient are independent, and the corresponding columns ofA (that is, the first, second and fifth column) give a basis of Imf.

The free parameters are the third and the fifth variables (say x3 and x5). The solution that comes from x3 = 1 and x5 = 0 is (5,−2,1,0,0)T, while the solution that comes from x3 = 0 and x5 = 1 is (31,−7,0,−2,1)T. These two vectors form a basis inkerf.

Exercise 2.6.1. Give an alternative proof of the dimension theorem: show (without the usage of it) that the vectors in kerf that are obtained by the method above are not only independent, but in fact they span kerf. Deduce the statement of the theorem from this.