The Inverse of a Matrix and its Calculation

2.5 Matrices

2.5.3 The Inverse of a Matrix and its Calculation

A system of linear equations can be written in a form Ax=b by Theorem 2.5.7, whereA is the coefficient matrix andb is the vector whose coordinates are the constants on the right hand sides of the equations. As formally there is a matrix multiplication on the left hand side, it seems a natural question if there is an analogue of the division for matrices, since in that case we could hope for a solution by "dividing both sides byA". It turns out that the answer for this question is (at least in partly) positive. This means that in some cases we have this analogue. To understand the following notion correctly we note that the division by a real number a is nothing else than the multiplication by its reciprocal 1/a=a⁻¹. Now we introduce the corresponding notion for matrices (and use the latter notation to emphasize the similarity):

Definition 2.5.4. Assume that A∈ R^n×n, then the matrix X ∈R^n×n is called the inverse of A if AX =I_n =XA holds. In this case we use the notation X =A⁻¹.

It is important part of the definition that the inverse is defined only for a square matrix.

Also, if it exists, then it is unique. Indeed, ifXA=I =AX andY A =I =AY hold for the matrices X and Y, then

X =XI =X(AY) = (XA)Y =IY =Y

by Proposition 2.5.4 and by the associativity of the matrix multiplication. So the notation A⁻¹ is justified by the uniqueness, and from now on we can talk aboutthe inverse of a matrix - at least if it exists. It is easy to see that there are matrices whose inverse exists, for example I⁻¹ =I by Proposition 2.5.4. Unfortunately this is not always the case, but the next theorem gives a complete answer for this question:

Theorem 2.5.10. The matrix A∈R^n×n has an inverse if and only if detA 6= 0.

Proof. Assume first thatA⁻¹ exists. It follows easily from the definition of the determinant (or by part (ii) of Theorem 2.4.2) that detI_n = 1 for every n. Then by Theorem 2.5.5 we have1 = detIn= det(AA⁻¹) = detA·detA⁻¹, and hencedetA6= 0 must hold.

For the other direction we need the following lemma:

Lemma 2.5.11. If A ∈ R^n×n and detA 6= 0, then there exists a unique matrix X ∈ R^n×n for which AX =In holds.

Proof. IfAX =In holds, then of course X must be of size n×n. So let x₁, . . . , x_n ∈Rⁿ be the columns of the matrix X. Observe that the ith column of AX is Ax_i by the definition of the matrix multiplication. Hence the equation AX =I_n holds if and only if the equation Ax_i =e_i holds for every1≤i≤n, where e₁, e₂, . . . , e_n are the columns of In, i.e. the vectors of the standard basis of Rⁿ (see page 46). Since detA6= 0 by our assumption, each of these equations has a unique solution by Theorem 2.4.6, and the statement follows.

Now we return to the proof of the theorem. By the lemma there is a unique matrix X for whichAX =I holds. We will show that in this caseXA =I holds as well. By Theorem 2.5.5 we have1 = detI = det(AX) = detA·detX, sodetX 6= 0holds and hence the lemma above is applicable forX as well. Thus, there is a unique matrixY for whichXY =I holds.

Now Proposition 2.5.4 and the associativity of the matrix multiplication give Y =IY = (AX)Y =A(XY) =AI =A,

henceXA=I and the theorem follows.

Now if a system of linear equations if given by the matrix equation Ax = b, where A∈R^n×n(which means that the number of equations is the same as the number of variables), and detA6= 0 also holds, then we can multiply the equation by A⁻¹ from the left to obtain (7) x=I_nx= (A⁻¹A)x=A⁻¹(Ax) =A⁻¹b,

so the unique solution of the system is A⁻¹b. This means that the system can be solved by a matrix multiplication if the matrix A⁻¹ is known. But observe that the proof above gives a method also for the computation of the inverse (if detA 6= 0). By its last paragraph it is enough to determine the matrix X for which AX = I holds, it will be automatically the

inverse of A. By the proof of Lemma 2.5.11 this can be done by solving the systems given byAx=e_i for every 1≤i≤n using Gaussian elimination (for example).

An advantage of this method is that it is not necessary to run the Gaussian elimination n times, the systems can be solved simultaneously. Indeed, the coefficient matrices of the systems agree, so the algorithm makes the same steps in all cases, we only have to keep track of the changes of all vectors on the right hand sides of the equations. That is, we write down the matrix (A|e₁e₂ . . . e_n) and run the Gaussian elimination for this matrix (note that the steps are determined only by the coefficient matrixA). If the determinant is 0, then at some point we get 0 in the main diagonal so that all the entries in its column are 0 below it and we can stop (just like in line 16of the algorithm for the calculation of the determinant, see page 67). Otherwise, after the algorithm stops, we obtain the reduced row echelon form on the left side of the vertical line, which is the matrix In in this case. So the result will be of the form (I_n|x₁ . . . x_n) where x_i is the solution of the system Ax =e_i, that is, on the right side of the line we get the inverse of A.

We demonstrate this method by an example. Let us calculate the inverse of the matrix A=

We are going to run the Gaussian elimination for the matrix(A|I₃):



The leading coefficient in the first row is already 1 in the beginning, so in the first step we eliminate the non-zero elements below it: we add the first row to the second one and subtract 2 times the first row from the third one. After these steps the second entry of the second row is zero, but we can swap the second and the third row to obtain a non-zero entry in the main diagonal:

The arising matrix is already of row echelon form, so we turn to the second phase of the elimination. We eliminate the non-zero entries above the leading coefficient in the last row by adding 2 times the last row to the second one and subtracting 7 times the last row from the first one. Finally, we add3 times the second row to the first one:

∼

At this point the algorithm stops and the columns on the right side of the vertical line form the inverse of A:

Now assume that we want to solve the system x₁−3x₂+ 7x₃ =p,

−x₁+3x₂− 6x₃ =q, 2x1−5x2+12x3 =r.

This is equivalent to the equationAx =b whereA is the matrix above and b= (p, q, r)^T. As detA6= 0 and hence its inverse exists, we get by (7) that

x=A⁻¹b =

This algorithm above works well in practice, but now we also give a formula for the inverse which is often useful in theoretical arguments. IfA∈R^n×n, then letAˆ∈R^n×nbe the matrix whose entryˆa_i,j in theith row and in thejth column is the cofactorC_j,iassigned to the entry a_j,i of A (given in Definition 2.4.5). Note that the indeces i, j are swapped in the definition of ˆa_i,j. Then Theorem 2.4.7 and Corollary 2.4.8 together give that

(8) AAˆ=

In general it is rather tiresome to calculate Aˆbased on its definition, but for n = 2 it is in fact very easy, since the minors of the matrix are1×1determinants whose values are just their single entry. So if

whose determinant (given explicitly in (5)) is non-zero, then

(10) A⁻¹ = 1

We can use the theorem above to give an exact formula for the unique solution of a system Ax=b, whereA∈R^n×n and detA6= 0. As we have seen in (7), this is

A⁻¹b = 1 detA

Ab,ˆ so the value of the variable x_i is

If B_i is the matrix which is obtained from A when replacing its ith column by b, then the latter sum is just the determinant ofB_i expanded along the ith column, so we have

Theorem 2.5.13 (Cramer’s rule). Assume that the system of linear equations is given by Ax=b, where A∈R^n×n and detA6= 0. Then the unique solution of the system is

xi = detBi

detA (1≤i≤n),

where B_i is the matrix obtained from A when replacing its ith column by the vector b.

As in the case of Theorem 2.5.12 we have to warn the reader that this formula is not practical for the calculation of the solution in general. But for n = 2 it is simple enough to take a closer look at it. So if we have the system

a_1,1x₁+a_1,2x₂ =b₁, a_2,1x₁+a_2,2x₂ =b₂, then

a_1,1 a_1,2 a_2,1 a_2,2

, B₁ =

b₁ a_1,2 b₂ a_2,2

, B₂ =

a_1,1 b₁ a_2,1 b₂

, and if detA=a_1,1a_2,2−a_1,2a_2,1 6= 0, then the unique solution is given by

x₁ = detB₁

detA = a_2,2b₁−a_1,2b₂

a_1,1a_2,2−a_1,2a_2,1, x₂ = detB₂

detA = a_1,1b₂−a_2,1b₁ a_1,1a_2,2−a_1,2a_2,1.

Recall that these are the same formulae that were given at the beginning of Section 2.4.

Finally, we show another quick application of Theorem 2.5.12. Note that if the entries of the matrix A are rational numbers, then so are the entries of Aˆ (because its entries are determinants with rational entries multiplied by ±1, so every operation that we make by the calculation of Aˆ gives a rational result). Also, the determinant of A is rational, so (9) gives that A⁻¹ has rational entries. In fact this follows also from the algorithm above for the calculation of the inverse. But the formula in (9) gives even more when we repeat this argument with integer entries. Namely, if the entries ofAare integers, then so are the entries of A, so we immediately getˆ

Corollary 2.5.14. Assume that A∈R^n×n and the entries of A are integers. If detA =±1, then the entries ofA⁻¹ are also integers.

This is a basic (and very important) fact in number theory, but we do not go into that direction.

In document Introduction to the Theory of Computing I. (Pldal 81-85)