Gaussian Elimination - Systems of Linear Equations

2.3 Systems of Linear Equations

2.3.2 Gaussian Elimination





1 1 2 0 7 5

0 1 3 0 5 1

0 0 0 1 −3 −3



∼





1 0 −1 0 2 4

0 1 3 0 5 1

0 0 0 1 −3 −3





We have reached the reduced echelon form and the algorithm stops. This form gives us all the solutions of the system in a manageable form. To see this we consider the equations that correspond to the rows of the last matrix:

x₁ − x₃ + 2x₅ = 4, x₂+ 3x₃ + 5x₅ = 1,

x₄−3x₅ =−3.

Easy to see that the values ofx₃ and x₅ can be chosen freely, and after that the values of the other variables can be expressed in terms of these values uniquely. Hence the solutions of the system can be written in the following form:

x₃ =α∈R, x₅ =β ∈R, x1 = 4−2β+α,

x₂ = 1−5β−3α, x₄ =−3 + 3β.

We call the variablesx3 and x5 free parameters (since their values can be chosen freely).

They are those variables for which there is no leading coefficient in the corresponding column of the coefficient matrix.

2.3.2 Gaussian Elimination

For a system of equations with n variables and k equations we fix the notation x_j for the variables (1 ≤ j ≤ n) and a_i,j for the coefficient of x_j in the ith equation (1 ≤ i ≤ k, 1≤ j ≤ n). The constant on the right hand side of the ith equation will be denoted by b_i. Also, we say that the system is of size (k×n), and its equations and augmented coefficient matrix are the following:

a_1,1x₁+a_1,2x₂+· · ·+a_1,nx_n = b₁ a2,1x1+a2,2x2+· · ·+a2,nxn = b2

... ... ... a_k,1x₁+a_k,2x₂+· · ·+a_k,nx_n = b_k







a_1,1 a_1,2 . . . a_1,n b₁ a2,1 a2,2 . . . a2,n b2

... ... . .. ... ... a_k,1 a_k,2 . . . a_k,n b_k







Definition 2.3.1. If a system of equations of size (k × n) is given with its augmented coefficient matrix, then we call the following operationselementary row operations: for every 1≤i, j ≤k, i6=j and λ∈R,

(i) the (element-wise) multiplication of a row by λ if λ6= 0,

(ii) replacement of the ith row of the matrix by the sum of itself andλ times the jth row,

(iii) swapping of the ith and the jth rows,

(iv) omission of a row which contains only zero elements.

Proposition 2.3.1. The operations given in the previous definition are equivalent transfor-mations of the coefficient matrix, i.e. the numbers y₁, . . . , y_n constitute a solution of the system given by the coefficient matrix before the operations if and only if they give a solution after the operations.

Proof. We prove the statement only for operation (ii), the remaining part of the proof is left to the reader. Assume that y₁, . . . , y_n is a solution of the system given by the matrix, then a_i,1y₁+· · ·+a_i,n =b_i and a_j,1y₁+· · ·+a_j,n =b_j hold. Adding λ times the second equation to the first we obtain that (a_i,1 +λa_j,1)y₁+. . .(a_i,n+λa_j,n)y_n = b_i +λb_j. But this means that the equation belonging to the ith row of the matrix after the operation holds. As the other rows do not change, we have that y₁, . . . , y_n is a solution of the new system.

On the other hand, if y₁, . . . , y_n is the solution of the system that is described by the coefficient matrix after the operation, then (a_i,1 +λa_j,1)y₁ +. . .(a_i,n+λa_j,n)y_n = b_i +λb_j anda_j,1y₁+· · ·+a_j,n =b_j hold (these correspond to the ith and jth row of the new matrix, respectively). Multiplying the latter equation byλand subtracting the result from the former one we get a_i,1y₁ +· · ·+a_i,n = b_i, hence y₁, . . . , y_n satisfy the ith equation of the original system, and since the other equations does not change,y₁, . . . , y_nis a solution of the original system.

Definition 2.3.2. If a system of equations of size (k × n) is given with its augmented coefficient matrix, then we say that it is of row echelon form, when the following hold:

(i) every row of the matrix contains a non-zero element before the vertical line, and the first non-zero element (the so-called leading coefficient) of the row is 1,

(ii) if 1≤ i < j ≤ k, and the leading coefficient of the ith row is in the lth column, while the leading coefficient in the jth row is in themth column, thenl < m(and then every element below a leading coefficient in the corresponding column is zero, moreover, every element on the left of a leading coefficient in its row and in the rows below it are also zero).

We say that the coefficient matrix is ofreduced row echelon form, if the following holds beside (i) and (ii):

(iii) every element above a leading coefficient in the corresponding column is zero (i.e. a column of a leading coefficient contains only one non-zero element, namely the leading coefficient itself).

Here is an example of a matrix of row echelon form and another one which is of reduced row echelon form (every ∗ denotes an arbitrary real number):



If a system is of reduced row echelon form, then it is easy to find all of its solutions.

Indeed, if every column contains a leading coefficient (which is 1), then the unique solution is given by the values on the right of the vertical line (as in the first example of the previous section). If there are columns which do not contain a leading coefficient, then they correspond to free parameters, i.e. variables whose values can be chosen freely, and then the values of the other variables can be expressed in terms of the free parameters and the values on the right of the vertical line (see the last example of the previous section).

The Gaussian elimination works in the following way: if a system of equations is given (by an augmented coefficient matrix), then we apply elementary row operations so that either we get a forbidden row, that is, a row which contains only zero elements on the left side of the vertical line and a non-zero last element (and then the system isinconsistent, i.e. it has no solution), or a matrix of reduced row echelon form is obtained (and then we can read all of the solutions of the system). All this is ensured by Proposition 2.3.1. The process is divided into two phases. In the first one we reach a matrix of row echelon form or we get a row whose elements are all zero except for the last one. In the latter case we stop and give the output "there is no solution". Otherwise we continue with the second phase where we reach a matrix of reduced row echelon form.

Gaussian Elimination - First Phase

Assume that the size of the system is (k ×n). We store the number of the row (in the variablei) and the number of the column (in the variablej) where the next leading coefficient is supposed to be. Initially we set i = j = 1. In the first part of this phase we run a loop, whose body is described by the following two paragraphs.

If a_i,j 6= 0, then we multiply the ith row by 1/a_i,j, and for every i < l ≤ k we multiply the (new) ith row by (−a_l,j) and add it to the lth row (obtaining zeros below the current leading coefficient). Now ifi < k and j < n, then we increase iand j by1continue from the beginning of the body, otherwise we break the loop and jump to the second part of the first phase (detailed below later).

On the other hand, if a_i,j = 0 and a_l,j 6= 0 for some i < l ≤k, then we choose an l with this property (e.g. the least one) and swap the ith and the lth rows and continue as in the previous paragraph. If there is no such l, then we increase j by 1 when it is smaller than n and go back to the beginning of the body (i.e. to the previous paragraph). If we cannot increasej, then we decrease i by 1, break the loop and jump to the second part of the first phase.

In the second part of the first phase we do the following. Ifi=k, then this means that we reached the last line of the matrix and set the leading coefficient in it to 1. In this case the matrix is of row echelon form, so we finish the first phase. Ifi < k, then either we reached the nth column and set the leading coefficient to1in theith row andnth column and eliminated the non-zero elements below it, or we had a_l,n = 0 for every i+ 1 ≤l ≤k anyway (but then a_i,n is not necessarily 1, or in the (degenerate, but still possible) case when i=−1 it is not even defined). In all of these cases we have only zeros in the lth row on the left side of the vertical line for everyi < l ≤k. So if b_l 6= 0 for some i < l≤ k, then there are no solutions and the algorithm stops. Otherwise we omit the lth row for every i < l≤k.

We give the steps of this phase also in the form of a pseudocode:

GAUSSIAN ELIMINATION - FIRST PHASE

Input: a matrix A with k rows and n+ 1 columns (the augmented coefficient matrix of a system of linear equations with n variables and k equations)

1 i←1; j ←1;

2 while true do 3 if a_i,i 6= 0,then

4 multiply theith row by 1/ai,j

5 if i < k, then

6 for every i < l≤k add (−a_l,j) times the ith row to thelth row 7 if i=k or j =n, then

8 goto SECOND PART

9 else

10 i←i+ 1; j ←j+ 1

11 else

12 if i < k and a_l,j 6= 0 for some i < l≤k,then

13 swap the ith and thelth rows

14 else

15 if j =n,then

16 i←i−1

17 goto SECOND PART

18 else

19 j ←j+ 1

20 end while

21 SECOND PART:

22 if i < k, then

23 if b_l6= 0 for somei < l ≤k, then

24 print"The system has no solution."; stop

25 else

26 omit the lth row for every i < l≤k 27 print"The matrix is of echelon form.";stop Gaussian Elimination - Second Phase

In the second phase of the algorithm the input is an augmented coefficient matrix which is of row echelon form. Here we simply eliminate the non-zero elements above the leading coefficients. For example, if a_i,j = 1 is the leading coefficient of the ith row, then for every 1≤ l < iwe multiply the ith row by a_l,j and subtract the result from the lth row (element-wise). We begin this phase with the last row and go backwards. Note that this is not necessary but this way we decrease the number of operations (because of the zeros that we produced in the previous steps of this phase). It is obvious that the resulting matrix is of reduced echelon form. We summarize this in the following theorem:

Theorem 2.3.2. If a system of equation is given by its augmented coefficient matrix and we apply Gaussian elimination, then exactly one of the following cases holds:

(i) We get a line at the end of the first phase whose elements are zero except for the last one. In this case the system has no solution.

(ii) We obtain a matrix of reduced row echelon form such that there is a leading coefficient in every column. Then the system has a unique solution.

(iii) We obtain a matrix of reduced row echelon form such that there are columns without a leading coefficient. Then the system has infinitely many solutions.

In the cases (ii) and (iii) the solutions can be read from the reduced echelon form as discussed above.

Corollary 2.3.3. If a system of equations with k equations and n variables has a unique solution, then k ≥n.

Proof. As the system has a solution, the Gaussian elimination produces a matrix of reduced row echelon form with say k⁰ rows. Then k⁰ ≤ k since the algorithm does not increase the number of the lines. Since the solution is unique, every column of the resulting matrix contains a leading coefficient, but the number of the leading coefficients is the same as the number of the rows, hence k⁰ =n and the claim follows.

Finally, we turn to the running time of the Gaussian elimination. It is not hard to see that in the case of a system with k equations and n variables the algorithm makes at most ck²n basic operations for some constant c, but the running time of these operations largely depends on how we store the numbers that are obtained during the process as results of previous operations, and also on how we actually implement these operations.

If all the inputs are rational numbers, then it would be possible to store the numerator and the denominator of them. But in this case we have to simplify the fractions after performing the operations on them, otherwise the magnitude of the numbers can get so large that the running time of the algorithm becomes exponential. This simplification can be done (for example) with the Euclidean algorithm, which still gives a polynomial running time, but unfortunately it is not fast enough for applications.

Hence in practice an approximation (typically a floating-point format) is used giving a reasonable running time. The drawback of this is that the errors of the approximations can accumulate during the process resulting in an unacceptable outcome. This can happen for example when we divide by numbers which are very close to zero. Without giving any further details we summarize this as follows: the Gaussian elimination is an efficient algorithm when implemented carefully.

In document Introduction to the Theory of Computing I. (Pldal 54-58)