Updating the phase-1 simplex multiplier - Improvement techniques of pricing

Improvement techniques of pricing

3.1 Updating the phase-1 simplex multiplier

As we saw in the previous chapter, the phase-2 simplex multiplier update formula (1.29) has been known since 1974 [99]. We will show how one can update the phase-1 sim-plex multiplier below. As we saw in (1.25) this simsim-plex multiplier can be obtained by multiplying the vectorhand the actualB⁻¹basis inverse:

φ^T =h^TB⁻¹

Now we move from basisBto ¯B, and the vectorhchanges to ¯h. Let:

∆h=h¯ −h. (3.1)

We suppose that ∆hcontains much less nonzero components than ¯h, so we expect that with∆hwe can obtain the new simplex multiplier faster.

Determine the value of ¯φ^T usingφ^T,∆hand ¯B⁻¹:

φ¯^T =h¯^TB¯⁻¹ =(h+ ∆h)^TB¯⁻¹ =h^TB¯⁻¹+ ∆h^TB¯⁻¹ (3.2) We can compute the vector-matrix product∆h^TB¯⁻¹quickly, because∆hhas few nonzero components. However, our expectation is that ¯h does not contain more nonzero com-ponents than h, so computing h^TB¯⁻¹ can be slower than the operation ¯φ^T = h¯^TB¯⁻¹. Consequently, we need an other way to computeh^TB¯⁻¹.

If we rearrange (1.25), we obtain that

φ^TB=h^T

Let be ˆφthe vector, where h^TB¯⁻¹ = (φ+φ)ˆ ^T equality holds. Rearrange this formula for

h^T and obtain ˆφ:

(φ+φ)ˆ ^TB¯ =h^T, (φ+φ)ˆ ^TB¯ =φ^TB φ^TB¯ +φˆ^TB¯ =φ^TB φˆ^TB¯ =φ^TB−φ^TB¯ φˆ^TB¯ =φ^T(B−B)¯

φˆ^T =φ^T(B−B) ¯¯ B⁻¹ (3.3) Finally, the updated simplex multiplier, ifhchanges by∆h, based on (3.2), (3.1), and (3.3):

φ¯^T =φ^T+(φ^T(B−B)¯ + ∆h^T) ¯B⁻¹. (3.4) The matricesBand ¯Bdiffer from each other in thepth column, soB−B¯ is a matrix, where every entry is zero, expect the pth column. Thus in practice, to compute φ^T(B−B) we¯ need a vector subtraction and a dot product. Similarly to the phase-2 reduced costs, the phase-1 reduced costs can be updated using these observations efficiently. Denote the change in the phase-1 simplex multiplier with∆φ^T:

∆φ^T =(φ^T(B−B)¯ + ∆h^T) ¯B⁻¹.

In this case the phase-1 reduced costs can be computed as follows: We know that d^T_R=φ^TR,

therefore

d¯^T_R=φ^TR+ ∆φ^TR=d^T_R+ ∆φ^TR (3.5) Before computing the simplex multiplier, it has to be verified whether vector φ^T(B− B)¯ + ∆h^T has less nonzeros than ¯hor not, because if ¯hhas more nonzeros, then the usage of (3.5) is preferred.

The computation of φ^T(B−B) can be implemented by using three successive loops¯ with appropriate vector representations. Since the coefficient matrix contains very few nonzeros, only these values are stored: Vectors are stored as (idx,v) pairs, whereidxgives the index of valuev(v,0) in the vector. This is the so-called sparse vector representation technique. Column and row vectors ofBare stored this way as shown in Figure (3.1). It can slow down the algorithm if a single element of the vector has to be obtained, because this needs a search of complexityO(n). However, the pricing module uses only nonzeros of the necessary vectors, so it is sufficient to go through the list of (idx,v) pairs only.

This technique makes the program faster, because it doesn’t have to complete operations

Figure 3.1: Ab example for column-wise matrix representation

where one of the values is zero. But the simplex multiplierφis stored in direct form with its zero elements, since the algorithm reads components ofφrandomly.

Usually, sparse representation does not ensure the ordering of (idx,v) pairs by idx, because operations on the vectors can modify the order of the elements. Maintaining the right order needs too much computational power. However, the simplex method changes matrixBby replacing its columns, but the elements in these vectors are unchanged. Since these elements do not vary,sorting(idx,v) pairs byidxin the initial phase makes sense. The question is which sorting algorithm is the most suitable? In different situations different methods are preferred. When there are few pairs, using selection sortis proposed. This algorithm hasO(n²) complexity, wherenis the number of elements. For more elements, a counting sort is the obvious choice, because the complexity isO(n+k), wherek is the range of elements. The counting sort can be simplified, because each index idx in the pairs is unique. Choosing the appropriate pair causes that the ordering of each vector in the coefficient matrix can be very time consuming, i.e. length of sorting time can be comparable with the total execution time of the simplex. However, choosing the right sorting algorithm based on a simple criterion, running time of sorting is negligible. Letn be the number of nonzeros in a vector, and rthe difference between the largest nonzero index and the lowest nonzero index. Then the criterion for choosing sorting algorithm is the following: Figure (3.2) shows an example for sorting times, wherer=100.

We know thatBand ¯Bdiffer from each other in one column vector; denote the vectors withband ¯brespectively. This fact can be utilized to computeφ^T(B−B). Moreover, as it¯ was mentioned above,band ¯bare stored insparse form, and the pairs (idx,v) are stored in ascending order. Finally,φis indirect form, i.e. each component ofφis reachable directly.

The algorithm computingBand ¯Butilizes these facts. It is similar to the merge sort: Each nonzeros of b and ¯b have to be visited. Let (idx,v)^k_b denote thekth index-value pair in vectorb, and similarly, (idx,v)^l_¯

bthelth pair in vector ¯b. The algorithm has to compute the following formula: Pn

i=1φi(b_i−b¯_i). At the beginning,landkare 1, and some iterations are executed. The indexkrefers to the pair (idx1,v1), andlrefers to (idx2,v2). In each iteration the referred indicesidx1andidx2are compared. Three cases can be distinguished:

0 5 10 15 20 25 30 0

50 100 150 200 250 300 350 400 450

number of nonzeros

execution time [nanoseconds]

selection sort counting sort

Figure 3.2: Execution times of different sorting algorithms

• idx1 =idx2In this case the algorithm refers the (idx1,v1)^k_band (idx2,v2)^l_b_¯ pairs, so then the valueφ_idx₁(v1−v2) is computed.

• idx1 < idx2In this case the algorithm refers the (idx1,v1)^k_b and (idx2,v2)^l_b_¯ pairs. This means that there is no such a (idx2,v2)^l_b_¯, whereidx1 =idx2, so the algorithm computes the valueφ_idx₁(v1−0)=φ_idx₁v1.

• idx1 > idx2In this case the algorithm refers the (idx1,v1)^k_b and (idx2,v2)^l_b_¯ pairs. This means that there is no such a (idx1,v1)^k_b, whereidx1 =idx2, so the algorithm computes the valueφ_idx₂(0−v2)=−φ_idx₁v2.

After these simple operations, kand lare increased by 1. Algorithm (3.1) shows the pseudo code of the algorithm, wherec(a) denotes the number of nonzeros in vectora, and the operatornz(a,i) gives the pair (idx,v)ⁱ_afrom vectora.

As computing (3.4) needs a BTRAN operation, it is useful to have a specific BTRAN implementation, exploitingsparsityofφ^T(B−B)¯ + ∆h^T.

In document Numerically stable simplex method implementation (Pldal 54-57)