Algorithm Evaluation - Multi-Level Genetic Algorithm (muleGA)

Multi-Level Genetic Algorithm (muleGA)

3.5 Algorithm Evaluation

3.5.1 Goal of the Experiments

The goal of the following investigations is to measure the absolute and relative perfor-mance of muleGA. Absolute perforperfor-mance will be expressed by testing the algorithm on search spaces already evaluated by exhaustive search, while relative performance is calcu-lated in comparison with traditional GAs. Emphasis is going to be put on the discovery and study of the assets and shortcomings of the multi-level evolutionary process. For this end, those parameters are going to be examined closely which determine the muleGA.

The inner cycle GAs and the single-level GA of the analyzed congurations are going to be parameterized with standard values. The eects of conventional GA parameters and GA variants to the multi-level evolutionary process is only briey discussed.

As problem solving with muleGA involves the decomposition of the search space into subspaces, it is expected that the algorithm performs exceptionally well on problems whose subproblems are easily solvable with GAs and where the subproblems are not linked, or only loosely linked. The linkage of a variable is said to be order k if it never depends on more thank−1other variables [80]. However, in most real-life problems, there are some linkages between all of the variables, although with diering signicances. When there are many conicting objectives with a high order linkage, GAs lose performance.

It was shown in [110] that MOGAs are outperformed by traditional non-Pareto GAs with aggregated objective functions if the number of objectives is equal to or greater than four. For problems with more than ten objectives, a purely random search may perform better than the traditional MOGA approaches [111]. A detailed study of the optimization of many conicting objectives is presented in [55]. It concludes the number of objectives dramatically changes the behavior of MOGAs, thus the performance of an algorithm conguration for a small number of objectives cannot be generalized to larger number of objectives, and good proximity to the trade-o surface with high diversity

3.5 Algorithm Evaluation 67

is hard to achieve. Although there are variants of MOGAs to overcome these problems providing good results for search spaces with50objectives.

The muleGA supports a hierarchical decomposition to unlimited levels, which can in-crease solution performance on high-dimensional problems. This study attempts to com-pare the performances of muleGA to traditional GAs (both with aggregating and Pareto-based techniques) on search spaces ranging from being bi-objective to high-dimensional.

The variants of the Knapsack Problem (KP) are used for testing, as KPs are popular test problems for analyzing the performances of EAs, and they share many similarities with the real-world dietary menu planning problems presented in Section 4.4.1. Traditional numeric test functions for MOPs are also going to get utilized in testing to demonstrate the capabilities of muleGA. Note that these numeric functions are test beds for tradi-tional MOGAs, which perform well on few-dimensional search spaces, usually two to four dimensional ones. This is why the traditional numeric functions implemented in MOGALib and presented in Section A.7 are suitable for this dimension range.

3.5.2 Test Problems and Measures of Performance

The measure of the performance of the algorithms is traditionally expressed in the dis-tance of the found solution(s) from the optimal one(s) [112]. Other metrics are eciency (CPU time, computational eort to obtain solutions, number of iterations, use of spa-tial and temporal resources) and eectiveness (accuracy and convergence, robustness, scalability). Test function characteristics are the following [11]: continuous vs. discon-tinuous vs. discrete, dierentiable vs. non-dierentiable, convex vs. concave, unimodal vs. multi-modal, numerical vs. alphanumeric, quadratic vs. non-quadratic, type of constraints (equalities, inequalities, linear, nonlinear, etc.), low vs. high dimensionality (genotype, phenotype), deceptive vs. non-deceptive, biased vs. nonbiased portions of PF_true. Concluding from the number of characteristic features a numerical test problem may have, a high number of possible test congurations are needed to test the perfor-mance of any new algorithm. From the No Free Lunch (NFL) theorems it derives that no algorithm is expected to perform equally well on every test problem. As a matter of fact, if every possible conguration of test problems is utilized, each algorithm comes out with an average performance.

The test congurations which are going to be utilized here bear such parameters through which the muleGA's particular characteristics can be expressed. The main purpose is to determine the problem characteristics with which the muleGA is expected to perform better than conventional GAs and MOGAs.

3.5.3 Experiment Design and Test Infrastructure

Two types of tests were assembled to assess muleGA performance. The rst type of tests compare muleGA to traditional GAs with aggregating tness functions in solving Multi-Level Multi-Objective 0-1 Knapsack Problems, the second type of tests deal with numerical MOP test functions and compare performances of muleGAs to MOGAs. The second type of tests were run on exhaustively evaluated search spaces, so the goodness of

every outcome is expressed in light of the true optima. The results of the test runs were recorded in a PostgreSQL database, statistical hypothesis testing and visualization were run as batch processes using Matlab and GNUPlot. Results of the comparisons were also recorded in the database and gures were saved as image les.

3.5.4 Knapsack Problem Tests

Single-Level Knapsack Problems were generated randomly using the congurations de-tailed in Appendix B.1. Using these single-level problems, three Multi-Level KP con-gurations with second and third level extensions were constructed with the following parameters.

First knapsack conguration has one level-3 knapsack, two level-2 knapsacks and four base-level knapsacks. Each of the level-2 knapsacks constrains the contents of two base-level knapsacks, the level-3 knapsack constrains the contents of the level-2 knapsacks. The weight constraints of the level-2 knapsacks equal the sum of the weight constraints of their corresponding base-level knapsacks. The level-3 knapsack's weight constraints equal the sum of the level-2 knapsacks.

Second knapsack conguration is generally the same as above, with the only dier-ence being that25%of the level 2 and 3 weight constraints were randomly decreased to 80%.

Third knapsack conguration has no modications to the weight constraints of the rst conguration, but each level-2 knapsack constrains four base-level knapsacks, and there are eight base-level knapsacks in total.

The GAs compared in the experiments are congured as follows. The single-level GA runs an aggregating tness function which rewards the individuals according to how much prot they represent. Whenever a weight constraint of any knapsack (base-level, level-2 or level-3) is breached, the individual's tness is zeroed. The KPs are encoded as binary strings, thei^th bit of a string encoding a KP (boxes-to-knapsack assignment) tells if the i^th box is in the knapsack. Each individual of the single-level GAs encodes all of the various level knapsack problems by recording all of the assignments in a single genome. Uniform mutation is used in each GA, and mutation is done on a gene by randomly ipping its value (FlipMutation).

The muleGA evolves the base-level knapsacks in separate populations, then collects their best individuals through input problem mapping and creates the top-level individ-uals from the rst n best combinations of the input problem solutions, where n is the size of the top-level population. At each level, aggregating tness function is used. The base-level tness functions only reward a single knapsack and only penalty the weight constraint of the corresponding knapsack. The top-level tness function rewards the level-2 and level-3 knapsacks and penalizes breaching their constraints.

Three pairs of test congurations were assembled, each comparing the results of a GA with its muleGA counterpart. The tests were congured as follows.

3.5 Algorithm Evaluation 69

First Test Conguration has the single-level GA congured with the following pa-rameters. Population size is40, number of generations is10(the rate of convergence decreased signicantly after ten iterations), chance of crossover is 90%, chance of mutation is10%. The muleGA populations are sized 40 for each base-level GA and for the top-level GA as well. Base-level populations are evolved for 10 iterations each, top-level population is only evolved for 1 iteration (just collecting and once recombining and mutating its individuals). The two GAs are compared with each other on the rst knapsack conguration. Note that each base-level GA in the muleGA conguration runs for the same number of iterations the single-level GA is run. This, however, does not mean the muleGA has more iterations to run, because the single-level GA works with individuals encoding the whole level-3 problem, so the length of its chromosomes equal the length of the sum of the chromosomes of the base-level GAs. The computation resources the two congurations use are generally the same, so the comparison is not biased.

Second Test Conguration has the same GA congurations as the rst one, but here the GAs evolve the problems dened in the second knapsack conguration. As the weight constraints of the upper-level knapsacks were decreased, such input problem mapping method was utilized for this conguration which does not allow the mapping of an input problem combination if it breaches some constraints of the parent problem. If it was necessary, the introduced input problem mapping function iteratively changed bits from one to zero in the mapped subproblems until the newly created individual satised all of the constraints.

Third Test Conguration evolves the third knapsack conguration problems with the GAs having the same parameters as in the rst test conguration, only the number of base-level populations dier in the muleGA, as there are eight base-level GAs evolving the corresponding base-level knapsacks.

Each test conguration was run a hundred times and the results were analyzed with statistical hypothesis testing. The signicance level of each test was chosen to be 5%

uniformly. Normal distribution of the results were tested with Lilliefors-tests [113]. If the results were of normal distribution, then two-sample T-tests were run to check whether the results of the single-level GA and the muleGA have equal means and variances.

Two-sample Kolmogorov-Smirnov tests [114] and paired, two-sided Wilcoxon signed rank tests [115] were both run to compare the results. The latter two tests do not need the distributions of the results to be normal.

Figures 3.8, 3.9(a), 3.9(b), 3.10, 3.11 highlight the dierences between the results of the single-level GA and muleGA. As it is shown in Table B.1 (in Appendix B), there are ve congurations for the number of knapsacks (2¹·8 = 16, . . . ,2⁵·8 = 256), and ten congurations for the number of weight and prot dimensions (2¹ = 2, . . . ,2¹⁰= 1024).

The combinations of the settings give a total number of50 distinct congurations. The 3D gures 3.8, 3.10, 3.11 show the dierences of the results (the sum of the prot in the top-level knapsack) in function of the number of knapsacks (decision space parameter) and the number of dimensions (objective space parameter).

1 2 3 4 5 6 7 8 9 10

1 2 3 4 5

−0.05 0 0.05 0.1 0.15 0.2

Figure 3.8: The results of the rst test conguration with number of boxes ranging from2³·2ⁱ,(i= 1, . . . ,5) (x-coord), weight and prot dimensions ranging from2^j,(j= 1, . . . ,10) (y-coord) and dierence between the results of muleGA and GA given in percentage, calculated by subtracting the latter from the former (z-coord). Note that the coloring is only for separating the congurations and increasing visibility.

1 2 3 4 5

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18

(a) Decision space dimension congurations

1 2 3 4 5 6 7 8 9 10

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18

(b) Objective space dimension congurations Figure 3.9: Percentage dierences in function of decision space dimensions (number of boxes)

and objective space dimensions (number of weight and prot dimensions) resulted from the rst test conguration.

3.5 Algorithm Evaluation 71

1 2 3 4 5 6 7 8 9 10

1 2

3 4

−0.05 0 0.05 0.1 0.15 0.2

Figure 3.10: The results of the second test conguration with number of boxes ranging from 2³·2ⁱ,(i = 1, . . . ,5) (x-coord), weight and prot dimensions ranging from 2^j,(j = 1, . . . ,10)(y-coord) and dierences between the results of muleGA and GA given in percentage, calculated by subtracting the latter from the former (z-coord).

1 2 3 4 5 6 7 8 9 10

1 2

4 5

−0.05 0 0.05 0.1 0.15 0.2

Figure 3.11: The results of the third test conguration with number of boxes ranging from 2³·2ⁱ,(i = 1, . . . ,5) (x-coord), weight and prot dimensions ranging from 2^j,(j = 1, . . . ,10)(y-coord) and dierence between the results of muleGA and GA given in percentage, calculated by subtracting the latter from the former (z-coord).

It can be concluded from the KP tests that the muleGA is capable of nding signif-icantly better solutions for the KPs in congurations were the number of boxes in the base-level knapsacks is low. Figure 3.9(a) highlights muleGA runs result in some 17%

better knapsack prot in16 box base-knapsack congurations. This means single-level GA is reaching at most 83% of the maximum possible prot in solving the 64(4·16) box 3-level KP. No exhaustive search was run on the KP test congurations, but from an iterative exploration of the promising regions of the search space it can be concluded that for the4·16 box case, muleGA converges to the true optima while the single-level GA stucks just above80%.

As the number of the boxes increases, the potential of muleGA decreases. The phe-nomenon is the same in all test congurations, including the third one where there are eight base-level GAs. The total number of knapsacks does not have any inuence on the quality of the solution in contrary to the number of knapsacks in the base-level GAs which do have. The perceivable performance loss in function of increasing the number of knapsacks is attributable to the base-level GAs, which lose performance as the dimension of their decision spaces increase. Therefore, they are not being able to bring the top-level solution closer to the true optimum.

The statistical tests showed that the results of muleGA are signicantly better than the results of the single-level GA in all congurations where base-level box numbers were under128. Only a handful of cases in high objective space dimensions produced insigni-cant dierences in muleGA and GA results in128box base-level knapsack congurations.

In the256box base-level congurations, the muleGA and GA produce results which are statistically equivalent, regardless of the objective space dimensions.

As the gures show, the eect of the number of objective space dimensions on the results is minute. Only a mild decline is depicted by the results of the experiments as the number of dimension increase. Note that aggregating tness functions were used for testing, thus the goodness of each individual was summed in a one dimensional prot value which may explain why the results are insensitive to the objective space dimension changes.

The tests on the multi-level KPs underline the eectiveness of using sub-level GAs for nding subsolutions in subspaces of multi-level KPs. These subsolutions can congure higher level GAs, in which extra constraints can be introduced, just as they were in the second test conguration, as ¹₄ of the weight limits were decreased to 80%, still the muleGA showed signicantly better results than the single-level GA. Note that for KPs like in the rst and third test congurations and unlike the second test conguration, there is no point running the top-level GA for many iterations, since on the top-level the dimension of the decision space is higher. However, for problems which introduce new constraints in upper-levels, like the problem in the second test conguration, iteration of upper-level GAs are necessary to have them evolve solutions which satisfy the upper-level constraints as well as the base-level ones. Note that the aggregating tness functions for solving KPs are not the most eective GA based solution methods. Lamarckian and Baldwinian GAs [52] would nd better solutions in the single-level implementation, but their advancements would also appear in the multi-level implementation too.

3.5 Algorithm Evaluation 73

Evolving only the base-level GAs is equivalent to the divide-and-conquer decomposi-tion of the problem and solving the subproblems with GAs. The divide-and-conquer type of decomposition of decision space and objective space and then solving the subproblems with GAs have been tested and presented in [116] for multi-dimensional numerical test problems. Solving the subproblems with GAs and then combining their results were found signicantly more eective than solving the whole problem with a single GA.

3.5.5 Numerical MOP Tests

The eectiveness of the divide-and-conquer decomposition of multi-dimensional numeri-cal test problems is presented in [116]. The paper describes tests which employ snumeri-calable test problems [117, 118] with adjustable decision space and objective space dimensions.

The KPs are conveniently scalable, because it is easy to adjust the dimensions of the criterion space and objective space, as the number of boxes and the dimension of the weight constraints are easily congurable. However, ndingP_true and PF_true for KPs is a computationally expensive process as the whole feasible region of the search space needs to be evaluated. This makes KPs less practical for such testing purposes where the comparison of the results with P_true and PF_true is needed. Some of the numerical test problems are constructed in such a way thatP_true is known right away, as it can be formulated mathematically. The scalable test problems of Deb et al [118] are problems of this type and became a standard in testing MOGAs in higher dimension (three or more) problem solving. However these scalable problems are single-level problems, so their decomposition would be a top-down problem decomposition without introducing subproblem specic constraints for each subproblem. To test muleGA in its intended function, namely in solving multi-level problems, multi-level numerical test problems are formulated here (with a bottom-up problem composition approach) by combining tradi-tional numerical test problems presented in Appendix A.7. Combining single-objective or bi-objective MOPs to a higher dimension problem is the easiest way of creating multi-level test problems [117]. However, besides combining the traditional MOPs, new objec-tives are also introduced for the upper-level GA, just as it has been done in the second conguration of the KP tests.

The traditional MOPs chosen for the experiments are the Binh1 [119, 120] and Poloni [121] problems, which are two-dimensional both in decision space and objective space.

The enumerated Pareto fronts of the problems are shown in Figure 3.12.

3.5.5.1 Multi-Level Problem Formulation

The bi-objective MOPs (binh1 and poloni) are combined to form 4, 8, 16 and 32 di-mension problems, both in criteria space and objective space. ThePtrue sets of the new problems are the combination of the base-problems P_true sets, and the same holds true for thePFtrue sets.

P^binh1.4D_true =P^binh1_true ×P^binh1_true (3.11) P^binh1.8D_true =P^binh1.4D_true ×P^binh1.4D_true (3.12) P^binh1.16D_true =P^binh1.8D_true ×P^binh1.8D_true (3.13) P^binh1.32D_true =P^binh1.16D_true ×P^binh1.16D_true (3.14)

P^poloni.4D_true =P^poloni_true ×P^poloni_true (3.15) P^poloni.8D_true =P^poloni.4D_true ×P^poloni.4D_true (3.16) P^poloni.16D_true =P^poloni.8D_true ×P^poloni.8D_true (3.17) P^poloni.32D_true =P^poloni.16D_true ×P^poloni.16D_true (3.18) So, the newly formed problems have 4,8,16and 32dimensional decision spaces and objective spaces, respectively. A traditional MOGA would have diculties in converging to the Pareto fronts of these newly formed problems, because of the following: in the high-dimensional objective space, there can be many individuals assigned with rank 1, while

In document Többszintű genetikus algoritmus és táplálkozás- és életmód-tanácsadó (Pldal 92-109)