Compile-time tasks of adaptive pattern matching

replacing contracted nodes, the tree structure has to be restored by removing the cycle edge from the forest that leads to the same free node, to which the edge entering the cycle goes. In our case, the cycle edge leading toTcis removed from the tree. Finally, the same procedure is repeated for the first cycle found, resulting in a search forest shown in Fig. 7.5(e).

In case offinding a low cost search plan in a given search forest, a simple greedy algorithm is used, which is sketched in Algorithm 7.2.

Algorithm 7.2A greedy algorithm for generating a low cost search plan Given a search forest.

Step 0: Set the counter to 1 and letSbe the set consisting of the dummy node and the bound nodes.

Step 1: Select the smallest forest edgeethat goes out fromS.

Step 2: Set the target node ofeto the value of the counter.

Step 3: Increment the counter by 1 and add the target node ofetoS.

Step 4: If the search forest still has a node that is not inS, then go back to Step 1.

We do not state that these simple algorithms provide optimal solutions for our special cost model, but best engineering practice suggests that if edges with weights giving the minimum sum are selected, then the search forest and the search plan consisting of the same edges also have low cost when our spe-cial cost function is employed. Simplicity and speed are further arguments in favour of the successful application of such algorithms.

Example 26 For illustrating the operation of Algorithm 7.2 on an example, the search forest of Fig. 7.5(e) has been selected as a starting point. In the first iteration, since the algorithm has no alterna-tives, it selects the iteration edge connecting the dummy node to free nodeTc. In the second iteration, 3 outgoing forest edges can be found, namely the ones connecting nodeTc toCc, Pc, andC, respec-tively. These forest edges are processed in this specific order as their weights form a non-decreasing sequence,¹and no new outgoing forest edges get into the set of valid choices. After the fourth iteration, the algorithm has no alternatives for forest edge selection, thus, it assigns labels 5, 6, and 7 as shown in Fig. 7.4(a), which in turn depicts the resulting search plan.

7.4 Compile-time tasks of adaptive pattern matching

At this point, several search plans have been elaborated by either the model-specific approach of Sec. 7.3, or any other search plan generation techniques.

In a traditional pattern matching approach, for each pair ofLHSpattern and adornment,only one search plan is selected and compiled in the code generation phase resulting in executable code on the target platform. This approach is now extended to support adaptive execution by generating pattern matching code for several search plans having been prepared for each LHS pattern and adornment combination, and by defining a model-dependent cost function, which can be evaluated at run-time for estimating the performance of the given pattern matching strategy before actually executing it.

1Edges leading to nodesCcandPccan be processed in a reverse order as well as they have the same weight.

114 CHAPTER 7. ADAPTIVE GRAPH TRANSFORMATION

7.4.1 Theoretical foundations of compile-time support for adaptivity

Our solution for code generation uses the Strategy design pattern [48] (see Fig. 7.6), which means that a class extending the abstractPatternMatchingStrategyclass is generated from each search plan.

PatternMatchingStrategy

cost match

StrategyFromModel2 StrategyFromModel1

Figure 7.6: Strategy design pattern

The abstract class has two basic functionalities, which have to be supported by the generated classes as well.

(a) One method (i.e.,match()) implements the actual pattern matching algorithm. Since our aim is to focus on the novelties of adaptation related topics, the generation of pattern matching algo-rithms is not discussed in more details that has already been presented in Sec. 4.2.5.

(b) The other relevant functionality is the calculation of cost for the given pattern matching strategy based on the statistics of the actual instance model available at run-time.

In order to support cost calculation functionality of pattern matching strategies, a model-dependent cost functioncSP(M)has to be specified for each search planSPthat drives the corresponding strategy.

In this thesis, we use the cost function

cSP(M) =

|V_SG^F| X

i=1 i

k=1

c_k(M)

where operation costc_k(M)is defined by the following three rules.

• Cost of iteration. If the source nodedof the search forest edge leading to the free nodevwith label kis the dummy node, then the operation cost is given by the value of the object counter

#_t(v)that has been defined for the direct typet(v)of pattern nodev, which is the origin of free nodev, in turn. Formally,

∀k∈Z⁺ : 1≤k≤ V_SG^F

∧SP(v) =k∧d→^z v∈ESF∧b(v) =v=⇒ck(M) =#_t(v)(M)

• Cost of forward navigation. If the search forest edgeu →^z vconnects pattern node derivativeu to free nodevwith labelkand goes in thesamedirection as its origin search graph edgeu→^z v,

7.4. COMPILE-TIME TASKS OF ADAPTIVE PATTERN MATCHING 115

then the operation cost is calculated as follows. The value of the link counter #(t(z),t(u),t(v))

declared for the typet(z)of the pattern edgezrestricted by direct typest(u)andt(v)of source and target pattern nodesuandv, respectively, is divided by the value of the object counter#_t(u) declared for the direct typet(u)of the source pattern nodeu. Formally,

∀k∈Z⁺ : 1≤k≤ V_SG^F

∧SP(v) =k∧u→^z v∈ESF∧u∈V_SG^P ∧b(u→^z v) =u→^z v=⇒ ck(M) = ^#(t(z),t(u),t(v))(M)

#_t(u)(M)

• Cost of backward navigation. If the search forest edge v ^z→^inv u leading from pattern node derivativevto free nodeuwith labelkgoes in theoppositedirection as its origin search graph edge u →^z v, then the operation cost is calculated as follows. The value of the link counter

#(t(z),t(u),t(v))declared for the typet(z)of the pattern edgezrestricted by direct typest(u)and t(v)of source and target pattern nodesuandv, respectively, is divided by the value of the object counter#_t(v)declared for the direct typet(v)of the target pattern nodev. Formally,

∀k∈Z⁺ : 1≤k≤ V_SG^F

∧SP(u) =k∧v^z→^inv u∈ESF∧v∈V_SG^P ∧b(v^z→^inv u) =u→^z v=⇒ ck(M) = ^#(t(z),t(u),t(v))(M)

#_t(v)(M) Note that the above-mentioned cost function is exactly the same as the one in Sec. 7.3.1 that has been used for defining model-specific search graphs for typical models in the optimization phase. How-ever, the application schedule of these cost functions completely differ as the one defined in the current section is evaluated at run-time, while the other is calculated at compile-time. It is worth emphasiz-ing that the presented adaptive pattern matchemphasiz-ing approach also requires the statistics support of the underlying model repository as it uses object and link counters for cost calculations.

7.4.2 Compile-time tasks in EJB3-based adaptive pattern matching

We present how an EJB3-based pattern matching engine can be made adaptive. In this sense, code frag-ments having been generated for a sample inherited concrete strategy class are shown, which implement pattern matching and cost calculation functionalities.

Example 27 Pattern matching driven by the search plan of Fig. 7.4(a) is implemented by the code presented in Listing 7.1. Note that this method is structurally similar to code fragments of Listings 4.1 and 4.2, which have been shown earlier in Sec. 4.2.5.

This strategy starts with iterating all tables and binding them to tableTcone-by-one (Lines 2–5).

For each table in the model, a corresponding column and primary key is sought by navigating along links of typeCF(Lines 6–9) andEO(Lines 10–13), respectively. At this point, the existence of aUF

link between the previously bound column and primary key is checked (Lines 14–15). Then the class, which has been transformed to the table assigned toTcis determined and it is bound to C(Lines 16–

17). Note that reference edges can be navigated without a while loop as at most one multiplicity constraints have been defined for both ends of such edges in the metamodel. WhenAE,Rel, andTrel

are already bound to a corresponding association end, association, and table by code fragments of Lines 18–21, 22–23, and 24–25, respectively, the checking of theNACfollows (Lines 26–29), which requires a matching to be initalized with the mapping of AE. When theNAC check fails, mappings ofTc, Cc,Pc, C, AE,Rel, and Trelconstitute a complete matching, which can be returned as a result (Lines 31–35).

116 CHAPTER 7. ADAPTIVE GRAPH TRANSFORMATION

1 public Matching match(Matching initialMatching) {

2 // Level 1 -- Binds tc : Table

3 Iterator<Table> iTc = getAllTables();

4 while (iTc.hasNext()) {

5 Table tc = iTc1.next();

6 // Level 2 -- Binds cc : Column

7 Iterator<Feature> iCc = tc.getCF();

8 while (iCc.hasNext()) {

9 Column cc = (Column) iCc.next();

10 // Level 3 -- Binds pc : PKey

11 Iterator<ModelElement> iPc = tc.getEO();

12 while (iPc.hasNext()) {

13 PKey pc = (PKey) iPc.next();

14 // Checks UF edge

15 if (pc.getUF().contains(cc)) {

16 // Level 4 -- Binds c : Class

17 Class c = (Class) tc.getRef();

18 // Level 5 -- Binds ae : AssocEnd

19 Iterator<Feature> iAE = c.getSFT();

20 while (iAE.hasNext()) {

21 AssocEnd ae = (AssocEnd) iAE.next();

22 // Level 6 -- Binds rel : Association

23 Association rel = (Association) ae.getCF();

24 // Level 7 -- Binds trel : Table

25 Table trel = (Table) rel.getRef();

26 // Checks NAC

27 Matching mNAC = new Matching();

28 mNAC.set("AE", ae);

29 if (! nacMatcher.match(mNAC)) {

30 // Prepares the result matching

31 Matching result = new Matching();

32 result.set("Tc",tc); result.set("Cc",cc); result.set("Pc",pc);

33 result.set("C",c); result.set("AE",ae); result.set("Rel",rel);

34 result.set("TRel",trel);

35 return result;

36 } } } } } } }

Listing 7.1: Program code equivalent of the search plan of Fig. 7.4(a)

Example 28 By using the search plan of Fig. 7.4(a), a corresponding cost calculation method is also generated as shown by Listing 7.2.

Thecost method processes search forest edges in an increasing order according to the label at-tached to their target free node. Local variabletermstores the product of the cost of such forest edges that have already been processed, while variableresultdenotes the partial result of the addition. For each forest edge, variableterm is multiplied by the corresponding cost of the edge, then it is added to the partial result. E.g., forest edge connectingTc toCchas label 2 at its target node derivativeCc, so it is processed in the second round (Lines 8–11). This forest edge represents a backward navigation along links of typeCFstarting from the table assigned to pattern nodeTc, so its cost is calculated by dividing the link counter#(CF,Column,Table) by the object counter#(Table) declared for the direct type of the already fixed pattern nodeTc.

In document professorBudapest,April2008 Prof.Dr.rer.nat.AndySchürr assistantprofessor Dr.DánielVarró,PhD associateprofessor Dr.KatalinFriedl,PhD MScinTechnicalInformaticsSupervisors: GergelyVarró PhDThesis AdvancedTechniquesfortheImplementationofModelTransformationSy (Pldal 125-129)