Motivation: Combining Different Pattern Matching Approaches

4.2 Motivation: Combining Different Pattern Matching Approaches

As an alternative,incremental pattern matching(INC) approaches have recently become a popular ap-proach in the model transformation community. The core idea is to improve the execution time of the time-consuming pattern matching phase by additional memory consumption. Essentially, the (par-tial) matches of graph patterns are stored explicitly, and these match sets are updated incrementally in accordance with elementary model changes. While model manipulation becomes slightly more complex, all matches of a graph pattern can be retrieved in constant time in exchange by eliminating the need for recomputing existing matches.

Initial benchmarking [17,2] has shown that in many scenarios, the incremental pattern matching approach (as implemented in the Viatra2 framework) leads to orders-of-magnitude increases in speed. However, an important implication of caching match sets is increased memory consumption, which needs to be taken into account when scaling up to large models. Unfortunately, in many practical applications of model transformations, available memory is frequently constrained (e.g.

when they are executed on average desktop computers and not on high performance servers).

Contribution to Hybrid Pattern Matching

To overcome this limitation, we propose a hybrid pattern matching approach which enables the transformation designer to combine local search based and incremental pattern matching to adapt to memory constraints. At design-time, transformation engineers may select whether a graph pattern should be matched using the LS or the INC strategy separately for each pattern. Moreover, based upon runtime monitoring, the execution engine may automatically switch from incremental pattern matching to local-search based technique when a certain memory limit has been reached.

However, selecting the appropriate matching strategy for complex model transformation pro-grams requires a deep understanding of both pattern matching algorithms. Therefore, we examined typical transformation scenarios from the literature [VSV05] [24, 17, 27]. As a result, we found a list of various factors (metrics) [2], which we experienced to have significant effect on run-time perfor-mance and memory consumption. Based on this analysis, we defined guidelines for transformation designers when a graph pattern should be matched using INC or LS algorithm.

The main highlight of our approach, compared to other adaptive pattern matching algo-rithms [VVF05, JBK10, VDWS12], is the ability to combine conceptually different matching strategies.

With each of them optimal for different scenarios, the transformation designer is able to fine-tune the performance of the transformation engine with regard to execution speed as well as size scalability.

4.2.1 Structure

The current part of the thesis is structured as follows.

• Chapter 5 introduces the concepts of search plan driven pattern matching by defining the concepts of search graphs (in Section 5.1) and search plans (in Section 5.2), followed by the algorithms used for their generation from graph patterns in Section 5.3. Finally, details on how the proposed approach was realized in the Viatra framework is discussed in Section 5.4.

• Chapter 6 gives an overview on how hybrid pattern matching can be introduced to our search graph driven pattern matching concept. In Section 6.1 the basics of RETE based incremental pattern matching is discussed, followed by Section 6.2 defining the extensions to our search graph concept to support hybrid pattern matching. The benchmark evaluation results on our

40 CHAPTER 4. INTRODUCTION

AntWorld case study using the introduced hybrid approach is discussed in Section 6.3. More-over, key metrics for selection strategies between LS and INC approaches are presented in Sec-tion 6.4

• Finally, related work and conclusions are discussed in Chapter 7.

Chapter

5 -Search Plan Driven Graph Pattern Matching

In the last decades many local search based pattern matching algorithms ( [Zün96, Dör95, GHS09]

or [18]) have been developed. Within the graph transformation community, all of these are variant of the well-known Ullmann [Ull76] algorithms. Usually, the implementations of LS based graph pattern matching algorithms differs from Ullman’s work in how the candidates are computed and the extended matches are checked.

The generation of search plans [Zün96] is a frequently used and efficient strategy to drive the execution of these local search based pattern matching algorithms. Informally, asearch plandefines an order of pattern nodes, in which they are bound to objects of the instance model during pattern matching. In addition to simply specifying the binding order of pattern nodes, it often also includes an order of elementary operations that have to be executed to drive pattern matching.

In the current thesis, we understand search plan driven pattern matching as a two phase pro-cess [Var08]:

• Atcompilation timea search graph is constructed for each graph pattern bodies. It is a joint representation of pattern graph elements and operation constraints that drives the pattern matching process.

• At runtime, when patterns are invoked with bound input parameters, the search graph is adorned, which denotes that the a given pattern node or edge is initially bound or free. Based on this adorned search graph, a search plan is generated that is one possible traversal of the search graph. Finally, this traversal defines a totally ordered list ofsearch operations, which represents the atomic units of pattern matching.

42 CHAPTER 5. SEARCH PLAN DRIVEN GRAPH PATTERN MATCHING

5.1 Search Graphs

Asearch graph is a joint representation of pattern body graph elements and operation constraints that drives the pattern matching process. In our interpretation, a search graph is ahypergraph rep-resenting aconstraint net, where graph nodes reflect variables, and hyperedges express constraints (predicates, similarly to Datalog [AV88]) between the variables. A search graph is directly derived from the pattern body graph as follows:

• Pattern variable: Each element (node or edge) of the pattern graph is mapped to apattern vari-able. These elements represent the arguments of the constraints. This uniform representation allows to support any kind of operation constraint between edges and nodes of the pattern body. This may be restricted if the underlying metamodeling language does not support this kind of freedom. For the ease of understanding, we will not make any restrictions on operation constraints on edges.

There is a subset of pattern variables calledconstants. They represent elements from the meta-model on which the graph pattern is defined.

• Operation Constraint: Each constraint on the pattern graph (e.g., connectivity, instanceOf, in-jectivity etc.) is mapped to an n-ary (usually binary) edge of the search graph. The edge connects elements that are part of the constraint it defines (e.g., a source constraint connects the pattern variables of the source node and the edge). They represent operation predicates, that have to be fulfilled during the matching process.

Example 11 The search graph of the first pattern body of themoveAnywhereButHome(see in Fig-ure 3.1(e)) GT rule is illustrated in FigFig-ure 5.1.

Figure 5.1: Search graph of the first pattern body of the moveAnywhereButHome graph rule’s pre-condition

The search graph contains nine pattern variables; SearcherAnt,Path,hasSearcherAntandField represent the type elements (constants) of the pattern graph (denoted by black ovals), while variables Ant,OldHasAnt,P,Field1andField2 represent the nodes and edges of the pattern graph itself (de-noted by black ovals). The operation predicates directly define the constraints of the pattern graph:

src,trg edges define the source and target node of an edge (denoted by green rectangles). For ex-ample, Field1 is the source of edge Pand its target is Field2. The inst edges represent the direct

5.1. SEARCH GRAPHS 43

instance of relations between a pattern variable and its type constant, for example as between the Ppattern variable and its path type. The nac1edge represents the negative application condition constraint with its single input parameterField2. Finally, the twoinjedge define the injectivity check between their input variables like between the two edgesOldHasAnt,P. Note that theinjconstraint for common sense have been defined separately between the edges and nodes of the search graph.

Definition 12 Given a metamodelM Mand a graph patternGP with a pattern bodyP B, thesearch graphSG= (VSG, ESG, b)is a hypergraph with nodesVSGand edgesESGand a backward mapping b : SG 7→ P B, which maps nodes and edges of the search graph SG to the pattern body. The structure of the search graph is described by the following rules:

• Pattern variable: NodesV_SGof the search graph can be partitioned into two sets: (i) the con-stant node set that contains all elements that are concon-stant in the pattern bodyV_SG^constand (ii) the V_SG^{P V} pattern variables. Formally,VSG =V_SG^const∪V_SG^{P V} andV_SG^const∩V_SG^{P V} =∅The mapping rules are the following:

– Each type elementtype of the pattern bodyP Bis mapped to aconstant nodexin the search graph. Formally, ∀instanceOf(A, x) ∈ SC_{P B},∃x ∈ M M, ∃type ∈ V_SG^const,: b(x) =type.

– Eachnodenof theP B pattern body is mapped to apattern variablevarin the search graph. Formally,∀n∈VSCP B,∃var∈V_SG^{P V} : b(var) =n.

– Eachedgeeof theP B pattern body is mapped to apattern variablevarin the search graph. Formally,∀e∈E_SC_{P B},∃var∈V_SG^{P V} : b(var) =e.

• Operation Constraint:Edges of the search graph can be partitioned into: (i) source edgesE_SG^src, (ii) target edgesE_SG^trg, (iii) instance of edges E^inst_SG , (iv) injectivity check edges E_SG^inj, (v) nac check edges E_SG^{N AC} and finally (vi) attribute condition check edgesE_SG^attr. Formally,E_SG = E_SG^src∪E_SG^trg∪E_SG^inst∪E_SG^inj∪E_SG^{N AC}∪E_SG^attrand∀i, j∈ {src, trg, inst, inj, N AC, attr}, i6=

j : E_SGⁱ ∩E_SG^j =∅. The mapping rules for the edges are the following:

1. Simple predicatesrepresent core constraints between two pattern variables.

– Each edgeeconnecting the nodeuto nodevin the pattern body is mapped to a pair of sourcesrc(x, y)andtargettrg(x, y)edgesin the search graph connecting the corresponding pattern variables ofu,vto the pattern variable ofe. Formally,

∀e∈E_SC_{P B}, src_SC_{P B}(e) =u,∃src(pvar_e, pvar_u)∈E_SG^src :

b(src(pvare, pvaru)) =e∧b(pvare) =e∧b(pvaru) =u∧pvaru, pvare ∈V_SG^{P V}

∀e∈E_SC_{P B}, trg_SC_{P B}(e) =v, ∃trg(pvar_e, pvar_v)∈E_SG^trg :

b(trg(pvare, pvaru)) =e∧b(pvare) =e∧b(pvarv) =v∧pvarv, pvare∈V_SG^{P V} – Each instanceOf relationship between the nodeuand type typeis mapped to an

instance of edge inst(pvar_u, pvar_type) in the search graph connecting the corre-sponding pattern variables ofuandtype. Formally,

∀instanceOf(u, type)∈SC_{P B},∃inst(pvar_u, pvar_type)∈E_SG^src :

44 CHAPTER 5. SEARCH PLAN DRIVEN GRAPH PATTERN MATCHING

(b(inst(pvar_u, pvar_type)) =instanceOf(u, type)∧b(pvar_u) =u

∧b(pvar_type) =type∧pvaru ∈V_SG^{P V} ∧pvartype∈V_SG^const 2. Complex predicatesare defined between an arbitrary number of pattern variables.

– One-oneinj(pvar1, . . . , pvarn) injectivity (hyper)edgeis created between all pat-tern variable representing nodes (inj_node) and edges (inj_edge) of the pattern body.

Formally, ∀n ∈ V_SC_{P B},∃inj_node ∈ E_SG^inj : b(pvar_n) = n∧pvar_n ∈ inj_node and

∀e∈ESCP B,∃inj_edge∈E_SG^inj : b(pvare) =e∧pvare∈inj_edge.

– Each negative application conditionN ACi = (GPN ACi, pN ACi)of theP Bpattern body is mapped to aNAC (hyper)edgenaci(pvar1, . . . , pvarn)in the search graph, where each correspondingpvar_u_i pattern variable of node or edgeu_i of the struc-tural constraint that is shared with the negative application condition pattern (as defined bypN AC), is part ofnaci. Formally,∀N AC_i,∃nac_i ∈E_SG^{N AC} : b(naci) = N AC_i and ∀j = 1. . . n, u_j ∈ (V_{P B}_SC ∪E_{P B}_SC),∃p_{N AC}(u_j) : ∃b(pvar_u_j) = uj ∧uj ∈naci.

– Each attribute constraint AC_i of the P B pattern body is mapped to an attr (hy-per)edgeai(pvar1, . . . , pvarn)in the search graph, where each correspondingpvari

pattern variable of node ui of the attribute constraint is part of ai. Formally,

∀AC_i,∃a_i ∈ E_SG^attr : b(a_i) = AC_i and∀j = 1. . . n, u_j ∈ AC_i : b(pvar_u_j) = uj ∧uj ∈ai.

In document Search-Based Techniques in Model-Driven Engineering (Pldal 51-56)