Graph pattern matching on EJB3 platform - Graph transformation with portable EJB QL queries

6.6 Graph transformation with portable EJB QL queries

6.6.2 Graph pattern matching on EJB3 platform

By using search plans ofLHSand embeddedNACpatterns, we construct and execute a singleSELECT

EJB QL query that calculates and retrieves all the matchings of the precondition of a rule.

The general form of the query is as follows:

Apattern node variableis an identification variable being declared in theFROMclause of the EJB QL query, which represents a pattern node. Identification variables that represent bound and free nodes are calledbound node variablesandfree node variables, respectively.

TheSELECT clause of the query contains all pattern node variables in the form of a comma sepa-rated list.

In order to handle the mappings of (initially matched) bound nodes, an identification variable decla-ration of the formtype_bound AS boundis appended to theFROMclause. Additionally, an expression of the formbound.id = obj is added as an initial matching constraint, in whichobj denotes the object to which bound nodeboundhas been mapped by the initial matching.

If a free (pattern node) variable is reached by navigation in theFROMclause of an EJB QL query, then the type of this pattern node variable may be an ancestor of the type prescribed by the pattern node itself. This yields a situation where the pattern node variable has a possibly larger set of matching objects than it is allowed by the type restriction set up by the pattern node. In order to resolve this situation, an additional identification variable is declared for representing the same pattern node and a type checking constraintis defined to narrow the set of matching objects for this pattern node.

Free node variable declarations and type checking constraints are generated during search plan traversal while processing search plan edges in an increasing order.

• Processing iteration edges. If an iteration edge with a target nodetrgis being processed, then an expressiontype_trg AS trgis added to the end of theFROMclause wheretype_trgis the type of the pattern nodetrg.

• Processing to-one navigation edges.If a to-one navigation edge of typeassocconnecting node srctotrgis being processed, then expressionssrc.assoc AS trg_supandtype_trg AS trg are appended to the end of theFROMclause, and a subformulatrg_sup.id = trg.idis also added as a type checking constraint.

6.6. GRAPH TRANSFORMATION WITH PORTABLE EJB QL QUERIES 99

• Processing to-many navigation edges. If a to-many navigation edge of typeassocconnecting nodesrctotrg is being processed, then terms IN(src.assoc) AS trg_sup, andtype_trg AS trg are appended to theFROM clause, and a subformulatrg_sup.id = trg.id is also added as a type checking constraint.

An edge checking constraint expresses a restriction, which is caused by a pattern edge that has not been processed at all during the traversal of the search plan. For each pair of unnumbered naviga-tion edges connecting nodessrcandtrgin both directions, we append a subformulasrc.assoc = trg.idortrg MEMBER OF src.assocto theWHEREcondition by using a logical ANDoperator for affixing, ifsrc.assocrepresents a to-one or a to-many navigation edge, respectively.

Injectivity constraintsare defined for such pairs of pattern nodes where one member has a type that conforms to a supertype of the other. For each such pairnode_iandnode_j, we add a subformula of the formnodei.id <> nodej.id.

NACconstraintsexpress restrictions formulated byNACpatterns that are embedded into theLHS pattern being processed. For each embedded NAC pattern, we add a constraint of the form NOT EXISTS (subquery), wheresubquery is the EJB QL query that is going to be generated for the embeddedNACpattern. In case ofNACsubqueries, bound node variables do not have to be declared in the FROMclause, as their declarations are already contained by the embedding query of theLHS. TheNOT EXISTSconstraint will be evaluated to true if and only if the subquery, which would list the matchings of theNACpattern has no rows.

Example 20 To continue our running example, we present theSELECTstatement (Listing 6.3) that is generated for the search plans of theLHS andNAC patterns ofClassRule(as depicted in the left and right parts of Fig. 6.3, respectively). Note that in case of search plans prepared for EJB QL queries, NACnodes are always marked by the maximum possible value as the subqueries ofNACpatterns are always evaluated after matching theLHSpattern.

Figure 6.3: Sample search plans of theLHSand theNACofClassRule

Lines 1–9 of the query are generated during the traversal of the search plan ofLHS, when its edges are processed in increasing order as shown by the comments at the ends of lines. (Expressions in parentheses denote the search plan edge processing method being used.) Since at least one edge is selected from each pair of navigation edges, no edge checking constraints are needed in the query.

Metamodel class Schemais a subclass of classPackage, so schemaScannot be mapped to the same object as package P as expressed by Lines 8–9. The query for the NAC pattern (Lines 10–16) is processed similarly with the single exception thatCnow counts as a bound node as a mapping for node

Chas already been found.

100 CHAPTER 6. GRAPH TRANSFORMATION IN RELATIONAL DATABASES

1 SELECT s,p,c

2 FROM Schema AS s, -- 1 (iter)

3 s.ref AS p_sup, Package AS p, -- 2 (one)

4 IN(p.eo) AS c_sup, Class AS c -- 3 (many)

5 WHERE -- type checking constraints

6 p_sup.id = p.id AND c_sup.id = c.id

7 -- no edge checking constraints

8 -- injectivity constraints

9 AND s.id <> p.id

10 -- NAC constraint

11 AND NOT EXISTS (

12 SELECT c,tn

13 FROM c.ref AS tn_sup, Table AS tn -- 1 (one)

14 WHERE tn_sup.id = tn.id

15 AND c.id <> tn.id

16 )

Listing 6.3: The EJB QL query describing the pattern matching defined by search plans of Fig. 6.3

6.7 Conclusion

In the current chapter, I elaborated a provenly correct method for executing graph transformation built on top of a relational database, and assessed the performance of the approach in several measurement settings involving different combinations of databases, parameters and optimization strategies.

• Graph pattern matching in relational databases. I elaborated a provenly correct method, which automatically transforms graph patterns to SQL queries whose behaviour corresponds to the pattern matching phase of graph transformation (Sections 6.4.2 and 6.4.3).

• Modification phase of graph transformation on top of relational databases. I elaborated a provenly correct method, which generates such SQL commands, whose behaviour corresponds to the updating phase of graph transformation (Sec. 6.4.4).

• Quantitative performance analysis of the method. By using the object-relational mapping as a benchmark example, I examined the efficiency of this technique in several measurement settings involving different combinations of databases, parameters and optimization strategies (Sec. 6.5).

• Portable queries for graph pattern matching. I proposed a database independent and portable pattern matching approach that uses declarative EJB QL queries to implement graph pattern matching (Sec. 6.6).

These results are reported in [145, 149, 150, 151].

Relevance

The relevance of the presented approach can be summarized by examining the motivation goals laid out in Sec. 6.1.

As the RDBMS based graph transformation approach stores models on the disk, it has the ability to handle larger models compared to pure in-memory GT engines. Performance experiments of Sec-tions 5.5 and 6.5 demonstrate that relational databases provide a feasible candidate as an implementa-tion framework for graph transformaimplementa-tion engines with promising results especially for large models.

6.7. CONCLUSION 101

However, performance is not the only aspect one needs to consider from a practical point of view when implementing model transformations. Our relational database approach automatically provides persistence and transactional services without further programming effort.

Persistence is very important in the case of MDA tools storing their UML models in relational databases as e.g., AMEOS of Aonix [4]. This tool offers powerful built-in means to capture model-to-code transformations, but model-to-model transformations (including model manipulations) are not supported, which could be complemented by our technique to provide a general solution.

When model transformation is used in an enterprise environment, transformation plugins are typi-cally implemented as session beans whose methods are generally executed in a transaction block, which should provide support for withdrawing the effects of rule execution. As RDBMSs are able to handle transactions by default, our approach can easily integrated into an enterprise environment to function as a model transformation plugin, in which pattern matching is expressed by (SQL or EJB QL) queries.

A further advantage of the presented technique is that it is also suitable for in-memory RDBMSs like TimesTen [106], SQLite [124], Xcelerix [160], HSQLDB [65] without modification. In such an approach, performance and model size are expected to be on the same order of magnitude as in case of any other in-memory solutions.

The practical applicability of RDBMS based graph transformation has recently been confirmed in [71] by the developers of the MOLA tool [69, 70], who used a similar approach for implementing their Transformation Execution Environment module.

Limitations

The presented approach suffers from certain runtime overhead due to several reasons. The underlying database performs statement parsing, optimization, table (or row) locking, and transaction handling tasks during the execution of each query. Though repeated parsing can be partially avoided by prepared statements, and both optimization and transaction handling can sometimes be guided by the pattern matcher, each remaining task obviously increases the execution time of the pattern matching engine.

The application server also introduces performance degrading factors in case of EJB3-based pattern matching solutions as the objects in the matchings returned are also cached in the main memory in ad-dition to their original location on the disk, and the object-relational mapping service of the application server has to synchronize these copies. Additionally, the application server usually provides transaction handling services whose usage also worsens performance.

As a natural limitation of the EJB QL based approach, it is worth emphasizing the trade-off between portability and run-time performance when database-specific query optimizations are switched on and off.

CHAPTER

7 Adaptive Graph Transformation

In this chapter, I present a novel approach to implement adaptive, and model-sensitive graph pattern matching modules. Based on the statistics of the instance model under transformation, these mod-ules can dynamically select the optimal pattern matching strategy from a set of precompiled strate-gies that have been generated from model-sensitive search plans by using estimates for their expected performance on typical instance models available at transformation design time. As a result, a fast transformation engine can be obtained, which can dynamically modify its behaviour at run-time in the performance critical pattern matching phase by always selecting the strategy that is expected to be optimal for the instance model under transformation.

7.1 Motivation

As model transformation is becoming an engineering discipline (transware), conceptual and tool sup-port is necessitated for the entire life-cycle, i.e., the specification, design, execution, validation and maintenance of transformations. However, different phases of transformation design frequently set up conflicting requirements, and it is difficult to find the best compromise. For instance,interpreted MT approacheshave a clear advantage during the validation (e.g., by interactive simulation) or the main-tenance phase due to their flexibility. On the other hand, the main driver in the execution phase is performance, therefore, acompiled MT approachis advantageous as shown by survey [154].

The performance of the executable code is optimized at compile time by evaluating and optimizing differentsearch plans[163] for the traversal of theLHSpattern, which typicallyexploits the multiplicity and type restrictionsimposed by the metamodel of the problem domain.

Problem statement

• Lack of adaptivity. While in many cases, types and multiplicities provide a very powerful heuristics to prune the search space, in practical model transformation problems, one has further domain-specific knowledge on the potential structure of instance models of the domain, which is typically not used in these approaches.

• Hard-wired pattern matching strategies. Furthermore, in case of intensive changes during the evolution of models, the characteristic structure of a model may change as well, therefore a

103

104 CHAPTER 7. ADAPTIVE GRAPH TRANSFORMATION

pattern matching strategy that is generated a priori at compile time from a search plan might not be flexible and powerful enough.

Objectives

I propose a method for generating model-sensitive search plans for pattern traversal (as an extension to traditional multiplicity and type considerations of existing tools) by estimating the expected perfor-mance of search plans on typical instance models that are available at transformation design time. I also elaborate a method for finding low cost search plans with respect to the cost function that estimates the performance of a given strategy.

Furthermore, I propose an adaptive approach for graph pattern matching, where the optimal strategy can be selected from previously generated pattern matching strategies at run-time based on statistical data collected from the instance model under transformation.

Finally, I present a technique for preparing adaptive, compiled, stand-alone plugins for representing pattern matching strategies on the EJB 3.0 platform from model-sensitive search plans.

Related work

Sophisticated pattern matching strategies of the most advanced compiled graph transformation ap-proaches are now surveyed.

• FUJABA [99] compiles visual specifications of transformations [44] into executable Java code based on an optimization technique using search graphs with a breadth-first traversal strategy penalizing many-to-many multiplicity constraints. Our approach is different from Fujaba in the use of EJB3 beans instead of pure Java classes and the model-sensitive generation of search plans.

• PROGRES [122] supports both interpreted and compiled execution (generating C code) of pro-grammed graph transformation systems. It uses a sophisticated cost model [163] for defining a priori costs of basic operations (like the enumeration of nodes of a type and navigation along edges) for generating search plans. These costs are not domain-specific in the sense that they are based on assumptions about a typical problem domain on which the tool is intended to be used. Operation graphs of PROGRES, which are similar to search graphs of Sec. 4.2, addition-ally support the handling of path expressions and attribute conditions. The compiled version of PROGRES generates search plan at compile-time by a greedy algorithm, which is based on the a priori costs of basic operations.

• The pattern matching engine of compiled GREAT [158] (generating C++ code) uses a breadth-first traversal strategy starting from a set of nodes that are initially matched. This initial binding is referred to as pivoted pattern matching in GREAT terms. This tool uses the Strategy design pattern for the purpose of future extensions and not for supporting different pattern matching strategies like in our approach.

Object-oriented database management systems also use efficient algorithms [126] for query opti-mization, but their strategy significantly differs as queries are formulated as object algebra expressions, which are later transformed to trees of special object manager operations during the query optimization process.

In the graph transformation community, adaptivity has already been used with a completely dif-ferent meaning, when adaptive star grammars [33] have been defined for applications, which are used

In document professorBudapest,April2008 Prof.Dr.rer.nat.AndySchürr assistantprofessor Dr.DánielVarró,PhD associateprofessor Dr.KatalinFriedl,PhD MScinTechnicalInformaticsSupervisors: GergelyVarró PhDThesis AdvancedTechniquesfortheImplementationofModelTransformationSy (Pldal 110-117)