Case Study - Advisors:TihamérLevendovszkyPh.D.LászlóLengyelPh.D. MárkAsztalos by AutomatedOffli

In the following, we present a sample application domain that will be used as an illustrative case study in this thesis. We will reference to the elements of this domain to illustrate the formal definitions in the next chapters in order to make our formalism more understandable.

Our sample domain represents relational database schemata. The metamodel of the domain is presented in Figure 2.3. Instance models of the metamodel describe database instances with several tables and columns. A column has several attributes including its name (CName), type (Type), and others that, for example, specify if the column is a primary key (IsPrimaryKey) or a foreign key (IsForeignKey). When a column is a foreign key, it must reference a primary key of another table using an edge of type ForeignKeyReference. Besides models of database schemata, we have a model transformation that performs normalization steps on such models. Database normal

forms manage the functional dependencies inside a table [Haan and Koppelaars, 2007]. Functional dependencies can be modeled by a special type of edge calledFunctionalDependencyRelation.

Database DBName: srtring CollationName:string

Table TName: srtring

Column CName:string Type:string Length:int

Constraints:List of strings IsIndexed:bool

IsNullAllowed:bool IsPrimaryKey:bool Refactor:string ProcessingState:string IsForeignKey:bool TableContainment

0..1

0..*

TableColumnContainment 0..1

0..*

FunctionalDepencencyRelation

0..*

0..1

ForeignKeyReference

0..*

0..1

Figure 2.3: Database repository metamodel

The model transformation DB2NF performs a normalization step used to make a table satisfy the requirements of the second normal form. This transformation removes the columns that are functionally dependent from a non-key column from the table and moves them into a new table.

Given an instance model, we can add edges of typeFunctionalDependencyRelation between the columns. Informally, to make a database schema meet the requirements of the second normal form of relational databases, the model transformation DB2NF performs the modifications discussed below.

(1) If in a table, there is a column that has at least one outgoing FunctionalDependencyRelation edge, then this column (called the determinant) and the other columns (called thedependents) that are functionally dependent from this column should be moved to a new table.

(2) In this new table the primary key will be the determinant.

(3) In the original table a new foreign column key should be created that references the primary key of the new table. When a concrete model is transformed, theFunctionalDependencyRelation edges are processed.

When a new table is created, it is named after the attributeRefactorof the determinant, therefore, the input model should be configured properly, before executing the transformation. Note that the recognition of the functional dependencies is not the task of this transformation; we assume that the dependency relations are present correctly in the input models.

We show a pair of sample input and output models in Figure 2.4. The first model depicts a schema with a single table where data of the members of departments are stored. For each member, we not only store the identifier of the corresponding department, but the table also contains the name and the address of the department. This method is redundant, because the unique identifier of the department determines its name and address. Therefore, functional dependency can be defined between these columns. The output model still contains the ’Members’ table, however, the columns related to the departments are moved into a new table whose name is ’Department’, because this is the value of theRefactor attribute of the source of the dependency relations (DepartmentID) in the input model. InFigure 2.4, we do not show all attributes of all elements, only those are presented that are relevant to the transformation. To make these models more understandable, we present the schema for each of them inFigure 2.5using a more traditional notation.

The control mechanism of the model transformation is presented inFigure 2.6. This is a directed control flow graph where the nodes (except the start and the end nodes) represent rewriting rules.

The transformation consists of two rules (Figure 2.7,Figure 2.8). Informally, the application of a rule is performed as follows: given an input model, the left side of the rule is searched for in the model

2. Background

Members[Table]

MemberID[Column]

e1[TableColumnContainment]

Name[Column]

PhoneNumber[Column]

DepartmentID[Column]

DepartmentName[Column]

DepartmentAddress[Column]

e2[TableColumnContainment]

e3[TableColumnContainment]

e4[TableColumnContainment]

e5[TableColumnContainment]

e6[TableColumnContainment]

IsPrimaryKey:true

Type:Numeric Refactor:Department

Members[Table]

MemberID[Column]

e1[TableColumnContainment]

Name[Column]

PhoneNumber[Column]

DepartmentID[Column]

DepartmentName[Column]

DepartmentAddress[Column]

e2[TableColumnContainment]

e3[TableColumnContainment]

e4[TableColumnContainment]

e10[TableColumnContainment]

e11[TableColumnContainment]

IsPrimaryKey:true

Type:Numeric Refactor:Department IsForeignKey:true e7[FunctionalDependencyRelation]

e8[FunctionalDependencyRelation]

Department[Table]

DepartmentID[Column]

e9[TableColumnContainment]

IsPrimaryKey:true Type:Numeric

e12[ForeignKeyReference]

Figure 2.4: Sample input and output model of model transformation DB2NF

MemberID (PK)

MemberID (PK) NameName PhoneNumber DepartmentIDPhoneNumber DepartmentNameDepartmentName DepartmentAddressDepartmentAddress Members

MemberID (PK)

MemberID (PK) NameName PhoneNumberPhoneNumber DepartmentIDDepartmentID Members

DepartmentID (PK)

DepartmentID (PK) DepartmentNameDepartmentName DepartmentAddressDepartmentAddress Department

Foreign key

Figure 2.5: Database schemata conforming to the sample models ofFigure 2.4

such that the constraints specified in the rule are satisfied, and this match is replaced by the right hand side of the rule. Moreover, the attributes are modified according to the attached imperative code if there is. The first rule of the transformation creates the new table and the primary key column for each determinant column, while the second rule is responsible for the moving of the dependent columns. Both rules are appliedexhaustively (denoted by a star next to the name of the rule), which means that they are applied repeatedly until they cannot be applied anymore.

To summarize the execution of the transformation, given an input model, the first rule is applied exhaustively. This means that an instance of the rule’s left-hand side (LHS) pattern is searched for in the input model, and it will be replaced by the right-hand-side (RHS). The attributes of the appropriate elements are updated according to the imperative code. After a successful application

START Rule1* Rule2* END

Figure 2.6: Transformation DB2NF: control flow graph

t[Table]

determinant [Column]

dependent [Column]

t[Table]

determinant [Column]

dependent [Column]

key [Column]

tnew [Table]

[ForeignKey]

IMPERATIVE CODE

tnew.TName = determinant.Refactor;

key.CName = determinant.CName;

key.Type = determinant.Type;

key.IsForeignKey = false;

key.IsPrimaryKey = true;

determinant.ProcessingState = "processed";

determinant.IsForeignKey = true;

IMPERATIVE CODE

tnew.TName = determinant.Refactor;

key.CName = determinant.CName;

key.Type = determinant.Type;

key.IsForeignKey = false;

key.IsPrimaryKey = true;

determinant.ProcessingState = "processed";

determinant.IsForeignKey = true;

CONSTRAINT CODE

determinant.ProcessingState != "processed"

CONSTRAINT CODE

determinant.ProcessingState != "processed"

[FunctionalDependency] [FunctionalDependency]

Figure 2.7: Transformation DB2NF: Rule 1

t1[Table]

c1[Column] c2[Column] c1ref[Column]

t2[Table]

[ForeignKey]

[FunctionalDependency]

t1[Table]

c1[Column] c2[Column] c1ref[Column]

t2[Table]

[ForeignKey]

Figure 2.8: Transformation DB2NF: Rule 2

of the rules, an instance of LHS is searched again. When the rule cannot be applied anymore, the execution moves on to the second rule. Note that it is not true in general that after the exhaustive application of a rule, all instances of LHS will be replaced by an instance of RHS. This is because, a new instance of LHS is searched for again after each application of the rule and there may be overlapping instances of LHS in the input model.

The model transformation DB2NF is intended to be an illustrative case study, hence, we omitted the validation of the input model from the model transformation. This means that, for example, we assume that there is no functional dependency from a column of a table to column of another table, and it is also forbidden that a column depends on one of its dependents. We will show that if the input model does not contain such semantically incorrect constructions, then it works correctly.

3 Related Work

Our research focuses on the offline analysis and verification of functional properties of graph rewriting-based model transformations. Therefore, in this chapter, we summarize work related to the following topics:

• We present different approaches for the specification of model transformations in Section 3.1.

The formal background and the approach how rewriting rules and the control mechanism can be defined determines what methods can be used for their formal analysis. We briefly outline approaches for non-graph rewriting-based model processing programs, then, we concentrate on different methods to define the rewriting rules and the control mechanism of graph rewriting-based transformations.

• Verification of graph rewriting-based model transformation is an algorithmically hard problem.

Therefore, in industrial or academic tools, only certain parts of verification workflow are sup-ported by automated methods. Section 3.2 presents several model transformation tools. We selected such tools where analysis of the developed transformations is supported by the system, or case studies with manual analysis have been published.

• Section 3.3 is devoted to the presentation of work related to different verification and analy-sis methods of graph rewriting-based model transformations. Although, this theanaly-sis focuses on functional properties, we present an overview of the formal analysis of non-functional properties as well such as termination and confluence, because certain concepts applied in these meth-ods are not restricted to the analysis of only these non-functional properties. The second part of Section 3.3presents general analysis methods and case studies of manually analyzed model transformations.

• In Section 3.4 work related to the application of reusable patterns that can be used in model transformation languages are discussed.

3.1 Specification of Model Transformations

Although the application of graph rewriting-based transformations has several advantageous proper-ties, this method is not the only possible way to implement model processing programs. The tools refer-enced in this subsection will be presented in more details inSection 3.2. [Czarnecki and Helsen, 2006]

differentiate (among others) the following categories of model-to-model transformations:

• In direct manipulation approaches, a framework provides an internal model representation along with a direct Application Programming Interface (API) for its manipulation. Ex-amples for this method are the Java Metadata Interface [Oracle, 2010], Microsoft’s Vi-sual Modeling Tool [Microsoft, 2010], or the MetaEdit+ modeling tool [Tolvanen, 2006, Tolvanen and Rossi, 2003]. Our modeling framework, VMTS also provides a direct API for each domain; however, graph rewriting-based model transformations (that are built on this API) are also available.

• Operational approaches are similar to direct manipulation, but they offer dedicated sup-port for model transformations, (e.g. by using metamodels, Object Constraint Lan-guage [Object Management Group, 2006, Warmer and Kleppe, 2003] and imperative con-structs). For example, this concept has been applied in Kermeta that is a metaprogramming environment based on object-oriented DSLs [Triskell team, 2012].

• In graph transformation approaches, the model transformation framework is based on the the-ory of algebraic graph rewriting. However, each model transformation framework realizes the theoretic background of graph rewriting in a different way. This is because, attribute hand-ling is usually performed by imperative code written in a textual programming language like an imperative dialect of OCL, or general purpose languages like C++ ( [Lédeczi et al., 2001]

C# [VMTS, 2010]. In this chapter, we will present several of these frameworks.

The previous categories along with the category of hybrid frameworks conclude the most commonly used approaches for implementing model processing programs.

Before continuing with the presentation of graph rewriting-based model transformations, we also mention certain publications that provide a general introduction about the role of model transfor-mations in model-driven software development. [Mens et al., 2005] provides a better understanding of what model transformations are. In this work, the following questions are discussed in detail:

(i) What needs to be transformed into what? (ii) What are the important characteristics of a mo-del transformation? (iii) What are the success criteria for a transformation language or tool? (iv) Which mechanisms can be used for model transformations? Moreover, [Syriani and Vangheluwe, 2009]

and [Klar et al., 2007] provide a general introduction into the application of model transformations and the comparison of a wide range of different approaches from a practical point of view.

In the following, we continue with the presentation of graph rewriting-based model transformation methods. As mentioned inChapter 1, algebraic graph rewriting-based model transformations have sev-eral desirable properties that makes formal analysis possible. A more detailed discussion about why the application of graph rewriting is advantageous can be found in [Blostein et al., 1996]. We also mention [Mens et al., 2006] where the topics of [Mens et al., 2005] are discussed with respect to the graph rewriting-based methods. We mention some important work ([Syriani and Vangheluwe, 2009, Czarnecki and Helsen, 2006, Taentzer et al., 2005]) that provide a comparison and general overview of graph rewriting-based model transformations. Finally, [Giese and Henkler, 2006] discusses indus-trial aspects of model transformations. In the rest of this section, we discuss three important aspects of graph rewriting-based model transformations: (i) formal background of graph rewriting, (ii) dif-ferent methods for the formal description of rewriting rules and their applications, and (iii) difdif-ferent approaches for the definition of the control mechanism of model transformations.

Formal Background of Graph Rewriting

There are slightly different approaches to formalize models during the analysis of model transforma-tions. These approaches are all based on different categories of graphs. Typed graphs, typed graphs with inheritance [de Lara et al., 2007], attributed and labeled graphs [Ehrig et al., 2006] form differ-ent categories where graph transformations can be described by means of category theory.

Our formalism is based on typed graphs with inheritance as presented in [de Lara et al., 2007], but in our framework, attributes are handled in an abstract form, each attribute of a model element has a name and a value, but its type is not taken into account. A typical method to formalize attributes of models is the use of typed attributed graphs. In this case, we need to specify the data type algebra.

Our abstraction of the attributes makes it possible to work with attributes and constraints without explicitly stating anything about the type of attributes. It also makes it possible to integrate general general type of constraint logic systems.

3. Related Work

Rewriting Rules

In this thesis, we assume that the application of the rewriting rules is based on the Double Pushout (DPO) approach (detailed inChapter 2). Single Pushout (SPO) [Rozenberg, 1997b] is an alternative solution to describe how the input is processed. The main difference between them is that DPO forbids the creation of dangling edges during the application of a rule. This can happen, when a node is deleted from a graph, and this node has several edges that are not matched and therefore not deleted by the rule. In this case, the rewriting engine should automatically delete these edges, or forbid the application of the rule along the concrete match, otherwise an invalid graph structure would be created. The first solution (SPO) may be effective as a practical solution, because the rule can be always applied when a match has been found. However, the transformation will not be invertible. In model transformation frameworks, DPO is more common; however, for example, the transformation framework of the tool GrGen.NET [Geiß et al., 2006] is based on the SPO approach.

Control Mechanism

There are many approaches for the definition of a control mechanism (or scheduling) of model trans-formations. The role of the control mechanism is to define the execution order of the different rewriting rules. In [Blostein et al., 1996], different approaches are categorized into the following categories:

• The application of the rules can be unordered, i.e. in this case, the rule to be applied is se-lected non-deterministically. This method is followed in graph grammars where the goal of a transformation is to generate every possible state that can be reached from a concrete input graph. The transitions before the states are described by means of rewriting rules. The mo-del transformation tools ATOM³ [de Lara et al., 2004] and AGG [Taentzer, 2004] apply such methods.

• For the developer, the easiest way is to explicitly define the execution order of the rules by an explicit control mechanism. The most typical method is to use a directed control flow graph (similar to the UML activity diagrams), where the nodes are the rewriting rules and the execution follows the directed edges. Such methods are implemented in VMTS or in the GReAT [Agrawal et al., 2006] tool.

• In several frameworks, hybrid solutions are used. For example, the rewriting rules to be applied are selected non-deterministically from certain sets called layers, but the application order of the layers are explicitly specified. Similar behavior can be achieved by assigning priority values to the rules. Priority-based graph grammars can also be used in the ATOM³ framework.

• We also mention event-driven ordering of the rewriting rules. In this case, the transforma-tion is executed interactively. The framework replies to events by the executransforma-tion of rewriting rules. This method makes model animation and simulation possible ([Guerra and de Lara, 2007, Syriani and Vangheluwe, 2008]).

In document Advisors:TihamérLevendovszkyPh.D.LászlóLengyelPh.D. MárkAsztalos by AutomatedOfflineVerificationofGraphRewriting-BasedModelTransformationsGráfújraírás-AlapúModelltranszformációkHelyességénekOfflineEllenőrzése (Pldal 30-36)