• Nem Talált Eredményt

In this section, I present data structures needed for (i) maintaining, (ii) efficiently storing, (iii) invalidat-ing, and (iv) notifying partial matchings, and for (v) accelerating the retrieval of complete matchings.

Algorithms of the incremental pattern matching engine, which operate on these data structures are discussed later in Sec. 8.4.

Class diagrams depicting the different aspects of data structures being used by the incremental pattern matching engine are shown in Fig. 8.3.

8.3.1 Matching snapshots and snapshot trees

The concept of snapshots has already been introduced for denoting matchings for subgraphs (i.e., par-tial matchings for patterns) in Sec. 4.1.1, when pattern matching algorithms have been discussed in general. Since the incremental algorithm is search plan driven, the interpretation ofsnapshots(denoted by numbered circles in Fig. 8.4) is now restricted by only allowing matchings for subpatterns, which constitute a special subset of subgraphs based on the currently active search plan. However, in any other respects, this new definition still fully complies to the original concept.

Object

PatternNode PatternNodeDerivative

Matching

Mapping Snapshot

0..1 0..*

value 0..*

1 mappings

0..1 0..*

patternNode

1 0..*

matching

parent 0..1 0..*

children

spNode 1

snapshot 1..*

invalidatedBy 0..*

0..*

invalidates

(a) Matchings

Label NotificationKey

Object InsertKey

isSrc DeleteKey

1 0..*

end 1 0..*

1 trg 0..*

src

1 0..*

label

(b) Event processing

Figure 8.3: Data structures of the incremental pattern matching engine

8.3. DATA STRUCTURES FOR INCREMENTAL PATTERN MATCHING 131

Since a snapshot is a logical representation of a (partial) matching, there is a one-to-one corre-spondance between these terms, and consequently, these names could have been used interchangeably.

However, in the following, the term snapshot is used specifically for the data structure appearing in the incremental algorithm, while the word (partial) matching denotes the actual (partial) morphism between the pattern graph and the model.

To support incremental behaviour, a snapshot tree is maintained for each pattern graph, which consists of snapshots being organized into a tree structure alongparent-child edges(depicted by dashed arcs in Fig. 8.4). The root of the tree (i.e., the single node on the first level) denotes the initial matching, in which pattern nodes are only bound outside the pattern matching algorithm, and not by the algorithm itself. Snapshots denoting matchings for a given subpattern can always be found on the samelevelof the tree (marked by light grey areas in Fig. 8.4). The mapping of subpatterns to tree levels is guided by the search plan having been fixed for the pattern graph. A tree node on the (k+1)th level (i.e., having distancekfrom the root) represents a matching for the kth subpattern being specified by the search plan. Eachleaf represents a maximal partial matching for the pattern. If the pattern hasnnodes to be matched, then each leaf on the (n+1)th (i.e., deepest possible) level represents a (complete) matching.

Example 31 Sample models of Figures 8.4(c), 8.4(e), and 8.4(g) and the corresponding data struc-ture contents are presented in Figures 8.4(d), 8.4(f), and 8.4(h), respectively. Figures 8.4(d), 8.4(f), and 8.4(h) show snapshot trees in their top-right corner, they depict binding arrays at the bottom, while notification arrays are presented in their left part.

Fig. 8.4(f) contains two snapshot trees representing the partial matchings of theLHSand theNAC pattern, respectively. Snapshots1and2denote empty matchings. Snapshot3is located on the second level of the tree defined for theLHSpattern, thus, it is a matching for the first subpatternLHS1, which contains a single mapping that assigns objectc1to pattern nodeC. Snapshot3is a child of snapshot1, as the matching represented by the latter can be extended by the mapping of pattern nodeC.

In the context of Fig. 8.4(d), snapshot 3 is a maximal partial matching as it cannot be further extended, due to the lack of outgoingEOedges leading out of classc1. On the other hand, snapshot3

is not a maximal partial matching in Fig. 8.4(f) as it can be extended e.g., by mappingsPtopandStos

to get the matching represented by snapshot5. This means a (complete) matching for theLHSpattern as snapshot5is located on the lowest tree levelLHS3.

Definition 61 Thesnapshot universeSGdenotes all possible partial matchings for a pattern graphG.

Definition 62 Given a search planSPdefined for the adorned search graphASGof a pattern graphG and a modelM, asnapshotsGk of pattern graphGis a logical representation of a matchingmGkfor thekth subpatternGkof pattern graphGin modelM. Formally,sGk ∈SG

In the following, superscriptmis used for identifying the matching represented by a snapshot.

Definition 63 Given a search planSPdefined for the adorned search graphASG of a pattern graph G and a model M, a snapshot tree STG = STG, rG, pG

is a data structure described by a triple consisting of the following parts.

• Tree nodesSTGare snapshots of the pattern graphG. Formally,STG⊆SG.

• Theroot noderGis a tree node, which represents the matchingmG0 for the0th subpatternG0

of pattern graphGin modelM. Formally,∃rG ∈STG : rmG =mG0.

• Theparent functionp : STG→STGdefines the parent node of each snapshot.

132 CHAPTER 8. INCREMENTAL GRAPH TRANSFORMATION

(b) Notational guide for data structures

p:Package

(d) Data structure contents forModel 1

p:Package

(f) Data structure contents forModel 2

p:Package

(h) Data structure contents forModel 3

Figure 8.4: Sample models and the corresponding data structures

8.3. DATA STRUCTURES FOR INCREMENTAL PATTERN MATCHING 133

The following restrictions ensure the well-formedness of the snapshot tree.

• Each tree nodesexcept for the rootrG has a parent. Formally,∀s ∈ STG\ {rG},∃t ∈ STG : p(s) =t, and@s∈STG : p(rG) =s.

• If tree nodesGk is not the rootrG, and it represents a matchingmGk for thekth subpatternGk of pattern graphGin modelM, then its parentsGk−1 must represent a matchingmGk−1 for the (k–1)th subpatternGk−1of pattern graphGin modelM. Formally,

∀k∈N : 1≤k≤ VSGF

=⇒

∀sGk ∈STG\ {rG},∀mGk : smGk =mGk =⇒

∃sGk−1 ∈STG,∃mGk−1 : smGk−1 =mGk−1 ∧p(sGk) =sGk−1 .

8.3.2 Binding arrays

In implementations, matchings are physically stored as one-dimensional binding arrays, which are indexed by the pattern nodes. An entry in a binding array stores pattern node–object pairs in the corresponding matching. When one matching is an ancestor of another one, their binding arrays can be shared in order to reduce memory consumption as the ancestor matching contains a subset of the mappings of the descendant matching. Consequently, for each pattern graph Gwith n variables, a binding arraymatch[n]of sizenis used. In Fig. 8.4, binding arrays are connected to snapshots by solid black lines.

Example 32 Since the LHSpattern has 3 nodes, snapshots of theLHSsnapshot tree refer to binding arrays having 3 entries as it is shown e.g., in the lower part of Fig. 8.4(f). Each column of the binding array of theLHS snapshot tree represents a mapping, which shows the object (in the lower row) to which the pattern node (in the upper row) has been mapped. Note that the array that contains mappings

Ctoc1,Ptop, andStoscan be shared by snapshots1,3,4, and5, as they only consist of mappings of the first 0, 1, 2, and 3 free nodes, respectively.

8.3.3 Invalidation edges

Invalidation edges(denoted by thick (red) arcs) represent the invalidation of partial matchings of aLHS caused by (complete) matchings of aNAC.

Example 33 The red invalidation edge of Fig. 8.4(h) connecting snapshots7to3means that snapshot

7 represents a (complete) matching for the NAC pattern, which invalidates the partial matching of snapshot3as both map the shared nodeCto the same objectc1. As long as snapshot3is invalidated (as shown by the incoming invalidation edge), it cannot be part of a (complete) matching for theLHS pattern, which fact is marked by the empty subtree rooted at snapshot3.

Definition 64 Given a model M and snapshot universesSLHS and SNAC defined for pattern graphs LHS and NAC, respectively, by also using search plans defined for the corresponding adorned search graphs, a snapshot sNAC invalidates snapshot sLHSk (denoted by sNAC sLHSk), if all shared nodes of NAC andLHS are mapped by matchings smNAC and smLHS

k defined for the NAC and the kth subpattern LHSk of pattern graph LHS, respectively, and each shared node is mapped to the same object in model M by both matchings. Formally, ∀sLHSk ∈ SLHS,∀sNAC ∈ SNAC :

sNAC sLHSk ⇐⇒VNACsh 6=∅ ∧ ∀x∈VNACsh ,∃c∈VM : smLHS

k(x) =c∧smNAC(x) =c .

134 CHAPTER 8. INCREMENTAL GRAPH TRANSFORMATION

Definition 65 Given a model M and snapshot trees STLHS and STNAC defined for pattern graphs LHS and NAC, respectively, by also using search plans defined for the corresponding adorned search graphs, a snapshot sNAC invalidates a subtree rooted at snapshot sLHSk (denoted by sNAC hsLHSki), if snapshot sLHSk is invalidated by snapshot sNAC, but its parent p(sLHSk) is not invalidated by the same snapshot sNAC. Formally, ∀sLHSk ∈ STLHS,∀sNAC ∈ STNAC : (sNAChsLHSki ⇐⇒sNACsLHSk∧ ¬(sNACp(sLHSk))).

Definition 66 Given a model M and snapshot treesSTLHS and{STNACi}defined for pattern graphs LHS and {NACi}, respectively, by also using search plans defined for the corresponding adorned search graphs,invalidation edgesIdenote the set of all such snapshot pairs(sNACi, sLHSk), where the first snapshotsNACi invalidates the subtree rooted at snapshotsLHSk. Formally,I ⊆S

iSTNACi×STLHS, and∀i, ∀sNACi ∈STNACi,∀sLHSk ∈STLHS : (sNACi, sLHSk)∈I⇐⇒sNACi hsLHSki.

8.3.4 Notification arrays

Since the transformation engine sends notifications on model changes, notification related data struc-tures (shown in Fig. 8.3(b)) are also needed. The incremental pattern matching engine has a single insert notification arrayand a singledelete notification arrayconsisting of notification entries.

• An entry in the insert notification array is a pair consisting of an insert key and a list of snapshots to be notified. An insert key denotes a trigger condition for initiating incremental algorithms when objects or links are inserted into the model.

• An entry in the insert notification array is a pair consisting of a delete key and a list of snapshots to be notified. Adelete keydenotes a trigger condition for initiating incremental algorithms when objects or links are removed from the model.

The exact role of insert and delete notification arrays is presented later by Algorithm 8.1.

Example 34 Sample notification arrays are presented e.g., in the left part of Fig. 8.4(d). TheINSERT notification array has 4 entries of which the first is triggered by the insert key[∗,type,Class]and refers to snapshot 1. This entry means that snapshot 1 has to be notified, when an object, which conforms to classClassis inserted into the model. Similarly, the first entry in theDELETEnotification array means that snapshot3must be notified, if objectc1, which conforms to classClassis deleted.

Definition 67 Given a metamodel M M and a model M, insert keysKI denote trigger conditions for initiating incremental algorithms when objects or links are inserted into modelM. Insert keys can be partitioned into three types (namely,K1I, K2I, andK3I). Formally, KI = K1I ∪K2I∪K3I, and K1I∩K2I∩K3I=∅.

• An insert keyh

type→ Ci

of typeK1Iis triggered, if an object that conforms to classCis inserted into modelM. Formally,K1I ⊆({ ∗ } × {type} ×VM M).

• An insert keyh

∗→A bi

of typeK2Iis triggered, if a link of typeAleading into objectbis inserted into modelM. Formally,K2I ⊆({ ∗ } ×Assoc×VM).

• An insert key h

a→ ∗A i

of typeK3I is triggered, if a link of typeA leading out of object ais inserted into modelM. Formally,K3I ⊆(VM ×Assoc× { ∗ }).