integration can even be performed during logical design

Jahrestagung, Berlin, Springer, Informatik-Fachberichte 50, 1981

A CONCEPTUAL FOUNDATION FOR VIEW INTEGRATION

C. BATINI, M. LENZERINI

2. integration can even be performed during logical design

Synthesis algorithms in [3],[4] are examples of such approach.

Most of the methodologies perform integration at the conceptual level. This approach can be considered as a tra

deoff between two different requirements:

a. as in software design, in data base design too error cost increases dramatically during the life cycle of

the application. This aspect should justify when possible an integration "in the head of the designer".

b. on the opposite side, due to the great complexity of the integration process, it seems better to perform such activity only when formal, unambiguous representations have been produced.

132 24 .

3.3.2. Design strategies

As we pointed out in [2], in data base design we can use the terms "top-down" and "bottom up" to characterize

the different strategies proposed in the literature for conceptual design.

For instance, the refinement of an entity into a more complex structure that inherits its links in a conceptual schema can be considered as a top-down activity, while the integration of two schemata (or else of a new entity to a schema) is a bottom-up activity.

In principle, the designer of a conceptual schema should be allowed to intermix top-down and bottom-up acti

vities. As a consequence (while methodologies usually pro

pose two distinct and clearly specified activities for view modelling and schema integration) in general the integration

step should be allowed at any level of refinement, in order to carry on the design intermixing view modelling and

schema integration.

Most of the existing methodologies for view integra

tion do not afford this problem: they assume that the view modelling process has been concluded so that the schemata

to be integrated are assumed as specified at the final level of refinement.

Some basic concepts regarding to this aspect can be found in [17] , where the proposed data model is based on abstraction mechanisms and general guidelines to integrate user views possibly specified at different levels of ab

straction are provided.

3.3.3. Order of integration between schemata

This aspect involves two related problems: giving a

133

25 .

general strategy for the entire integration process in order to produce a global schema from several conceptual schemata and providing criteria for the choice of the order of ag

gregation of such schemata.

With regard to the first point, the concept of -integra

tion tree can be introduced: let's call CA ,...,C the

schema-1 n

ta to be integrated (user schemata in the following) and CS the global conceptual schema.

The procedurality of the integration process can be represented by means of a tree according to the following rules:

- the root represents the global schema CS - the leafs represent user schemata , . ..,C

- the intermediate nodes represent partial integrated schemata

- for each node, its children represent schemata from which it has been derived by means of an integration step.

Stating the structure of the integration tree corresponds to provide the general strategy to accouplish the integra

tion process. Most of the methodologies, for example, agree in adopting a binary tree because of the increasing com

plexity of the integration step with respect to the number of schemata to be integrated.

The proposals in [3],[20] can be considered exceptions to this rule: n-ary integration steps are allowed in their approaches in which however, the types of conflicts and situations taken into account in the analysis is quite limited.

With respect to the balancing of the integration tree, two alternative choices have been proposed: respectively a

134 26.

completely balanced [17] and a completely unbalanced binary tree [7].

In [17] it is argued that the balancing of the integra

tion tree minimizes the number of comparisons between con

cepts of the schemata that are performed at intermediate steps in the integration process. In the approach of [7]

the integration of schemata with higher relevance is anti

cipated so to obtain a better convergence and stability in the construction of the partial integrated schema.

3.3.4. Order of -integration between concepts to be merged This aspect and the next one is meaningful when a

procedural!ty has been chosen for the integration of schemata, and two or more schemata are to be integrated in a new one.

At this stage, in order to discipline the explosion of possible activities, at least two different strategies can be chosen.

A first class of strategies proceed imposing an order to classification structures allowed in the model, and integrating in such order the corresponding "layers"

of the schemata.

A possible criterion for the choice of the order should tend to anticipate as soon as possible the most critical choices, achieving fastly a first convergence of the design.

We show,for example (see fig. 2 ) , the metaschema of an Entity Relationship Model [9] enriched with subset and ge

neralization abstractions for entities (Sub and Gen rela

tionships) and min and max cardinalities. We assume the metaschema selfexplanatory, except for symbol:

135

27.

called underlying attribute,i .e . the attribute of the entity at the uppdr level in the generalization whose values cor

respond to names of entities at lower level.

S e ntence

Since in such a model the entity concept is the most significant one, if this procedurality and this model are assumed it is useful to anticipate entity analysis. In several methodologiesf 1 0] , [ 1 5 ] this criterion is widely applied.

136

3.3.5. Order of integration between modelling cathegories A second order that could be chosen with the goal of finding layers of the schemata to be subsequently integrated, is based on modelling cathegories i.e.:

a. names

b. classification structures c. integrity constraints.

The idea here is that the naming activity is the most primitive one when a Universe of Discourse is conceptualized in a schema. As a consequence, when two or more schemata are integrated, first of all names of concepts are unified in

dependently from classification structures and integrity constraints chosen for modelling there. Subsequently, clas

sification structures of the concepts with the same name are analyzed, attempting to unify them according to tran

sformations that preserve equivalence. Finally, integrity constraints are analyzed in order to check their compati

bility. Such an approach is chosen in [7].

In document Kecskemét, HungaryTanulmányok 150/1983 May 22-27. 1983 "S D M " HUNGARIAN ACADEMY OF SCIENCES COMPUTER AND AUTOMATION INSTITUTE (Pldal 133-138)