• Nem Talált Eredményt

Combining Description Logics and Object Oriented Models in an Information Framework

N/A
N/A
Protected

Academic year: 2023

Ossza meg "Combining Description Logics and Object Oriented Models in an Information Framework"

Copied!
26
0
0

Teljes szövegt

(1)

Oriented Models in an Information

Integration Framework

GergelyLukásyandPéterSzeredi

BudapestUniversityofTehnologyandEonomis

DepartmentofComputerSieneandInformation Theory

1117Budapest,Magyartudósokkörútja2.,Hungary

Phone:+361463-2585Fax:+361463-3157

{lukasy,szeredi}s.bme.hu

Keywords:desriptonlogi,informationintegration,logiprogramming

Abstrat. We present an information integration system alled SIN-

TAGMAwhihsupports the semanti integration of heterogeneous in-

formationsouresusingametadatadrivenapproah.Themainideaof

SINTAGMAistobuildasoalledModelWarehouse,ontainingseveral

layers of integrated models onneted by mappings. At the bottom of

thishierarhy therearethemodelsrepresentingtheatualinformation

soures. Higher level models represent virtual databases whihan be

queried,asthemappingsprovideapreisedesriptionofhowtopopu-

latethesevirtualsouresusingtheonreteones.

TheimplementationofSINTAGMAusesonstraintsandlogiprogram-

ming,forexample,theomplexqueriesaretranslatedintoProloggoals.

ThispaperfousesonareentdevelopmentinSINTAGMAallowingthe

information expert to use Desription Logi (DL) based ontologies in

thedevelopment ofhighabstrationleveloneptual models.Querying

these models is performed using the Closed World Assumption as we

arguethat traditionalOpenWorld DLreasoningis lessappropriate in

theontextofdatabaseorientedinformationintegration environments.

1 Introdution

ThispaperpresentstheDesriptionLogimodellingapabilitiesoftheSIN-

TAGMAEnterpriseInformation Integrationsystem.

SINTAGMA is based on the SILK tool-set, developed within the EU FP5

projetSILK(System Integration viaLogi & Knowledge)[3℄. SILKis aPro-

logbased,dataentered, monolithiinformationintegrationsystemsupporting

semi-automatiintegrationonrelationalandsemi-struturedsoures.

The SINTAGMA system extends the original framework in several dire-

tions.As opposed to themonolithiSILKstruture, SINTAGMAisbuiltfrom

looselyoupleddistributedomponents.Thefuntionalityhasbeomeriheras,

among others,thesystemnowdealswithWebServiesasinformationsoures.

(2)

integrationexpert touseDesriptionLogimodelsintheintegrationproess.

Thispaperis arevisedandextended versionof thepaper presentedat the

ALPSWS'07workshopinPorto[22℄.Itisstruturedasfollows.Setion2intro-

dues desriptionlogi and logi programming.In Setion 3wegiveageneral

introdution to theSINTAGMA system,desribing the main omponents,the

SILan modelling language, and the query exeution mehanism. In the next

setion we disuss the desription logi extension of SILan: we introdue the

syntati onstruts and the modelling methodology. Setion 5 desribes the

exeution mehanismused when queryingDesriptionLogi models. Setion 6

presentsafairly omplexexample, demonstrating thetoolsand tehniqueswe

havedisussedsofar.InSetion7weexaminerelatedwork.Finally,weonlude

withasummaryofourresults.

Theexamplesweuseintheupomingdisussionsarepartoftheintegration

senariodesribedindetailinSetion 6.Thissenariorepresentsaworldwhere

weattempttointegratevariousinformationsouresaboutwriters,paintersand

theirwork(i.e.books,paintings,et.)andpresentthisinformationin theform

ofabstratviews.

2 Bakground

Belowwegiveabriefintrodution to DesriptionLogiand logiprogram-

ming asthesetehnologiesformthebasisofourwork.

2.1 Desription Logi

DesriptionLogis (DL) [17℄ is a family of simplelogi languages used for

knowledgerepresentation.DLsareusedfordesribingthevariouskindsofknowl-

edgeforaseletedeld.Theterminologialsystemofadesriptionlogiknowl-

edgebaseonsistsofonepts,whihrepresentsetsofobjets,androles,desrib-

ingbinaryrelationsbetweenonepts.Objetsaretheinstanesourringinthe

modelled appliationeld,andthusarealsoalled instanes orindividuals.

Adesriptionlogi knowledgebase onsists oftwodisjointparts:theTBox

andtheABox. TheTBox(terminologybox),in itssimplestform,ontainster-

minology axioms of form

C ⊑ D

(onept

C

is subsumed by

D

). The ABox

(assertionbox)storesknowledge abouttheindividuals in theworld:aonept

assertionof form

C(i)

denotesthat

i

isan instane ofonept

C

, whilea role

assertion

R(i, j)

meansthat theobjets

i

and

j

arerelatedthroughrole

R

.

Coneptsandrolesmayeitherbeatomi(referredtobyaoneptnameora

rolename)oromposite.Aompositeoneptisbuiltfromatomioneptsusing

onstrutors.TheexpressivenessofaDLlanguagedependsontheonstrutors

allowedforbuilding ompositeoneptsorroles. Obviouslythere is atrade-o

betweenexpressivenessandinfereneomplexity.

We use the language

ALCN (D)

in this paper.

ALCN (D)

onept expres-

sions (oftensimplyreferredto asonepts) arebuilt from rolenames, onept

(3)

names (atomi onepts) and the top and bottom onepts (

and

) using

thefollowingonstrutors:intersetion (

C ⊓ D

),union(

C ⊔ D

),negation(

¬C

),

valuerestrition(

∀R.C

),existentialrestrition(

∃R.C

)andnumberrestritions (

> n R

and

6 n R

). Here,

C

and

D

are onept expressions and

R

is a role

name.Thetwokindsofnumberrestritionsarejointlyreferredtoas

( ⋊ ⋉ n R)

.In

ALCN (D)

weanalsouseonrete domains,suhasintegersorstrings,when

buildingonepts.Foradetailed introdutionto desriptionlogiswereferthe

readertothersttwohaptersof[1℄.

2.2 Logi programmingand Prolog

Themain ideaofLogiProgramming isto usemathematiallogiasapro-

gramminglanguage.The exeutionof alogiprogram anbeviewed asarea-

soningproess.

Prolog(ProgramminginLogi)[26℄istherstandsofarthemostwidelyused

logiprogramminglanguage.PrologusesHorn-lauses andSLDresolution [25℄

forreasoning.ThebasielementsofthePrologexeutionproessareproedure

invoationbasedonuniationandbaktraking [28℄.

Prolog, and logi programming in general, is suessfully used in several

areasofomputersiene.Theseinlude naturallanguageproessing,planning,

dierentkindsofreasoningsystems,andinformationintegration.

Thenotionofterm isaprinipaloneptoftheProloglanguage.Itiseither

(a) asimple value(number, string) or(b) avariable or() a struture witha

name and arbitrarynumber of arguments. These arguments are Prolog terms

themselves.Thenameandthearityofatermtogetherisreferredasthefuntor

oftheterm.APrologstruture withthreeargumentsanbeseenbelow:

'Work:lass:220'(DT, [A, B, C, D, E℄, _) (1)

Here thename of the struture is 'Work:lass:220'.The rst and thethird

argumentsarevariables.Thesearedenotedbyidentiersstartingwithaapital

oran underline.A single underline(_) is an anonymous variable, thevalue of

whih is of no interest. Multiple ourrenes of suh anonymous variables are

onsidered dierent.Theseond argumentof (1)is astruturein aspeial list

notation. A list is atually a reursive struture [Head|Tail℄, onsisting of a

Head (itsrst element) and aTail, whih is alist of the remaining elements.

Thelistintheseondargumentontainsvevariablesandisgiveninasimplied

notation, i.e. [A,B,C,D,E℄, in fat, orresponds to [A|[B|[C|[D|[E|[℄℄℄℄℄℄.

Here[℄representsanemptylist(alistwithnoelements).

APrologprogramonsistsofasetoflausesofformHead :- Body,meaning

Head is implied by Body. The Head is aterm, while the Body is a term ora

omma-separated sequene of terms. Here the omma denotes a onjuntion.

Clauseswhoseheadshavethesamefuntoraregroupedtogetherintoprediates.

Thenameofaprediateisthesharedstruturenameoftheheadsofitslauses.

AProloggoal (query)hasthesameformasalausebody.Theexeutionof

agoalwrt.aPrologprogramsueedsifaninstaneofthegoalanbededued

(4)

substitutionsas results.Forexample,letusonsider thegoalshownbelow.

'Writer:lass:234'(ID), 'Painter:lass:236'(ID)

This omplex goal onsists of two goals, separated by a omma. It sueeds

if there is suh an instantiation of variable ID under whih bothgoals an be

dedued fromthegiven program(notshownhere). Theresultoftheexeution

is theenumerationof suh IDs.Informally,thisquery enumeratesthose people

whoarewritersandpaintersatthesametime.

Furtherontrolonstrutssuhasdisjuntion(Goal1 ;Goal2 )andnega-

tion\+Goal arealsosupported byProlog.Thelatteris theso allednegation

byfailure,whihisnotapableofenumeratingsolutions,butjust heksifthe

exeution of Goal fails. There is a widerange of built-in prediates, inluding

onesforolletingallsolutionsofagoal(e.g.bagof).Forexample,

bagof(ID, ('Writer:lass:234'(ID), 'Painter:lass:236'(ID)), IDs)

willollettheidentiersofallpeoplewhoarewritersandpaintersintothelist

IDs.An importantpropertyof bagofisthatitanreturnmultiple solutionsif

notallvariablesinitsseondargumentappearintherst.Forexample,onsider

aprediateedgedesribingtheedgesofadiretedgraph:

edge(a,b). edge(a,). edge(,d). edge(d,a). edge(,e).

Byinvoking thegoalbagof(End, edge(Start, End), EndPoints)weollet

the endpoints of theedges. This goalwill produe three answers,one foreah

possiblevalueofvariableStart:

Start = a, EndPoints = [b,℄

Start = , EndPoints = [d,e℄

Start = d. EndPoints = [a℄

MoreabouttheProloglanguageanbereadintheISOstandardforProlog

[26℄andin textbooks,suh as[28,10℄.

3 SINTAGMA System Arhiteture

TheoverallarhitetureoftheSINTAGMAsystemanbeseeninFigure1.

Themainideaofthesystemistoolletandmanagemeta-information onthe

soures to be integrated. These piees of information are stored in the Model

Warehouse, in the form of UML-like models [12℄, onstraints and mappings.

Thiswayweanrepresentstruturalaswellasnon-struturalinformation,suh

aslassinvariants,et. TheModelWarehouseresidesinand ishandled bythe

Model Manager omponent.

Weusethetermmediation to refertotheproessofqueryingSINTAGMA

models.Mediationdeomposesomplexintegratedqueriestosimplequeriesan-

swerable by individual information soures, and, having obtained data from

(5)

Meta

Server

Data

Server

Comparator

ModelVerier

Unier

Corr.generator

DataVerier

SpeAdvisor

ModelManager Model

Warehouse

Model

Im(Ex)port Agent

Congu-

rator

Mediator

Client

Programs

Browser

Shell

... User

Wrapper Wrapper

Wrapper

ModelingTool

(Protege,Rose)

Wrappers:

-Relational

-XML

-RDF

-HTML

-WebServie

Fig.1.ThearhitetureoftheSINTAGMAsystem

these, omposes the results into an integrated form. Mediation is the task of

theMediator omponent.

Aesstoheterogeneousinformationsouresissupportedbywrappers.Wrap-

pershidethesyntatidierenesbetweenthesouresofdierentkinds,bypre-

sentingthem to upperlayersuniformly,asUML models. These models (alled

interfae models)areenteredintotheModelWarehouseautomatially.Thefol-

lowingsubsetionsgiveabriefdesriptionofthemainSINTAGMAomponents.

3.1 The Model Manager

TheModelManagerisresponsibleformanagingtheModelWarehouse(MW)

and providing integration support, suh asmodel omparison and veriation

(not overedinthis paper).HerewefousontheroleoftheModelWarehouse.

TheontentoftheMWisgivenin thelanguagealledSILanwhihisbased

onUML[12℄and DesriptionLogis[17℄. ThesyntaxofSILAN resemblesIDL,

theInterfaeDesriptionLanguageofCORBA[19℄.Wedemonstratetheknowl-

edgerepresentationfailitiesofSINTAGMAbyasimpleSILanexampleshowing

therelevantfeatures ofthemeta-datarepository(Figure2).

The example desribes the model Art ontaining two lasses, Artist and

Work.It also ontainsanassoiationhasWorkbetweenartistsand theirworks.

Wewillexplainthedetailsofthisexamplebelow.

(6)

2 lass Artist: BuiltIns::DLAny {

3 attribute String name;

4 attribute Integer birthDate;

5 onstraint self.reation.date > 1900;

6 };

7

8 lass Work: BuiltIns::DLAny {

9 attribute String title;

10 attribute String author;

11 attribute Integer date;

12 attribute String type;

13 primary key title;

14 };

15

16 assoiation hasWork {

17 onnetion Artist as reator;

18 onnetion Work as reation;

19 };

20 };

Fig.2.SILanrepresentationofthemodelArt

SemantisofSILanmodels TheentralelementsofSILanmodelsarelasses

and assoiations, sinethese are the arriersof information. A lass denotesa

set of entities alled the instanes of the lass. Similarly, an

n

-ary assoiation

denotesasetof

n

-arytuplesoflassinstanesalled links.

Classesanhaveattributes whiharedenedasfuntionsmappingthelass

toasubsetofvaluesallowedbythetypeoftheattribute.Classesaninheritfrom

otherlasses.Allinstanesofthedesendantlassareinstanesoftheanestor

lass, as well. In ourexample bothArtistand Work inherit from the built-in

lassBuiltIns::DLAny 1

(f. lines2and8). SeeSetion 4.3formoredetails.

Assoiationshaveonnetions,an

n

-aryassoiationhas

n

onnetions.Inan

assoiationsomeoftheonnetionsanbenamed,providingintuitivenavigation.

Forexample, theonnetions of assoiation hasWork,orresponding to lasses

ArtistandWork,arealledreatorandreation,respetively(lines1718).

Classesan havea primary key, omposed of one ormore attributes. This

speiesthatthegivensubsetoftheattributesuniquelyidentiesaninstaneof

thelass. Inourexample,asagrosssimpliation, attributetitleservesasa

keyin lassWork,i.e.thereannotbetwoworks(books,forexample)withthe

sametitle.

1

InSILandoubleolons(::)separatethemodelnamefromthenameofitsonstituent

(lass,assoiation, et.).

(7)

jetonstraintextension ofUML, theOCLlanguage[9℄.Invariantsgivestate-

mentsaboutinstanesoflasses(andlinks ofassoiations)thatholdforeahof

them.TheonstraintinthedelarationofArtist(line5)isaninvariantstating

that the publiation date ofeah work of anartist is greaterthan 1900 2

. The

identierselfreferstoanarbitraryinstaneoftheontext,inthisasethelass

Artist.Thentwonavigation stepsfollow.Intherststepwenavigatethrough

theassoiationhasWorktoanarbitrary pieeofworkoftheartist,whileinthe

seond stepwego from thework to itspubliationdate, and nallystate that

thisdateis alwaysgreaterthan1900.

Inadditionto theobjetorientedmodelling paradigm,theSILan language

also supports onstrutsfrom the Desription Logi (DL) world[17℄. This re-

entlyaddedfeatureofSINTAGMAisdisussedin Setion4.

Abstrations Formediation,weneedmappingsbetweenthedierentsoures

andtheintegratedmodel.Thesemappings arealled abstrations beausethey

often provide a more abstrat view of the notions present in the lower level

models. Anexampleabstrationalledw0anbeseeninFigure3.

1 abstration w0 (m0: Interfae::Produt,

2 m1: Interfae::Desription

3 -> m2: Art::Work) {

4

5 onstraint

6 m1.id = m0.id and

7 m1.ategory = "artwork"

8 implies

9 m2.title = m0.name and

10 m2.author = m0.reator and

11 m2.date = m0.reation_date and

12 m2.type = m1.subategory and

13 m2.DL_ID = m0.name;

14 };

Fig.3.SILanrepresentationoftheabstrationpopulatinglassWork

This abstration populates the lass Work (f. Figure 2) in the model Art

usinglassesProdutandDesription,bothfromthemodelInterfae(lines

13).Thismeansthattheabstrationspeieshowtoreateavirtualinstane

of lassWork,giventhat theothertwolassesarealreadypopulated(e.g.they

orrespond to real informationsoures). Inlines 13theidentiers m0,m1 and

2

Thismaybesobeausetheunderlyinginformationsouresareknowntobedealing

withworksofartof20thenturyorlater.

(8)

todenote instanesoftheappropriatelasses.

The abstration desribes that given an instane of lass Produt alled m0

and an instane of lass Desription alled m1, for whih the onditions in

lines 67 hold, there exists an instane m2 of lass Work with attribute values

speiedbylines913 3

. Notethat line6speiesthat theidattributes ofthe

two instanes have to be the same, and thus orresponds to a relational join

operation.Inourintegrationsenario(seeSetion6)ProdutandDesription

atuallyorrespondtoreal-worldOraletablesontainingvariousprodutsand

theirdesriptions,inludingbooksandpaintings.

Thesetwosouressharethekeyid(line6).Whiletherstonesuppliesfour

eldstoWorkobjets(title,author,dateandDL_ID),theontributionofthe

seond oneis asingleeld (type). However,this seond table hasinformation

to ensurethat onlyrelevantproduts(worksofart)are inludedinlassWork,

throughtheonditioninline 7.

Wenote that other abstrations an also populate lass Work.In this ase

theset ofinstanes of Workwillbetheunionoftheinstanes produedbythe

appropriate abstrations. Note that if a new information soure is added, we

only haveto speifya newabstration orresponding to this soure,while the

existingabstrationsdonothaveto bemodied.

Notie that the abstration in Figure 3 takes the form of an impliation

desribinghowthegivensouresanontributetopopulatingthehighlevellass

Art::Work.ThisisharateristioftheLoalasViewintegrationapproah[6℄.

3.2 The Wrappers

Wrappersprovideaommoninterfaeforaessingvariousinformationsoure

types,suhasrelationalandobjet-orienteddatabases,semi-struturedsoures

(e.g.XML orRDF),aswellasWeb-servies.

Awrapperhastwomain tasks.First,it extratsmeta-datafrom the infor-

mation soure and deliversthese to the Model Manager in the form of SILan

models.Forexample,inaseofrelationalsoures,databasesorrespondtomod-

els,tablestolasses,olumnsto attributes,asshownin Figure4.

The other prinipal task of a wrapper is to transform queries, formulated

in terms of this interfae model, into the format required by the underlying

informationsoure,andthusallowforrunningqueriesonthesoures.

3.3 The Mediator

TheMediator[2℄supportsqueriesonhighlevelmodelelementsbydeompos-

ingthemintointerfaemodelspeiquestions.Thisisperformedbyreatinga

queryplansatisfyingthedataowrequirementsofthesoures.Duringtheexe-

utionofthisqueryplanthedatatransformationsdesribedintheabstrations

3

AttributeDL_IDomesfromthelassDLAny,ofwhihlassWorkisadesendant.It

hasaspeialrole,asexplainedinSetion4.3.

(9)

olumn

attribute

database

model

table

lass

Produt

... nameString

idInteger

reatorString

reation_dateString

modelInterfae{

lassProdut{

attributeStringname;

attributeIntegerid;

attributeStringreator;

attributeStringreation_date;

primarykeyid;

};

};

Interfae

Fig.4.ModellingrelationalsouresinSILan

arearriedout.WheneverwequeryamodelelementinSINTAGMA,theModel

ManagerprovidesthefollowingtwokindsofinformationtotheMediator:

1. thequerygoalitself, i.e.aPrologtermrepresentingwhatto query;

2. setofmediatorrules,usingwhihtheMediatorandeomposetheomplex

queryintoprimitiveones (i.e.queriesthatreferonlytointerfaemodels).

Forexample,letusonsider thequeryshownbelowinvolvinglassWork.

query ReentWork

selet *

from w: Art::Work

where w.date > 2000;

This query is looking for reent works, namely those instanes of the lass

Art::Work that were reatedafter 2000 4

. In this ase, the query goal is sim-

ilartothefollowingsimplePrologexpression:

:- 'Work:lass:220'(DT, [A, B, C, D, E℄, DA), C > 2000. (2)

Here, the rst Prolog goal retrieves an instane of Art::Work. The vari-

ables in this term will be instantiated during query exeution. The prediate

name 'Work:lass:220'is a onatenation of three strings: the kind of the

model element (lass) and its unique internal identier (220), preeded by

theunqualiedand thus non-uniqueSILanname(Work),providedfor read-

ability. Model elements areoften referredto byhandles of form Kind(Id),e.g.

4

WeouldhavereatedalassnamedReentWorkandpopulateditbyanappropri-

ateabstration. Then,instead offormulating aSILanquery,we ouldhavesimply

diretlyaskedfortheinstanesofthislass.Thequestionwhethertouseaqueryor

anabstrationisamodellingdeision.

(10)

theinstanesqueriedfor,asopposedtothedynamitypewhihanbedierent,

ifthereturnedobjetbelongstoadesendantlassof Work.

Thedynamitypeofthequeriedinstane,i.e.thehandleofthemostspei

lass it atually belongs to, is returned in the rst argumentof thegoal. The

seond argument ontains the values of the stati attributes, in this ase we

havevesuhvariables(f.delarationoflassWorkinFigure2).Forexample,

C denotes thevalueof the attribute date.Thethird and last argumentof the

query term arries the values of the dynami attributes. These represent the

additionalattributes(not knownatquerytime)oftheinstaneifithappensto

belongtoadesendant lassof Art::Work.

Theseond partof thequerygoalorrespondstoasimplearithmeti OCL

onstraint,whihusesvariableCrepresentingthedateattributeoftheworkin

question.

Themediator rulesrepresenting theabstration w0shownin Figure3 take

thefollowingform:

'Produt:lass:190'(_,[Title,Id,Author,Date℄,_),

'Desription:lass:191'(_,["artwork",Id,Type℄,_) --->

'Work:lass:220'(lass(220),[Title,Title,Author,Date,Type℄,[℄)

The spei rule above desribes how to reate an instane of the lass Work

wheneverwehavetwoappropriateinstanesoflassesProdutandDesription

available.Ifthereweremoreabstrations,theMediatorwouldgetmorerulesas

therewouldbemorethanonepossiblewayto populatethegivenlass.

Notethat the mediator rulesare also used to desribeinheritane between

model elements.Insuh aasethe dynamitype ofthe model element onthe

righthandsideoftheruleisavariable(asopposedtotheonstantlass(220)

above).Thisvariableisthesameasthedynamitypeofthemodel elementon

thelefthandside.Thedynamiattributesarepropagatedsimilarly.

Finally, let us state that an n-ary assoiation is implemented as an n-ary

relation,eahargumentofwhihisaternarystrutureorrespondingto alass

instane,similar totherstgoalof(2).Forexample,aquerygoalfortheasso-

iationhasWork(f.Figure2)hasthefollowingform:

:- 'hasWork:assoiation:227'(

'Artist:lass:218'(DT1,[DL_ID1,Name,Birthdate℄,DA1), (3)

'Work:lass:220'(DT2,[DL_ID2,Title,Author,Date,Type℄,DA2)

).

4 DL modelling in SINTAGMA

LetusnowintroduethenewDLmodellingapabilitiesoftheSINTAGMA

system.FirstwedisusswhyweneedDesriptionLogimodelsduringtheinte-

grationproessand provideanintrodutoryexample.Then wepresenttheDL

(11)

usage.Finally,wesummarisethetasksoftheintegrationexpertwhenusingDL

elementsduringintegration.

4.1 Anintrodutory example

IntheModelWarehousewehandlemodelsofdierentkinds.Wedistinguish

between appliation and oneptual models. The appliation models represent

existingorvirtualinformationsouresandbeauseofthistheyarefairlyelabo-

rateand preise.Coneptualmodels,however,representmentalmodelsofuser

groups,thereforetheyarevaguerthantheappliationmodels.

Ourexperiene showsthat to onstrutsuhmodelsit ismoreappropriate

to use somekind ofontologialformalisminsteadof therelativelyrigid objet

oriented paradigm. Aordingly, we have extended our modelling language to

inorporate several desription logi onstruts, in addition to the UML-like

ones desribedearlier. In theenvisioned senario,the high-levelmodels of the

users are formulated in desription logi and via appropriate denitions they

are onneted to lower-level models. Mediation for aoneptualmodel follows

the sameideawe usefor any other model: thequery isdeomposed,following

thedenitionsandabstrations,untilwereahtheinterfaemodels(ingeneral,

throughsomefurtherintermediatemodels)whihanbequerieddiretly.

Beforegoingintothedetails,weshowanexampletoillustrate thewayhow

DL desriptionsare representedin SILan (note that Writerand Painter are

bothdesendantsoflass Artist,but otherwisetheyare normalUMLlasses;

wewillpresentmoredetailsabouttheselassesinSetion 6).

model Coneptual {

lass WriterAndPainter {};

onstraint equivalent { (4)

WriterAndPainter,

Unified::Writer and Unified::Painter};

};

Here we dene the lass WriterAndPainterby providing a SILan onstraint.

This onstraintanbeplaedanywherein theModelWarehouse:in theexam-

pleabovewesimplyput itintheverymodelthatdelaresthelassWriterAnd

Painter itself. Theonstraintatually orresponds to aDL onept denition

axiom:WriterAndPainter

Writer

Painter.Namely,itstatesthattheinstanes

of lass WriterAndPainterare those (and only those) who belong to the un-

namedlassontainingtheindividualswhoarebothwritersandpainters.Thus,

DL onepts are dened using the Global as View approah [6℄, as opposed

to the Loal asView tehniquesapplied in populating high-levellasses using

abstrations(f.Setion 3.1).

NotethatthelassWriterAndPainterouldbereatedwithoutDLsupport.

However,in thatasetheintegrationexpert would haveto gothroughamuh

(12)

ifyingallitsattributes andpopulatingitwithanappropriateabstration.This

abstrationwouldhavetoimplementtheonstraint(4),throughanappropriate

join-likeoperation.

Now,withDLsupport, theexpertsimplyformulates averyshort andintu-

itive DL axiom.We argue that this is easier for the expert to do, and it also

makestheontentoftheModelWarehousemorereadabletoothers.

4.2 DL elementsin SILan

FromtheDLpointofview,SINTAGMAsupportsayliDesriptionLogi

TBoxesontainingonly oneptdenition axioms,whih are formulatedin an

extensionofthe

ALCN (D)

language(seemorebelowabouttheextension).Only singleatomionepts,soallednamedsymbols anappearonthelefthandside

oftheaxioms,suhasWriterAndPainterinexample(4).Theremainingatomi

onepts, not appearing on the left hand side are alled base symbols. Suh a

TBoxisdenitorial,i.e.themeaningofthebasesymbolsunambiguouslydenes

themeaning ofthenamed symbols.Thebase symbols,in ourase,orrespond

tonormalSINTAGMAlassesandassoiations,e.g.WriterandPainterinthe

example(4).TheABoxisasetofoneptandroleassertions,asdeterminedby

theinstanesofthelasseswhihorrespondto thebase symbolspartiipating

in theTBox.

The DL onept onstrutors supported by SINTAGMA and their SILan

equivalentsaresummarisedinTable1.Notethatthistableatuallydesribesthe

possibleoneptformatsontherighthandsideofadenition axiom,assuming

that wehaveexpanded theTBox 5

.

Theonlynon-lassialDLelementinTable1istheonretedomainrestri-

tion (thelastlineinthetable).Suharestritionspeiesasubsetofinstanes

ofthebaseonept

A

forwhihthegivenOCLonstraintholds.Thisisagener-

alisationoftheideaofonretedomainsintheDesriptionLogisworld.Below

weshowanexampleofaonreteSILanrestritiondesribingthoseworkswhose

type(i.e.thevalueoftheattributetype)ispainting.

lass onstraint Art::Work satisfies self.type="painting"

The reasonweallowonly onept denition axiomsis that weaim to use DL

oneptsto desribeexeutablehigh-levelviews ofinformationsoures. Inthis

sense aDLoneptisatuallyasyntativariantofaSILanqueryoraSILan

lasspopulatedbyanabstration.

NotethatthisalsoimpliesthatweusetheClosedWorldAssumption(CWA)

inDLqueryexeution.Wearguethatthisisappropriatebeauseofthefollow-

ingthreereasons.First,CWAautomatiallyensuresthat ourDLonstrutsare

5

TheexpandedversionofanayliTBoxisobtainedbyrepeatedlyreplaingevery

namedsymbolontherighthandsideofanaxiombyitsdenition.Thisproess is

repeateduntil nofurthernamedsymbolsare left ontherighthand side.Thefat

thattheTBoxisayliensurestheterminationofthisproess.

(13)

Baseonept

A

UMLlass

Atomirole

R

UMLassoiation

Intersetion

C ⊓ D

C and D

Union

C ⊔ D

C or D

Negation

¬C

not C

Valuerestrition

∀R.C

slot onstraint R all values C

Existentialrestrition

∃R.C

slot onstraint R some value C

Numberrestrition

⋊ ⋉ nR

slot onstraint R ardinality i..j

Top

DLAny

Bottom

DLEmpty

Conreterestrition lass onstraint A satisfies OCL

Table 1.DL-relatedonstrutssupportedinSILan

semantiallyompatiblewith otheronstrutsintheSINTAGMAsystem.Se-

ond, we arguethat the OpenWorldAssumption(OWA)is appliablewhen we

have only partial knowledge and would liketo determine the onsequenes of

thisknowledge,trueineveryuniverseinwhihtheaxiomsofthispartialknowl-

edge hold. Inontrastwith this, in theontextof informationintegration,our

userswould liketoonsider asingleuniverse, inwhihabase oneptorarole

denotes exatly those individuals (orpairs ofindividuals) whih are presentin

theorrespondingdatabase.Toillustratethisissue,letusonsiderthefollowing

example:the oneptofnoviepainter is dened toontain painters having at

most5paintings(forexample,beinganoviepaintermaybeapreonditionfor

agovernmentgrant).Tomodelthissituation,theintegrationexpertreatesthe

DLaxiomshownbelow.

NoviePainter

Painter

⊓ (6 5

hasPainting

)

However,queryingthis onept,usingOWA, willprovidenoresultsin general,

asanopenworldreasonerwouldreturnanindividualonlyifitisprovable that

it has no more than 5paintings. Pratially, this is notwhat the information

expert wants.

Thethirdreasonwhywe deidedtouse thelosed worldassumption isthe

fatthatweenvisagehandlinghugeamountsofdataintheunderlyingdatabases.

Traditional,tableaubasedDLreasonersdonotopewellwithlargeABoxes[15℄.

(14)

still not fast ornot expressiveenough [24℄. By using CWA we animplement

DLqueriesusingthewellresearhed,eientdatabasetehnology.

4.3 Modeling methodology and tasksof the integration expert

Theintegrationexpert isresponsibleforreatingtheDLaxioms.Although

thesearerepresentedinSILanwithintheSINTAGMAsystem,theexpertanuse

anyavailableOWLeditortoreateOWLdesriptions.Thesedesriptionsthen

anbeloaded bytheOWLimporterof theSINTAGMA systemthat basially

realisesanOWL-SILantranslation(f.theModelIm(Ex)portboxinFigure1).

Onething theexpertshould takeareofisto maththenamesof thebase

symbols and the orresponding SINTAGMA lasses and assoiations. This is

often done in two steps: rst the integration expert reates onept denition

axiomsusingthewidelyaeptedterminologyofthedomain,notpayingatten-

tion to the names of the model elements in the Model Warehouse. Next, the

expert provides additional denition axioms for eah base symbol onneting

it with the propermodel element. For example, we ould use names A and B

instead of WriterandPainter in (4), provided that we also enode in SILan

theequivalentsofthefollowingDLaxioms:

A

Writer

B

Painter

Afurtherruial issueistodeidehowto identifytheinstanesof thebase

onepts,e.g.theinstanesofthelassWriterandlassPainter.Withoutthis,

itisnotpossibletodeterminetheinstanesoflassWriterAndPainter.

InatraditionalDLABox,aninstanehasanamethatunambiguouslyiden-

tiesit.InSINTAGMA,similarly todatabases,aninstane isidentiedbythe

subset ofits attribute values.Forexample,twowriters ould be onsidered to

bethesameiftheirnamesmath, assumingthat nameisakeyin lassWriter.

Theproblem isthatsuhkeysarefairlyuselesswhenweompareinstanes

ofdierentdatasoures.Thisisbeause,ingeneral,weannotdrawanydiret

onlusion from the relation of the keys belonging to instanes from dierent

lasses.Forexample,databasesontainingemployeesoften usenumeriIDs as

keys.HavingtwoemployeesfromdierentompanieswiththesameIDdoesnot

mean that we are talking about the same person.Similarly, if the IDs of the

employeesdonotmath,theyarenotneessarilydierentpersons.

Whatweneedissomekindofsharedkeythatuniquelyidentiestheinstanes

ofthelassespartiipatinginDLoneptdenitions.Lukily,theobjet-oriented

paradigmweusein SINTAGMAprovidesaniewaytohavesuh identiers.

Wehavementionedearlierthatin SINTAGMAthenotionofDLoneptis

asyntativariantofSINTAGMAlass.ThisalsomeansthattheresultofaDL

queryisanordinaryinstanethathastobelongtosomelass(es).Forexample,

whenwearelookingfortheinstanesthat areelementsof bothlassesWriter

andPainterweareatuallyinterestedinanartist instane belongingto these

(15)

weuse todesribeaDLonepttheresultmustbelongto somelassthat isa

ommonanestor(intermsofinheritane)ofthelassesinvolved.

Instead of asking the integration expert to dene suh ommon anestor

lassesin anadhoway,weintroduethebuilt-in lassDLAny.Thislass or-

respondstotheDLonepttop(

)andithasonlyoneattributealled DL_ID,

whihisakey.WerequirethatallthelassespartiipatinginDLoneptdeni-

tionsarethedesendantsofDLAny 6

(f.lines2and8ofFigure2).Beauseofthe

propertiesofinheritane,attribute DL_IDwillbeakeyinallofthedesendant

lasses,i.e.itwillexatlyserveastheglobalidentierwewerelooking for.

Now, the task of the integration expert is to assign appropriate values to

the DL_IDattributes: she needsto extend theexisting abstrations populating

thebasesymbols(lasses) toalsoonsidertheattributeDL_ID.Byappropriate

valueswemeanthattheDL_IDsoftwoinstanesshouldmathiftheseinstanes

are the same, and should dier otherwise. An example for this an be seen

in Figure 5populating thelass Writer,whih is partof a bigger integration

senariotobeshownlaterin Setion6.

1 abstration ap (m0: Interfae::Member ->

2 m1: Unified::Writer) {

3

4 onstraint let n = m0.fname.onat(" ").onat(m0.lname) in

5 m1.name = n and

6 m1.birthDate = m0.date and

7 m1.member_id = m0.iwa_id and

8 m1.style = m0.style and

9 m1.DL_ID = n;

10 };

Fig.5.PopulatingtheDL_IDattributeofabaseonept

This abstration populates the lass Writer from an interfae lass alled

Member (lines 12), whih represents a membership database of an imaginary

International Writer Assoiation (IWA). Letus assume that the membersof

this assoiation havesomekind ofauniqueidentier,suh asthemembership

number,presentin theunderlyingdatabase.Itmaybeworthbringing thiskey

tothelassWriter(line7)asitmakespossibletondwriterseientlyifthey

happento beIWA members.However,theuniqueidentierfrom theDLpoint

of viewhas to bedierent: in fat it is the onatenationof the rst andlast

nameofthewriter,withaspaein between(lines4and9).

6

Notethatthisisaneessaryondition.Asforanyonept

C

,

C ⊑ ⊤

holds,anyDL

instanehastobelongtothelassorrespondingto

,i.e.toDLAny.

(16)

(e.g. Person,see Figure8) where theIWAnumber makesno senseand so the

member_id attribute is set to "n/a".Furthermore, we may want lass Writer

to be a desendant of lass Artist (f. Figure 8), together with some other

lasses,suhasPainter.Thisrequiresakeythatanbeomputedfromallthe

underlyingsoures,suhasthenameoftheartist 7

.

Tosummarise,theintegrationexperthastoperformthefollowingtaskswhen

DLmodellingisusedduring theintegrationproess:

1. delareDLlassesandforeahprovideorrespondingdenition axioms;

2. ensurethat eahbaseoneptappearinginthedenitionaxiomsis:

(a) inheritedfromlassDLAny,

(b) populatedproperly,i.e.itsDL_IDattributeislled appropriately.

5 Querying DL models in SINTAGMA

Now we turn our attention to querying DL onepts in SINTAGMA. As

desribedin Setion3.3ourtaskisto reateaquerygoal andasetofmediator

rules.WhenwequeryaDLlass,mediatorrulesareonlygeneratedforthebase

symbols.As theseare ordinarylassesand assoiations,this proess isexatly

the sameasthe onewe usefor aseswithoutany DLonstrutinvolved. This

meansthatwean nowfousontheonstrutionofthequerygoal.

Reall that aSINTAGMA instane is haraterised by three properties, as

exempliedby(2)onpage9:itsdynamitypeDT,itsstatiattributesSAandits

dynami attributesDAs. Belowwewill usethevariable nameAs todenote the

fullattributelistofaninstane,i.e.theonatenationofthestatianddynami

attributevalues,withtheexlusionof DL_ID.

A DL lass has only a single stati attribute, the DL_ID key. However, in

ontrast with an objet oriented query, a DL query may return an answer

that has multiple dynami types. For example, when we enumerate the lass

WriterAndPainter we get instanes that belong to both lasses Writer and

Painter(somethingwhihisnotpossibleinthestandardUMLmodelling).A-

ordingly,ananswertoaDLquerytakestheformofapair(ID, DTA),whereID

istheDL_ID 8

ontainingtheuniquenameoftheDLinstanes(seeSetion4.3),

whileDTAisaPrologstrutureontainingthedynamitypesoftheanswer,eah

pairedwiththeorrespondingfullattributelist.TheDTAstrutureisthuseither

asingleDT-Aspair, orreursively, twoDTAstruturesjoined using theomma

operator:(DTA

1

, DTA

2

).

Figure6desribesthemappingfrom anarbitraryDLoneptexpressionto

the orresponding query goal. Here we dene a funtion

Φ C

whih, given an

arbitraryoneptexpression

C

, returnstheorrespondingquerygoalwith two arguments,IDandDTA.Wedene thisfuntion byonsideringtheDLonept

onstrutors,aslistedinTable1.

7

Thisisalsoasimpliation.Morerealistially,thekey ouldbethenametogether

withthebirthdate.

8

WeusethenameIDinsteadof DL_IDforoniseness.

(17)

Φ A

(ID, DTA)

= A N

(DT, [ID|SAs℄, DAs), DTA = DT-(SAs

DAs)

Φ C⊓D

(ID, DTA)

= Φ C

(ID, DTA

1

),

Φ D

(ID, DTA

2

), DTA = (DTA

1

, DTA

2

)

Φ C⊔D

(ID, DTA)

=

(

Φ C

(ID, DTA) ;

Φ D

(ID, DTA))

Φ ¬C

(ID, _)

= \

+

Φ C

(ID, _)

Φ ∃R.C

(ID, DTA)

= R N

(

R N D

(DT, [ID|SAs℄, DAs),

R N R

(_, [ID

2

|_℄, _)),

Φ C

(ID

2

, _), DTA = DT-(SAs

DAs)

Φ ∀R.C

(ID, DTA)

= Φ R D

(ID, DTA)

,

\

+ (

R N

(

R N D

(_, [ID|_℄, _),

R N R

(_, [ID

2

|_℄, _)),

Φ ¬C

(ID

2

, _))

Φ ⋊ ⋉nR

(ID, DTA)

=

bagof(Y,

R N

(X, Y), Ys),length(Ys, S),ondition

⋉ ⋊

(n, S),

X =

R N D

(DT, [ID|SAs℄, DAs), DTA = DT-(SAs

DAs)

Φ ⊤

(_, _)

=

true

Φ ⊥

(_, _)

=

false

Fig.6.TransformingDLonstrutsintoquerygoals

Letusonsidertheasesonebyone.Ifwehaveabaselass,wesimplyreate

aqueryterm representingtheinstanes ofthe lass,similar to theonein goal

(2) and then onvert the attributesretrieved to therequiredform (DTA). Here

operation

denotes theompiletimeonatenationof lists 9

, while

A N

stands

for the prediate name orresponding to onept

A

. For example, Work

N =

'Work:lass:220',f. (2)onpage9.Notethat intheseond argumentofthe

query goal

A N

wemake use of the fat that the DL_IDattributes are always

plaedrstinthestatiattributelistof aninstane.

Ifwehavetheintersetionoftwoonepts

C

and

D

,wereursivelytransform

onepts

C

and

D

andputthem in aPrologonjuntion. TheDTAstrutureis

builtfrom the strutures reursivelyobtainedfrom theexeutionof the trans-

formationsofonepts

C

and

D

.Note thattheresultingstruturemayontain

dupliates,i.e.thesameDT-AspairmaybefoundinDTAmorethanone.These

dupliatesareonlyremovedatthetoplevel,i.e.whenthenalresultofaquery

ispresented.Thetransformationofuniononeptsissimilartotheintersetion:

wereateaPrologdisjuntion.

Negation

¬C

is implemented by using the Prolog negation-as-failure. This translationisonlyapableofheking whetheragiveninstanewithIDbelongs

to onept

C

ornot. As usual in thedatabase ontext, we restrit the use of

negationtoaseswherenegatedqueriesappearonlyinonjuntionwithatleast

9

The

operatoris usedonlywith astati attributelist (SAs).For any given base lass, thelengthofthe orrespondingSAs isxed(thenumberofstati attributes

exluding theDL_ID). Therefore, theSAs

DAs onatenationanbe arried out atompiletime.

(18)

negated onepts have to appear either in the sope of a quantier, or in an

intersetiontogetherwithatleastonenon-negatedonept.Itisthetaskofthe

Mediator to nd an appropriate order in the nal query plan where negation

appearsin aplaewhereIDisinstantiated[5℄.TheMediatorrefusestoexeute

thequeryifsuhanorder doesnotexist.

The next two ases involve assoiations. On the right hand side of these

formulas

R N

denotestheprediatenameorrespondingtotheassoiationitself.

R D

(

R R

) denotes the base lass that is the domain (range) of assoiation

R

.

Correspondingly,

R N D

and

R N R

standfor theprediate namesofthe lasses

R D

and

R R

, respetively 10

. Reall that a binary assoiation is represented by a

binaryrelationwithternarystruturesasarguments,asin (3).

The existential restrition

∃R.C

is simply transformed to a query of the assoiation

R

andtheonept

C

.

Thegoalorrespondingtoavaluerestrition

∀R.C

rstenumeratesthedo-

main of

R

and thenusesdoublenegationtoensurethat thegiveninstanehas

no

R

-valueswhihdonotbelongto

C

.Notethat

Φ ¬ C

(ID2, _)isinvokedonly

whenID2isalreadyinstantiated.

Anumberrestrition

( ⋊ ⋉ nR)

istransformedintoagoalwhihusestheProlog built-in prediatebagof(f. Setion2.2, page4)to enumeratetheinstanesin

thedomainof

R

togetherwiththenumberof

R

-valuesonnetedtothem, and

thensimplyappliestheappropriatearithmetiomparison.

ThelasttwolinesofFigure6denethetransformationofthetopandbottom

onepts.

ismappedintotrue,while

tofalse.Queryingtheseoneptson

theirowndoesnotmakesense,butthesemappingsareusefulwhentransforming

DLoneptssuhas

∃R.⊤

or

∀R.⊥

.

HavingdesribedthetransformationofDLoneptstoquerygoals,wenow

deal with the only remaining onstrut: the onrete restrition. A onrete

restritioninvolvingabaseonept

A

andanOCLonstraint

O

istransformed in astraightforwardwayintothequerygoalasshownbelow

11

:

Φ A

(ID, DTA), DTA = DT-AT,

Ψ O

(ID, AT)

Toillustrate thegeneralalgorithm,twoexampletransformationsarepresented

in Figure7.TherstoneshowsthetranslationoftheWriterAndPainterlass

desribed in (4) on page 11. The query goal is a onjuntion that onsists of

three goals.The rsttwogoalsenumerate theinstanes of lassesWriterand

PainterwithaonditionthattheirIDattributesmath.Atthispointwehave

identied those instanes whoare writers and painters at thesame time. The

last goal onstruts the struture DTA, desribing the dynami types and the

orrespondingattributevaluesofthegiveninstanes.

10

Forexample,if

R =

hasWork,f.Figure2,then

R N =

'hasWork:assoiation:227',

R N D =

'Artist:lass:218'and

R N R =

'Work:lass:220'.

11

Ψ O

(ID, AT) denotes the Prolog translation of the OCL onstraint

O

. This is a

featurewhihhasalreadybeenpresentinearlierversionsofSINTAGMA,beforethe

introdutionoftheDLextensions,see[3 ℄.

(19)

modern work. This DL onept involves the assoiation hasWork and a lass

Modern(representing,say,ontemporarypieesofart).Thequerygoalbeomes

abitmoreomplexthanin therstexample:nowitonsistsoffourgoals.The

rst goal enumerates the instanes of lass Writer.The seond and thethird

goalslteroutthosewritersthatdonothaveanymodern works.Herewehave

usedthefats thatthedomainof hasWorkisthelassArtistandtherangeis

thelassWork(f.Figure 2).Finally,thelast goalbuildsthestrutureDTA.

Class toquery: WriterAndPainter

DLdenition: Writer

Painter

Querygoal: 'Writer:lass:234'(DT1,[ID,Nam e1,Bi rth1 ,IWA, Styl e℄,DA 1),

'Painter:lass:236'(DT2,[ID,Name 2,Bi rth2, Colo ur℄,D A2),

DTA = (DT1-[Name1,Birth1,IWA,Style|DA 1℄,

DT2-[Name2,Birth2,Colour|DA2℄)

Class toquery: ModernWriter

DLdenition: Writer

⊓ ∃

hasWork.Modern

Querygoal: 'Writer:lass:234'(DT1,[ID,Nam e1,Bi rth1 ,IWA, Styl e℄,DA 1),

'hasWork:assoiation:227'(

'Artist:lass:218'(DT2,[ID,Name 2,Bir th2℄ ,DA2) ,

'Work:lass:220(_,[ID2|_℄,_)),

'Modern:lass:237'(_,[ID2|_℄, _),

DTA = (DT1-[Name1,Birth1,IWA,Style|DA 1℄,

DT2-[Name2,Birth2|DA2℄)

Fig.7.Transformationexamples

Notethatifawriterhasmorethanonepieeofmodernwork,thetransforma-

tioninFigure7enumeratesthewritermultipletimes.Thisisbeausetheseond

goalansueedmorethanone,leavingahoiepoint [26℄.Inthepresentver-

sion of SINTAGMAthese dupliatesare removedat the toplevelonly, before

the queryresults arepresentedto the user.In future,wewill onsider amore

eientsolution,utilisingthePrologpruningoperators(onditionalsoruts)to

eliminatetheunneessaryhoies.

AlsonotethatinourexamplesenarioattributesName1,Name2andBirth1,

Birth2will beinstantiatedto thesamevalues,i.e.to thenameandbirth date

ofthemodernwriter.Thisistheonsequeneofthedatarepresentationweuse

inSINTAGMA,i.e.ifaninstanehasmultipledynamitypes,foreahofthem

wesupplyalltheattribute values.

(20)

Inthis setionwepresenta simpleusease, wherewefousonillustrating

theDLextensionofSINTAGMA.Moreomplextraditionalintegrationproblems

solvedusingSINTAGMAaredisussedinotherpapers,forexamplein[21℄.

Figure8showstheontentofourexampleModelWarehouse.Herewehave

fourmodelsondierentabstration levels.

PSfragreplaements

Member Person

...

... Exhibitor Produt Desription

MySQL XML

PostgreSQL Orale

abstration

abstration abstration

abstration

generalisation

general.

Writer Painter

Artist Work

hasWork

hasPainting

InterfaeUnifiedArtConeptual DesriptionLogi lasses

PainterWriter Novel

NoviePainter ...

Fig.8.ContentoftheModelWarehouse

The lowest one, Interfae, ontains lasses diretly orresponding to the

information soures we aim to integrate. Class Member orresponds to some

databasetableontaininginformationaboutwriters(membersofaertainwrit-

ersassoiation),PersonisthemodelofanXMLsouredesribingpeople(some

of whom are possibly writers). We also havehere lass Exhibitor ontaining

(21)

among other produts together with lass Desription whih provides some

informationonproduts.Thesemodelsareonstrutedautomatiallybydier-

entwrappersoftheSINTAGMAsystem.

Thenext,moreabstratmodel,alledUnified,ontainstwolassesWriter

andPainter,theirSILandesriptionsareshowninFigure9(referringto lass

Artistintrodued in Figure2in page6). These lassesprovideauniedview

of writers and painters overour heterogeneousinformation soures,i.e. query-

ing Writer and Painter gives us all the known writers and painters respe-

tively.TheselassesarepopulatedbySILanabstrations:Writerbytwo,while

Painter by only one. We an later extend our Model Warehouse to inlude

moreinformation soures on painters. This wayPainter would also be popu-

latedbyseveralabstrations.Pleasenotehowexiblethisapproahis:whenever

wewouldliketoaddanewinformationsoure,allwehavetodoistoprovidea

newabstration.Thisisfundamentallydierentfromthewayviewsarereated

in traditionaldatabasesystems.

model Unified {

lass Writer: Art::Artist {

attribute Integer member_id;

attribute String style;

};

lass Painter: Art::Artist {

attribute String favourite_olour;

};

};

Fig.9.SILandesriptionoflassesWriterandPainter

ThethirdmodelArtdesribesanevenhigherviewoftheunderlyinginforma-

tionsoures.Itontainstwolassesonnetedbyanassoiation.ClassArtistis

delaredtobethegeneralisationoflassesWriterandPainter,i.e.Artistisa

ommonparent of WriterandPainter,intermsofinheritane.Aordingly,

it ontainstheunion ofthe instanes ofthese lasses. Class Workinorporates

works (books and paintings). Inthe example, lass Workis populatedby only

one abstration.Assoiation hasWork onnets instanes in lass Artistwith

thoseinlassWork,i.e.itallowsustonavigatefromanartisttoherworks.This

assoiationispopulatedby anabstration(not shown inFigure 8)by reating

virtualpairsfrom thoseinstanesoflassesArtistandWorkwheretheauthor

ofthework mathesnameoftheartist.

Note that there is one more assoiation in the Model Warehouse, alled

hasPainting.This assoiationonnetspainters withtheirpaintingsand goes

(22)

latedbyanabstration,notshownhere.AssoiationhasPaintingisusedinthe

denition of PainterWriter(seebelow).

Upuntilnowwehaveusedthe traditionalfeaturesof SINTAGMA:lasses,

assoiations, generalisations, abstrations. Now we turn to the most abstrat

model, named Coneptual,whihprovidesanevenhigher-levelviewofthein-

formationthanthepreviousmodel.

The model Coneptual represents the knowledge of our spei example

domain,intheformofDLoneptdenitionaxioms.Theseaxiomsformasimple

ontology,apartofwhihisshowninFigure10.Thisontologytalksaboutspeial

typesofartists,paintersandwriters.Itstatesthatanoviepainter isapainter

whohasonlypaintednomorethan5paintings(axiom1).Somebodyismostly

writerifsheisanartistwhohasproduedatleast3works,buthasatmostone

painting(axiom2).Aprodutivewriterhasreatedatleast

10

works(axiom3).

Somebodyispainter-writerissheisawriterwhohassomepaintings(axiom4).

Finally,anovelistissomebodywhoisonlywritingnovels(axiom 5).

NoviePainter

Painter

⊓ (6 5

hasPainting

.⊤)

(1)

MostlyWriter

Artist

⊓ (> 3

hasWork

.⊤) ⊓ (6 1

hasPainting

.⊤)

(2)

ProdutiveWriter

MostlyWriter

⊓ (> 10

hasWork

.⊤)

(3)

PainterWriter

Writer

⊓ (∃

hasPainting

.⊤)

(4)

Novelist

≡ ∀

hasWork

.

Novel (5)

. . . ≡ . . .

Fig.10.Anontologydesribingartists,paintersandwriters.

Inpratie,suhanontologyanbereatedbytheinformationexpertman-

ually or an be imported from an existing ontology using the OWL importer

omponent of the SINTAGMA system. In SINTAGMA this ontology is rep-

resented by a model ontaining lasses with no attributes, together with the

orrespondingSILanonstraintsasshownbelow:

model Coneptual {

lass NoviePainter {}; lass MostlyWriter {};

lass ProdutiveWriter {}; lass PainterWriter {};

lass Novelist {}; ...

onstraint equivalent {

NoviePainter,

Painter and {slot onstraint hasPainting ardinality 0..5}

};

...

};

(23)

Mostofthese(i.ePainter,WriterandArtist),appearintheunderlyingUML

models.However,thereistheoneptof Novel,whihhasnodiretUMLoun-

terpart. This onept anbe dened using aonrete restrition of SILan, as

shownbelow.

onstraint equivalent {

Novel,

{lass onstraint Art::Work satisfies self.type="novel"}

};

This onludesthedesriptionofourexamplemodels.HavingenodedourDL

axiomsintermsofSILanonstraints,weannowexeuteDLqueries.Forexam-

ple, weanask SINTAGMA to enumeratethe instanes of lass Produtive-

Writer.Thisquerywillprodueinstanessimilarto thefollowing:

1 ('Lisa James',

2 [

3 'Writer'-['Lisa James', 1965, 42, 'fantasy'℄,

4 'Painter'-['Lisa James', 1965, 'red'℄

5 ℄

6 )

Here,thestring'Lisa James',appearinginline1,orrespondstotheIDof

Figure6,i.e.thesharedDLidentier.Lines34ontainthelistofthedynami

types and orresponding attributes of the instane. This spei instane has

twodynamitypes:sheis awriter andapainterat thesametime (lines3and

4). As a writer, she has a name, birth date, her membership ID and a style

attribute.Asapainterwealsoknowherfavouriteolour.

7 Related work

Thetwomain approahesin information integrationare theLoal asView

(LAV) and the Global as View (GAV) [6℄. In the former, soures are dened

in termsof theglobalshema,whilein thelatter, theglobalshemaisdened

in terms of the soures (similarly to the lassialviews in databasesystems).

Information Manifold [20℄ is agood example for aLAV system. Examples for

theGAV approahinlude theStanford-IBMintegrationsystemTSIMMIS[8℄,

andtheDLbasedintegrationsystemalledObserver[23℄.

InSINTAGMAweapplyahybridapproah,i.e.weusebothLAVandGAV.

Whenusingabstrationstopopulatehigh-levellassesweemploytheLAVprin-

iple,whilein aseofDLlassdenitionsweusetheGAVapproah.

Thereareseveralompletedandongoingresearhprojetsintheareaofusing

desriptionlogi-based approahesfor bothEnterprise Appliation Integration

(EAI) andEnterpriseInformationIntegration(EII).

(24)

Arhiteture,andtheprovisionofnewapabilitieswithintheframeworkofSe-

mantiWebServies.Examplesforsuhresearhprojetsinlude DIP[16℄and

INFRAWEBS [13℄.These projetsaimat thesemantiintegrationofWebSer-

vies,inmostasesusingDesriptionLogibasedontologiesandSemantiWeb

tehnologies.Here,however,DLisusedmostlyforserviedisoveryanddesign-

timeworkowvalidation,but notduring queryexeution.

Ontheotherhand,severallogi-basedEIItoolsuseDLandtakeasimilarap-

proahaswedidinSINTAGMA.Thatis,theyreateaDLmodelasaviewover

the informationsoures tobeintegrated.The basiframework of thissolution

is desribede.g.in [7,4℄. Thefundamental dierenewithourapproahis that

these appliations dealwith the lassial Open World Assumption, asalready

disussed in Setion 4.2. We argue that existing DL reasoners are not usable

whenlargeamountsofdataandomplexDLqueriesareinvolved[15,18,24℄.

Onthetheoretialsideaninterestingdesriptionlogiisthe

ALCK

[11℄whih

addsanon-monotoni

K

operatortothe

ALC

languagetoprovidetheabilityto

useboththeCWAandtheOWA,whenneeded.

ALCK

hasseveralimplementa- tion,thePellet reasoner[27℄,forexample,supportsthislogi. However,

ALCK

lakstheabilityto expressardinalityonstraints,whihisafeaturefrequently

usedin informationintegrationsenarios.

Finally, we mention that the Desription Logi Programming (DLP) ap-

proah,rstintroduedin [14℄,alsoemploystheideaoftranslatingDLaxioms

into Prolog goals (f. the approah summarised in Table 6). In ontrast with

our approah DLP uses the Open World Assumption and does not deal with

negationandardinalityrestritions.

8 Conlusions

InthispaperwehavepresentedtheDLextensionoftheinformationintegra-

tion system SINTAGMA. This extension allowsthe information expert to use

DesriptionLogibasedontologiesin thedevelopmentofhigh abstrationlevel

oneptualmodels.QueryingthesemodelsisperformedusingtheClosedWorld

Assumptionovertheunderlyinginformationsoures.

We have presented the main omponents of the SINTAGMA system: the

ModelManagerwhihisresponsibleformaintainingtheModelWarehouserepos-

itory,theWrapper,whihprovidesauniformviewovertheheterogenousinfor-

mationsouresandtheMediator,whihdeomposesomplexhigh-levelqueries

intoprimitiveones answerablebytheindividualinformationsoures.

Next, we have desribed the newly introdued DL modelling elements the

integration expert anuse when building oneptualmodelsand wehave also

disussedthemodellingmethodologyshehastofollow.Wehavedenedatrans-

formationofDLqueriestoProloggoals,usedintheSINTAGMAsystemforDL

queryexeution.Wehavealsoillustratedourapproahbyprovidingausease

aboutartistsandtheirworks.

(25)

used alone for solvingomplexmodelling problems, somekind of hybridteh-

niques are neessary. Weargue that oursolution for ombining DLand UML

modelling in a unied integration framework provides a viable alternative to

existing systems.Theusage ofDL onstrutsin buildinghigh-leveloneptual

modelshassubstantialbenets,bothintermsofmodellingeienyandmain-

tenane.

Aknowledgements

Theauthors aknowledgethe support of theHungarianNKFP programme

fortheSINTAGMAprojetundergrantno.2/052/2004.Wethankallthepeople

partiipatingin thisprojet,espeiallyTamásBenk®,theleadarhitet.

Wearealsograteful totheanonymousreviewersofapreliminaryversionof

thispaper[22℄, fortheirinsightfulomments.

Referenes

1. F.Baader,D.Calvanese,D.MGuinness,D.Nardi,andP.F.Patel-Shneider,ed-

itors.TheDesriptionLogiHandbook:Theory,ImplementationandAppliations.

CambridgeUniversityPress,2003.

2. Liviu BadeaandDoinaTilivea. QueryPlanningfor IntelligentInformationInte-

grationusingConstraintHandlingRules,2001.IJCAI-2001WorkshoponModeling

andSolvingProblemswithConstraints.

3. T.Benk®,P.Krauth,andP.Szeredi.Alogibasedsystemforappliationintegra-

tion. InProeedings ofthe18thInternationalConferene onLogiProgramming,

ICLP 2002.Springer,LNCS,2002.

4. A. Borgida, M. Lenzerini, and R. Rosati. Desription logis for databases. In

Desription LogiHandbook,pages462484,2003.

5. AndrásG.Békésand Péter Szeredi. OptimizingQueries ina Logi-basedInfor-

mation IntegrationSystem. InWimVanhoofPatriia Hill,editor,Proeedings of

the17thWorkshoponLogi-basedmethodsinProgrammingEnvironments(WLPE

2007), pages115,Porto,Portugal,2007.

6. D.Calvanese,D.Lembo,andM.Lenerini. Surveyonmethodsforqueryrewriting

andqueryansweringusingviews. Teh.report,UniversityofRome,April2001.

7. Diego Calvanese, Giuseppe DeGiaomo,Maurizio Lenzerini, DanieleNardi, and

RiardoRosati. Desriptionlogiframeworkforinformationintegration.InPrin-

iplesof KnowledgeRepresentationandReasoning,pages213,1998.

8. S. Chawathe, H. Garia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou,

J. Ullman, andJ. Widom. The TSIMMISprojet:Integration ofheterogeneous

information soures. In 16th Meeting of the Information Proessing Soiety of

Japan,pages718,Tokyo,Japan,1994.

9. T.Clark and J.Warmer,editors. ObjetModeling with theOCL: The Rationale

behindtheObjetConstraintLanguage,volume2263ofLNCS. Springer,2002.

10. WilliamF.CloksinandC.S.Mellish.ProgramminginPROLOG.Springer-Verlag

NewYork,In.,Seauus,NJ,USA,1994.

(26)

operatorfordesriptionlogis. Artif.Intell.,100(1-2):225274, 1998.

12. Martin Fowler and KendallSott. UML Distilled: Applying the Standard Objet

ModelingLanguage. Addison-Wesley,1997.

13. Vladislava Grigorova. Semanti desription of web servies and possibilities of

BPEL4WS. InformationTheoriesandAppliations,13(2):183187,2006.

14. BenjaminN.Grosof,IanHorroks,RaphaelVolz,andStefanDeker. Desription

logi programs:Combininglogiprograms withdesriptionlogi. InPro.of the

Twelfth International World Wide WebConferene (WWW 2003), pages 4857.

ACM,2003.

15. V. Haarslev andR. Möller. Optimizationtehniquesfor retrievingresoures de-

sribedinOWL/RDFdouments:Firstresults.InNinthInternationalConferene

onthePriniplesofKnowledgeRepresentationandReasoning,KR2004,Whistler,

BC,Canada,June2-5,pages163173,2004.

16. M.Hepp,F.Leymann,J.Domingue,A.Wahler,andD.Fensel. Semantibusiness

proess management: A vision towards using semanti webservies for business

proessmanagement,2005.

17. IanHorroks. Reasoningwithexpressivedesriptionlogis:Theoryandpratie.

In Pro. of the 18th Int. Conf. onAutomated Dedution (CADE 2002), number

2392 inLetureNotesinArtiialIntelligene,pages115.Springer,2002.

18. U. Hustadt, B.Motik, and U. Sattler. Reasoning for desription logis around

SHIQ

inaresolutionframework. TehnialReport3-8-04/04,June2004.

19. InterfaeDenitionLanguage. ISOInternationalStandard,number14750.

20. T. Kirk, A. Y. Levy, Y. Sagiv, and D. Srivastava. The Information Manifold.

In C. Knoblok and A. Levy,editors, AAAI Spring Symposium ob Information

GatheringfromHeterogeneous, DistributedEnvironments,1995.

21. Gergely Lukásy,TamásBenk®,andPéter Szeredi. Towards automatisemanti

integration. In Enterprise Interoperability II, New Challenges and Approahes,

ProeedingsoftheI-ESA2007,pages795806,Funhal,Portugal,2007.Springer.

22. GergelyLukásyandPéterSzeredi. Ontologybasedinformationintegrationusing

logiprogramming.InEdnaRukhaus,editor,Proeedingsofthe2ndInternational

Workshop onAppliations ofLogiProgramming tothe Web,Semanti Weband

Semanti WebServies(ALPSWS2007),pages3954,Porto,Portugal,2007.

23. Eduardo Mena,Vipul Kashyap, Amit P.Sheth,and Arantza Illarramendi. OB-

SERVER:Anapproahforqueryproessinginglobalinformationsystemsbased

oninteroperationarosspre-existingontologies. InConfereneonCooperative In-

formationSystems, pages1425,1996.

24. Zsolt Nagy, Gergely Lukásy, and Péter Szeredi. Translating desription logi

queriestoProlog. InPro.ofPADL,SpringerLNCS3819,pages168182,2006.

25. LinhAnhNguyen. A xpointsemantisandansld-resolutionalulusfor modal

logi programs. Fundam.Inf.,55(1):63100, 2003.

26. ISOPrologstandard,1995. ISO/IEC13211-1.

27. EvrenSirin,BijanParsia,BernardoCuenaGrau,AdityaKalyanpur,andYarden

Katz. Pellet:Apratialowl-dlreasoner. WebSemant.,5(2):5153,2007.

28. Leon SterlingandEhud Shapiro. Theart of Prolog:advaned programmingteh-

niques. MITPress, Cambridge,MA,USA,1986.

Hivatkozások

KAPCSOLÓDÓ DOKUMENTUMOK

We then analyzed more than 200 open-source Java systems, extracted their object-oriented metrics and an- tipatterns, calculated their corresponding maintainability values using

We apply a method for combining the models in the network that is related to both bagging [21] and “pasting small votes” [22]: when the models start their random walk, initially

“Cloud manufacturing is a computing and service-oriented manufacturing model developed from existing advanced manufacturing models (e.g., application service providers,

• Presenting results in recognition of the extracted object classes, based on the combination of their shape and texture information, using a simple and fast approach based on

Some (simple) solutions have counterparts presented in [6] and [21], where the transformation is discussed in terms of ER and BRM schemas, respectively. Trans- formation

Since document and directory languages are modal logics over document and directory models, the working mechanism of the engine is based on the different variations of model

In this chapter three structure identification methods are discussed: block- oriented models using Volterra kernels (HABER, 1989), a nonlinear model with linear parameters

Objective: Our paper investigates the costs and benefits of using the popular industrial Eclipse Modeling Framework (EMF) as an underlying representation of program models processed