Oriented Models in an Information
Integration Framework
GergelyLukásyandPéterSzeredi
BudapestUniversityofTehnologyandEonomis
DepartmentofComputerSieneandInformation Theory
1117Budapest,Magyartudósokkörútja2.,Hungary
Phone:+361463-2585Fax:+361463-3157
{lukasy,szeredi}s.bme.hu
Keywords:desriptonlogi,informationintegration,logiprogramming
Abstrat. We present an information integration system alled SIN-
TAGMAwhihsupports the semanti integration of heterogeneous in-
formationsouresusingametadatadrivenapproah.Themainideaof
SINTAGMAistobuildasoalledModelWarehouse,ontainingseveral
layers of integrated models onneted by mappings. At the bottom of
thishierarhy therearethemodelsrepresentingtheatualinformation
soures. Higher level models represent virtual databases whihan be
queried,asthemappingsprovideapreisedesriptionofhowtopopu-
latethesevirtualsouresusingtheonreteones.
TheimplementationofSINTAGMAusesonstraintsandlogiprogram-
ming,forexample,theomplexqueriesaretranslatedintoProloggoals.
ThispaperfousesonareentdevelopmentinSINTAGMAallowingthe
information expert to use Desription Logi (DL) based ontologies in
thedevelopment ofhighabstrationleveloneptual models.Querying
these models is performed using the Closed World Assumption as we
arguethat traditionalOpenWorld DLreasoningis lessappropriate in
theontextofdatabaseorientedinformationintegration environments.
1 Introdution
ThispaperpresentstheDesriptionLogimodellingapabilitiesoftheSIN-
TAGMAEnterpriseInformation Integrationsystem.
SINTAGMA is based on the SILK tool-set, developed within the EU FP5
projetSILK(System Integration viaLogi & Knowledge)[3℄. SILKis aPro-
logbased,dataentered, monolithiinformationintegrationsystemsupporting
semi-automatiintegrationonrelationalandsemi-struturedsoures.
The SINTAGMA system extends the original framework in several dire-
tions.As opposed to themonolithiSILKstruture, SINTAGMAisbuiltfrom
looselyoupleddistributedomponents.Thefuntionalityhasbeomeriheras,
among others,thesystemnowdealswithWebServiesasinformationsoures.
integrationexpert touseDesriptionLogimodelsintheintegrationproess.
Thispaperis arevisedandextended versionof thepaper presentedat the
ALPSWS'07workshopinPorto[22℄.Itisstruturedasfollows.Setion2intro-
dues desriptionlogi and logi programming.In Setion 3wegiveageneral
introdution to theSINTAGMA system,desribing the main omponents,the
SILan modelling language, and the query exeution mehanism. In the next
setion we disuss the desription logi extension of SILan: we introdue the
syntati onstruts and the modelling methodology. Setion 5 desribes the
exeution mehanismused when queryingDesriptionLogi models. Setion 6
presentsafairly omplexexample, demonstrating thetoolsand tehniqueswe
havedisussedsofar.InSetion7weexaminerelatedwork.Finally,weonlude
withasummaryofourresults.
Theexamplesweuseintheupomingdisussionsarepartoftheintegration
senariodesribedindetailinSetion 6.Thissenariorepresentsaworldwhere
weattempttointegratevariousinformationsouresaboutwriters,paintersand
theirwork(i.e.books,paintings,et.)andpresentthisinformationin theform
ofabstratviews.
2 Bakground
Belowwegiveabriefintrodution to DesriptionLogiand logiprogram-
ming asthesetehnologiesformthebasisofourwork.
2.1 Desription Logi
DesriptionLogis (DL) [17℄ is a family of simplelogi languages used for
knowledgerepresentation.DLsareusedfordesribingthevariouskindsofknowl-
edgeforaseletedeld.Theterminologialsystemofadesriptionlogiknowl-
edgebaseonsistsofonepts,whihrepresentsetsofobjets,androles,desrib-
ingbinaryrelationsbetweenonepts.Objetsaretheinstanesourringinthe
modelled appliationeld,andthusarealsoalled instanes orindividuals.
Adesriptionlogi knowledgebase onsists oftwodisjointparts:theTBox
andtheABox. TheTBox(terminologybox),in itssimplestform,ontainster-
minology axioms of form
C ⊑ D
(oneptC
is subsumed byD
). The ABox(assertionbox)storesknowledge abouttheindividuals in theworld:aonept
assertionof form
C(i)
denotesthati
isan instane ofoneptC
, whilea roleassertion
R(i, j)
meansthat theobjetsi
andj
arerelatedthroughroleR
.Coneptsandrolesmayeitherbeatomi(referredtobyaoneptnameora
rolename)oromposite.Aompositeoneptisbuiltfromatomioneptsusing
onstrutors.TheexpressivenessofaDLlanguagedependsontheonstrutors
allowedforbuilding ompositeoneptsorroles. Obviouslythere is atrade-o
betweenexpressivenessandinfereneomplexity.
We use the language
ALCN (D)
in this paper.ALCN (D)
onept expres-sions (oftensimplyreferredto asonepts) arebuilt from rolenames, onept
names (atomi onepts) and the top and bottom onepts (
⊤
and⊥
) usingthefollowingonstrutors:intersetion (
C ⊓ D
),union(C ⊔ D
),negation(¬C
),valuerestrition(
∀R.C
),existentialrestrition(∃R.C
)andnumberrestritions (> n R
and6 n R
). Here,C
andD
are onept expressions andR
is a rolename.Thetwokindsofnumberrestritionsarejointlyreferredtoas
( ⋊ ⋉ n R)
.InALCN (D)
weanalsouseonrete domains,suhasintegersorstrings,whenbuildingonepts.Foradetailed introdutionto desriptionlogiswereferthe
readertothersttwohaptersof[1℄.
2.2 Logi programmingand Prolog
Themain ideaofLogiProgramming isto usemathematiallogiasapro-
gramminglanguage.The exeutionof alogiprogram anbeviewed asarea-
soningproess.
Prolog(ProgramminginLogi)[26℄istherstandsofarthemostwidelyused
logiprogramminglanguage.PrologusesHorn-lauses andSLDresolution [25℄
forreasoning.ThebasielementsofthePrologexeutionproessareproedure
invoationbasedonuniationandbaktraking [28℄.
Prolog, and logi programming in general, is suessfully used in several
areasofomputersiene.Theseinlude naturallanguageproessing,planning,
dierentkindsofreasoningsystems,andinformationintegration.
Thenotionofterm isaprinipaloneptoftheProloglanguage.Itiseither
(a) asimple value(number, string) or(b) avariable or() a struture witha
name and arbitrarynumber of arguments. These arguments are Prolog terms
themselves.Thenameandthearityofatermtogetherisreferredasthefuntor
oftheterm.APrologstruture withthreeargumentsanbeseenbelow:
'Work:lass:220'(DT, [A, B, C, D, E℄, _) (1)
Here thename of the struture is 'Work:lass:220'.The rst and thethird
argumentsarevariables.Thesearedenotedbyidentiersstartingwithaapital
oran underline.A single underline(_) is an anonymous variable, thevalue of
whih is of no interest. Multiple ourrenes of suh anonymous variables are
onsidered dierent.Theseond argumentof (1)is astruturein aspeial list
notation. A list is atually a reursive struture [Head|Tail℄, onsisting of a
Head (itsrst element) and aTail, whih is alist of the remaining elements.
Thelistintheseondargumentontainsvevariablesandisgiveninasimplied
notation, i.e. [A,B,C,D,E℄, in fat, orresponds to [A|[B|[C|[D|[E|[℄℄℄℄℄℄.
Here[℄representsanemptylist(alistwithnoelements).
APrologprogramonsistsofasetoflausesofformHead :- Body,meaning
Head is implied by Body. The Head is aterm, while the Body is a term ora
omma-separated sequene of terms. Here the omma denotes a onjuntion.
Clauseswhoseheadshavethesamefuntoraregroupedtogetherintoprediates.
Thenameofaprediateisthesharedstruturenameoftheheadsofitslauses.
AProloggoal (query)hasthesameformasalausebody.Theexeutionof
agoalwrt.aPrologprogramsueedsifaninstaneofthegoalanbededued
substitutionsas results.Forexample,letusonsider thegoalshownbelow.
'Writer:lass:234'(ID), 'Painter:lass:236'(ID)
This omplex goal onsists of two goals, separated by a omma. It sueeds
if there is suh an instantiation of variable ID under whih bothgoals an be
dedued fromthegiven program(notshownhere). Theresultoftheexeution
is theenumerationof suh IDs.Informally,thisquery enumeratesthose people
whoarewritersandpaintersatthesametime.
Furtherontrolonstrutssuhasdisjuntion(Goal1 ;Goal2 )andnega-
tion\+Goal arealsosupported byProlog.Thelatteris theso allednegation
byfailure,whihisnotapableofenumeratingsolutions,butjust heksifthe
exeution of Goal fails. There is a widerange of built-in prediates, inluding
onesforolletingallsolutionsofagoal(e.g.bagof).Forexample,
bagof(ID, ('Writer:lass:234'(ID), 'Painter:lass:236'(ID)), IDs)
willollettheidentiersofallpeoplewhoarewritersandpaintersintothelist
IDs.An importantpropertyof bagofisthatitanreturnmultiple solutionsif
notallvariablesinitsseondargumentappearintherst.Forexample,onsider
aprediateedgedesribingtheedgesofadiretedgraph:
edge(a,b). edge(a,). edge(,d). edge(d,a). edge(,e).
Byinvoking thegoalbagof(End, edge(Start, End), EndPoints)weollet
the endpoints of theedges. This goalwill produe three answers,one foreah
possiblevalueofvariableStart:
Start = a, EndPoints = [b,℄
Start = , EndPoints = [d,e℄
Start = d. EndPoints = [a℄
MoreabouttheProloglanguageanbereadintheISOstandardforProlog
[26℄andin textbooks,suh as[28,10℄.
3 SINTAGMA System Arhiteture
TheoverallarhitetureoftheSINTAGMAsystemanbeseeninFigure1.
Themainideaofthesystemistoolletandmanagemeta-information onthe
soures to be integrated. These piees of information are stored in the Model
Warehouse, in the form of UML-like models [12℄, onstraints and mappings.
Thiswayweanrepresentstruturalaswellasnon-struturalinformation,suh
aslassinvariants,et. TheModelWarehouseresidesinand ishandled bythe
Model Manager omponent.
Weusethetermmediation to refertotheproessofqueryingSINTAGMA
models.Mediationdeomposesomplexintegratedqueriestosimplequeriesan-
swerable by individual information soures, and, having obtained data from
Meta
Server
Data
Server
Comparator
ModelVerier
Unier
Corr.generator
DataVerier
SpeAdvisor
ModelManager Model
Warehouse
Model
Im(Ex)port Agent
Congu-
rator
Mediator
Client
Programs
Browser
Shell
... User
Wrapper Wrapper
Wrapper
ModelingTool
(Protege,Rose)
Wrappers:
-Relational
-XML
-RDF
-HTML
-WebServie
Fig.1.ThearhitetureoftheSINTAGMAsystem
these, omposes the results into an integrated form. Mediation is the task of
theMediator omponent.
Aesstoheterogeneousinformationsouresissupportedbywrappers.Wrap-
pershidethesyntatidierenesbetweenthesouresofdierentkinds,bypre-
sentingthem to upperlayersuniformly,asUML models. These models (alled
interfae models)areenteredintotheModelWarehouseautomatially.Thefol-
lowingsubsetionsgiveabriefdesriptionofthemainSINTAGMAomponents.
3.1 The Model Manager
TheModelManagerisresponsibleformanagingtheModelWarehouse(MW)
and providing integration support, suh asmodel omparison and veriation
(not overedinthis paper).HerewefousontheroleoftheModelWarehouse.
TheontentoftheMWisgivenin thelanguagealledSILanwhihisbased
onUML[12℄and DesriptionLogis[17℄. ThesyntaxofSILAN resemblesIDL,
theInterfaeDesriptionLanguageofCORBA[19℄.Wedemonstratetheknowl-
edgerepresentationfailitiesofSINTAGMAbyasimpleSILanexampleshowing
therelevantfeatures ofthemeta-datarepository(Figure2).
The example desribes the model Art ontaining two lasses, Artist and
Work.It also ontainsanassoiationhasWorkbetweenartistsand theirworks.
Wewillexplainthedetailsofthisexamplebelow.
2 lass Artist: BuiltIns::DLAny {
3 attribute String name;
4 attribute Integer birthDate;
5 onstraint self.reation.date > 1900;
6 };
7
8 lass Work: BuiltIns::DLAny {
9 attribute String title;
10 attribute String author;
11 attribute Integer date;
12 attribute String type;
13 primary key title;
14 };
15
16 assoiation hasWork {
17 onnetion Artist as reator;
18 onnetion Work as reation;
19 };
20 };
Fig.2.SILanrepresentationofthemodelArt
SemantisofSILanmodels TheentralelementsofSILanmodelsarelasses
and assoiations, sinethese are the arriersof information. A lass denotesa
set of entities alled the instanes of the lass. Similarly, an
n
-ary assoiationdenotesasetof
n
-arytuplesoflassinstanesalled links.Classesanhaveattributes whiharedenedasfuntionsmappingthelass
toasubsetofvaluesallowedbythetypeoftheattribute.Classesaninheritfrom
otherlasses.Allinstanesofthedesendantlassareinstanesoftheanestor
lass, as well. In ourexample bothArtistand Work inherit from the built-in
lassBuiltIns::DLAny 1
(f. lines2and8). SeeSetion 4.3formoredetails.
Assoiationshaveonnetions,an
n
-aryassoiationhasn
onnetions.Inanassoiationsomeoftheonnetionsanbenamed,providingintuitivenavigation.
Forexample, theonnetions of assoiation hasWork,orresponding to lasses
ArtistandWork,arealledreatorandreation,respetively(lines1718).
Classesan havea primary key, omposed of one ormore attributes. This
speiesthatthegivensubsetoftheattributesuniquelyidentiesaninstaneof
thelass. Inourexample,asagrosssimpliation, attributetitleservesasa
keyin lassWork,i.e.thereannotbetwoworks(books,forexample)withthe
sametitle.
1
InSILandoubleolons(::)separatethemodelnamefromthenameofitsonstituent
(lass,assoiation, et.).
jetonstraintextension ofUML, theOCLlanguage[9℄.Invariantsgivestate-
mentsaboutinstanesoflasses(andlinks ofassoiations)thatholdforeahof
them.TheonstraintinthedelarationofArtist(line5)isaninvariantstating
that the publiation date ofeah work of anartist is greaterthan 1900 2
. The
identierselfreferstoanarbitraryinstaneoftheontext,inthisasethelass
Artist.Thentwonavigation stepsfollow.Intherststepwenavigatethrough
theassoiationhasWorktoanarbitrary pieeofworkoftheartist,whileinthe
seond stepwego from thework to itspubliationdate, and nallystate that
thisdateis alwaysgreaterthan1900.
Inadditionto theobjetorientedmodelling paradigm,theSILan language
also supports onstrutsfrom the Desription Logi (DL) world[17℄. This re-
entlyaddedfeatureofSINTAGMAisdisussedin Setion4.
Abstrations Formediation,weneedmappingsbetweenthedierentsoures
andtheintegratedmodel.Thesemappings arealled abstrations beausethey
often provide a more abstrat view of the notions present in the lower level
models. Anexampleabstrationalledw0anbeseeninFigure3.
1 abstration w0 (m0: Interfae::Produt,
2 m1: Interfae::Desription
3 -> m2: Art::Work) {
4
5 onstraint
6 m1.id = m0.id and
7 m1.ategory = "artwork"
8 implies
9 m2.title = m0.name and
10 m2.author = m0.reator and
11 m2.date = m0.reation_date and
12 m2.type = m1.subategory and
13 m2.DL_ID = m0.name;
14 };
Fig.3.SILanrepresentationoftheabstrationpopulatinglassWork
This abstration populates the lass Work (f. Figure 2) in the model Art
usinglassesProdutandDesription,bothfromthemodelInterfae(lines
13).Thismeansthattheabstrationspeieshowtoreateavirtualinstane
of lassWork,giventhat theothertwolassesarealreadypopulated(e.g.they
orrespond to real informationsoures). Inlines 13theidentiers m0,m1 and
2
Thismaybesobeausetheunderlyinginformationsouresareknowntobedealing
withworksofartof20thenturyorlater.
todenote instanesoftheappropriatelasses.
The abstration desribes that given an instane of lass Produt alled m0
and an instane of lass Desription alled m1, for whih the onditions in
lines 67 hold, there exists an instane m2 of lass Work with attribute values
speiedbylines913 3
. Notethat line6speiesthat theidattributes ofthe
two instanes have to be the same, and thus orresponds to a relational join
operation.Inourintegrationsenario(seeSetion6)ProdutandDesription
atuallyorrespondtoreal-worldOraletablesontainingvariousprodutsand
theirdesriptions,inludingbooksandpaintings.
Thesetwosouressharethekeyid(line6).Whiletherstonesuppliesfour
eldstoWorkobjets(title,author,dateandDL_ID),theontributionofthe
seond oneis asingleeld (type). However,this seond table hasinformation
to ensurethat onlyrelevantproduts(worksofart)are inludedinlassWork,
throughtheonditioninline 7.
Wenote that other abstrations an also populate lass Work.In this ase
theset ofinstanes of Workwillbetheunionoftheinstanes produedbythe
appropriate abstrations. Note that if a new information soure is added, we
only haveto speifya newabstration orresponding to this soure,while the
existingabstrationsdonothaveto bemodied.
Notie that the abstration in Figure 3 takes the form of an impliation
desribinghowthegivensouresanontributetopopulatingthehighlevellass
Art::Work.ThisisharateristioftheLoalasViewintegrationapproah[6℄.
3.2 The Wrappers
Wrappersprovideaommoninterfaeforaessingvariousinformationsoure
types,suhasrelationalandobjet-orienteddatabases,semi-struturedsoures
(e.g.XML orRDF),aswellasWeb-servies.
Awrapperhastwomain tasks.First,it extratsmeta-datafrom the infor-
mation soure and deliversthese to the Model Manager in the form of SILan
models.Forexample,inaseofrelationalsoures,databasesorrespondtomod-
els,tablestolasses,olumnsto attributes,asshownin Figure4.
The other prinipal task of a wrapper is to transform queries, formulated
in terms of this interfae model, into the format required by the underlying
informationsoure,andthusallowforrunningqueriesonthesoures.
3.3 The Mediator
TheMediator[2℄supportsqueriesonhighlevelmodelelementsbydeompos-
ingthemintointerfaemodelspeiquestions.Thisisperformedbyreatinga
queryplansatisfyingthedataowrequirementsofthesoures.Duringtheexe-
utionofthisqueryplanthedatatransformationsdesribedintheabstrations
3
AttributeDL_IDomesfromthelassDLAny,ofwhihlassWorkisadesendant.It
hasaspeialrole,asexplainedinSetion4.3.
olumn
→
attributedatabase
→
modeltable
→
lassProdut
... nameString
idInteger
reatorString
reation_dateString
modelInterfae{
lassProdut{
attributeStringname;
attributeIntegerid;
attributeStringreator;
attributeStringreation_date;
primarykeyid;
};
};
Interfae
Fig.4.ModellingrelationalsouresinSILan
arearriedout.WheneverwequeryamodelelementinSINTAGMA,theModel
ManagerprovidesthefollowingtwokindsofinformationtotheMediator:
1. thequerygoalitself, i.e.aPrologtermrepresentingwhatto query;
2. setofmediatorrules,usingwhihtheMediatorandeomposetheomplex
queryintoprimitiveones (i.e.queriesthatreferonlytointerfaemodels).
Forexample,letusonsider thequeryshownbelowinvolvinglassWork.
query ReentWork
selet *
from w: Art::Work
where w.date > 2000;
This query is looking for reent works, namely those instanes of the lass
Art::Work that were reatedafter 2000 4
. In this ase, the query goal is sim-
ilartothefollowingsimplePrologexpression:
:- 'Work:lass:220'(DT, [A, B, C, D, E℄, DA), C > 2000. (2)
Here, the rst Prolog goal retrieves an instane of Art::Work. The vari-
ables in this term will be instantiated during query exeution. The prediate
name 'Work:lass:220'is a onatenation of three strings: the kind of the
model element (lass) and its unique internal identier (220), preeded by
theunqualiedand thus non-uniqueSILanname(Work),providedfor read-
ability. Model elements areoften referredto byhandles of form Kind(Id),e.g.
4
WeouldhavereatedalassnamedReentWorkandpopulateditbyanappropri-
ateabstration. Then,instead offormulating aSILanquery,we ouldhavesimply
diretlyaskedfortheinstanesofthislass.Thequestionwhethertouseaqueryor
anabstrationisamodellingdeision.
theinstanesqueriedfor,asopposedtothedynamitypewhihanbedierent,
ifthereturnedobjetbelongstoadesendantlassof Work.
Thedynamitypeofthequeriedinstane,i.e.thehandleofthemostspei
lass it atually belongs to, is returned in the rst argumentof thegoal. The
seond argument ontains the values of the stati attributes, in this ase we
havevesuhvariables(f.delarationoflassWorkinFigure2).Forexample,
C denotes thevalueof the attribute date.Thethird and last argumentof the
query term arries the values of the dynami attributes. These represent the
additionalattributes(not knownatquerytime)oftheinstaneifithappensto
belongtoadesendant lassof Art::Work.
Theseond partof thequerygoalorrespondstoasimplearithmeti OCL
onstraint,whihusesvariableCrepresentingthedateattributeoftheworkin
question.
Themediator rulesrepresenting theabstration w0shownin Figure3 take
thefollowingform:
'Produt:lass:190'(_,[Title,Id,Author,Date℄,_),
'Desription:lass:191'(_,["artwork",Id,Type℄,_) --->
'Work:lass:220'(lass(220),[Title,Title,Author,Date,Type℄,[℄)
The spei rule above desribes how to reate an instane of the lass Work
wheneverwehavetwoappropriateinstanesoflassesProdutandDesription
available.Ifthereweremoreabstrations,theMediatorwouldgetmorerulesas
therewouldbemorethanonepossiblewayto populatethegivenlass.
Notethat the mediator rulesare also used to desribeinheritane between
model elements.Insuh aasethe dynamitype ofthe model element onthe
righthandsideoftheruleisavariable(asopposedtotheonstantlass(220)
above).Thisvariableisthesameasthedynamitypeofthemodel elementon
thelefthandside.Thedynamiattributesarepropagatedsimilarly.
Finally, let us state that an n-ary assoiation is implemented as an n-ary
relation,eahargumentofwhihisaternarystrutureorrespondingto alass
instane,similar totherstgoalof(2).Forexample,aquerygoalfortheasso-
iationhasWork(f.Figure2)hasthefollowingform:
:- 'hasWork:assoiation:227'(
'Artist:lass:218'(DT1,[DL_ID1,Name,Birthdate℄,DA1), (3)
'Work:lass:220'(DT2,[DL_ID2,Title,Author,Date,Type℄,DA2)
).
4 DL modelling in SINTAGMA
LetusnowintroduethenewDLmodellingapabilitiesoftheSINTAGMA
system.FirstwedisusswhyweneedDesriptionLogimodelsduringtheinte-
grationproessand provideanintrodutoryexample.Then wepresenttheDL
usage.Finally,wesummarisethetasksoftheintegrationexpertwhenusingDL
elementsduringintegration.
4.1 Anintrodutory example
IntheModelWarehousewehandlemodelsofdierentkinds.Wedistinguish
between appliation and oneptual models. The appliation models represent
existingorvirtualinformationsouresandbeauseofthistheyarefairlyelabo-
rateand preise.Coneptualmodels,however,representmentalmodelsofuser
groups,thereforetheyarevaguerthantheappliationmodels.
Ourexperiene showsthat to onstrutsuhmodelsit ismoreappropriate
to use somekind ofontologialformalisminsteadof therelativelyrigid objet
oriented paradigm. Aordingly, we have extended our modelling language to
inorporate several desription logi onstruts, in addition to the UML-like
ones desribedearlier. In theenvisioned senario,the high-levelmodels of the
users are formulated in desription logi and via appropriate denitions they
are onneted to lower-level models. Mediation for aoneptualmodel follows
the sameideawe usefor any other model: thequery isdeomposed,following
thedenitionsandabstrations,untilwereahtheinterfaemodels(ingeneral,
throughsomefurtherintermediatemodels)whihanbequerieddiretly.
Beforegoingintothedetails,weshowanexampletoillustrate thewayhow
DL desriptionsare representedin SILan (note that Writerand Painter are
bothdesendantsoflass Artist,but otherwisetheyare normalUMLlasses;
wewillpresentmoredetailsabouttheselassesinSetion 6).
model Coneptual {
lass WriterAndPainter {};
onstraint equivalent { (4)
WriterAndPainter,
Unified::Writer and Unified::Painter};
};
Here we dene the lass WriterAndPainterby providing a SILan onstraint.
This onstraintanbeplaedanywherein theModelWarehouse:in theexam-
pleabovewesimplyput itintheverymodelthatdelaresthelassWriterAnd
Painter itself. Theonstraintatually orresponds to aDL onept denition
axiom:WriterAndPainter
≡
Writer⊓
Painter.Namely,itstatesthattheinstanesof lass WriterAndPainterare those (and only those) who belong to the un-
namedlassontainingtheindividualswhoarebothwritersandpainters.Thus,
DL onepts are dened using the Global as View approah [6℄, as opposed
to the Loal asView tehniquesapplied in populating high-levellasses using
abstrations(f.Setion 3.1).
NotethatthelassWriterAndPainterouldbereatedwithoutDLsupport.
However,in thatasetheintegrationexpert would haveto gothroughamuh
ifyingallitsattributes andpopulatingitwithanappropriateabstration.This
abstrationwouldhavetoimplementtheonstraint(4),throughanappropriate
join-likeoperation.
Now,withDLsupport, theexpertsimplyformulates averyshort andintu-
itive DL axiom.We argue that this is easier for the expert to do, and it also
makestheontentoftheModelWarehousemorereadabletoothers.
4.2 DL elementsin SILan
FromtheDLpointofview,SINTAGMAsupportsayliDesriptionLogi
TBoxesontainingonly oneptdenition axioms,whih are formulatedin an
extensionofthe
ALCN (D)
language(seemorebelowabouttheextension).Only singleatomionepts,soallednamedsymbols anappearonthelefthandsideoftheaxioms,suhasWriterAndPainterinexample(4).Theremainingatomi
onepts, not appearing on the left hand side are alled base symbols. Suh a
TBoxisdenitorial,i.e.themeaningofthebasesymbolsunambiguouslydenes
themeaning ofthenamed symbols.Thebase symbols,in ourase,orrespond
tonormalSINTAGMAlassesandassoiations,e.g.WriterandPainterinthe
example(4).TheABoxisasetofoneptandroleassertions,asdeterminedby
theinstanesofthelasseswhihorrespondto thebase symbolspartiipating
in theTBox.
The DL onept onstrutors supported by SINTAGMA and their SILan
equivalentsaresummarisedinTable1.Notethatthistableatuallydesribesthe
possibleoneptformatsontherighthandsideofadenition axiom,assuming
that wehaveexpanded theTBox 5
.
Theonlynon-lassialDLelementinTable1istheonretedomainrestri-
tion (thelastlineinthetable).Suharestritionspeiesasubsetofinstanes
ofthebaseonept
A
forwhihthegivenOCLonstraintholds.Thisisagener-alisationoftheideaofonretedomainsintheDesriptionLogisworld.Below
weshowanexampleofaonreteSILanrestritiondesribingthoseworkswhose
type(i.e.thevalueoftheattributetype)ispainting.
lass onstraint Art::Work satisfies self.type="painting"
The reasonweallowonly onept denition axiomsis that weaim to use DL
oneptsto desribeexeutablehigh-levelviews ofinformationsoures. Inthis
sense aDLoneptisatuallyasyntativariantofaSILanqueryoraSILan
lasspopulatedbyanabstration.
NotethatthisalsoimpliesthatweusetheClosedWorldAssumption(CWA)
inDLqueryexeution.Wearguethatthisisappropriatebeauseofthefollow-
ingthreereasons.First,CWAautomatiallyensuresthat ourDLonstrutsare
5
TheexpandedversionofanayliTBoxisobtainedbyrepeatedlyreplaingevery
namedsymbolontherighthandsideofanaxiombyitsdenition.Thisproess is
repeateduntil nofurthernamedsymbolsare left ontherighthand side.Thefat
thattheTBoxisayliensurestheterminationofthisproess.
Baseonept
A
UMLlassAtomirole
R
UMLassoiationIntersetion
C ⊓ D
C and DUnion
C ⊔ D
C or DNegation
¬C
not CValuerestrition
∀R.C
slot onstraint R all values CExistentialrestrition
∃R.C
slot onstraint R some value CNumberrestrition
⋊ ⋉ nR
slot onstraint R ardinality i..jTop
⊤
DLAnyBottom
⊥
DLEmptyConreterestrition lass onstraint A satisfies OCL
Table 1.DL-relatedonstrutssupportedinSILan
semantiallyompatiblewith otheronstrutsintheSINTAGMAsystem.Se-
ond, we arguethat the OpenWorldAssumption(OWA)is appliablewhen we
have only partial knowledge and would liketo determine the onsequenes of
thisknowledge,trueineveryuniverseinwhihtheaxiomsofthispartialknowl-
edge hold. Inontrastwith this, in theontextof informationintegration,our
userswould liketoonsider asingleuniverse, inwhihabase oneptorarole
denotes exatly those individuals (orpairs ofindividuals) whih are presentin
theorrespondingdatabase.Toillustratethisissue,letusonsiderthefollowing
example:the oneptofnoviepainter is dened toontain painters having at
most5paintings(forexample,beinganoviepaintermaybeapreonditionfor
agovernmentgrant).Tomodelthissituation,theintegrationexpertreatesthe
DLaxiomshownbelow.
NoviePainter
≡
Painter⊓ (6 5
hasPainting)
However,queryingthis onept,usingOWA, willprovidenoresultsin general,
asanopenworldreasonerwouldreturnanindividualonlyifitisprovable that
it has no more than 5paintings. Pratially, this is notwhat the information
expert wants.
Thethirdreasonwhywe deidedtouse thelosed worldassumption isthe
fatthatweenvisagehandlinghugeamountsofdataintheunderlyingdatabases.
Traditional,tableaubasedDLreasonersdonotopewellwithlargeABoxes[15℄.
still not fast ornot expressiveenough [24℄. By using CWA we animplement
DLqueriesusingthewellresearhed,eientdatabasetehnology.
4.3 Modeling methodology and tasksof the integration expert
Theintegrationexpert isresponsibleforreatingtheDLaxioms.Although
thesearerepresentedinSILanwithintheSINTAGMAsystem,theexpertanuse
anyavailableOWLeditortoreateOWLdesriptions.Thesedesriptionsthen
anbeloaded bytheOWLimporterof theSINTAGMA systemthat basially
realisesanOWL-SILantranslation(f.theModelIm(Ex)portboxinFigure1).
Onething theexpertshould takeareofisto maththenamesof thebase
symbols and the orresponding SINTAGMA lasses and assoiations. This is
often done in two steps: rst the integration expert reates onept denition
axiomsusingthewidelyaeptedterminologyofthedomain,notpayingatten-
tion to the names of the model elements in the Model Warehouse. Next, the
expert provides additional denition axioms for eah base symbol onneting
it with the propermodel element. For example, we ould use names A and B
instead of WriterandPainter in (4), provided that we also enode in SILan
theequivalentsofthefollowingDLaxioms:
A
≡
WriterB
≡
PainterAfurtherruial issueistodeidehowto identifytheinstanesof thebase
onepts,e.g.theinstanesofthelassWriterandlassPainter.Withoutthis,
itisnotpossibletodeterminetheinstanesoflassWriterAndPainter.
InatraditionalDLABox,aninstanehasanamethatunambiguouslyiden-
tiesit.InSINTAGMA,similarly todatabases,aninstane isidentiedbythe
subset ofits attribute values.Forexample,twowriters ould be onsidered to
bethesameiftheirnamesmath, assumingthat nameisakeyin lassWriter.
Theproblem isthatsuhkeysarefairlyuselesswhenweompareinstanes
ofdierentdatasoures.Thisisbeause,ingeneral,weannotdrawanydiret
onlusion from the relation of the keys belonging to instanes from dierent
lasses.Forexample,databasesontainingemployeesoften usenumeriIDs as
keys.HavingtwoemployeesfromdierentompanieswiththesameIDdoesnot
mean that we are talking about the same person.Similarly, if the IDs of the
employeesdonotmath,theyarenotneessarilydierentpersons.
Whatweneedissomekindofsharedkeythatuniquelyidentiestheinstanes
ofthelassespartiipatinginDLoneptdenitions.Lukily,theobjet-oriented
paradigmweusein SINTAGMAprovidesaniewaytohavesuh identiers.
Wehavementionedearlierthatin SINTAGMAthenotionofDLoneptis
asyntativariantofSINTAGMAlass.ThisalsomeansthattheresultofaDL
queryisanordinaryinstanethathastobelongtosomelass(es).Forexample,
whenwearelookingfortheinstanesthat areelementsof bothlassesWriter
andPainterweareatuallyinterestedinanartist instane belongingto these
weuse todesribeaDLonepttheresultmustbelongto somelassthat isa
ommonanestor(intermsofinheritane)ofthelassesinvolved.
Instead of asking the integration expert to dene suh ommon anestor
lassesin anadhoway,weintroduethebuilt-in lassDLAny.Thislass or-
respondstotheDLonepttop(
⊤
)andithasonlyoneattributealled DL_ID,whihisakey.WerequirethatallthelassespartiipatinginDLoneptdeni-
tionsarethedesendantsofDLAny 6
(f.lines2and8ofFigure2).Beauseofthe
propertiesofinheritane,attribute DL_IDwillbeakeyinallofthedesendant
lasses,i.e.itwillexatlyserveastheglobalidentierwewerelooking for.
Now, the task of the integration expert is to assign appropriate values to
the DL_IDattributes: she needsto extend theexisting abstrations populating
thebasesymbols(lasses) toalsoonsidertheattributeDL_ID.Byappropriate
valueswemeanthattheDL_IDsoftwoinstanesshouldmathiftheseinstanes
are the same, and should dier otherwise. An example for this an be seen
in Figure 5populating thelass Writer,whih is partof a bigger integration
senariotobeshownlaterin Setion6.
1 abstration ap (m0: Interfae::Member ->
2 m1: Unified::Writer) {
3
4 onstraint let n = m0.fname.onat(" ").onat(m0.lname) in
5 m1.name = n and
6 m1.birthDate = m0.date and
7 m1.member_id = m0.iwa_id and
8 m1.style = m0.style and
9 m1.DL_ID = n;
10 };
Fig.5.PopulatingtheDL_IDattributeofabaseonept
This abstration populates the lass Writer from an interfae lass alled
Member (lines 12), whih represents a membership database of an imaginary
International Writer Assoiation (IWA). Letus assume that the membersof
this assoiation havesomekind ofauniqueidentier,suh asthemembership
number,presentin theunderlyingdatabase.Itmaybeworthbringing thiskey
tothelassWriter(line7)asitmakespossibletondwriterseientlyifthey
happento beIWA members.However,theuniqueidentierfrom theDLpoint
of viewhas to bedierent: in fat it is the onatenationof the rst andlast
nameofthewriter,withaspaein between(lines4and9).
6
Notethatthisisaneessaryondition.Asforanyonept
C
,C ⊑ ⊤
holds,anyDLinstanehastobelongtothelassorrespondingto
⊤
,i.e.toDLAny.(e.g. Person,see Figure8) where theIWAnumber makesno senseand so the
member_id attribute is set to "n/a".Furthermore, we may want lass Writer
to be a desendant of lass Artist (f. Figure 8), together with some other
lasses,suhasPainter.Thisrequiresakeythatanbeomputedfromallthe
underlyingsoures,suhasthenameoftheartist 7
.
Tosummarise,theintegrationexperthastoperformthefollowingtaskswhen
DLmodellingisusedduring theintegrationproess:
1. delareDLlassesandforeahprovideorrespondingdenition axioms;
2. ensurethat eahbaseoneptappearinginthedenitionaxiomsis:
(a) inheritedfromlassDLAny,
(b) populatedproperly,i.e.itsDL_IDattributeislled appropriately.
5 Querying DL models in SINTAGMA
Now we turn our attention to querying DL onepts in SINTAGMA. As
desribedin Setion3.3ourtaskisto reateaquerygoal andasetofmediator
rules.WhenwequeryaDLlass,mediatorrulesareonlygeneratedforthebase
symbols.As theseare ordinarylassesand assoiations,this proess isexatly
the sameasthe onewe usefor aseswithoutany DLonstrutinvolved. This
meansthatwean nowfousontheonstrutionofthequerygoal.
Reall that aSINTAGMA instane is haraterised by three properties, as
exempliedby(2)onpage9:itsdynamitypeDT,itsstatiattributesSAandits
dynami attributesDAs. Belowwewill usethevariable nameAs todenote the
fullattributelistofaninstane,i.e.theonatenationofthestatianddynami
attributevalues,withtheexlusionof DL_ID.
A DL lass has only a single stati attribute, the DL_ID key. However, in
ontrast with an objet oriented query, a DL query may return an answer
that has multiple dynami types. For example, when we enumerate the lass
WriterAndPainter we get instanes that belong to both lasses Writer and
Painter(somethingwhihisnotpossibleinthestandardUMLmodelling).A-
ordingly,ananswertoaDLquerytakestheformofapair(ID, DTA),whereID
istheDL_ID 8
ontainingtheuniquenameoftheDLinstanes(seeSetion4.3),
whileDTAisaPrologstrutureontainingthedynamitypesoftheanswer,eah
pairedwiththeorrespondingfullattributelist.TheDTAstrutureisthuseither
asingleDT-Aspair, orreursively, twoDTAstruturesjoined using theomma
operator:(DTA
1
, DTA2
).Figure6desribesthemappingfrom anarbitraryDLoneptexpressionto
the orresponding query goal. Here we dene a funtion
Φ C whih, given an
arbitraryoneptexpression
C
, returnstheorrespondingquerygoalwith two arguments,IDandDTA.Wedene thisfuntion byonsideringtheDLoneptonstrutors,aslistedinTable1.
7
Thisisalsoasimpliation.Morerealistially,thekey ouldbethenametogether
withthebirthdate.
8
WeusethenameIDinsteadof DL_IDforoniseness.
Φ A(ID, DTA)= A N(DT, [ID|SAs℄, DAs), DTA = DT-(SAs DAs)
Φ C⊓D(ID, DTA)= Φ C(ID, DTA1
), Φ D(ID, DTA2
), DTA = (DTA1
, DTA2
)
1
),Φ D(ID, DTA2
), DTA = (DTA1
, DTA2
)
Φ C⊔D(ID, DTA)=
(Φ C(ID, DTA) ; Φ D(ID, DTA))
Φ D(ID, DTA))
Φ ¬C(ID, _)= \
+ Φ C(ID, _)
Φ ∃R.C(ID, DTA)= R N(R N D(DT, [ID|SAs℄, DAs), R N R(_, [ID2
|_℄, _)),
R N D(DT, [ID|SAs℄, DAs), R N R(_, [ID2
|_℄, _)),
2
|_℄, _)),Φ C(ID2
, _), DTA = DT-(SAs DAs)
Φ ∀R.C(ID, DTA)= Φ R D(ID, DTA),
,
\
+ (R N(R N D(_, [ID|_℄, _), R N R(_, [ID2
|_℄, _)),
R N R(_, [ID2
|_℄, _)),
Φ ¬C(ID2
, _))
Φ ⋊ ⋉nR(ID, DTA)=
bagof(Y, R N(X, Y), Ys),length(Ys, S),ondition⋉ ⋊
(n, S),
⋉ ⋊
(n, S),X =
R N D(DT, [ID|SAs℄, DAs), DTA = DT-(SAs DAs)
Φ ⊤(_, _)=
true
Φ ⊥(_, _)=
false
=
falseFig.6.TransformingDLonstrutsintoquerygoals
Letusonsidertheasesonebyone.Ifwehaveabaselass,wesimplyreate
aqueryterm representingtheinstanes ofthe lass,similar to theonein goal
(2) and then onvert the attributesretrieved to therequiredform (DTA). Here
operation
denotes theompiletimeonatenationof lists 9, while
A N stands
for the prediate name orresponding to onept
A
. For example, WorkN =
'Work:lass:220',f. (2)onpage9.Notethat intheseond argumentofthe
query goal
A N wemake use of the fat that the DL_IDattributes are always
plaedrstinthestatiattributelistof aninstane.
Ifwehavetheintersetionoftwoonepts
C
andD
,wereursivelytransformonepts
C
andD
andputthem in aPrologonjuntion. TheDTAstrutureisbuiltfrom the strutures reursivelyobtainedfrom theexeutionof the trans-
formationsofonepts
C
andD
.Note thattheresultingstruturemayontaindupliates,i.e.thesameDT-AspairmaybefoundinDTAmorethanone.These
dupliatesareonlyremovedatthetoplevel,i.e.whenthenalresultofaquery
ispresented.Thetransformationofuniononeptsissimilartotheintersetion:
wereateaPrologdisjuntion.
Negation
¬C
is implemented by using the Prolog negation-as-failure. This translationisonlyapableofheking whetheragiveninstanewithIDbelongsto onept
C
ornot. As usual in thedatabase ontext, we restrit the use ofnegationtoaseswherenegatedqueriesappearonlyinonjuntionwithatleast
9
The
operatoris usedonlywith astati attributelist (SAs).For any given base lass, thelengthofthe orrespondingSAs isxed(thenumberofstati attributesexluding theDL_ID). Therefore, theSAs
DAs onatenationanbe arried out atompiletime.negated onepts have to appear either in the sope of a quantier, or in an
intersetiontogetherwithatleastonenon-negatedonept.Itisthetaskofthe
Mediator to nd an appropriate order in the nal query plan where negation
appearsin aplaewhereIDisinstantiated[5℄.TheMediatorrefusestoexeute
thequeryifsuhanorder doesnotexist.
The next two ases involve assoiations. On the right hand side of these
formulas
R N denotestheprediatenameorrespondingtotheassoiationitself.
R D (R R) denotes the base lass that is the domain (range) of assoiation R
.
R
.Correspondingly,
R N D and R N R standfor theprediate namesofthe lassesR D
R D
and
R R, respetively 10
. Reall that a binary assoiation is represented by a
binaryrelationwithternarystruturesasarguments,asin (3).
The existential restrition
∃R.C
is simply transformed to a query of the assoiationR
andtheoneptC
.Thegoalorrespondingtoavaluerestrition
∀R.C
rstenumeratesthedo-main of
R
and thenusesdoublenegationtoensurethat thegiveninstanehasno
R
-valueswhihdonotbelongtoC
.NotethatΦ ¬ C(ID2, _)isinvokedonly
whenID2isalreadyinstantiated.
Anumberrestrition
( ⋊ ⋉ nR)
istransformedintoagoalwhihusestheProlog built-in prediatebagof(f. Setion2.2, page4)to enumeratetheinstanesinthedomainof
R
togetherwiththenumberofR
-valuesonnetedtothem, andthensimplyappliestheappropriatearithmetiomparison.
ThelasttwolinesofFigure6denethetransformationofthetopandbottom
onepts.
⊤
ismappedintotrue,while⊥
tofalse.Queryingtheseoneptsontheirowndoesnotmakesense,butthesemappingsareusefulwhentransforming
DLoneptssuhas
∃R.⊤
or∀R.⊥
.HavingdesribedthetransformationofDLoneptstoquerygoals,wenow
deal with the only remaining onstrut: the onrete restrition. A onrete
restritioninvolvingabaseonept
A
andanOCLonstraintO
istransformed in astraightforwardwayintothequerygoalasshownbelow11
:
Φ A(ID, DTA), DTA = DT-AT, Ψ O(ID, AT)
Toillustrate thegeneralalgorithm,twoexampletransformationsarepresented
in Figure7.TherstoneshowsthetranslationoftheWriterAndPainterlass
desribed in (4) on page 11. The query goal is a onjuntion that onsists of
three goals.The rsttwogoalsenumerate theinstanes of lassesWriterand
PainterwithaonditionthattheirIDattributesmath.Atthispointwehave
identied those instanes whoare writers and painters at thesame time. The
last goal onstruts the struture DTA, desribing the dynami types and the
orrespondingattributevaluesofthegiveninstanes.
10
Forexample,if
R =
hasWork,f.Figure2,thenR N =
'hasWork:assoiation:227',R N D =
'Artist:lass:218'andR N R =
'Work:lass:220'.11
Ψ O(ID, AT) denotes the Prolog translation of the OCL onstraint O
. This is a
featurewhihhasalreadybeenpresentinearlierversionsofSINTAGMA,beforethe
introdutionoftheDLextensions,see[3 ℄.
modern work. This DL onept involves the assoiation hasWork and a lass
Modern(representing,say,ontemporarypieesofart).Thequerygoalbeomes
abitmoreomplexthanin therstexample:nowitonsistsoffourgoals.The
rst goal enumerates the instanes of lass Writer.The seond and thethird
goalslteroutthosewritersthatdonothaveanymodern works.Herewehave
usedthefats thatthedomainof hasWorkisthelassArtistandtherangeis
thelassWork(f.Figure 2).Finally,thelast goalbuildsthestrutureDTA.
Class toquery: WriterAndPainter
DLdenition: Writer
⊓
PainterQuerygoal: 'Writer:lass:234'(DT1,[ID,Nam e1,Bi rth1 ,IWA, Styl e℄,DA 1),
'Painter:lass:236'(DT2,[ID,Name 2,Bi rth2, Colo ur℄,D A2),
DTA = (DT1-[Name1,Birth1,IWA,Style|DA 1℄,
DT2-[Name2,Birth2,Colour|DA2℄)
Class toquery: ModernWriter
DLdenition: Writer
⊓ ∃
hasWork.ModernQuerygoal: 'Writer:lass:234'(DT1,[ID,Nam e1,Bi rth1 ,IWA, Styl e℄,DA 1),
'hasWork:assoiation:227'(
'Artist:lass:218'(DT2,[ID,Name 2,Bir th2℄ ,DA2) ,
'Work:lass:220(_,[ID2|_℄,_)),
'Modern:lass:237'(_,[ID2|_℄, _),
DTA = (DT1-[Name1,Birth1,IWA,Style|DA 1℄,
DT2-[Name2,Birth2|DA2℄)
Fig.7.Transformationexamples
Notethatifawriterhasmorethanonepieeofmodernwork,thetransforma-
tioninFigure7enumeratesthewritermultipletimes.Thisisbeausetheseond
goalansueedmorethanone,leavingahoiepoint [26℄.Inthepresentver-
sion of SINTAGMAthese dupliatesare removedat the toplevelonly, before
the queryresults arepresentedto the user.In future,wewill onsider amore
eientsolution,utilisingthePrologpruningoperators(onditionalsoruts)to
eliminatetheunneessaryhoies.
AlsonotethatinourexamplesenarioattributesName1,Name2andBirth1,
Birth2will beinstantiatedto thesamevalues,i.e.to thenameandbirth date
ofthemodernwriter.Thisistheonsequeneofthedatarepresentationweuse
inSINTAGMA,i.e.ifaninstanehasmultipledynamitypes,foreahofthem
wesupplyalltheattribute values.
Inthis setionwepresenta simpleusease, wherewefousonillustrating
theDLextensionofSINTAGMA.Moreomplextraditionalintegrationproblems
solvedusingSINTAGMAaredisussedinotherpapers,forexamplein[21℄.
Figure8showstheontentofourexampleModelWarehouse.Herewehave
fourmodelsondierentabstration levels.
PSfragreplaements
Member Person
...
... Exhibitor Produt Desription
MySQL XML
PostgreSQL Orale
abstration
abstration abstration
abstration
generalisation
general.
Writer Painter
Artist Work
hasWork
hasPainting
InterfaeUnifiedArtConeptual DesriptionLogi lasses
PainterWriter Novel
NoviePainter ...
Fig.8.ContentoftheModelWarehouse
The lowest one, Interfae, ontains lasses diretly orresponding to the
information soures we aim to integrate. Class Member orresponds to some
databasetableontaininginformationaboutwriters(membersofaertainwrit-
ersassoiation),PersonisthemodelofanXMLsouredesribingpeople(some
of whom are possibly writers). We also havehere lass Exhibitor ontaining
among other produts together with lass Desription whih provides some
informationonproduts.Thesemodelsareonstrutedautomatiallybydier-
entwrappersoftheSINTAGMAsystem.
Thenext,moreabstratmodel,alledUnified,ontainstwolassesWriter
andPainter,theirSILandesriptionsareshowninFigure9(referringto lass
Artistintrodued in Figure2in page6). These lassesprovideauniedview
of writers and painters overour heterogeneousinformation soures,i.e. query-
ing Writer and Painter gives us all the known writers and painters respe-
tively.TheselassesarepopulatedbySILanabstrations:Writerbytwo,while
Painter by only one. We an later extend our Model Warehouse to inlude
moreinformation soures on painters. This wayPainter would also be popu-
latedbyseveralabstrations.Pleasenotehowexiblethisapproahis:whenever
wewouldliketoaddanewinformationsoure,allwehavetodoistoprovidea
newabstration.Thisisfundamentallydierentfromthewayviewsarereated
in traditionaldatabasesystems.
model Unified {
lass Writer: Art::Artist {
attribute Integer member_id;
attribute String style;
};
lass Painter: Art::Artist {
attribute String favourite_olour;
};
};
Fig.9.SILandesriptionoflassesWriterandPainter
ThethirdmodelArtdesribesanevenhigherviewoftheunderlyinginforma-
tionsoures.Itontainstwolassesonnetedbyanassoiation.ClassArtistis
delaredtobethegeneralisationoflassesWriterandPainter,i.e.Artistisa
ommonparent of WriterandPainter,intermsofinheritane.Aordingly,
it ontainstheunion ofthe instanes ofthese lasses. Class Workinorporates
works (books and paintings). Inthe example, lass Workis populatedby only
one abstration.Assoiation hasWork onnets instanes in lass Artistwith
thoseinlassWork,i.e.itallowsustonavigatefromanartisttoherworks.This
assoiationispopulatedby anabstration(not shown inFigure 8)by reating
virtualpairsfrom thoseinstanesoflassesArtistandWorkwheretheauthor
ofthework mathesnameoftheartist.
Note that there is one more assoiation in the Model Warehouse, alled
hasPainting.This assoiationonnetspainters withtheirpaintingsand goes
latedbyanabstration,notshownhere.AssoiationhasPaintingisusedinthe
denition of PainterWriter(seebelow).
Upuntilnowwehaveusedthe traditionalfeaturesof SINTAGMA:lasses,
assoiations, generalisations, abstrations. Now we turn to the most abstrat
model, named Coneptual,whihprovidesanevenhigher-levelviewofthein-
formationthanthepreviousmodel.
The model Coneptual represents the knowledge of our spei example
domain,intheformofDLoneptdenitionaxioms.Theseaxiomsformasimple
ontology,apartofwhihisshowninFigure10.Thisontologytalksaboutspeial
typesofartists,paintersandwriters.Itstatesthatanoviepainter isapainter
whohasonlypaintednomorethan5paintings(axiom1).Somebodyismostly
writerifsheisanartistwhohasproduedatleast3works,buthasatmostone
painting(axiom2).Aprodutivewriterhasreatedatleast
10
works(axiom3).Somebodyispainter-writerissheisawriterwhohassomepaintings(axiom4).
Finally,anovelistissomebodywhoisonlywritingnovels(axiom 5).
NoviePainter
≡
Painter⊓ (6 5
hasPainting.⊤)
(1)MostlyWriter
≡
Artist⊓ (> 3
hasWork.⊤) ⊓ (6 1
hasPainting.⊤)
(2)ProdutiveWriter
≡
MostlyWriter⊓ (> 10
hasWork.⊤)
(3)PainterWriter
≡
Writer⊓ (∃
hasPainting.⊤)
(4)Novelist
≡ ∀
hasWork.
Novel (5). . . ≡ . . .
Fig.10.Anontologydesribingartists,paintersandwriters.
Inpratie,suhanontologyanbereatedbytheinformationexpertman-
ually or an be imported from an existing ontology using the OWL importer
omponent of the SINTAGMA system. In SINTAGMA this ontology is rep-
resented by a model ontaining lasses with no attributes, together with the
orrespondingSILanonstraintsasshownbelow:
model Coneptual {
lass NoviePainter {}; lass MostlyWriter {};
lass ProdutiveWriter {}; lass PainterWriter {};
lass Novelist {}; ...
onstraint equivalent {
NoviePainter,
Painter and {slot onstraint hasPainting ardinality 0..5}
};
...
};
Mostofthese(i.ePainter,WriterandArtist),appearintheunderlyingUML
models.However,thereistheoneptof Novel,whihhasnodiretUMLoun-
terpart. This onept anbe dened using aonrete restrition of SILan, as
shownbelow.
onstraint equivalent {
Novel,
{lass onstraint Art::Work satisfies self.type="novel"}
};
This onludesthedesriptionofourexamplemodels.HavingenodedourDL
axiomsintermsofSILanonstraints,weannowexeuteDLqueries.Forexam-
ple, weanask SINTAGMA to enumeratethe instanes of lass Produtive-
Writer.Thisquerywillprodueinstanessimilarto thefollowing:
1 ('Lisa James',
2 [
3 'Writer'-['Lisa James', 1965, 42, 'fantasy'℄,
4 'Painter'-['Lisa James', 1965, 'red'℄
5 ℄
6 )
Here,thestring'Lisa James',appearinginline1,orrespondstotheIDof
Figure6,i.e.thesharedDLidentier.Lines34ontainthelistofthedynami
types and orresponding attributes of the instane. This spei instane has
twodynamitypes:sheis awriter andapainterat thesametime (lines3and
4). As a writer, she has a name, birth date, her membership ID and a style
attribute.Asapainterwealsoknowherfavouriteolour.
7 Related work
Thetwomain approahesin information integrationare theLoal asView
(LAV) and the Global as View (GAV) [6℄. In the former, soures are dened
in termsof theglobalshema,whilein thelatter, theglobalshemaisdened
in terms of the soures (similarly to the lassialviews in databasesystems).
Information Manifold [20℄ is agood example for aLAV system. Examples for
theGAV approahinlude theStanford-IBMintegrationsystemTSIMMIS[8℄,
andtheDLbasedintegrationsystemalledObserver[23℄.
InSINTAGMAweapplyahybridapproah,i.e.weusebothLAVandGAV.
Whenusingabstrationstopopulatehigh-levellassesweemploytheLAVprin-
iple,whilein aseofDLlassdenitionsweusetheGAVapproah.
Thereareseveralompletedandongoingresearhprojetsintheareaofusing
desriptionlogi-based approahesfor bothEnterprise Appliation Integration
(EAI) andEnterpriseInformationIntegration(EII).
Arhiteture,andtheprovisionofnewapabilitieswithintheframeworkofSe-
mantiWebServies.Examplesforsuhresearhprojetsinlude DIP[16℄and
INFRAWEBS [13℄.These projetsaimat thesemantiintegrationofWebSer-
vies,inmostasesusingDesriptionLogibasedontologiesandSemantiWeb
tehnologies.Here,however,DLisusedmostlyforserviedisoveryanddesign-
timeworkowvalidation,but notduring queryexeution.
Ontheotherhand,severallogi-basedEIItoolsuseDLandtakeasimilarap-
proahaswedidinSINTAGMA.Thatis,theyreateaDLmodelasaviewover
the informationsoures tobeintegrated.The basiframework of thissolution
is desribede.g.in [7,4℄. Thefundamental dierenewithourapproahis that
these appliations dealwith the lassial Open World Assumption, asalready
disussed in Setion 4.2. We argue that existing DL reasoners are not usable
whenlargeamountsofdataandomplexDLqueriesareinvolved[15,18,24℄.
Onthetheoretialsideaninterestingdesriptionlogiisthe
ALCK
[11℄whihaddsanon-monotoni
K
operatortotheALC
languagetoprovidetheabilitytouseboththeCWAandtheOWA,whenneeded.
ALCK
hasseveralimplementa- tion,thePellet reasoner[27℄,forexample,supportsthislogi. However,ALCK
lakstheabilityto expressardinalityonstraints,whihisafeaturefrequently
usedin informationintegrationsenarios.
Finally, we mention that the Desription Logi Programming (DLP) ap-
proah,rstintroduedin [14℄,alsoemploystheideaoftranslatingDLaxioms
into Prolog goals (f. the approah summarised in Table 6). In ontrast with
our approah DLP uses the Open World Assumption and does not deal with
negationandardinalityrestritions.
8 Conlusions
InthispaperwehavepresentedtheDLextensionoftheinformationintegra-
tion system SINTAGMA. This extension allowsthe information expert to use
DesriptionLogibasedontologiesin thedevelopmentofhigh abstrationlevel
oneptualmodels.QueryingthesemodelsisperformedusingtheClosedWorld
Assumptionovertheunderlyinginformationsoures.
We have presented the main omponents of the SINTAGMA system: the
ModelManagerwhihisresponsibleformaintainingtheModelWarehouserepos-
itory,theWrapper,whihprovidesauniformviewovertheheterogenousinfor-
mationsouresandtheMediator,whihdeomposesomplexhigh-levelqueries
intoprimitiveones answerablebytheindividualinformationsoures.
Next, we have desribed the newly introdued DL modelling elements the
integration expert anuse when building oneptualmodelsand wehave also
disussedthemodellingmethodologyshehastofollow.Wehavedenedatrans-
formationofDLqueriestoProloggoals,usedintheSINTAGMAsystemforDL
queryexeution.Wehavealsoillustratedourapproahbyprovidingausease
aboutartistsandtheirworks.
used alone for solvingomplexmodelling problems, somekind of hybridteh-
niques are neessary. Weargue that oursolution for ombining DLand UML
modelling in a unied integration framework provides a viable alternative to
existing systems.Theusage ofDL onstrutsin buildinghigh-leveloneptual
modelshassubstantialbenets,bothintermsofmodellingeienyandmain-
tenane.
Aknowledgements
Theauthors aknowledgethe support of theHungarianNKFP programme
fortheSINTAGMAprojetundergrantno.2/052/2004.Wethankallthepeople
partiipatingin thisprojet,espeiallyTamásBenk®,theleadarhitet.
Wearealsograteful totheanonymousreviewersofapreliminaryversionof
thispaper[22℄, fortheirinsightfulomments.
Referenes
1. F.Baader,D.Calvanese,D.MGuinness,D.Nardi,andP.F.Patel-Shneider,ed-
itors.TheDesriptionLogiHandbook:Theory,ImplementationandAppliations.
CambridgeUniversityPress,2003.
2. Liviu BadeaandDoinaTilivea. QueryPlanningfor IntelligentInformationInte-
grationusingConstraintHandlingRules,2001.IJCAI-2001WorkshoponModeling
andSolvingProblemswithConstraints.
3. T.Benk®,P.Krauth,andP.Szeredi.Alogibasedsystemforappliationintegra-
tion. InProeedings ofthe18thInternationalConferene onLogiProgramming,
ICLP 2002.Springer,LNCS,2002.
4. A. Borgida, M. Lenzerini, and R. Rosati. Desription logis for databases. In
Desription LogiHandbook,pages462484,2003.
5. AndrásG.Békésand Péter Szeredi. OptimizingQueries ina Logi-basedInfor-
mation IntegrationSystem. InWimVanhoofPatriia Hill,editor,Proeedings of
the17thWorkshoponLogi-basedmethodsinProgrammingEnvironments(WLPE
2007), pages115,Porto,Portugal,2007.
6. D.Calvanese,D.Lembo,andM.Lenerini. Surveyonmethodsforqueryrewriting
andqueryansweringusingviews. Teh.report,UniversityofRome,April2001.
7. Diego Calvanese, Giuseppe DeGiaomo,Maurizio Lenzerini, DanieleNardi, and
RiardoRosati. Desriptionlogiframeworkforinformationintegration.InPrin-
iplesof KnowledgeRepresentationandReasoning,pages213,1998.
8. S. Chawathe, H. Garia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou,
J. Ullman, andJ. Widom. The TSIMMISprojet:Integration ofheterogeneous
information soures. In 16th Meeting of the Information Proessing Soiety of
Japan,pages718,Tokyo,Japan,1994.
9. T.Clark and J.Warmer,editors. ObjetModeling with theOCL: The Rationale
behindtheObjetConstraintLanguage,volume2263ofLNCS. Springer,2002.
10. WilliamF.CloksinandC.S.Mellish.ProgramminginPROLOG.Springer-Verlag
NewYork,In.,Seauus,NJ,USA,1994.
operatorfordesriptionlogis. Artif.Intell.,100(1-2):225274, 1998.
12. Martin Fowler and KendallSott. UML Distilled: Applying the Standard Objet
ModelingLanguage. Addison-Wesley,1997.
13. Vladislava Grigorova. Semanti desription of web servies and possibilities of
BPEL4WS. InformationTheoriesandAppliations,13(2):183187,2006.
14. BenjaminN.Grosof,IanHorroks,RaphaelVolz,andStefanDeker. Desription
logi programs:Combininglogiprograms withdesriptionlogi. InPro.of the
Twelfth International World Wide WebConferene (WWW 2003), pages 4857.
ACM,2003.
15. V. Haarslev andR. Möller. Optimizationtehniquesfor retrievingresoures de-
sribedinOWL/RDFdouments:Firstresults.InNinthInternationalConferene
onthePriniplesofKnowledgeRepresentationandReasoning,KR2004,Whistler,
BC,Canada,June2-5,pages163173,2004.
16. M.Hepp,F.Leymann,J.Domingue,A.Wahler,andD.Fensel. Semantibusiness
proess management: A vision towards using semanti webservies for business
proessmanagement,2005.
17. IanHorroks. Reasoningwithexpressivedesriptionlogis:Theoryandpratie.
In Pro. of the 18th Int. Conf. onAutomated Dedution (CADE 2002), number
2392 inLetureNotesinArtiialIntelligene,pages115.Springer,2002.
18. U. Hustadt, B.Motik, and U. Sattler. Reasoning for desription logis around
SHIQ
inaresolutionframework. TehnialReport3-8-04/04,June2004.19. InterfaeDenitionLanguage. ISOInternationalStandard,number14750.
20. T. Kirk, A. Y. Levy, Y. Sagiv, and D. Srivastava. The Information Manifold.
In C. Knoblok and A. Levy,editors, AAAI Spring Symposium ob Information
GatheringfromHeterogeneous, DistributedEnvironments,1995.
21. Gergely Lukásy,TamásBenk®,andPéter Szeredi. Towards automatisemanti
integration. In Enterprise Interoperability II, New Challenges and Approahes,
ProeedingsoftheI-ESA2007,pages795806,Funhal,Portugal,2007.Springer.
22. GergelyLukásyandPéterSzeredi. Ontologybasedinformationintegrationusing
logiprogramming.InEdnaRukhaus,editor,Proeedingsofthe2ndInternational
Workshop onAppliations ofLogiProgramming tothe Web,Semanti Weband
Semanti WebServies(ALPSWS2007),pages3954,Porto,Portugal,2007.
23. Eduardo Mena,Vipul Kashyap, Amit P.Sheth,and Arantza Illarramendi. OB-
SERVER:Anapproahforqueryproessinginglobalinformationsystemsbased
oninteroperationarosspre-existingontologies. InConfereneonCooperative In-
formationSystems, pages1425,1996.
24. Zsolt Nagy, Gergely Lukásy, and Péter Szeredi. Translating desription logi
queriestoProlog. InPro.ofPADL,SpringerLNCS3819,pages168182,2006.
25. LinhAnhNguyen. A xpointsemantisandansld-resolutionalulusfor modal
logi programs. Fundam.Inf.,55(1):63100, 2003.
26. ISOPrologstandard,1995. ISO/IEC13211-1.
27. EvrenSirin,BijanParsia,BernardoCuenaGrau,AdityaKalyanpur,andYarden
Katz. Pellet:Apratialowl-dlreasoner. WebSemant.,5(2):5153,2007.
28. Leon SterlingandEhud Shapiro. Theart of Prolog:advaned programmingteh-
niques. MITPress, Cambridge,MA,USA,1986.