• Nem Talált Eredményt

OF COMPUTER COMMAND LANGUAGES compositional interpretation

N/A
N/A
Protected

Academic year: 2022

Ossza meg "OF COMPUTER COMMAND LANGUAGES compositional interpretation"

Copied!
38
0
0

Teljes szövegt

(1)

Theoretical Linguistics Programme, Budapest University (ELTE)

c o m p o s i t i o n a l i n t e r p r e t a t i o n

OF COMPUTER COMMAND LANGUAGES

Gábor Rádai and László Kálmán

Research Institute for Linguistics, Hungarian Academy of Sciences Working Papers in the Theory of Grammar, Vol. 2, No. 2

Received: June 1995

(2)
(3)

COMPOSITIONAL INTERPRETATION OF COMPUTER COMMAND LANGUAGES

Gábor Radau* and László Kálmán**

‘ Department of Symbolic Logic, Budapest University (ELTE)

*Research Institute for Linguistics, HAS, Room 119

* Theoretical Linguistics Programme, Budapest University (ELTE) E-mail: radaiOnytud.hu, kalmau8nytud.hu

Working Papers in the Theory of Grammar, Vol. 2, No. 2 Supported by the Hungarian National Research Fund (O TK A )

Theoretical Linguistics Programme, Budapest University (E L T E ) Research Institute for Linguistics, Hungarian Academy of Sciences

Budapest I., P.O. Box 19. H-1250 Hungary

Telephone: (36-1) 175 8285; Fax: (36-1) 212 2050

(4)

% 'L

ß Kyelvraäomänyi intézel Kandiára ■■■"

leltári száffli 5 _ é 6 0 0

(5)

0. In tr o d u c tio n

T h e aim of th is p a p e r is to exam ine th e tra d itio n a l concept o f com positionality.

W e will be d ealin g w ith a language, nam ely, th e language of c o m m a n d s used in th e Unix o p e ra tin g system , th e in te rp re ta tio n of w hich is in tu itiv ely fa r from com positional, a lth o u g h it fits th e tra d itio n a l definition o f co m positionality. We will outline th e reason of this discrepancy, th e n we will show how to m o d ify the language so t h a t it receives an intuitively com positional in te rp re ta tio n . W e show th a t this will g et us closer to a m ore reaso n ab le definition of th e p rin cip le of co m p o sitio n ality an d its significance for th e sem antics o f n a tu ra l lan g u ag es.

T h e p a p e r is organized as follows. In section 1 we present th e P rin cip le of C o m p o sitio n ality an d argue th a t it is to b e stre n g th en e d , because it is to o loose in its original fo rm u latio n . In p articu lar, we in tro d u ce th e P rin cip le o f Ind ep en ­ dence, and p ro p o se to include it into th e P rin cip le of C o m p o sitio n ality . T h e rest o f th e p ap er discusses a language, nam ely, th e language o f com m ands u s e d in the U nix o p e ra tin g system , th e in te rp re ta tio n o f which is fa r from c o m p o sitio n al in th e intuitive sense of th e word. However, th e tra d itio n a l P rin cip le o f C om posi­

tio n a lity does n o t preclude such an in te rp re ta tio n . F irst, in section 2, we explain th e concept o f shells (com m and in te rp re te rs), an d show how th e U n ix com m and language is non-com positional. T hen we p resen t an a lte rn a tiv e co m m a n d language w hich has a m o re n a tu ra l in te rp re ta tio n , b ased on o u r version of th e concept of com positionality. Section 3 inform ally p resen ts th e way in which su c h a ‘com­

p o sitio n al U n ix shell’ should work. T h e n we develop a language to ta lk about th e sem antic d o m ains relevant to our in te rp re ta tio n , i.e., various co m p o n en ts of a sim plified co n cep t of machine states (section 4). T h e n we ex p lain th e concept o f denotational semantics (section 5), a n o n -p ro ced u ral view of th e in te rp re ta tio n o f co m p u ter p ro g ram s, which underlies th e p a rtic u la r s tru c tu re th a t we a ttrib u te to ou r sem an tic dom ains (section 6). T h e a ctu al sy n ta x an d sem a n tic s of the language in w hich we can talk ab o u t those o b jects is given in sectio n 7, and the description o f th e sem antics of command lines (com m ands followed b y p aram e­

te rs) will be ex p lain ed in section 8. T h e way in w hich we pro d u ce th o s e m eanings fro m those o f th e com m and nam es an d th e p a ra m e te rs in a co m p o sitio n al way is explained in sectio n 9. Finally, we offer som e conclusions (section 1 0 ).

1. C o m p o sitio n a lity

Let us first define th e concept which will b e in th e cen tre of o u r a tte n tio n th ro u g h ­ o u t this p a p e r. T h e in te rp re ta tio n of a language can b e said com positional if an d only if it obeys th e Principle of C om positionality, w hich ru n s as follows:

1 .1 . T h e P r i n c i p l e o f C o m p o s i t i o n a l i t y

T h e m ean in g of a complex expression is a fu n ctio n of th e m e an in g s of its c o n stitu e n ts an d th eir mode of com bination.

(6)

2 1. Comp ősi ti on ali ty T h is definition leaves it o p en w hether ‘th e m eanings of th e c o n stitu e n ts’ m ay d e p e n d on each o th e r or on th e function th a t we use to calculate th e m eaning of th e com plex expression. H ow ever, it seems th a t the Principle of C om positionality w ould be r a th e r vacuous if we w ere to allow for such dependencies. T h a t is, we u n d e rs ta n d t h a t th e intended c o n ten t of th e Principle of C om positionality im plies a Principle o f Independence:

1 .2 . T h e P r i n c i p l e o f I n d e p e n d e n c e

T he m ean in g s of th e co n stitu en ts o f a complex expression a re assigned in d ep en d en tly of each o th e r and th e function th a t yields th e m ean in g of th e com plex expression.

T h e reason w hy we propose to a d d this prin cip le is th a t, as we will see shortly, languages th a t o b ey the P rin c ip le of C om positionality m ay still be ra th e r ‘non- co m p o sitio n al’ if they fail to s a tisfy the P rin cip le of Independence. In such la n ­ guages, the m e an in g of an expression m ay v ary depending on w hat it is a con­

s titu e n t of. As a result, very sim ilar co n stru ctio n s (e.g., containing th e sam e expression in th e sam e sy n tac tic role) m ay b e in terp reted in heterogeneous (or even u n related ) ways. We s u b m it th a t this co ntradicts th e in tu itio n b e h in d th e co n cep t of com positionality.

N ote th a t t h e in te rp re ta tio n o f com positionality proposed here im plies th a t th e m eaning co n trib u tio n s of th e constituents of an expression are co n sta n t, i.e., th e y do not vary fro m one c o n stru c tio n to th e o th e r. This m eans a certain context- independence as well, which m a n y would deny. We conceive o f this as a p rice to pay fo r a reaso n ab le concept o f com positionality. In our approach, th e co n tex t of u tte ra n c e (and th e u tte ra n c e -in te rn a l context o f any sub-expression) can only play a ro le inasm uch as b o th th e m ean in g s and th e functions th a t com bine th e m are underspecißed. T h a t is, by v ir tu e of their underspecification, co n tex tu al factors (including th e in te rn a l context, i.e ., the presence of the o th ers) m ay enrich these m eanings. This k in d of m ech an ism does not co n trad ict th e P rin cip le o f In d ep en ­ dence, because it is not the m eanings assigned th a t depend on each o th e r, b u t w h a t th ey becom e la te r on.

I t is easy to see th a t the P rin c ip le of Independence is n o t vacuous a t all. T h e in te ra c tio n of m ean in g s is by d efin itio n contentful, i.e., th e P rin cip le of In d e p e n ­ dence prevents m ean in g assig n m en ts from d ep en d in g on form al p ro p erties o f the c o n tex t (e.g., th e sh ap e of a co-occurring co n stitu en t). O nly genuine hom onym s (hom ophonous expressions w ith independent m eanings) challenge th is principle;

th o se have to b e considered different expressions which accidentally a re o f th e sam e shape. So w h e th e r an am b ig u ity is due to an accidental surface coincidence or a sy stem atic sem an tic p h en o m en o n must be determ ined independently.

(7)

2. Unix shells 3

2. U n ix sh ells

A shell is a p ro g ram th a t establishes co n tact betw een th e o p e ra tin g s y ste m of a co m p u ter a n d its user. Its task is to forw ard th e u se r’s com m ands to th e o p e ra tin g system (a fte r a check of correctness). (A co m m an d is also called a co m m a n d line; we will refer to it as a cml.) M any shells offer a d d itio n al fe atu res to th e u s e r (such as a b b re v iato ry m echanism s and ways o f referring to com m ands issu ed earlier), as well as b u ilt-in com m ands. The shells used w ith th e U nix o p e ra tin g system (especially th e C-shell) offer many such featu res. T h e com m ands t h a t do not exploit th e e x tra possibilities offered by th e shell m ay co n tain a co m m a n d nam e ( cmn) a n d various types of parameters th a t follow it. T h e com m and n a m e is sim ply th e n am e o f a co m p u ter program ; th e p ro g ram processes th e p a ra m e te rs, so their in te rp re ta tio n is its ‘in te rn a l affair’. (B u ilt-in shell com m ands do no t c o rre sp o n d to p rogram s, th e p a ra m ete rs of such com m ands are processed by th e shell itse lf.) The language also has c e rta in operators (o p r), which can be prefixed to a n y com and line. T h e y corresp o n d to program s th a t ru n th e rem ain in g co m m an d lin e, and p erfo rm som e uniform com putation in th e m e a n tim e .1

T h e inform al sy n tactic and sem antic d escrip tio n of com m and lines is available in th e form o f m a n u al pages provided w ith th e o p e ra tin g system . A m a n u a l page contains th e su m m ary of th e syntax asso ciated w ith a com m an d n am e follow ed by th e d escrip tio n of w h a t th e com m and lines do. Let us tak e a look a t th e sy n tactic descrip tio n o f th e com m and called g re p :

2 .1 . E x a m p l e

g r e p [ - b c h i l n s v y ] [ - f expfile] [ [ —ell expression] [hies]

F irst com es th e specification of the com m an d nam e, followed by the list o f Bags (Ü).

In th e case o f g re p , these are one-character strings th a t can be c o n c a te n a te d in any ord er a n d th e ir co n caten atio n m ust be p reced ed by a m inus sign. In g e n e ra l, we can th in k o f a flag as any strin g containing no blan k space a n d preceded by a minus sign. (F lags are in principle optional; in m an u al pages, [•] m eans o p tio n ality .) T h e n com e two options, each consisting o f a n option letter an d its argum ent. (An op tio n le tte r is like a flag, b u t it has an arg u m en t.) T h e o p tio n le tte r in th e second o p tio n is itse lf op tio n al. Finally, th e la st ite m is an optional argum ent (opt), i.e., a p a ra m e te r th a t has a fixed position in th e com m and line which is n o t preceded by an o p tio n le tte r. In fact, the above sy n tac tic su m m ary is th e a b b re v ia tio n of two different sy n tac tic possibilities:

2.1'. E x a m p l e

a. g r e p [ - b c h i l n s v y ] [ - f exp hie'] [ - e expression] [files]

b. g re p [ - b c h iln s v y ] [ - f exphle] [expression] [hies]

1 For exam ple, th e o p erato r tim e will re tu rn th e tim e th e process given as its a rg u m e n t has ta k en to run.

(8)

4 2. Unix shells In 2 .1 'a , we have nine flags, tw o options a n d a n optional argum ent; in 2 .1 'b , th e re are nine flags, one option a n d two o p tio n al arg u m en ts.2

In g en eral, the syntax o f th e relevant frag m en t of th e language of U nix com ­ m and lines (L /cm|)) in B N F is as follows:

2.2. D e f i n i t i o n

1. cm l = opr cml \ emo | cml Ü | cm l o p t; 2. c m n =f c° I . . . I cm n+1 expr | cm n _ x opl;

Ó. o p t — expr;

4. exp r d= n | c° | . . . | var° | __

c” stan d s fo r n-a.rgument c o m m an d nam e c o n sta n ts, n stan d s for n a tu ra l nu m b ers, a n d cn s ta n d s for a n am e co n stan t d e n o tin g elem ents o f th e universe — files, directories, etc., as we w ill see. As one c an see from the definition, we assum e th a t flags a n d options com e a t th e end o f com m and lines ra th e r th a n betw een th e com m and nam e and its argum ents. T h is m odification does n o t m ake any difference ex cep t for the fa c t th a t the d escrip tio n of the sem antics of th e relevant co n structions will be far sim p ler. In w hat follows, we will n o t discuss th e sem antics o f most o f th e constructs specific for th e shell language; we will co n cen trate on th e sem antics o f commands.

The lan g u ag e presented above is a n id ealisatio n of th e cu rren tly available languages, as th e co n stru ctio n rules in th e given form are context free, w hereas in th e a c tu a l command la n g u ag e as specified in the m an u al pages c o n stru ctio n rules are sep a ra te ly given fo r every com m an d as can be seen from th e sy n ta x of th e com m and g re p above. It is obvious th a t , for exam ple, th e sy n tactic ru le th a t combines com m and nam es w ith flags is c o n tex t sensitive in th e sense th a t th e program w ill rep o rt a sy n tax e rro r if a flag is n o t explicitly listed in th e p ro g ram description. O n the one h a n d , it would be d esirab le to have a context free language as L(cm|) a n d , on the o th e r h a n d , it is m o re in line w ith o u r in tu itio n th a t if a m odifier com es from a closed syntactic class, b u t is not applicable in a c ertain context, th e n th is is a sem an tic, ra th e r th a n a syntactic phenom enon. It should b e explained in term s of sem a n tic in co m p atib ility or vacuous sem antic o p e ra tio n s ra th e r th a n in syntactic te rm s. In w hat follows, we will assum e th e above language a n d let o u r sem antic a p p a ra tu s b e such th a t it accounts for th e problem s connected w ith the relev an t constructions.

There a re also more im p o r ta n t problem s, related to th e com positionality of th e in te rp re ta tio n of com m ands. Besides th e fact th a t com m and nam es com e

2 The abo v e description is n o t quite co rrect, since exactly one of th e expfíle a n d expression argum ents is in fact obligatory.

(9)

2. Unix shells 5 w ith som e predefined sets of possible p aram eters (flags a n d o p tio n le tte rs), th e in te rp re ta tio n o f these also depends on th e com m and n a m e a t h a n d . For exam ple, th e flag -1 m ean s roughly ‘long, verbose listin g ’ in con n ectio n w ith th e co m m an d n a m e I s , 3 w hereas as an argum ent to wc it m eans so m eth in g like ‘count lines o n ly ’.4 Sim ilarly, while th e option le tte r - f (sta n d in g for ‘file’) in tro d u ces the n a m e of an auxiliary file (containing expressions or com m an d s) w ith g r e p an d sim ilar com m ands (make, awk, sed etc.), it is a flag th a t s ta n d s for ‘fo rce’ w ith th e co m m an d rm (rem ove), a n d has a to tally different effect.5

A second problem is th e issue of m ultiple flags. In g eneral, th e o rd e r of flags does n o t m ake any difference and m ultiple occurrences of th e sam e flag in one co m m an d cause th e sam e change in b eh av io u r as single o ccu rren ces, as one w ould ex p ect. Yet we have to face th e problem of dependent hags, i.e., th e pro b lem th a t c e rta in flags can only a p p e a r in th e presence of som e o th e r flag. For exam ple, th e flag - u dep en d s on th e presence o f - t in this sense w ith th e co m m an d n am e I s . 6 T h o u g h even the inform al sem antics m akes th is p erfectly u n d e rsta n d a b le , c u rre n tly th is is tre a te d as a sy n tactic co n strain t, w hich ag ain clearly does no t ag ree w ith o n e ’s intuition.

As a m a tte r of course, th e idiosyncratic b eh av io u r of flags can be explained aw ay by assum ing th a t flags are fu n cto rs over com m an d n am es as arg u m en ts.

3 I s -1

4 WC

5 rm

6 I s - t -u

lists the files specified by its argum ent in long fo rm a t, giving m ode, num ber of links, owner, group, size in b y tes, a n d tim e o f last m o d ­ ification for each file. If th e file is a sym bolic link, th e filenam e is p rin ted followed by ‘-> ’ a n d th e p a th n a m e of th e referenced file. If th e file is a special file, th e size field will co n tain th e m a jo r a n d m i­

n o r device num bers, ra th e r th a n a size. A to ta l co u n t of blocks in th e directory, including indirect blocks, is p rin te d a t th e to p of long fo rm at listings.

counts lines, words and ch aracters in th e n a m ed files, or in th e s ta n ­ d a rd in p u t if no nam es ap p ear. It also keeps a to ta l count for all nam ed files. A word is a m axim al strin g o f c h a ra c te rs delim ited by spaces, ta b s, o r newlines. T h e flags -1 , -w a n d - c m ay b e used in any com bination to specify th a t a subset o f lines, w ords, a n d ch aracters are to be rep o rte d .

removes each given file. By default, it does n o t rem ove directories.

If th e - f (‘fo rce’) flag is used, it ignores n o n e x isten t files an d does n o t p ro m p t th e user if th e file is u n w ritab le.

so rts the files listed by last m odification tim e (la te st first) ra th e r th a n by nam e.

uses tim e of la st access in ste ad of tim e of la st m o d ificatio n for sorting;

can only b e used w ith th e - t flag.

(10)

6 2. Unix shells Since th e re is only a finite num b er of com m ands, th e m eaning of a flag could be a p a rtia l function defined pointw ise, i.e., one w hose actio n is determ ined by first looking a t its a rg u m e n t.7 A sim ilar issue is raised by th e ways in which th e presence vs. absence of o p tio n s a n d optional argum ents is significant. For exam ple, if th e com m an d s e t is followed by two argum ents (a nam e a n d a value), it causes th e variable nam e to be set to value, w hereas if it sta n d s w ith o u t an argum ent, th e corresponding action is to display th e currently set variables w ith th eir values.

This can again b e dealt w ith using several m a th e m atica l tricks such as polym orphic fu n ctio n s o r em p ty strings as arg u m en ts, defining th e fu n ctio n again pointwise.

O bviously, u n d er th e c u rren t w ording of th e P rin cip le o f C om positionality, a com positional in te rp re ta tio n of U nix com m ands can be given th a t uses only fu n c­

tio n al a p p lic atio n ,8 alth o u g h we have th e very stro n g feeling th a t, un d er a m ore a p p ro p ria te view of com positionality, th is should n o t b e possible. In p a rtic u la r, th e h eterogeneous in te rp re ta tio n of flags (and o th e r o p tio n le tte rs) as well as th e heterogeneous beh av io u r o f ab sen t optio n al arg u m en ts are incom patible w ith ou r P rin cip le of Independence. In w h at follows, we will specify a sem antics th a t we feel com es closer to th e original idea beh in d com positionality a n d th a t will rem edy some o f th e problem s m en tio n ed above. We will see th a t this ty p e of in te rp re ta tio n will satisfy th e P rinciple o f Independence.

3. C o m p o sitio n a l U nix: A n Inform al O u tlin e

A nom alies like th e hom onym y of th e - f flag m entioned earlier should n o t occur in a U nix shell w ith com positional sem antics (an d th ey o ccu r to a very lim ited ex ten t in n a tu ra l languages). In a com positional U nix shell, th e re m ust be a flag

—f o r c e to b e used w ith rm (an d sim ilar com m ands)9, an d a different flag - - f i l e to b e u sed w ith g re p (a n d sim ilar com m ands). (N eedless to say, w hat nam e we choose for these flags is im m aterial.) T h e m eanings of — f o r c e an d —f i l e m u st

7 T h is m e th o d would give us a function th a t is as good as any o th er m a th e ­ m atically. Even if we assum e th a t th e num ber of com m ands is infinite an d th a t th e function is to ta lly defined, we ju s t have to define th e result of th e a p p lic atio n of a flag to som e com m and for w hich it is undefined as th e action o f issuing some erro r m essage — again an action th a t m akes exactly as m uch sense as any o th er from th e m ath em atical p o in t of view.

8 F o r exam ple, th e m eaning of a flagged com m and is th e action it perform s.

C om p o sitio n ality in th e above sense is not even destroyed by th e fact th a t th e flag as a function does n o t necessarily preserve an y th in g of th e original action p e rfo rm ed by its arg u m en t.

9 As it is conventional, we will use — instead o f - to in d icate th a t som ething is a m u ltile tte r flag ra th e r th a n th e concatenation o f in d ep en d en t flags.

(11)

3. Compositional Unix: An Informal Outline 7 be assigned uniform ly and in d ep en d en tly of th e co n tex t. For e x am p le, - - f o r c e could b e in te rp re te d as ‘overw rite th e file argum ent if you own th e file, even if you do n o t have w rite perm ission for i t ’. (E ventually, it can also cover ‘do no t check if th e file arg u m en t exists at all’, alth o u g h it w ould be cleaner to se p a ra te these two m eanings, so th a t th e la tte r is to be expressed by, say, - - i g n o r e . ) Similarly, th e in te rp re ta tio n of th e option le tte r - - f i l e w ould be in te rp re te d as ‘th e nam e of an au x iliary file (containing com m ands etc.) follows’.

A ssum ing th a t th e program s corresponding to rm, g re p etc. o p e ra te as they u su ally do in U nix (i.e., th a t we are n o t to rew rite th em ), th e shell will in terp ret th ese p ro g ram nam es independently of th eir original in te rp re ta tio n (o r relying on th e original in te rp re ta tio n if needed). To achieve th is, we will a ssu m e th a t the shell m a in ta in s a lexicon which contains a program specifícation for each possible com ­ m a n d nam e. P ro g ram specifications contain variables co rresp o n d in g to th e possi­

ble effects o f p aram eters. For exam ple, th e value o f th e v ariab le W R IT E C H E C K d eterm in es w h eth er w rite perm ission is to be checked before o v e rw ritin g a file; th e v ariab le E X IS T C H E C K determ ines w h eth er th e non-existence o f a file will trig ­ ger a special action; and th e value of A U X FILE sto res th e n a m e o f th e auxiliary

(com m and) file. If necessary, p ro g ram specifications assign default values to such variables, which can be overridden by param eters.

T h e p ro ced u re described above corresponds to a certain underspecifícation of th e a c tu a l effect of ru n n in g th e p rogram s. T he p ro g ram specifications will ensure th a t th e e x tern al context (th e so-called environm ent, a set of v a ria b le bindings) a n d th e (o b lig ato ry an d optional) p aram eters to g e th e r specify th e e x act action to tak e w hen invoking a program .

4. M ach in e S ta tes

To give a sem antics for th e language of Unix com m ands, we assu m e th a t th e relevant basic dom ain is th a t of machine states (MS). For th e sake o f simplicity, we will rep resen t a m achine sta te w ith th e disjoint union o f a ty p e d directed acyclic graph (T D A G), stan d in g for th e directory structure a n d th e files stored, a n d a dom ain NC_l d= N ® Char* for the d e n o ta tio n of th e n a tu r a l num bers a n d c h a ra c te r strin g s,10 form ing th e universe o f in te rp re ta tio n , a n interpretation function a n d a valuation corresponding to th e environm ent. In th is section we will m ain ly b e concerned w ith th e g rap h s belonging to a m achine s ta te , th e o th e r tre e com ponents will be explained in d etail in section 7. A ty p e d d ire c te d acyclic g ra p h is defined as follows:

10 T h e ex act m eaning of th e above n o ta tio n will be defined la te r, cf. defini­

tio n 5.4.

(12)

8 4. Machine States

4 . 1 . D e f i n i t i o n

1. Given strin g s v and u, v is a prefix of u def3w.u = vw.

2. A tree domain D is a non-em pty subset o f strings (tree addresses) in N*

such th a t:

a. for each u E D, ev ery prefix of u is also in D\

b. for each u E D, for every i E N* if ui E D then, for every j such th a t 1 < j < i, u j is also in D.

3. Two tre e addresses a re independent if n e ith e r is a prefix of th e other.

4. A tree address u is term inal <£> there is no tree address def v in D such th a t u is a prefix of v.

5. Given a set T of types a n d E = l J reT E r of labels, a typed tree is a to ta l fun ctio n t? : D —> E , w h ere D is a tree dom ain.

6. A typed directed acyclic graph is an ordered p air (t r , R), w here t ? is a typed tre e and R is a n equivalence re la tio n on D (dom(ÍT’)) such th a t for all u , v E d o m (tr), if (u , v) E R, then:

a. ui E dom(ÍT’) & vi E dom (fy);

b. ui E dom (fy) =4- (u i , v i) E R; c. t T {u) = t T (v).

N o t all T D A G ’s are acc e p tab le in m achine states. In o u r case, th e TD A G a sso c iate d w ith N C j_ , an in te rp re ta tio n function a n d a valuation h as som e fu rth e r sp ecial p ro p erties, as shown by th e following d efinition.11 We su p p o se th a t T — { d ir,file , Char*}, i.e., th e relevant types are directory, file a n d character string.

4 . 2 . D e f i n i t i o n

(td a g © N C i , p, v) E MS 44 tdag = (tr, R) is a TD A G , a n d 1. p: Con dom (i'r) © N C _l

2. v: Var i—>• d o m (fr) © N C ^

3. t T {u) E E dir =>• Vi € N . f r (u i) E Edir V t T (ui) E E fiIe;

4. í t(u) E Efiie t r ( u 1) E Echar* A —*3» E N \ { l} .u i E d o m (f r);

5. tr{u) € Echar* ->3i E N .ui E d o m (ir);

6. í t(0) € S d;r;

7. 1,11, 111 € dom(<T), *t( 1) € Ed|r, í t( H ) € 53fiie, f T ( H l ) € Elchar*, and ->3i E N . l i E dom(Í7’) V I l i E d o m ^ ) V l i E dom (fx)-

T h e above definitions fo rm u la te the following constraints on w h a t ordered trip les o f universe, in te rp re ta tio n function and valu atio n we accept as m achine

11 T h e identity o f th e labels does n o t play any role in w hat follows. © in clauses 1 a n d 2 m ean s roughly th e disjo in t union of th e two dom ains. A lthough th e d o m a in consists of the d isjo in t union of a T D A G and N C _l, we are only in te re ste d in th e disjoint u n io n taken w ith th e dom ain o f th e T D A G , as th e su b seq u en t clauses show. F o r th e exact definition, see definition 5 .4 .

(13)

4. Machine States 9 sta te s p ro p er. T h e in te rp re ta tio n an d th e v a lu atio n asso ciated w ith th e u n iverse are fu n ctio n s th a t assign e ith er a num erical value, a c h a ra c te r strin g or a tre e a d ­ dress to a c o n sta n t or a variable of the language to be given in section 7, d ep en d in g on its ty p e, as we shall see. Furtherm ore, in an MS labels asso ciated w ith th e te r­

m inal addresses of th e underlying tree have to b e of ty p e ‘d ir ’ o r ‘C h a r* ’, 12 i.e., em p ty directories or finite lists of ch aracters corresponding to co n ten ts o f files.13 We have to im pose som e fu rth er co n strain ts g u aran teein g th a t c h a ra c te r strin g s a re only im m ediately prefixed14 by files an d th e la tte r are im m ed iately prefixed by directories an d th a t files only im m ediately prefix one c h a ra c te r s trin g w hich im m ediately prefixes n o th in g . As the so rts form dom ains o f th e ir own, a d d itio n ­ ally, í t h as to contain th re e special elem ents: _l_char* , -bfi]e a n d J_dir —- th e ir tree addresses are 1 ,1 1 and 111, respectively — , n e ith e r being th e prefix of an y o th e r tre e address. T hese will serve as the so-called b o tto m elem ents of th e ir resp ectiv e dom ains — as required by dom ain th eo ry (cf. sections 5 - 6 ) , b u t th e y will also be p u t to special use in o u r sem antics, as will b e explained la te r on.

We will provide th e com positional Unix com m and lan g u ag e w ith a so-called denotational semantics. T his makes it necessary to in tro d u ce som e co n cep ts before specifying w h a t th e dom ains of the sem antic values of th e various ex p ressions in o u r language will be.

5. D e n o ta tio n a l S em antics

W e will use d en o tatio n al sem antics — as worked ou t an d describ ed in S c o tt a n d Strachey (1971) — for th e description of th e relevant fragm ent o f a U nix c o m m an d language. To illu stra te th e basic points, let us take a look a t th e follow ing two program s:

5 .1 . E x a m p l e

F( n ) <= If n = 0 th e n n else F ( n — 1) G(n) «= 0

Obviously, th e two program s do quite different things. T h e p ro g ram F — on receiving a n argum ent n of type N — will recursively co m p u te a value, n a m ely th e value 0. P ro g ra m G, on th e o th er hand, will im m ed iately p ro d u ce th e sam e re su lt.

A lth o u g h we see th a t th e two program s p ro d u ce th e sam e o u tp u t on a p p ro p ria te 12 We will use the term s hie, directory a n d character string to refer to tree

addresses labelled w ith objects of th e a p p ro p ria te type.

13 As costum ary, we th in k of em pty files as containing th e em p ty strin g o f c h a r­

acters, i.e., th e strin g of length 0.

14 Let u ,v G N *. v is an immediate prefix o f u 3i € N .u = vi.

(14)

10 5. Denotations! Semantics input, i.e ., they are equivalent u n d er th e stan d a rd set th eo retic in te rp re ta tio n of functions, co m p u tatio n ally they are as different as any two p ro g ram s can b e .15 T h e idea b e h in d d e n o tatio n al sem antics is exactly this: for m any purposes it is b e tte r if we c an a b stra c t away from accidental p ro p erties of program m ing languages and the realizatio n s of specific program s, so th a t we can reg ard p ro g ram s essentially as realizatio n s of some (set theoretic) fu n ctio n s on dom ains a p p ro p ria te for w hatever can serve as th e in p u t a n d th e o u tp u t in th e language u n d e r investigation.

B u t th in g s are m ore com plicated th a n they seem a t first sight. If we in te rp re t the fu n c tio n s to be of ty p e / : N i-> N , we have no problem s. B u t w h at h ap p en s if we let th e ir type be / : Z H¥ Z? T h e p ro g ram G will still pro d u ce 0 on every in p u t.

But F is in trouble as w hen it is given som e n < 0 as an arg u m en t, it will go straig h t in to an infinite loop. W hy is th a t a problem for o u r sem antics? Because we have to do som ething ab o u t th e infinite loop, and th e sem antics th a t we chose forces us to give a d e n o tatio n to th is resu lt — a d en o tatio n th a t can a p p e a r as values o f functions. A dditionally, it has to be of type Z to m eet th e constraints.

For th is p u rp o se we in tro d u ce a special con stan t in every dom ain, called bottom

( - L ) .

F u rth erm o re , we will need an ordering which roughly m irro rs th e relations of in fo rm atio n content o f th e elem ents of th e dom ain. T his gives us an algebraic stru c tu re called a S co tt domain. T h e official definition o f S cott dom ains is as follows:16

5.2. D e f i n i t i o n

sd = (U,T sd , E ) € SD Hpf& U 7^ 0 , T s d U,E a cpo, a n d Vx € U.T sd E x -

E xam ples a re th e dom ains N x and T x , i.e., the dom ains of n a tu ra l num bers and tru th values w ith th e ir respective b o tto m elem ents. These dom ains are also exam ­ ples of a n o th e r im p o rta n t notion, th e so-called üat domains, defined as follows:

10 In w h a t follows, we will use th e term s extensions1 equivalence vs. intensional equivalence: F a n d G are extensionally, b u t n o t intensionally, equivalent.

16 U is th e universe o f th e dom ain containing a t least Xstj, th e inform ation c o n ten t of which is m inim al according to the complete partial ordering E- A cpo is a po which h as lim its |_|n x n for all (countable) increasing sequences Xq E X\ E ■ • • E x n E • • •• C ertain fu rth e r conditions on dom ains are im posed in G u n te r and S cott (1990), b u t th ese need n o t concern us here, as th ey are m e a n t prim arily to ensure th a t th e class of dom ains a re closed u n d e r various co n stru ctio n s.

(15)

5. Denotational Semantics 11 5 .3 . D e f i n i t i o n

sd € FD Vx, y 6 U.x / ± sd A y ^ ± sd => x £ y .

It is obvious th a t if we take the ordering to b e ab o u t th e in fo rm atio n co n ten t of the elem ents of th e respective dom ains, th e n n eith er _L C T , n o r T C _L, i.e., n e ith e r tr u t h value carries m ore in fo rm atio n th a n th e o th e r, w h ereas lack of inform ation a b o u t a tr u th value certain ly carries less in fo rm atio n th e n th e y do an d , sim ilarly, no n a tu ra l num ber is less inform ative th a n any o th e r, e x ce p t for th e b o tto m elem ent representing th e ‘re s u lt’ o f n o n -te rm in atin g c o m p u ta tio n s.

If we tak e some previously given dom ains as basic, all o th e r d o m ain s can be defined using c ertain operations on dom ains. T hese o th er dom ains in clu d e fu n ctio n dom ains, p ro d u c t dom ains and sum dom ains. Som e of th e relevant o p e ra tio n s are defined below :17

5 .4. D e f i n i t i o n

d\ —>■ d2 th e dom ain of all fu nctions from d\ into d2, w here / g Vx e d i . f ( x ) C d2 g(x).

T h u s L d 1->d2 is th e fu n ctio n th a t m aps every elem ent o f d\ into -f d2 i

th e C artesian pro d u ct dom ain w here

( x j , X 2 ) ^ d i X d 2 (2/l 5 2/2 ) ^4* Vz £ { 1 , 2 } . X i 2/i)

th e ‘coalesced’ sum , w here elem ents o rig in atin g fro m different di's are incom parable a n d b o th J_dt are identified w ith JLdl0 d2;

th e lifted dom ain o b ta in e d by adding a new b o tto m elem ent u n d e r d;

th e lists of finite len g th — including strings o f le n g th 0 — w ith n o n -T com ponents in d.

T h ere are two m ore notions th a t are im p o rta n t in th e th e o ry o f d o m a in s as 17 d \ , d2 d en o te a rb itra ry dom ains. T h e s ta n d a rd function space is th e sp ace of

continuous functions. Continuous fu n ctio n s are defined as follows: A fu n c tio n / is continuous iff

/ (U x„) = U / ( x n ).

T his n o tio n is im p o rta n t from a technical p o in t of view, as th e re are n o n -triv ia l dom ains (th e so called reflexive dom ains) w hich satisfy th e follow ing eq u atio n : d = d —^ d an d can serve as th e d e n o ta tio n of some special c o n stru c ts , b u t th is will n o t concern us fu rth er in th e p a p er.

d\ x C?2

d\ © d2

• d±_

d*

(16)

12 5. Denotational Semantics

well as in w h a t will follow:

5 .5 . D e f i n i t i o n

1. A function / is m onotone def f ( x ) E f{y)- 2. A function / is strict /(_L) = _L.

These p ro p e rtie s are defined for functions on dom ains b u t there is a very in tu itiv e analogy w ith com puter p rogram s. T h e first p ro p e rty is one we generally expect co m p u ter program s to satisfy, nam ely th a t th e y respect th e richness o f th e in p u t,18 i.e., a n in p u t th a t is ric h e r — according to some obvious ordering — is never taken in to a n o u tp u t th a t is poorer th a n th e o u tp u t for some po o rer in p u t. T h e second p ro p e rty is less obvious, b u t for p ro g ram s it m eans th a t we can n o t design a pro g ram t h a t saves us if it is given some erro n o u s in p u t, e.g., if its in p u t is provided by th e o u tp u t of some p ro g ra m th a t does n o t te rm in a te — as would b e th e case if we gave th e o u tp u t of p ro g ram F in 5 .1 on in p u t —7 as th e input to itse lf19. If we give th e above o u tp u t as a n input to th e p ro g ram G in 5.1, th en its b eh av io u r depends o n w hether we su p p o se it to o p e ra te call-by-value or call-by-name. In th e form er case, we get the sam e result as above; in th e la tte r, we get a p ro g ram th a t is m on o to n e b u t not s tric t, since it assigns th e sam e value to every in p u t — th u s satisfying th e condition o f m onotonicity — , b u t it does not respect th e b o tto m elem ent. Similarly, it is e asy to define a num erical p ro g ram th a t is s tric t b u t n o t m onotone — take one th a t takes every n a tu r a l num b er except T into som e n £ N b u t it ta k es some k € N in to n — 1 (a n d T into _L). T hus we see th a t th e two p ro p erties a re independent.

O ne m ore rem ark h a s to be m ade a t th is point. We said before th a t deno­

ta tio n a l sem antics is u sed so th a t we can a b stra c t away from certain accid en tal p ro p erties o f program s, i.e., we can see extensionally equivalent p ro g ram s as h av ­ ing th e sam e denotation. T h is will pose th e problem th a t certain p ro g ram s of th e U nix com m and language are extensionally equivalent, b u t they have different side effects th a t we may b e interested in c a p tu rin g . For exam ple, a p ro g ram th a t sim ply displays the co n ten t o f a file does n o t affect th e m achine s ta te in any ob ­ vious way. So we can e ith e r take the decision to dro p d en o tatio n al sem an tics as o u r tool o r we can sim ply n o t take account of th ese features of p ro g ram s. B u t we can also try to m irror c ertain in ten sio n al differences — i.e., differences due to th e im p lem en tatio n of p ro g ram s th a t do n o t show u n d e r th e set th e o re tic al rep re ­ sen tatio n b u t which we consider relevant — as extensional ones, th u s sticking to d e n o ta tio n a l semantics. In w h a t follows, we take th e la tte r p ath .

18 In o u r case, inputs a n d o u tp u ts will b e m achine states.

19 In o u r case this m eans th a t we can never recover from th e erro r s ta te .

(17)

6. The Semantic Domains 13

6. T h e S em a n tic D om ains

To m ake MS in to a Scott dom ain, we need a b o tto m elem ent ±ms a n d a cpo. The form er is th e u n s tru c tu re d e rro r state (JLm s); th e la tte r is defined as follows:

6 .1 . D e f i n i t i o n

1. Vi < v 2 &def V x.(ui(x) = 1 A v2(x) ^ T ) V (t>i(:r) = v2{x))\

2. m sj C Ms m s2 ^ m si = ±ms V

m si = {tdagi © N C x , Pi, Uj) (for i G 2) A tdagx = tdag2 A p\ — p2 A v\ <

V2-

T h a t is, th e e rro r sta te is less ‘in form ative’ th a n any o th e r s ta te , a n d w h ereas all o th e r sta te s w ith different underlying trees o r in te rp re ta tio n fu n ctio n s a re incom ­ p arab le, in co m p arab le states the ordering is sim ply in h e rite d from th e ordering on th e v alu atio n , which says th a t a v alu atio n is m ore in fo rm ativ e th a n a n o th e r if an d only if it is ‘defined’ in some sense for m ore values.20

Now we are read y to define th e sem antic dom ains for th e language o f o u r Unix shell:21

6 .2 . D e f i n i t i o n 1. [n] G Nj_;

2. [cl G dom(<T );

3. [v a rj G dom (í-r) ® iV±;

4. [cm „] G Un -> . . . -> Ui MS -> MS;

5. [optjj 6 (MS -> MS) ->■ MS MS;

6. [f?l G (MS -4 MS) -» MS -> MS;

7. [oplj G (Un —>• . . . —t U\ —^ MS —> M S) —> U^n+i —t . . . —y Ui —y M S —y MS;

8. [cm ij G MS -> MS;

9. [o p r] G (M S -> MS) -> MS -> MS.

T h ere is little to say ab o u t th e dom ain of integers; c o n sta n ts will e v a lu a te to dis­

tin g u ish ed nodes of th e tree, variables to nodes or n a tu ra l n u m b ers in accordance w ith th e ir types. C om m and lines (com m ands) will be in te rp re te d as functions from m achine sta te s to m achine states, w hereas n -arg u m en t co m m an d n a m e s yield com m ands w hen supplied w ith the a p p ro p ria te num ber of arg u m en ts. O p tio n s and

20 T h is is ju stified by th e fact th a t th e relevant in fo rm atio n is b asically stored in th e v a lu atio n function, w hereas th e underlying tree a n d th e in te rp re ta tio n fu n ctio n c arry little inform ation.

21 Cf. definition 2 .2 , Furtherm ore, we use th e convention th a t b ra c k etin g is right associative. For exam ple, X —y Y -* Z = (X —y ( Y —y Z)).

(18)

14 6. The Semantic Domains flags, like o p e ra to rs, are fu n c tio n s from com m ands to com m ands; nevertheless, we sh all see th a t th e re is a difference between o p e ra to rs a n d options/flags. O p tio n le tte rs create new argum ent places. By th e definition o f th e dom ains resu ltin g from coalesced sum , C artesia n p ro d u ct and fu n ctio n fo rm atio n 22, and th e flatness o f NCj_, Tj_ a n d dom (i;r), th e ordering re la tio n s an d th e b o tto m elem ents are given. For exam ple, the le a s t ‘inform ative’ p ro g ram (-1-m s->m s) is th e one th a t tak es every m achine state in to th e error sta te .

T h e in te rp re ta tio n of th e expressions of th e language L^cml^ will proceed via a tra n sla tio n fu n c tio n into th e lan g u ag e of specifications — th e topic of the following section. T h a t is, com m and lin es will be tra n s la te d into th e specification language first, th e n th a t language w ill b e in terp reted using th e sem antic dom ains defined here.

7. A L anguage for P ro g ra m S p ecifica tio n s

As we have said above, com plex expressions w ill receive a d en o tatio n in two steps.

F irst we define a tra n slatio n fun ctio n r : L^ cm^ (->■ L (spec), he., we tra n sla te ex­

pressions of th e shell language in to expressions of th e language of specifications.

T h ese expressions will be g iv en a d en o tatio n v ia an in te rp re ta tio n fun ctio n a n d a valuation. As we shall see, th ese will be th e desired den o tatio n s of th e shell expressions. W e will proceed in two steps. W e first specify an auxiliary language I > s> an d a fu n ctio n rj: Z /cml) Z /ps) which will serve as th e basis for specifying th e language L^spec^ and th e fu n c tio n r.

C om m ands (cmi) will b e tra n sla te d in to p ro g ram specifications (P S ), which can be in te rp re te d directly in th e sem antics. T h e tra n sla tio n s of all o th e r expres­

sions (such as flags and o p tio n letters) will b e given relative to PS. F irst of all we need a ty p ed dynam ic first-o rd e r language w ith equality (T D F O L E )23 th a t will b e sufficient to specify — i.e., to describe — fu n ctio n s from m achine sta te s to m achine states. T h e set of ty p e s is defined as follows:

7 .1 . D e f i n i t i o n

1. t, dir, file, natnum , char*, € T;

2. a , ß 6 T (a ß) G T .

T h e types dir a n d file are self-explanatory, t is th e ty p e tru th value — i.e., th e ty p e of form ulae — , natnum is th e type of natural num bers and char* sta n d s for c h a ra c te r strin g s, ( a ß) is th e ty p e of functions from o b jects of type ß to o b jects

22 Cf. definition 5.4.

23 T h e lan g u ag e and its sem an tics will be very sim ilar to th e one given in Groe- nendijk a n d Stokhof (1991) w ith some m odifications required by th e typing.

(19)

7. A Language for Program Specihcations 15 o f ty p e a . T h e ty p ed first order language based on th e above set T is defined as follows:

7 .2 . D e f i n i t i o n

1. l / ps) d= f (LCps, Con, Var, E xpr);

2. L C ps d= { ( , ) , = , A, 3};

3. C on d= ( J r gT C onr ; a. C ont d= { T } ;

b. C o n d ir d= { r o o t,± dir};

c. Conßle = { tty ,m a il, l file};

d. Conn a f;nurn — N ; e - C o n char* =f C h a r ;

f- C o n (n atn u m file) = {w rite.perm ission};

S- C on(char" file) =f {content};

h. C o n((Cjjar' char") char") = { }>

4. V a r d^ f U r e T V a r ^ U U r e T V a 4 ; a. V arsd ir = f {HOM E, C W D , dirx

b. V a rJ Je = {KBD, SC R E E N , f i l e i,...} ;Hpf

c. V arn a tn u m = f {W R IT E C H E C K , E X IS T C H E C K ,..

d. V ar^ = {x.c\ x £ Var^ A c £ Con*a/J^};

5. E x p r = f U reT E x p rr ; 6. C o n r U V ar“ C E x p rr ;

7. 4» £ E x p r(a ß),rj £ Expr^ => $(77) £ E x p ra ; 8. T7,C 6 E x p ra ^ p = ( e E x p rt ;

9. 4>, T £ E x p r, =>■ ->($), ($ A $ ) £ E x p rt ; 10. $ £ E x p r ,,£ £ V a rf => 3£.<f> £ E xpr,.

T h e c o n stan ts a n d sim ple variables of th e language serve to n am e th e elem ents of th e m achine s ta te s — i.e., files, directories, n a tu ra l n u m b e rs and c h a r a c te r strings

— in accordance w ith ou r requirem ents. O u r exam ples of special v ariab les are

‘H O M E ’ for th e u se r’s hom e directory; ‘C W D ’ for th e c u rren t w orking directory;

‘K B D ’ for th e c u rren t keyboard in p u t file ; ‘S C R E E N ’ for th e c u rre n t screen o u tp u t file, ‘r o o t’, ‘m a il’ and ‘tty ’ are special files a n d directories. T h e u se of the rem ain in g c o n stan ts an d sim ple variables should be obvious from th e ir sem antics th a t we specify la te r on. T h e fu n ctio n al co n stan ts a re again self-ex p lan ato ry , ex cep t for "" w hich is the symbol of concatenation. T h e d e n o ta tio n o f x ^ y is th e c o n caten atio n of (th e strings) x a n d y. We u su ally om it it, a n d indicate co n ca te n atio n by m ere ju x ta p o sitio n . V arc is th e set o f complex variables. The

(20)

16 7. A Language for Program Specifications v alu e o f a com plex variable d e p e n d s on its com ponents. T h e o p e ra to r is sim ilar to th o se o p e ra to rs of p ro g ram m in g languages w hich select a p a rtic u la r m em ber o f a structure. W e can think o f u n a ry nam e fu n ctio n s as selectors of m em bers of such stru ctu res. W e stipulate th a t

x.c 6 V arc x.c = c(x).

T h a t is, th e values of nam e fu n c tio n s applied to variables can be a u to m atically re fe rre d to by com plex variables. For exam ple, th e content o f th e file file can be re fe rre d to e ith er as ‘co n ten t(fiie)’ o r ‘file.content’. The o p e ra to r associates to th e left (i.e., x.c.d = (x.c).d). A p a rt from a n d the lan g u ag e itself is given by th e sta n d a rd construction ru le s for expressions of type r in a T D F O L E . In w h a t follows we w ill be especially interested in expressions o f ty p e t 24.

W e need c e rta in further o p e ra to rs defined in term s of th e above:

7 . 3 . D e f i n i t i o n

1. ($ V T ) d= A -iT );

2. (<E> -> T ) d= - ( $ A - T ) ; 3. !(*) d=* - ( - ( * ) ) .

T h e definition o f V and -> is s ta n d a rd , w hereas T is a u n a ry logical sen ten tial o p e ra to r, i.e., it ta k es formulae in to form ulae.25

T h e sem antic value of th e w ell-form ed expressions of the lan g u ag e in a m achine s ta te m s is p ro d u c e d via th e fu n c tio n [ - J m s . F ir s t we define a fun ctio n D th a t assigns sem antic dom ains to ty p e s, i.e., it specifies which kinds o f o bjects serve as th e d e n o tatio n o f expressions giv en th e set of m achine sta te s26:

7 . 4 . D e f i n i t i o n

1. D(t) d= P (M S );

2. D(file) d= { u :tT (u) £ £fiie}j 3. D (dir) = f { u :fT(u) € E dir};

4. D (natnum ) d= N x ; 5. D (cfiar*) d= Char*;

6. D ((a ß )) d= D(/?) —>• D ( a ) .

T h a t is, the d e n o ta tio n of a fo rm u la is a set o f m achine states, w hereas n am es of files, directories, n a tu r a l num bers a n d ch aracter strin g s evaluate to elem ents o f th e

24 In w hat follows, we will refer to expressions o f type t as formulae.

25 T h is is G roenendijk and S to k h o f’s closure o p e ra to r o.

26 Cf. definitions 4.1 and 4 .2 .

(21)

7. A Language for Program Specifications 17 a p p ro p ria te ty p e of th e universe — e.g., a file n am e ev alu ates to a n o d e of type file of th e und erly in g tree of th e tdag — w hereas fu n c tio n a l expressions evaluate to fu nctions of th e a p p ro p ria te type.

Now we are read y to define the sem antics of th e well form ed ex p ressio n s of the language L^ps\ F irst we give the definition of expressions o th e r th a n fo rm u lae:27

7 .5 . D e f i n i t i o n

1. c 6 C on =$> fcj d= p(c);

2. x G V ar3 =>• |x ] d= v(x);

3. x.c € V arc => [x.c] = f [c]([x]);

4. [E X IS T C H E C K ] d= n € 2;

5. [ro o tj d= 0 € d o m (tr);

6. I-Ldir] = f 1 € d o m (ir);

7. [-Lfiie] =f 11 € doin(tr);

8. [± C h a r* ]= f l l l € d o m ( < T);

9. [w rite.p erm issio n j € F —» 2, w here F C d o m( t r ) such t h a t =

^file •

T h u s the sem antic values of constants a n d sim ple variables are p ro d u c e d by the in te rp re ta tio n a n d valu atio n functions, respectively. T h e values of co m p lex vari­

ables are d eterm in ed as was seen before. T h e re m a in in g clauses can b e regarded as c o n strain ts on v an d p. ‘E X IS T C H E C K ’ is a v ariab le th a t can o n ly b e set to 0 or 1 (th e sam e holds for ‘W R IT E C H E C K ’); ‘ro o t’ h as to denote th e ro o t of the T D A G . T h e n am e co n stan ts will represent ‘im m u ta b le ’ o bjects in th e m achine.

Som e of th em (especially ‘m a il’ and ‘t t y ’) will help us avoid co m p licatio n s in con­

n ectio n w ith p ro g ram s th a t do not change a m achine s ta te u n d er th e s ta n d a rd in te rp re ta tio n (since norm ally we are only in te re sted in th e ir side effects): we con­

ceive of th em as files th a t can grow indefinitely as strin g s are c o n c a te n a te d to their co n ten t (w hen m ail is sent or ch aracter strings are displayed, resp ectiv ely ). _Lr

27 We assum e th a t

Vxa .|x ]-Lms = _La , w here a G T \ {f}

i.e., th e d e n o ta tio n of all well-formed expressions except for fo rm u lae in the erro r s ta te is th e b o tto m elem ent of th e a p p ro p ria te type, as th is will not influence w h a t follows in any way. T h e definition below applies to all other cases. We will dro p th e superscript ‘m s’ a n d th e ty p e su b sc rip ts w h en this gives rise to no m isunderstanding. u[X] sta n d s for th e range of th e function v w hen c o n strain ed to th e set X .

(22)

18 7. A Language for Program Specifications d en o tes th e b o tto m element o f ty p e r; these a re ‘degenerate’ objects such as n o n ­ ex isten t files; th e ir use will b e explained la te r on. ‘w rite_perm ission’ is a fu n ctio n fro m tree ad d resses to 0 or 1, th u s relating a tre e address of type file to its w rite perm ission.oo

T h e sem an tic value of fo rm u lae in a m a ch in e state will be the set of m achine s ta te s th a t can resu lt after th e form ula has b e en processed. T hus we specify the m eanings as sets o f ordered p a irs o f m achine s ta te s. The definition runs as follows:

7 .6 . D e f i n i t i o n

1. <E E x p rt . ( l MS, 1m s) € [$];

2. (m s im s 2) G [T | m s i = ms2;

3 . ( m s i , m s 2 ) G [ t i = t 2 j <=> m s i = m s 2 A [ [ i i J raSl - [ f 2 J m s i ; d e f

4. ( m s i,m s 2) G |p $ ] m sj = m s2 A ->3ms3.(m s i,m s 3) G [$J;

5. ( m s i,m s 2) G [$ A T ] 3ms3.( m s i, m s3) G |$ ] A (m s3,m s x) G ['kJ;

6. (m s1, m s 2) G [3x.<k] td ag j = td a g 2 A p\ = p2A

A 3 m s3.(tdag3 = td a g j A p3 = pi A u3[x]vx A (m s3,m s 2) G f$ ]).

C lau se 1 s ta te s th a t the e rro r s ta te verifies every form ula an d no form ula can recover from it. The form ula T denotes th e diagonal relatio n on th e set MS, i.e., it is always tru e w ithout a n y dynam ic effects. The rem aining clauses are th e s ta n d a rd ones fo r DPL, th o u g h clause 6 looks a b it m ore com plicated, b u t th is is th e only clause introducing d y n a m ic effects, a n d it sim ply says th a t we are only in te re sted in changes of the v a lu a tio n fu n c tio n 28 29 if this leads to a v alu atio n th a t can serve as a n in p u t to th e em b ed d ed form ula. This justifies w hat we said above, n am ely th a t th e denotation o f a form ula in a m achine sta te is a set of valuations.

Now it is easy to com pute th e sem antic clauses for th e defined operators:

7 . 7 Facts

1. ( m s i,m s 2) G [$ V T ] <=>

m si = ms2 A 3 m s3.(m s i,m s 3) G [$ J V (m sa,m s 3) G [T];

2. (m s1,m s 2) G [$ —* T ] m si = m s2 A Vms3.(m s i,m s 3) G [ $ ] =£■

3m s4 .(m s3,m s4) G [T ];

3. ( m s i,m s 2) G |!$1 ^ m s i = m s2 A 3 m s3.(m s i,m s 3) G [$J;

28 We are m a k in g unforgivable sim plifications here. A m ong others, we sim ply ignore th e difference b etw een character files an d special files (such as character devices); also, we ignore o th e r types o f perm issions alto g eth er (norm ally th e perm issions o f a file are en co d ed in fo u r o c tal digits in th e file system ).

29 Vi[x\v2 m e a n s that th e tw o valuations a re th e sam e except p erh ap s for th e value th e y assign to x.

(23)

7. A Language for Program Specifications 19 As for th e first two definitions, there is little to say. In th e case of clau se 3. it should be now obvious why G roenendijk an d Stokhof call it th e closure o p e ra to r:

it closes off any d y n am ic effects a form ula m ay have h ad . Now we have a D FO L E th a t h as enough expressive power to describe relatio n s betw een m ach in e states.

We will use this language to specify th e sem antics o f p ro g ram s. B ut we h a v e to face two fu rth e r problem s. T h e den o tatio n of a form ula is a p a rtia l re la tio n , i.e ., it is n e ith e r fu n ctio n al n o r com plete. B ut we th in k of p ro g ram s as total fu n ctio n s from m achine states to m achine states — i.e., program s are defined ev ery w h ere, and th ey are determ inistic. T h is m eans th a t n o t every fo rm u la o f th e above language is a p p ro p ria te as a tra n sla tio n of a program . To single o u t th e class t h a t we need, we will in tro d u ce a rep resen tatio n for th e form ulae a n d im pose th e relevant co n strain ts on th is rep resen tatio n , which is basically a s h o rth a n d for th e form ulae of

8. P rogram S pecification s

W e will take th e form ulae th a t represent th e tra n s la tio n s of o u r p ro g ram s a p a r t a n d give th e m a rep resen tatio n in term s of th e ir p a rts . T h e sen ten ces of th is re p re se n ta tio n will be the ones of Z /ps\ b u t we will n o t use all th e p o w er of th is language. B u t now we will th in k a b o u t this language as an o rd in a ry typed first o rd e r language w ith equality w ith its s ta n d a rd sem antics. Two sen ten ces of th is new re p resen tatio n will play a key role in specifying program s. T h e first one, w hich we will call th e precondition (PC) of th e p ro g ram , will co n tain th e input co n d itio n s for th e execution of a program ; th e o th er, called th e m axim al change ( MC) , specifies its o u tp u t conditions. T h e in ten d ed in te rp re ta tio n is as follows:

a fo rm u la (j) is applicable to a m achine s ta te ms — i.e., m s G d o m ([</>]) — if and only if th e m achine s ta te satisfies all sentences in th e p ro g ra m ’s P C ,30 a n d if a p ro g ram is not applicable to a m achine s ta te , we will take it to have no effect.31 T h is is basically th e sam e behaviour as th a t o f s ta n d a rd shells, w here a n error m essage is issued in such a situ atio n , b u t th e m achine s ta te is no t affected. The only way a p ro g ram can lead to th e erro r s ta te is by lead in g o u t of th e set of m ach in e states, e.g., by rem oving one of th e o bjects req u ired by d e fin itio n 4.1.

T h e m ax im al change b ro u g h t about by th e p ro g ram is th a t sentences in th e MC of th e p ro g ram are satisfied by the new m achine sta te , a n d all o th e r sen ten ces not

30 W e take this to m ean th a t all form ulae in th is com ponent are satisfied by the m achine s ta te u n d e r some ap p ro p riate first-o rd er d efinition, i.e., m s [= T V7 G T.m s 7.

31 W e do th a t in o rd e r to get com plete functions in accordance w ith th e re­

q u irem en ts of definition 6.2. T h e general idea is th a t we explicitly lis t the p resu p p o sitio n s im posed by a pro g ram on th e in p u t m ach in e states.

(24)

20 8. Program Specifications

affected by M C retain th e ir t r u t h value.32

In a ctu a l fa c t, program specifications w ill be m ore com plex. F irst, th e P C will n o t be checked against th e in itia l m achine s ta te directly, b u t a m odified m achine s ta te , in w hich some variables are assigned local values for th e execution of th e program . So each program specification w ill contain a com ponent describing a m odification o f th e valu atio n o f the in itial m achine sta te . W e will call this com ­ p o n en t the local environment (L E N V) of th e program . T h e role of LENV is th a t we do not e x p e c t the in p u t m ach in e sta te to verify it, no r do we w ant it to live on in th e o u tp u t m achine s ta te , unless as a consequence of som e p ro p erty of th e M C in th e p ro g ram specification. Second, since M C is ju s t a sentence in a F O L E , we have to keep a separate c o m p o n en t describing the dynam ic aspect of th e change o f state effected by the p ro g ra m , i.e., th e list of those variables the sem antic value o f which m ay change from th e input s ta te to th e o u tp u t s ta te (th ro u g h changes in th e v a lu atio n ). We will call this com ponent th e environm ent change (E N V C) th a t the p ro g ra m can effect.

So p ro g ra m specifications will be q u ad ru p les of th e form (LEN V ; PC; M C; ENVC),

w here LEN V 6 Var -> (V ar U Con U {*}) (where V rep resen ts th e undefined funcion value). We will u se th e n o tatio n m s + LENV to refer to th e m odified m achine s ta te w hich differs fro m ‘m s’ in its valuation only, an d

* ± £ = L E N V (x) =* [ x l ms+LENV = [ e r s.

O n the o th e r h a n d , ENVC C V ar. As a m a tte r of course, if a variable is in ENVC th e n , even if L E N V assigns it a local value, its old value is n o t resto red after th e com p u tatio n .

The com p o n en t called M C does no t u se th e full force o f our language Z /psA T h is is due to th e fact t h a t th e o p eratio n o f a pro g ram is to be deterministic.

Therefore, a sentence in M C d oes not co n tain negation: th e re m ay b e several ways o f falsifying a form ula. (In th is way, we also exclude conditionals an d disjunctions, w hich also le a d to non-determ inism , because they are defined in term s o f negation.) A n o th er p ro b lem atic type o f sentence in o u r FO L E is equality: th ere are two ways o f verifying th e equality o f tw o variables, nam ely, th e v a lu atio n of eith er one (or b o th ) can b e m odified in o rd e r to make th e ir values identical. Accordingly, we will stip u late t h a t a t most one v ariab le on e ith e r side of an eq u ality is in E N V C , and all variables o f ENVC a p p e a r in some eq u ality — otherw ise we could change th e m achine s ta te arb itrarily w ith respect to th e variables in EN V C b u t n o t in M C.

32 Except fo r those changes th a t MC en tails, of course.

Hivatkozások

KAPCSOLÓDÓ DOKUMENTUMOK

In the computer integrated transport system the output information of the computer aided preparation of passenger transport subsystem (CAPTP) is used by the computer aided

Csak száll hozzá szájamnak hörgő, vak, rozsdás fohásza, mert áldott mint úrvacsora, s miként áldott az áprilisi eső.. Míg torkomba áll leheletének

Further- more, Uralic languages share a lot of geo-political aspects: the national languages of the group—Finnish, Estonian and Hungarian—are small languages and only moder-

lakultak, mint például a német ajkú Frantzfelden.20 Sajnos az is előfordult, hogy nem sikerült a gyülekezet megszervezése: 1819-ben Szintáron lemondtak az önálló

Az ajtónál azonban Hermann József mégegyszer kiegyenesedett és így szólt: „Adja Isten, hogy fiaddal együtt meg ne lakolj azért, amit vakmerőségedben vétkeztél!”..

Az ajtónál azonban Hermann József mégegyszer kiegyenesedett és így szólt: „Adja Isten, hogy fiaddal együtt meg ne lakolj azért, amit vakmerőségedben vétkeztél!”!.

Figure 12: Command \stepwise (second: after stepping) Consider the example of command

The only imaginable way to make secure software would be to redesign our entire computer environment, including hardware, programming languages and operating systems with security