Gradience - Vector Semantics

For the implicational account, negation causes difficulties, since in negative contexts the direction of implication reverses. WhereasJohn runsfollows fromJohn runs fast, John doesn’t rundoesn’t follow fromJohn doesn’t run fast. This is something of an idealized example, in that the primary reading forJohn runs fastis habitual ‘John is a fast runner’

whereas the primary reading forJohn runsis episodic ‘John is running now’, so on the most natural readings the implication doesn’t even hold!

If we just look at strength as this term is ordinarily understood, at least in subject position the effect is rather clear: fromA red car is overtaking us it follows thatA car is overtaking us but not conversely. This much, however, is easily obtained from the vectorial account as well: since the polytope corresponding tored caris the intersec-tion of the red and carpolytopes, it is contained in the latter. The discussion can be extended to negative polarity contexts along the standard lines (Giannakidou,1997), but we call attention to another phenomenon whose explanation has hitherto been lacking: it is precisely in the case of non-intersective adjectives that the implication failsA former president will give the commencement talkœA president will give the commencement talk.

To see how all this plays out for comparatives, we need to define the comparative morpheme-er, for which4langprovideser_, =agt has quality, "_-er"

-er

mark_ stem_[quality], "than _" mark_ =pat, =pat has quality.

Most of this definition just serves to pin down A, B, andC in the A is B-er than C construction: A is the agent, B is the quality marked by the stem, and C is the patient. The only critical element is the relationaler_, which we take to mean ordinary numerical comparison ’>’ between the stem-ishness ofAandC.er_is a primitive only under the algebraic view: in the geometric view we can replace it by

xA|P|By ą xC|P|By. (7.2)

As we have already discussed in Chapter6, non-intersective adjectives likeformer actu-ally shiftP (the projection that falls onVnis replaced by the projection toVb), but the inequality7.2will otherwise remain homogeneous in the basis of comparison:cold-eris obtained fromcoldand-erthe same wayblue-eris obtained fromblueand-er. In other words, the semantics of-eris perfectly compositional, and stays the same for intersective and non-intersective adjectives alike.

Turning to superlatives, we can define -est as er_ all. Since all is defined as -est/3625

all gen, whole, we obtain

xA|P|By ą xC|P|1/n,...,1/nXwholey (7.3)

7.2 Gradience 161

where we have used the fact thatgenis a fixed vector with1{non all coordinates, and that the semantics of conjunction is intersective. This can be further improved by

substi-tuting the definition ofwhole which isall member, and alsomember group has, whole member in group. This brings to sharp relief the essence of-est, that we have some implicit

comparison group, andAis-estmeans that for every other group member Eq.7.2holds.

Again the analysis is entirely compositional, and leaves implicit exactly what needs to be left implicit, the comparison group. Note that this group is not entirely supplied by the noun that the superlative attaches to:the tallest boyis not the boy who is tallest among all boys, just the one who is tallest among all relevant boys (Moltmann,1995).

Eq.7.3can be faulted for usingąinstead ofě. This can be easily fixed by replacing all byotherin the definition of -est, but this of course implies a unique maximum.

We have arrived at a situation that is fairly common in formulaic semantics, where the correctness of an analysis must be evaluated based on the felicity of readings in some-what contrived situations. Suppose there are two twins of the exact same height in a class, Bill and Dave. Can Bill be called the tallest, why or why not? If we think-estdoes not imply a unique maximum, the definition of-est/3625is along the right lines (we’d still have to make provisions for the fact that no thing is strictly larger than itself). Since superlative plurals likethe strongest boys, the most beautiful paintingsare common, this lends strength to the proposal, as long as we assume that the strongest boys are equally strong.

However, if we are committed to the idea that extrema are unique, we can use a

different definition of -est, er_ other. This can again be further analyzed by sub- -est/1513 stituting the definition of other, which is simply different. Recall thatotheris a other procedural keyword that prevents unification, and when we define it as different,

we rely on Leibniz’ Principle of Indiscernibles, i.e. we bring in a property that

distin-guishes the two:different means=pat has quality, =agt lack quality, different

"from _" mark_ =pat. As unification operates silently, the simplest assumption is that this property must be the one marked by the stem. Other properties could also be invoked, as exemplified by Kornai,2012as follows.

In the case of ()She promised immunity for a confession(), we assume the promissorp is in a position to cause some suspectsto have immunity against prosecution q for some misdeed d, and that it isswho needs to confess to d.

Yet the sentence is perfectly compatible with a more loose assignment of roles, namely that the actual misdeed was committed by some kingpin k, and s is merely a witness to this, his greatest supposed crime being the withholding of evidence. Thisd¹, being an accessory after the fact, is of course also a misdeed, but the only full-force implication from the lexical content ofimmunityis that there is some misdeedmthat could trigger prosecution against which sneeds immunity, not thatm “ dor m “ d¹. The hypothesis m “ d is merely the most economical one on the part of the hearer (requiring a minimum amount of matters to keep track of) but one that can be defeased as soon as new evidence comes to light.

162 7 Adjectives, gradience, implicature

That said, the most natural (default) assumption is that the distinguishing property is indeed the one supplied by the adjectival stem, which implies that the noun modified by stem-estis indeed the one that has the property signified bystemin the greatest measure among all candidates.

4langhas only two definitions,best‘optimus’ good, -est; andmain‘primus’

best

main er_ other, rank, lead/2617that rely on the superlative morpheme. In the for-mer case, we left the choice of superlative between3625and1513unresolved, since bestinherits the ambiguity, but in the latter case we resolve it in favor of1513, as it is commonly assumed that there can only be one main city, main thoroughfare, etc.

In a weaker form, gradience phenomena are observable not just on adjectives but also on nouns. Many languages have diminutives like English -ette (cigar/cigarette, kitchen/kitchenette, pipe/pipette, . . . ) and augmentatives like Italian-one (minestra/mine-strone, provola/provolone, spilla/spillone . . . ) but these are rarely productive, whereas comparatives and superlatives are so productive that the existence of such forms is often taken as diagnostic for the adjectival status of the stem.

Another set of examples comes from syntax: English and many other languages have a fully productive construction withtrue/realin the noun modifier slot. Atrue Scotsman is one that enjoys all properties Scotsmen are supposed to have, areal Coltis a revolver actually made byColt’s Manufacturing Company, and so on. Since this ‘prototypicality’

reading of the construction is non-compositional (all4langdefinitions oftrue, real, fact revolve around existence and proof) we must supply the semantics based on the word prototypical, which means ‘very typical’ (LDOCE). Typical means ‘having the usual features or qualities of a particular group or thing’, so atrue/real Xmust be something that has the usual features/qualities ofX in large measure. This can be implemented using the same idea. If we model a word by a polytope that is the intersection of some half-spacesH_i, we can form the intersection ofH_i¹ defined by the same normal vectors nibut higher biasesb¹_i ąbi.

Cross-linguistically, intensifier morphemes can apply to all categories, as in Russian pre-(predobryj‘very kind’ from adjectivaldobryj‘kind’;premnogo‘very much’ from adverbial mnogo ‘much’; preizbytok ‘large abundance’ from nominal izbytok ‘abun-dance’;preuspet’‘succeed in’ fromuspet’‘manage’) though not with equal productivity (Endresen,2013). Importantly, quantifiers are no exception, they can be intensified any-thing at alland serve as intensifiersso I don’t work or anything(Labov,1984).

The ease of creating homogeneous semantics for intensification processes, coupled with the very visible heterogeneity of their productivity across lexical categories, pro-vides yet another argument in favor of Bloomfield’s rejection of ‘class meaning’ we cited in2.1. It is not that we need to reject the very notion of lexical categories, in fact Lévai and Kornai (2019) demonstrates that word vectors in different syntactic categories show different behavior, a matter we shall return to in8.2. Rather, we have to draw the line between inflectional and derivational morphology following the dictum of Anderson (1982):inflection is what is relevant for syntax.Syntactic constructions and inflectional regularities can be most economically stated over lexical categories, and when we see

In document Vector Semantics (Pldal 172-175)