You are on page 1of 28

Getting the most out of most

Joseph Lochlann Smith


September 21, 2012

Contents
1 Generalised Quantifier Theory
1.1 Limitations of GQT . . . . . . . . . . . . . . . . . . . . . . . .
2 Three critiques of generalised
2.1 Most and more than half . .
2.2 At least three and three . . .
2.3 At least three and more than

3
5

quantifiers
7
. . . . . . . . . . . . . . . . . . . 7
. . . . . . . . . . . . . . . . . . . 9
two . . . . . . . . . . . . . . . . 11

3 The
3.1
3.2
3.3

syntactic distribution of at
Comparison with adverbs . . .
Comparison with only . . . .
Summary . . . . . . . . . . .

most
14
. . . . . . . . . . . . . . . . . . 15
. . . . . . . . . . . . . . . . . . 16
. . . . . . . . . . . . . . . . . . 17

4 The
4.1
4.2
4.3
4.4

proposal: from most


Intensional most . . .
Ranking propositions .
Refining the account .
At best . . . . . . . . .

most
. . . .
. . . .
. . . .
. . . .

to
. .
. .
. .
. .

at
. .
. .
. .
. .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

18
18
19
21
23

5 Semantic or pragmatic modality?

24

6 Conclusion

26

References

27

Introduction
Generalised Quantifier Theory (GQT) is one of the great success stories of
formal semantics. In recent years, though, it has come under increasing
criticism. Much of this criticism is directed at the perceived coarseness
of GQT. Because its aims and successes lie in characterising general
properties of large classes of disparate quantificational expressions, it glosses
over subtle differences of meaning and ignores internal syntactic structure.
What much subsequent research has shown is that the two failings are
intimately linked. Differences in meaning between expressions equated by
GQT often find their explanation in the internal structure which GQT has
ignored. Conversely, more detailed exploration of the structures subsumed
under the Det category of GQT has often led to the discovery of differences
in their behaviour.
In this paper I will attempt to bring together two separate strands of
research in this vein. The first is Hackls (2009) analysis of the differences
between most and more than half. This is work in the explicitly compositional
tradition of Heim and Kratzer (1998) and its critique of GQT is centred on
the latters insensitivity to the internal structure of multi-word and even
single-word quantifiers.
The second is work on inadequacies of GQT with respect to the expressions at most and at least, which I will henceforth (following Geurts and
Nouwen (2007)) refer to as superlative scalar modifiers or simply superlative
modifiers. The specific puzzles under consideration vary from paper to paper, but all the work from Geurts and Nouwen (2007) through B
uring (2008)
to Cummins and Katsos (2010) has built on the formal foundations of Krifka
(1999).
My goal is to address the question speculatively posed at the end of
Geurts and Nouwen (2007): what exactly is the relationship between comparative and superlative modifiers, on the one hand, and comparative and
superlative morphology, on the other? I will concern myself only with the
superlative side of things, and my specific aim will be to derive the lexical
entry for at most hypothesised by Geurts and Nouwen from Hackls lexical
entry for the superlative morpheme -est. In so doing, I hope to explain the
relationship between at most and the closely-related expression at best, which
is an issue overlooked by all authors that I am aware of. This will, I hope,
pave the way for exploration of superlative scalar modifiers as a wider class
of expressions than previously assumed, and ultimately the formulation of
some cross-linguistic generalisations about them.

Generalised Quantifier Theory

The mathematical notion of generalised quantifiers dates back to Mostowski


(1957) but their most famous application to linguistic issues is Barwise and
Coopers classic 1981 paper.1 Their key insight was that there are quantificational expressions in natural language whose meaning cannot be expressed
in first-order predicate logic. More than this, the fact that first-order logic
is inadequate is not simply because its inventory of quantifiers ( and ) is
too limited but because of fundamental limitations in its formulation.
The key witness to this fact is the word most. Barwise and Cooper
proved that no matter how we enrich its inventory of quantifiers, first-order
logic cannot express the proportional relation denoted by most. Compare
the following (where in (1-d), {, , , } denotes any combination of the
logical operators contained in the set):
(1)

a. JeveryK(A)(B) = 1 iff x[A(x) B(x)]


b. JsomeK(A)(B) = 1 iff x[A(x) B(x)]
c. JnoK(A)(B) = 1 iff x[A(x) B(x)]
d. *JmostK(A)(B) = 1 iff M x[A(x) {, , , } B(x)]

Unlike for every, some, and no, which can be expressed using the existing
resources of predicate logic, there is no first-order quantifier M x which will
allow us to represent the meaning of most.
The solution to the issue is to employ Mostowskis generalisation of the
notion of a first-order quantifier, such that relations like and are in
fact specific instances of the more general class of relations between sets of
entities. On this view, we can give a uniform treatment to the quantifiers
listed above, as well as to many more that we find in natural language.
(2)

a.
b.
c.
d.

JeveryK(A)(B) = 1 iff A B
JsomeK(A)(B) = 1 iff A B 6=
JnoK(A)(B) = 1 iff A B =
JmostK(A)(B) = 1 iff |A B| > |A B|

We must be careful to distinguish early on between various conflicting uses of


the terms quantifier and determiner. A quantifier in the GQT sense is not
an individual word like every or most, which are known as quantificational
determiners. A generalised quantifier is the denotation of the DP formed
1

It was not, however, the first: they were employed in Montague (1974). Montagues
aims, though, did not coincide with those of Chomskyan linguistics to the extent that
Barwise & Coopers did, as Montague was concerned with delimiting the class of all
possible natural and artificial languages, as opposed to just natural languages.

by the combination of a quantificational determiner with an NP, for instance


every boy. (Note that in the syntactic theory of the early 80s, every boy was
considered an NP, not a DP).
(3)

. . . semantically . . . more than half is not acting like a quantifier,


but like a determiner. It combines with a set expression to produce
a quantifier. On this view, the structure of the quantifier may be
represented as below:
Quantifier
Determiner Set expression
. . . we can see that the structure of the logical quantifier corresponds
in a precise way to the English noun phrase (NP) as represented in:
NP
Det

Noun

most

people
(Barwise and Cooper 1981: 162)

The term determiner is, in Barwise and Coopers usage, potentially ambiguous between a semantic sense (i.e. the denotation of a word like every)
corresponding to Determiner in the first tree diagram above, and a syntactic
one, corresponding to Det in the second tree. Here, I will follow Szabolcsi
(2010) in referring to the former as semantic determiners or determiner
denotations and the latter simply as determiners.
This clarified, we can now say that the relations between sets that constitute the denotations in (2) are not quantifier denotations but the denotations
of quantificational determiners. To understand the denotations of generalised
quantifiers, lets restate our definitions in (2) using lambda abstraction.2
(4)

a.
b.
c.
d.

JeveryK = AB[A B]
JsomeK = AB[A B 6= ]
JnoK = AB[A B = ]
JmostK = AB[#(A B) > #(A B)]

The arguments A and B are predicates, i.e. sets of individuals or their char2
For clarity in combination with the square brackets I use the notation #(P ) as a
substitute for |P | here and elsewhere.

acteristic functions, depending on our perspective. Lets combine a predicate


with our determiner denotations in (4) via functional application to produce
generalised quantifers:
(5)

a.
b.
c.
d.

Jevery boyK = AB[A B](JboyK) = B[JboyK A]


Jsome boyK = AB[A B 6= ](JboyK) = B[JboyK B 6= ]
Jno boyK = AB[A B = ](JboyK) = B[JboyK B = ]
Jmost boysK = AB[#(A B) > #(A B)](JboyK)
= B[#(JboyK B) > #(JboyK B)]

The generalised quantifiers thus formed are second-order predicates, that is to


say, they denote (the characteristic functions of) sets of sets. The denotation
of (5-a) is the set of all properties possessed by every boy, which is the same
thing as the set of all sets which contain every boy.
Generalised quantifiers have proven to be an immensely powerful tool for
investigating the deep semantic structure of natural language quantification.
They have been particularly useful at uncovering potentially universal semantic properties of determiners, such as conservativity and extension (see,
for instance, Keenan and Stavi (1986)), and exploring interesting implications for the learnability and processing of quantificational expressions. But
they are not without their limitations.

1.1

Limitations of GQT

There are two respects in which GQT can be considered coarse. The first
is syntactic, the second semantic.
The syntactic coarseness is the result of the rather blunt way in which
Barwise and Cooper identify the Det in the lowermost tree diagram of (3).
Essentially, what they choose to identify as the determiner is the result of
removing the noun from the noun phrase. In the phrase at least three boys,
boys is obviously the noun, and therefore the determiner must (according
to Barwise and Cooper) be at least three. Once a determiner has been so
identified, its internal structure is entirely ignored. This extends even to
examples like not more than two people, where the determiner is identified
as not more than two and the matter is left at that. For instance, consider
how the following syntactically and morphologically diverse expressions are
indistinguishable in GQT:
(6)

a.
b.
c.

JoneK(A)(B) = 1 iff |A B| 1
Jat least oneK(A)(B) = 1 iff |A B| 1
Jless than twoK(A)(B) = 1 iff |A B| 1
5

The semantic coarseness of GQT, on the other hand, has a number of


aspects. GQT is a strictly truth-conditional theory, and as such has nothing
to say about implicatures which may come packaged with the meaning of an
expression. This is, of course, hardly surprising in view of the aims of formal
semantics in the late 70s and early 80s, as Szabolsci (2010, p. 76) points out.
Nonetheless, it leaves the GQT lexical entries unable to account for certain
aspects of the semantic behaviour of quantifiers, and puts GQT increasingly
out of step with modern developments in the field.
GQT is also strictly extensional. This may seem entirely justified for a
theory of quantification. However, we will see that some of the expressions
labelled as quantificational determiners by GQT give rise to modal meanings
(though whether these modalities arise via semantic or pragmatics means is
the subject of ongoing debate).
A number of authors (Hackl (2009), Pietroski et al. (2009), Lidz et al.
(2011)) have cited examples such as (7) as evidence of GQTs perceived
insensitivity to the form in which truth conditions are stated.
(7)

a.
b.
c.

JnoK(A)(B) = 1 iff A B =
JnoK(A)(B) = 1 iff |A B| = 0
JnoK(A)(B) = 1 iff |A B| < 1

They argue that certain statements of truth conditions more accurately convey the intermediate representations that language users make use of in computing truth values, citing experimental data relating to the verification procedures employed by subjects in evaluating statements.
While it seems right that there is more to meaning than truth conditions,
I dont see how this can possibly be a criticism of GQT without being a
criticism of the whole of truth-conditional semantics. After all, the fact that
truth conditions can be stated in multiple forms is true of any truth conditional lexical entry in formal semantics. One can, for instance, substitute
an expression of the form x for one of the form preserving truth,
and two lexical entries so differentiated cannot be considered substantively
different. One may be preferred for reasons for readability, or to make the authors intentions clearer, but such issues cannot be confused with substantive
difference in meaning.
As I see it, if one wants to seriously pursue the study of verification procedures, one needs a new (explicitly procedural) representational framework
in addition to truth-conditional semantics. Otherwise, one is simply imbuing
notation with intuited meaning over and above its intended use.
These thorny issues aside, the main point is that much recent work has
started from a standpoint of dissatisfaction with GQT and attempted to find
6

ways to differentiate, both syntactically and semantically, expressions which


are equated by GQT. I will, therefore, introduce each critique of GQT with
reference to the truth-conditional equivalences it objects to, but implying
nothing more than that the meaning thus expressed does not fully account
for the semantic behaviour of the expressions in question.

2
2.1

Three critiques of generalised quantifiers


Most and more than half

In his 2009 paper, Hackl notes that most and more than half are indistinguishable in GQT. Both of the below formulations of their truth conditions
are equivalent and interchangeable.
(8)

a.
b.

(9)

a.
b.

JmostK(A)(B) = 1 iff |A B| > |A B|


Jmore than half K(A)(B) = 1 iff |A B| > |A B|
JmostK(A)(B) = 1 iff |A B| > 21 |A|
Jmore than half K(A)(B) = 1 iff |A B| > 21 |A|

Hackl provides persuasive evidence of systematic differences between the two


expressions which can be attributed to their compositional structure but
which are a mystery from the GQT perspective.
The main evidence is a puzzling asymmetry between most and fewest.
While in English more than half has a polar opposite less than half (whose
meanings are straightforwardly related as in (10)), the relationship between
most and fewest is more complex. (Similar facts hold for German).
(10)

a.
b.

Jmore than half K(A)(B) = 1 iff |A B| > 21 |A|


Jless than half K(A)(B) = 1 iff |A B| < 21 |A|

Most has two readings, associated with slightly different surface syntax in
English (in German only one surface pattern is attested and is ambiguous
between the two readings).
(11)

a.
b.

John climbed most of the mountains. (PROPORTIONAL)


John climbed the most mountains. (RELATIVE)

For the statement in (11-a), which gives rise to what Hackl calls the proportional reading, most has a meaning equivalent to more than half. The
statement in (11-b), by contrast, means that John climbed more mountains
than anyone else in a contextually-determined comparison set.

(12)

a. *John climbed fewest of the mountains. (PROPORTIONAL)


b. John climbed the fewest mountains. (RELATIVE)

In (12-a), we see that the proportional reading (less than half the mountains)
is unavailable for fewest, though the reading on which John climbed fewer
mountains than anyone else is available. In German the surface syntax of
the two is identical but fewest still lacks the proportional reading.
Hackl demonstrates that the explanation for these facts lies in the wordinternal compositional structure of most and fewest, specifically the presence
of the superlative morpheme in both. As noted by Szabolcsi (1986) and Heim
(1999), superlatives exhibit a very similar ambiguity to most.
(13)

John climbed the tallest mountain.

(13) can mean that John climbed a taller mountain than anyone else, or (in
the absence of a context which limits the comparison class) that he climbed
Mount Everest. Following Heim (1999), Hackl claims that the superlative
morpheme -est moves at LF for interpretability, and thus attributes the ambiguity in (13) and the two forms in (11) to differences in the LF scope of
-est.
(14)

JmanyK(d)(A) = x.[A(x) |x| d]

(15)

For all C of type he, ti, D of type hd, he, tii and x of type e,
JestK(C)(D)(x) is defined only if
x C y[y C]
When defined,
JestK(C)(D)(x) = 1 iff
y C[y 6= x max{d : D(d)(x = 1} > maxd: D(d)(y)=1
]
.

(16)

a.
b.

[John climbed [the [-est C]i [di -many mountains]]]


[John [-est C]i [climbed [the di -many mountains]]]

Hackl assumes that if [-est C] stays inside the DP, the comparison class C is
identical to the NP sister of the degree function. In (16-a) this makes it the
set of pluralities of mountains. If [-est C] moves into the matrix clause, the
comparison class is a set of salient individuals including the subject. In (16-b)
this makes it the set of individuals who climbed pluralities of mountains.
Many-est comes out with a proportional reading because of the assumption that the pluralities in the comparison class C have no overlapping atomic
parts. We can see why this meaning comes out if we consider an analogy
with seats in Parliament. A party which has an overall majority (i.e. more
than half the seats) is guaranteed to be in government, because the party
8

with more seats than any other forms a government. If one party has more
than half the seats, it is impossible for any other party, even if they control
all the remaining seats, to have more seats. Similarly, Hackls lexical entry
makes Jmany-estK(C)(x) true of x if x has more atomic members than any
distinct (i.e. non-overlapping) plurality in C. If x contains more than half
of all the atomic members under consideration, this condition is guaranteed.
Hackls achievement is to show how a careful consideration of the structure that GQT leaves unanalysed can provide insightful explanations of otherwise puzzling phenomena. Deriving the semantics of most from the semantics of the superlative morpheme is a triumph for a compositional approach,
and one which I hope to extend to at most.

2.2

At least three and three

Krifka takes exception to the following equivalence in GQT:


(17)

a.
b.

Jat least threeK(A)(B) = 1 iff |A B| 3


JthreeK(A)(B) = 1 iff |A B| 3

He claims that the indistinguishable truth conditions fail to explain why three
gives rise to a scalar implicature that stronger alternatives on the Hornscale are false, while at least three does not. Krifka argues that part of
the reason for this failure of GQT is that it gets the syntax of modified
numeral expressions wrong. Barwise and Cooper analyse at least three as
a determiner which combines with a noun phrase to give the structure [at
least three [boys]]. Krifka disputes this, maintaining that [at least] is an
independent constituent which combines with [three [boys]] to give [[at least]
[three [boys]]].
The distributional data are discussed in more detail in section 3, but
suffice to say for now that the separability of [at least] from the numeric
expressions it sometimes attaches to seems indisputable in light of examples
like the following:
(18)

a.
b.
c.
d.

At least John drank three beers.


You should at least call her.
Mary is at most an associate professor.
Three people came, at most.

The above examples demonstrate not only the syntactic deficiency of the
GQT analysis of superlative scalar modifiers but some of its corresponding
semantic blind spots, for nothing in the GQT truth conditions can explain
how at most is able to modify an associate professor, or at least modifies the
9

VP call her.
Krifkas solution to the puzzle crucially relies on another facet of superlative scalar modifiers which GQT overlooks, namely their ability to associate
with focus in a way which affects truth conditions.3 Consider:
(19)

a.
b.

At most three boys left.


At most three boys left.

Sentence (19-a) could still be true if, say, three boys and three girls left.
Sentence (19-b) is ambiguous between a reading where the focus is specifically
on the noun boys, and one where focus projects to the entire noun phrase
three boys. If we assume the latter, the sentence would be false (given an
appropriate context) if three boys and three girls left.
As Krifkas formal account of focus will be essential for my own account of
at least, it is worth considering in some detail. Krifka follows Rooth (1985) in
assuming that focussed expressions have the semantic function of introducing context-determined alternatives which percolate upwards compositionally
through the tree until they meet an operator which makes use of them. He
then additionally assumes that number words and other expressions which
naturally line up along a scale are special, in that they can introduce alternatives without being focussed, and that the set of alternatives they introduce
comes with an ordering relation. Superlative scalar modifiers like at most
and at least depend upon this set of alternatives and its ordering relation for
their meaning.
Lets consider a concrete example.
(20)

At most John drank three beers.

According to Krifka, the number word three introduces a set of ordered alternatives which percolate upwards compositionally according to the following
rule (this idea goes back to Hamblin (1973)):
(21)

If J[]K = f (JK, JK), then J[]Ka =


{hf (X, Y ), f (X 0 , Y 0 )i | hX, X 0 i JKa and hY, Y 0 i JKa }

Those expressions which do not introduce alternatives (i.e. that are neither
focussed nor come with an ordering relation) are assumed for formal reasons
to have come with a set consisting of both their meaning proper (to allow
composition with unordered alternatives) and an ordered pair formed with
their meaning proper (to allow composition with ordered alternatives).
3

This phenomenon has been explored in depth for adverbials such as only and even in
work such as Rooth (1985).

10

Heres how the first few steps of such composition would play out for our
example (20).
(22)

a.

JthreeK = P x[3(x) P (x)]


JthreeKa = {hP x[n(x)P (x)], P x[m(x)P (x)]i | n n m}

b.

JbeersK = beers
JbeersKa = {beers, hbeers, beersi}

c.

Jthree beersK
= P x[3(x) P (x)](beers)
= [3(x) beers]
Jthree beersKa
= {hx[n(x) beer(x)], x[m(x) beer(x)]i | n n m}

Since in (20) at most attaches to the S node of the syntactic tree, the alternatives that serve as input to at most are those that percolate all the way to
the top, and are thus an ordered set of propositions:
(23)

J[John [drank [ three [beers]]]]Ka =


{hx[n(x) beers(x) drank(john)(x)],
x[m(x) beers(x) drank(john)(x)]i | n n m}

In Krifkas analysis, at most and at least use the focus-induced alternatives in such a way that leaves no alternatives for scalar implicatures to
negate. This is the explanation for the contrast between at least three and
three which GQT fails to capture.
While I will make use of Krifkas ideas on projection of ordered alternatives by focus, I will not use his specific analyses of at most and at least,
mainly because they fail to account for the phenomena which are the topic
of the next section.

2.3

At least three and more than two

While Krifka criticised the equivalence of at least n boys and n boys in GQT,
he made no such distinction between at least n boys and more than n-1 boys,
and his focus-sensitive definitions for at least and more than in effect equate
the two, just as GQT does (likewise for at most n and less than n+1 ):
(24)

a.
b.

Jat least threeK(A)(B) = 1 iff |A B| 3


Jmore than twoK(A)(B) = 1 iff |A B| 3
11

(25)

a.
b.

(26)

a.
b.

(27)

a.
b.

Jat least threeK(A)(B) = 1 iff |A B| > 2


Jmore than twoK(A)(B) = 1 iff |A B| > 2

Jat most twoK(A)(B) = 1 iff |A B| 2


Jless than threeK(A)(B) = 1 iff |A B| 2
Jat most twoK(A)(B) = 1 iff |A B| < 3
Jless than threeK(A)(B) = 1 iff |A B| < 3

Geurts and Nouwen (2007) take exception to this equivalence of comparative and superlative scalar modifiers. Their key pieces of evidence are the
following.
Firstly, superlative modifiers allow a specific construal that is infelicitous
for comparative modifiers:
(28)

a. I will invite at most two people, namely Jack and Jill.


b. ?I will invite fewer than three people, namely Jack and Jill.

(29)

a. I will invite at least two people, namely Jack and Jill.


b. ?I will invite more than one person, namely Jack and Jill.

Secondly, there is a contrast in the accessibility of inferences to statements


involving superlative and comparative modifiers:
(30)

a. Beryl had three sherries.


b. Beryl had more than two sherries.
c. ?Beryl had at least three sherries.

While (30-b) indisputably follows from (30-a), the inference from (30-a) to
(30-c) is more questionable, a result which has been borne out in several
experimental studies (Geurts (2007), Geurts, Katsos, Cummins, Moons, and
Noordman (2010)).
Thirdly, in combination with modals superlative modifiers give rise to
ambiguities that comparative modifiers do not.
(31)

a.
b.

You must have at least three beers.


You must have more than two beers.

Example (31-a) has a preferred reading that you are required to drink a
minimum of three beers. This is what B
uring (2008) calls the authoritative
reading. It also has another, less accessible reading that three beers is the
minimum number that you are required to drink, but its possible you are
required to drink four beers, or five beers, etc. This B
uring calls the speaker
insecurity reading.

12

Example (31-b) is, by contrast, not ambiguous in the same way. It cannot
have the speaker insecurity reading, and means simply that you are required
to drink a minimum of three beers.
Geurts and Nouwen solve these puzzles by ascribing a modal component
to the meanings of superlative modifiers. Here are their lexical entries:
(32)

a.
b.

(33)

a.
b.

If is of
Jat least
If is of
Jat least

type t, then
K =  [ B ]
type ha, ti, then
K = X[(X) [ B (X)]]

If is of type t, then
Jat most K = [ B ]
If is of type ha, ti, then
Jat most K = X[(X) [ B (X)]]

Notice that, like Krifka, Geurts and Nouwen ackowledge the fact that superlative scalar modifies can combine semantically with entire propositions,
a fact not just overlooked in GQT but essentially inexpressible within its
framework. They also make use of Krifkas notion of ordered alternatives:
the B operator in the lexical entries above indicates an ordering relation
between two entities so joined (namely that the former precedes the latter).
Geurts and Nouwens introduction of an epistemically modal component
to the meaning is, however, novel. On their view, a speaker uttering John
drank at most three beers means that, as far as they know, it is possible that
John drank three beers, and not possible that he drank more than three. On
the other hand, John drank at least three beers means it is certain that he
drank three, and possible he drank more than three.
The infelicity of the specific construal in (28) and (29) is therefore ascribed
to the fact that comparative modifiers can only combine semantically with
first-order predicates, whereas as we can see from presence of a variable a in
the type definitions in (32-b) and (33-b), superlative modifiers can combine
with higher-order predicates. This means that the operation of existential
closure with Geurts and Nouwen assume must take different scope in the case
of superlative and comparative modifiers:
(34)

a.
b.

[I will invite [at most [two people]]]


[I will invite [fewer than [three people]]]

In (34-a) the closure operation, which results in a type shift from predicate
to quantifier (first- to second-order predicate), can apply within the scope of
at most because at most is allowed to combine with higher-order predicates.
In (34-b), however, it cannot, and because it does not scope directly over the
13

numeral, the numeral cannot have a specific construal.


The questionable inference patterns in (30) are straightforwardly ascribed
to the fact that the conclusion contains modal components which are not
present in the premise.
The ambiguity in (31-a) when superlative modifiers combine with modals
is ascribed to the interaction of the two modalities according to a rule of
modal concord, for which Geurts and Nouwen provide some independent
evidence in Dutch. This rule causes stacked modals which agree in modal
force (i.e. both are necessity or both are possibility operators) to merge.
This rule is posited to be optional but preferred (hence the inaccessibility of
the speaker insecurity reading).
In summary, Geurts and Nouwen convincingly demonstrate that superlative modifiers exhibit modal behaviour which comparative modifiers do not.
They assume that this modal behaviour is semantically encoded in the lexical
entries of the superlative modifiers, though they note the possibility that it
arises pragmatically (a point which others have taken up and to which we
will later return).

The syntactic distribution of at most

Krifka correctly points out that superlative scalar modifiers have distributional properties which are incompatible with their GQT analysis as determiners. Namely, they have a much freer distribution than syntactic determiners, and can in fact form constituents with them, something which is
impossible for true determiners.
(35)

a. At least every girl


b. At most some boy
c. *Every some girl
d. *Some every boy

They can also attach to VPs:


(36)

You must at least talk to him.

The evidence is thus compelling that superlative scalar modifiers are separable from the numerals they sometimes attach to.4
Before providing an explicit compositional account of at most, however,
we need to make this observation more precise. Exactly how free is the
4
By contrast, Krifkas further claim that comparative scalar modifiers are also thus
separable is more dubious, as pointed out in Szabolcsi (2010), pp 165-166.

14

distribution of at most? To what other expressions is it most similar in its


syntactic characteristics?

3.1

Comparison with adverbs

First, lets compare at most and a quantificational adverb, sometimes, in


a sentence with a numerically modified DP. Here a habitual reading of the
present tense must be assumed to make the examples felicitous (imagine a
context where we are describing what John does in the evenings after work).
(37)

a.
b.
c.
d.

(At most/Sometimes) John drinks three beers.


John (at most/sometimes) drinks three beers.
John drinks (at most/*sometimes) three beers.
John drinks three beers (at most/sometimes).

At most can appear everywhere that sometimes can, and in one position not
available to sometimes, namely post-verbally as in (37-c).5 The same pattern appears if we consider a non-quantificational adverb straightforwardly
derived from an adjective, e.g. slowly.
(38)

a.
b.
c.
d.

(At most/Slowly) John drank three beers.


John (at most/slowly) drank three beers.
John drank (at most/*slowly) three beers.
John drank three beers (at most/slowly).

Why is at most, but not slowly or sometimes, allowed in (37-c) and (38-c)?
A comparison with another focus-sensitive adverbial, only, can provide some
insight here.
(39)

a.
b.
c.
d.

(At most/Only) John drank three beers.


John (at most/only) drank three beers.
John drank (at most/only) three beers.
John drank three beers (at most/only).

Interestingly, only has, on the surface, the same distribution as at most: it


can also appear in the post-verbal position which is off-limits to the other
adverbial expressions we have considered. How can we account for this?
5
The ungrammaticality of many adverbs in post-verbal position in English (as in (37-c)
and (38-c)) is of course a well-documented phenomenon (see, for instance, Carnie (2012)).
It forms part of the key evidence that in French (and many Romance languages) the verb
raises from V to I while in (modern) English it does not.

15

It it well-known in the literature6 that only is a special sort of adverb


in that, unlike the manner or frequency adverbials which we tested above,
it can form a constituent with DP. In fact, it can combine with a variety of
phrase types. Consider the following from Koopman et al. (2003):
(40)

a.
b.
c.
d.

only
only
only
only

John (only DP)


with John (only PP)
happy (only AP)
put pepper on the tomatoes (only VP)

The same possibilities are available to at most, with the caveat that we can
imagine a context which places what follows at most on an ordered scale of
alternatives.
(41)

a.
b.
c.
d.

at
at
at
at

most
most
most
most

John (at most DP)


with John (at most PP)
happy (at most AP)
put pepper on the tomatoes (at most VP)

So at most has in common with only both focus-sensitivity and a distributional freedom not shared by other adverbials, in particular the ability to
form a constituent with DP. This explains the apparently post-verbal position of at most in (39-c): it is not so much post-verbal as pre-nominal,
forming a constituent [at most [three beers]].

3.2

Comparison with only

There is, however, a crucial difference between at most and only. Only cannot
combine with a whole sentence, while at most can. We need to look slightly
beyond the surface syntactic distribution to find supporting evidence for this.
Consider the following (again from Koopman et al. (2003)):
(42)

a. Only John drinks beer


b. *Only John drinks beer
c. *Only John drinks beer

Placing the focus on John is felicitous, but placing it later in the sentence
is not. (Or rather, it is not felicitous on a reading which associates it with
only and thus creates truth-conditional differences). This is because of the
following rule (as quoted in Koopman), which restricts the scope of focussensitive operators:
6

My source is Koopman, Sportiche, and Stabler (2003)

16

(43)

The focus associated with only must be contained in a constituent


sister to only

Clearly (42-b) and (42-c) are unacceptable because only can only form a
constituent DP with John, and therefore cannot associate with focus outside
of this DP.
The same restriction does not apply to at most.
(44)

a.
b.
c.
d.

At
At
At
At

most
most
most
most

John drank three beers.


John drank three beers.
John drank three beers.
John drank three beers.

All of these are felicitous, and all have slightly different truth-conditions
(although we need a suitable context to bring out this fact). For instance,
(44-a) would be false if, say, John and Peter both drank three beers, because
the set of alternatives under comparison might be the following:
(45)

{John drank three beers, John and Peter drank three beers, John
and Peter and Mary drank three beers, . . . }

By contrast, (44-c) would be true if John and Peter both drank three beers,
because the alternatives would then be:
(46)

3.3

{John drank one beers, John drank two beers, John drank three
beers, John drank four beers, . . . }

Summary

In summary, at most has a wider distribution than comparative scalar modifiers and many other adverbials.7 Its closest rival in this respect is only,
but at most can in fact form a constituent with a superset of the syntactic
phrases that only can. Specifically, unlike only it can combine syntactically
with S and associate with an focussed subconstituent of S. What this means
for the compositional semantic analysis of at most will be the topic of the
next section.
7

Geurts & Nouwen do highlight certain cases where the distribution of superlative
scalar modifiers appears more restricted than that of comparatives. However, as their own
explanation indicates, this is a semantic infelicity to do with the interaction of negation
and modals rather than a syntactic distributional restriction, and the weakness of the
corresponding judgments tallies with this.

17

The proposal: from most to at most

My aim is to characterise the relationship between the degree quantifier most


and the superlative scalar modifier, at most, and in so doing provide a compositional analysis of the related expression at best. In what follows I will
restrict my attention to cases where the superlative scalar modifier has attached to the S node and thus takes a propositional argument, but it should
not be too difficult to refine the account to include those cases where at most
combines with a DP and thus takes a predicative argument.

4.1

Intensional most

The first element of my proposal is the idea that most and its suffixal counterpart -est each come in two closely-related flavours.8 The first of these is
the lexical entry from Hackl (2009) which we discussed above. Here I use a
variant of Hackls formulation due to Gajewski (2010).9
(47)

For all C of type he, ti, D of type hd, he, tii and x of type e,
JmostK(C)(D)(x) is defined only if
x C d[D(d)(x) = 1]
y[y 6= x y C d[D(d)(y) = 1]]
When defined,
JmostK(C)(D)(x) = 1 iff
d[D(d)(x) = 1 y[y 6= x y C D(d)(y) = 0]]

Echoing the two variants of Geurts and Nouwens lexical entries in (32) and
(33), I propose that most is in fact systematically ambiguous between form
(47), which is a generalised degree quantifier over individuals, and an intensionalised form, mostint , which is a generalised degree quantifier over propositions (that is, over functions from worlds to truth values). Thus while
most is of type hhd, he, tii, he, tii, mostint is of type hhd, hhs, ti, tii, hhs, ti, tii.
The truth conditions of mostint are otherwise identical to (47), but as we
shall see, the shift in type allows mostint to interact compositionally with its
surroundings in ways which are impossible for most.
8

Following Hackl (2009), I take it for granted that most is additionally ambiguous
between its compositional form many-est and its primitive form. The latter is equivalent
to -est, and is the only form under consideration here.
9
It may appear that the presuppositions are substantively different from Hackls here,
given that they require the members of the comparison class to meet D to some degree.
However, as Gajewski points out, Hackls truth conditions for -est do reference the maximal
degree to which each individual in C has D, and he must therefore assume that every
member of C has D to some degree.

18

(48)

For all C of type hhs, ti, ti, D of type hd, hhs, ti, tii and p of type
hs, ti,
Jmostint K(C)(D)(p) is defined only if
p C d[D(d)(p) = 1]
q[q 6= p q C d[D(d)(q) = 1]]
When defined,
Jmostint K(C)(D)(p) = 1 iff
d[D(d)(p) = 1 q[q 6= p q C D(d)(q) = 0]]

For mostint , the covert contextual argument C is a set of salient alternative


propositions, and D is (the characteristic function of) a set of propositions
possessing some gradable property to a given degree. What gradable property
could a proposition possess?

4.2

Ranking propositions

Recall that Krifka (1999) claimed that certain expressions in certain contexts
introduce a partially ordered set of semantic alternatives into the universe
of discourse. These are expressions which either come pre-packaged with a
natural ordering (e.g. number words) or else have an ordering which is in
some sense aided by context (e.g. John is at least a Texan creating a taxonomic ordering on the alternatives to Texan, e.g. american tax texan
tax austinite). Such alternatives bubble up compositionally through the
sentence until they meet a focus-sensitive operator which makes use of them.
In the example under consideration, this will be at the syntactic level of the
S node and semantic level of an ordered set of a complete propositions.
I propose that this ordered set in fact constitutes the covert contextual
argument, C. Consider again
(49)

At most John drank three beers.

As described in (23) and repeated below, the end result of the percolation
of the ordered alternatives up the tree will be the following set C as an
argument to at most:
(50)

J[John [drank [ three [beers]]]]Ka =


{hx[n(x) beers(x) drank(john)(x)],
x[m(x) beers(x) drank(john)(x)]i | n n m}

Because the propositions which make up the comparison class are now embedded inside ordered pairs, we will need to replace every instance of C in
our lexical entry for mostint with C to get at them directly. Given that our
19

intensional variant of most can only be used as part of at most and C is


by assumption an ordered set in such cases, this will work for our purposes.
However, in the long run it suggests a refinement of the formalisation may
be needed. Firstly, vindication for the existence of mostint would come from
evidence for its presence in other contexts, perhaps outside of superlative
scalar modifiers. Such contexts, if found, might not provide an ordered set
for C, and C would not be a valid operation. Secondly, the further divergence between mostint and most weakens the underlying argument that they
are essentially the same thing. Ultimately, I believe a solution will lie in
tweaking the formal machinery used.
This noted, I can use the availability of the ordered set C to define the
following function as a mapping from pairs of degrees and propositions to
truth values:
(51)

rank(C)(d)(p) = 1 iff |{q | hq, pi C}| = d

That is to say, rank(C)(d)(p) = 1 if there are d lower-ranked propositions


than p in C.
Lets assume, as a nave initial sketch, that [at most] has the following
structure (never mind at this point about the syntactic ramifications of a P
head selecting a DegP complement).
(52)

PP
P

DegP

at

most

By our assumptions about semantic composition, at must be of type hd, hhs, ti, tii,
and combine with mostint via Functional Application. Remember our goal is
to move compositionally from the semantics for most found in Hackl (2009),
Gajewski (2010), and Heim (1999) to the semantics for at most found in
Geurts and Nouwen (2007). For now, lets place most of the burden for this
on the denotation of at. Note that the possibility operator represents epistemic possibility: the set of worlds implicitly under consideration are only
those that are compatible with what the speaker knows.
(53)

JatK = dd .phs, ti .[rank(C)(d)(p) = 1 p]

By Functional Application, then, we have the following.


(54)

Jmostint K(C)(JatK)(p) is defined only if


p C d[rank(C)(d)(p) = 1 p]
20

q[q 6= p q C d[rank(C)(d)(q) = 1 q]]


When defined,
Jmostint K(C)(JatK)(p) = 1 iff
d[rank(C)(d)(p) = 1 p
q[q 6= p q C [rank(C)(d)(q) = 1 q]]
The resulting complex expression [at most] still requires a propositional argument. It presupposes that this proposition has a given rank in the set
of semantic alternatives, and that it is possible (true in at least one world
compatible with the speakers knowledge). It also presupposes that there
is at least a second possible ranked proposition under consideration. It asserts that given the rank of the proposition argument, there is no second
distinct proposition that is both higher-ranked and possible. That is to say,
the speaker asserts both that p is possible, and that it is the highest-ranked
proposition that is so. Thus the alternative possible proposition which satisfies the presupposition must be lower-ranked than p.
It is easy to see that, apart from the presuppositions, this amounts to the
same thing as Geurts and Nouwens definition in (33) for the variant of at
most which combines with a propositional argument, repeated below.
(55)

4.3

If is of type t, then
Jat most K = [ B ]

Refining the account

In the previous sections we made the dubious syntactic assumption that at


most has the simple structure in (52). This led to a rather dubious conclusion,
namely that the denotation of a particular variant of the word at carries
what seems to be a large amount of construction-specific baggage. This goes
against the motivating principle of our compositional account, which is to
find nuggets of reusable meaning in apparently idiosyncratic expressions.
Fortunately, we can improve on our account. Consider:
(56)

At the most John drank four beers.

If the intervenes between at and most then at cannot in this case be the sister
to most, and cannot supply an argument to most. A plausible structure might
be
(57)

[pp at [dp the most]]

However, if the is the sister to most, how does most combine with a predicate
21

containing the notions of rank and possibility? Moreover, there does not
seem to be much independent evidence for a determiner selecting a DegP
complement.
Lets consider an alternative well-formed PP structure.
(58)

[pp in [dp the [ [ most probable] [case]]]]

Here, a P head selects a DP complement. Within the DP, D selects an NP.


The N within the NP is modified by an AP which consists of an AP itself
modified by the DegP adjunct, [most]. All very orthodox. What if we assume
that our the structure of at the most and indeed at most is identical,
but filled partly with covert elements?
To maximise the structural similarity between the various PPs we have
looked at, I propose that the determiner position can be filled with a covert
definite determiner (for whose general existence there is a wealth of independent evidence see in particular Abney (1987)). The only difference between
at most and at the most is in the overtness of the determiner.
(59)

a.
b.

[at [ [ [most RANKED][POSSIBILITY]]]]


[at [the [ [most RANKED][POSSIBILITY]]]]

Slightly more controversially, I assume the existence of a covert gradable


adjective encoding nothing more than the rank function we defined above.10
(60)

JRANKEDK(C)(d)(x) = 1 iff rank(C)(d)(x) = 1

I also posit a covert abstract noun POSSIBILITY denoting a possible proposition.


(61)

JPOSSIBILITYK = phs, ti .p

Of course, most is now uninterpretable as it does not have a sister of type


hd, hhs, ti, tii. It must move at LF in the manner described in Hackl (2009).
(62)

[at [most]i d [ [[di RANKED][POSSIBILITY]]]]

RANKED and POSSIBILITY combine compositionally just as any degree


adjective and noun would, by Predicate Modification. This is how we get
the component p rank(C)(d)(x) which we formerly stuffed awkwardly
inside the denotation of at. Composition up to the parent node of [most] is
achieved exactly as in the account of Hackl and Heim. Because by this point
we already have the correct denotation, we can now view at as semantically
10

Echoing Kaynes practice, I write covert lexical items in all caps.

22

vacuous in this construction. The denotation simply passes up to the top


node of the PP where it is a sister to S, and can receive its propositional
argument.

4.4

At best

Is there any independent evidence for the structure I have hypothesised


above? We have seen a few examples of parallel PP structures which suggest that it is possible. Further evidence comes from the expression at best,
hitherto overlooked by other authors on the topic of superlative modifiers.
At best is similar to at most in that it can combine with propositions
and similar to all superlatives in that it depends on a comparison class. It
is also focus-sensitive, as shown in (63-a), and thus depends on projected
alternatives.
(63)

a.
b.
c.
d.

At
At
At
At

best
best
best
best

John drank three beers


John drank three beers.
John drank three beers.
John drank three beers.

These have truth conditional differences. (a) would be false if there is a


possible situation in which Bill drank three beers which is considered better
than the one in which John did. (c) could be true in the same conditions,
provided that the situation in which John drank three beers is better than
the one which he drank an alternative number of beers.
For at best, I propose the surface structure in (64-a) and the LF in (64-b).
(64)

a.
b.

[at [ [ [good-est][POSSIBILITY]]]]
[at [-est]i d [ [[di -good][POSSIBILITY]]]]

Just as is the case for at most, at best appears in two surface syntactic forms,
at best and at the best.
(65)

a.
b.

[at [ [ [good-est][POSSIBILITY]]]]
[at [the [ [good-est][POSSIBILITY]]]]

We still need the covert POSSIBILITY noun to make the semantics come
out right, however. I argue that attributing a degree of desirability to a
proposition without further qualification normally entails the truth of the
proposition, as in (66).
(66)

Its good that hes still alive.

23

Since use of at best always entails uncertainty about the truth of the proposition, I propose that the POSSIBILITY noun is a necessary element of the
semantic representation.
From a semantic perspective, the comparison of at best with at most is
intriguing because it perhaps indicates a connection between the fact that at
most piggy-backs on pre-existing semantic scales and the fact that it lacks
an overt adjective. The alternatives required by at best do not need to be
come with a pre-existing semantic ordering, because the adjective good does
not require the comparison class C to be an ordered set.
The evidence for this comes from the fact that at best can seemingly
reverse what would be the default semantic ordering. Consider a context
where John is an alcoholic on the point of death from liver failure. He went
to the bar last night and we are speculating about how many beers he drank.
(Also assume that we want John to live).
(67)

a.
b.

At best John drank three beers.


At most John drank three beers.

When uttering (67-a), the semantic alternatives are compared by their desirability. Given that we want John to drink as little as possible, the alternatives
increase in desirability as the number of beers decreases. With (67-b), no
such option is available. The default number ordering (or quantity of alcohol,
say, depending on focus) bubbles up through the tree and makes alternatives
in which John drank more beers higher-ranked than those in which he drank
fewer.

Semantic or pragmatic modality?

Since the publication of Geurts and Nouwen (2007) several authors have
directly taken up the issue briefly discussed in its conclusion, namely whether
the modal effects Geurts and Nouwen identify are encoded in the truthconditional semantics or arise by pragmatic inference. This is not an issue
I have space to explore in detail, but it merits a brief discussion because
the pragmatic account has some advantages over the modal account I have
assumed herein.
B
uring (2008) hypothesises that the truth-conditional component of at
least is simply disjunctive: at least three cars means three cars or more
than three cars. He demonstrates how the modal phenomena can be derived
pragmatically via a Gricean implicature of quantity, namely that when a
speaker utters a disjunction it is because they are not certain of the truth of

24

either proposition.11 This account has the advantage of parsimony over the
semantically modal account of Geurts and Nouwen because it can also explain
the interaction of modals and superlative modifiers that forces Geurts and
Nouwen to invoke to the rather heavy machinery of modal concord. B
uring
claims that there is no independent evidence for modal concord, although
he seems to have overlooked some examples from Dutch which are cited in
Geurts and Nouwens paper.
Cummins and Katsos (2010) (ironically in continuation of their work in
Geurts et al. (2010) which led to the opposite conclusion) provide experimental evidence in support of a disjunctive view of superlative scalar modifiers.
Some of this is compelling, but I note an instance where my compositional
account provides an explanation where neither the B
urings nor Geurts and
Nouwens analysis does.
One of their experiments involved asking subjects to rate the acceptability
of a second statement as a revision of a first. For example,
(68)

Jean has at most three houses. Specifically, she has three houses.

Revisions from at most n to exactly n-1 and exactly n were found to be


significantly more acceptable than revisions from some to all. As the authors note, this is unexpected if both revisions are simply pragmatically
self-contradictory. It is also unexpected on the semantically modal account,
since on this view the superlative modifier encodes the possibility of exact
equality, a possibility which is then either denied by the second sentence or
strengthened to a statement of fact.
However, recall that in my semantics for at most, the superlative morpheme introduces a presupposition that there exist at least two possible
propositions in the comparison class. The first is the proposition denoted
by the S node. The second is one of the focus alternatives. The existence
of a higher-ranked possible proposition is explicitly denied by the semantics,
meaning that this second presupposed alternative must be lower ranked than
the over proposition. It is conceivable that when subjects revise from at most
n to exactly n-1 they are simply fixing the rank of the second possibility. This
is evidently less coherent than a straightforward entailment, but equally it is
more coherent than an implicature cancellation.
11

However, for reasons which remain obscure to me (and which seem to have been
ignored in Cummins and Katsos (2010)), he notes that extending the account to at most
would require a crucial refinement of his ideas.

25

Conclusion

In this paper I have examined a number of critiques of GQT and considered how their insights might be connected. Specifically, I have attempted to
address the link between superlative morphology and superlative scalar modifers by connecting the work of Hackl on the former with the work of Geurts
and Nouwen on the latter. In the process I have explored in depth some of
the syntactic properties of scalar modifiers and proposed that they contain a
number of covert elements which are crucial to their compositional analysis.
I have also opened up the possibility that the class of scalar modifiers is wider
than assumed, including at best.
The pragmatic account of B
uring does pose a strong challenge to the
semantically modal account which I assume here. As far as future prospects
for my compositional account, I see two options. The first is to accept that
it is irrevocably tied to the semantic account. If the pragmatic account were
ultimately vindicated, then presumably the compositional account would be
discarded. The alternative is to look for ways to modify it so as to derive the
truth-conditional disjunction of the pragmatic account, and let the modality
arise by implicature. In my view this is probably the wisest, as the pragmatic
account appears to have a number of decisive advantages, the strongest being
its parsimony.

26

References
Abney, S. P. (1987). The English Noun Phrase in its Sentential Aspect.
Unpublished doctoral dissertation, MIT.
Barwise, J., & Cooper, R. (1981). Generalized Quantifiers and Natural
Language. Linguistics and Philosophy, 4 (2), 159219.
B
uring, D. (2008). The least at least can do. Proceedings of WCCFL(2007),
114120.
Carnie, A. (2012). Syntax: A generative introduction. John Wiley & Sons.
Cummins, C., & Katsos, N. (2010, March). Comparative and Superlative
Quantifiers: Pragmatic Effects of Comparison Type. Journal of Semantics, 27 (3), 271305.
Gajewski, J. (2010, January). Superlatives, NPIs and Most. Journal of
Semantics, 27 (1), 125137.
Geurts, B. (2007). Experimental Support for a modal analysis of at least
and at most. Proceedings of the ESSLLI2007 workshop on . . . (3), 16.
Geurts, B., Katsos, N., Cummins, C., Moons, J., & Noordman, L. (2010,
January). Scalar quantifiers: Logic, acquisition, and processing. Language and Cognitive Processes, 25 (1), 130148.
Geurts, B., & Nouwen, R. (2007). At Least et al. Language and Cognitive
Processes, 83 (3).
Hackl, M. (2009). On the grammar and processing of proportional quantifiers:
most versus more than half. Natural Language Semantics, 132.
Hamblin, C. (1973). Questions in Montague English. Foundations of language, 10 (1), 4153.
Heim, I. (1999). Superlatives (No. 1).
Heim, I., & Kratzer, A. (1998). Semantics in generative grammar. John
Wiley & Sons.
Keenan, E. L., & Stavi, J. (1986). A semantic characterization of natural
language determiners. Linguistics and Philosophy, 9 (3), 253326.
Koopman, H., Sportiche, D., & Stabler, E. (2003). An introduction to
syntactic analysis and theory. Unpublished manuscript.
Krifka, M. (1999). At least some determiners arent determiners. In K. Turner
(Ed.), The semantics/pragmatics interface from different points of view
(pp. 25791). Oxford: Elsevier.
Lidz, J., Pietroski, P., Halberda, J., & Hunter, T. (2011). Interface Transparency and the Psychosemantics of most. Natural language semantics.
Montague, R. (1974). The proper treatment of quantification in order english.
In R. Thomason (Ed.), (p. 247-271). Yale University Press.
Mostowski, A. (1957). On a generalization of quantifiers. Fundamenta
Mathematicae, 44 , 12-36.
27

Pietroski, P., Lidz, J., Hunter, T., & Halberda, J. (2009). The meaning
of most: Semantics, numerosity and psychology. Mind & Language,
146.
Rooth, M. (1985). Association with Focus. Unpublished doctoral dissertation,
UMass.
Szabolcsi, A. (1986). Comparative superlatives. MIT Working papers in
Linguistics, 8 , 120.
Szabolcsi, A. (2010). Quantification. Cambridge University Press.

28

You might also like