Minimalist Principles & Parameters theories of grammar deal mainly in procedures which generate linguistic expressions from atoms in an incremental fashion. That is, these theories traffic in computational procedures, in the sense of Turing, Church, Post,
As its name suggests, Agree is the operation that causes grammatical agreement—subjectpredicate agreement, case marking, etc.—which, I argue in this paper, has yet to be sufficiently defined in such a way as to properly analyze its theoretical and empirical properties.
Take, for instance, the recent debate regarding Upward vs Downward Agree (
This lack of theoretical assessment of Agree is troubling, since the operation has been implicated in a wide range of grammatical phenomena beyond the morphological agreement phenomena from which it gets its name. Agree has been argued to be necessary to explain movement/Internal Merge (
This proliferation of theories of Agree is further exacerbated by the fact that, since its inception, analyses in Generative Grammar have always had both derivational and representational expressions. In the theory used in
Sincerity may frighten the boy.
S  (by S → NP^{⌢}Aux^{⌢}VP)  (cf. 
NP^{⌢}Aux^{⌢}VP  (by VP→V^{⌢}NP)  
NP^{⌢}Aux^{⌢}V^{⌢}NP  (by NP→Det^{⌢}N)  
NP^{⌢}Aux^{⌢}V^{⌢}Det^{⌢}N  (by NP→N)  
N^{⌢}Aux^{⌢}V Det^{⌢}N  (by Det→ 

N^{⌢}Aux^{⌢}V^{⌢} 
(by Aux→M)  
N^{⌢}M^{⌢}V^{⌢} 
(by M→ 

N^{⌢} 
(by N→ 

(by N→ 

(by V→ 

sincerity^{⌢}may^{⌢}frighten^{⌢}the^{⌢}boy 
[_{S} [_{NP}
The formal expression in (2) to (4) are all roughly equivalent, though each highlights a different aspect of the analysis they represent.
Since Generative Grammar within the P&P tradition is a computational theory, the derivational expression of a given analysis has always been the ultimate expression—a representation is only a valid analysis in such theory, insofar as it can be derived in that theory. The representational expressions, on the other hand, are much more concise and accessible, so they have been overwhelmingly used as shorthands for the derivational expressions, but they are useful as shorthands only insofar as all of the information they encode can also be represented with the derivational expressions.
These representational expressions become problematic, however, when they are augmented for the sake of clarity. For instance, movement/Internal Merge can be represented without arrows as in (5) but more often arrows will be added for ease of understanding as in (6), though (5) and (6) are assumed to be equivalent.
It is, perhaps, understandable that Agree, commonly represented by arrows similar to movement arrows as in (7), is assumed to have the same level of theoretical underpinning as movement.
To date, though, there has been no proposal for a derivational expression of the arrow in (7). The task of this paper in part, then, is to remedy this oversight.
To that end, I will be expanding the formalization of minimalist syntax developed by
Universal Grammar is a 6tuple: 〈PHONF, SYNF, SEMF, Select, Merge, Transfer〉
PHONF, SYNF, and SEMF are universal sets of phonetic, syntactic, and semantic features, respectively; Select, Merge, and Transfer are operations. I will begin the outline of the formal grammar with the feature sets, postponing discussion of the operations for now.
A lexical item is a triple: LI = 〈PHON, SYN, SEM〉
where SEM and SYN are finite sets such that SEM ⊂ SEMF, SYN ⊂ SYNF, and PHON ∈ PHONF*.
A lexicon is a finite set of lexical items.
An ILanguage is a pair 〈Lex, UG〉, where Lex is a lexicon and UG is Universal Grammar.
In order to capture the Copy/Repetition distinction, C&S introduce lexical item tokens, defined in (12), which are the atoms of syntactic computation. C&S also define several other useful terms using LI tokens.
A lexical item token is a pair LI_{k} = 〈LI, k〉, where LI is a lexical item, and k is an integer.
A lexical array is a finite set of lexical item tokens.
X is a syntactic object iff:
X is a lexical item token, or X is a set of syntactic objects.
Let A and B be syntactic objects, then B immediately contains A iff A∈B.
Let A and B be syntactic objects, then B contains A iff
B immediately contains A, or for some syntactic object C, B immediately contains C and C contains A.
C&S then define a generative framework, wherein complex syntactic objects are derived in stages.
A stage is a pair S = 〈LA, W〉, where LA is a lexical array [a possibly ordered set of lexical item tokens] and W is a set of syntactic objects. We call W the workspace of S.
The operations Merge, Select, and Transfer operate on stages and derive new stages. Merge is binary setformation, Select moves lexical item tokens from the lexical array to the workspace,
Given any two distinct syntactic objects A, B, Merge(A,B) = {A,B}.
Let S be a stage in a derivation S = 〈LA, W〉.
If lexical token A∈LA, then Select(A, S) = 〈LA – {A}, W∪{A}〉.
Thus, we can define the central notion of derivation in (20).
A derivation from lexicon L is a finite sequence of stages 〈S_{1}, …, S
where each S
For all LI and k such that 〈LI,k〉∈LA_{1}, LI∈LA,
W_{1}={} (the empty set),
for all
(derivebySelect) for some A∈LA
(derivebyTransfer) …, or
(derivebyMerge) LA
A∈W
Either A contains B [Internal Merge] or W
W
So, abstracting away from certain representational complexities, the sentence
(S_{1})  ⟨{T_{Pres}, 
(by Select( 
(S_{2})  ⟨{T_{Pres}, 
(by Select( 
(S_{3})  ⟨{T_{Pres}}_{LA3}, { 
(by Merge( 
(S_{4})  ⟨{T_{Pres}}_{LA4}, {{ 
(by Select(T_{Pres}, S_{4})) 
(S_{5})  ⟨{}_{LA5}, {T_{Pres}, { 
(by Merge(T_{Pres}, { 
(S_{6})  ⟨{}_{LA6}, {{T_{Pres}, { 
(by Merge( 
(S_{7})  ⟨{}_{LA6}, {{ 
C&S’s formalization is open for some refinements, such as those that
Agree can be very broadly described as an operation that modifies a syntactic object X iff X stands in a particular formal/structural relation and a particular substantive relation with another syntactic object Y. So, in order to define Agree, we must formalize (
Two elements X and Y can Agree iff X ccommands Y, Y Matches X, and there is no element H such that H Matches X, X ccommands H and H ccommands Y.
LDDVAgree is downward in the sense that it modifies the ccommanded element, and it is valuationbased in the sense that the element is modified by converting one of its unvalued feature into a valued one as specified in (23) and (24).
X Matches Y for feature F iff X has [F:
If X and Y Agree for feature F then [F:__] on Y becomes [F:
The first thing we must do, is formalize the notion of “feature” as used here. By (8), there are three sets of features in Universal Grammar—PHONF, SYNF, SEMF. Setting aside PHONF as irrelevant to the current paper, our task is to formalize the members of SYNF and SEMF. Generally, a given syntactic or semantic feature is describable with reference to its interpretability, its type, and its value (or lack thereof). Interpretability can be taken care of by simple set membership—interpretable features are members of SEMF, uninterpretable features are members of SYNF—leaving us with type and value.
A
For all feature types F, 〈F, 0〉 is an
For lexical item LI = 〈PHON, SYN, SEM〉, feature F_{v} is a
For lexical item token LI_{k} = 〈LI, k〉, feature F_{v} is a
So, for instance, English present tense might have roughly the lexical representation in (29).
〈PHON, {…〈φ, 0〉…}, {…〈T, 1〉...}〉
This lexical item has some phonetic features, an unvalued uninterpretable φfeature, and an interpretable T feature with the value 1, which we can stipulate is interpreted as present tense. The choice to formalize feature values as integers is made only to allow for a perspicuous way of defining unvalued features. We could use any type of discrete symbol to represent values, provided it had a special symbol for “unvalued.”
We can define Match as in (30).
For any two lexical item tokens X, G and feature type F,
Match(X, G, F) = 1 iff for some feature value v ≠ 0, 〈F, v〉 is a feature of X and 〈F, 0〉 is a feature of G.
Under this definition, an English finite T head will match a nonCasemarked pronoun but not a Casemarked one, as demonstrated in (31).
Match(T, 3SgF_{[Case:_]}, Case) =
Match(T, 3SgF_{[Case:ACC]}, Case) =
Value is essentially a replacement operation—operating on a lexical item token, swapping an unvalued feature with a valued counterpart. This is defined in (32).
For lexical item token LI_{k}=〈〈PHON, SYN, SEM〉, k〉, and feature 〈F, v〉,
Value(LI_{k}, 〈F, v〉) = 〈〈PHON, (SYN–{〈F, 0〉})∪{〈F, v〉}, SEM〉, k〉
So, an instance of Value associated with subjectpredicate agreement, ignoring Case, might look something like (33).
Where T_{Pres} = (29) and 〈φ, 31〉 corresponds to 3rd person singular,
Value(〈T_{Pres}, 4〉, 〈φ, 31〉) = 〈〈PHON, {...〈φ, 31〉…}, {...〈T, 1〉...}〉, 4〉
The resulting lexical item token still has an interpretable tense feature and an uninterpretable φfeature but the latter now has the value 31, which we stipulate corresponds to 3rd person singular.
Note that, while I have been tacitly assuming that (un)valuedness and (un)interpretability are correlated in the lexicon—that all and only unvalued features are members of SYNF—the definition of Value in (32) contradicts this assumption, since the result of Value is an element that contains at least one valued uninterpretable feature.
In fact, any attempt to make the assumption that all uninterpretable features are unvalued hold in general runs into issues. We could save it by eliminating Value, but this would contradict one of the core premises of Agree theory—that Agree modifies lexical item tokens midderivation. Alternatively, We could save it by redefining Value, say as Value′ in (34) which removes the unvalued feature from SYN and adds a valued feature to SEM.
For lexical item token LI_{k}=〈〈PHON, SYN, SEM〉, k〉, and feature 〈F, v〉,
Value(LI_{k}, 〈F, v〉) = 〈〈PHON, SYN–{〈F, 0〉}, SEM∪{〈F, v〉}〉, k〉
With this definition though, our subjectpredicate agreement would look like (35), which seems to create a T head which is
Where T_{Pres} = (29) and 〈φ, 31〉 corresponds to 3rd person singular,
Value(〈T_{Pres}, 4〉, 〈φ, 31〉) = 〈〈PHON, {…}, {...〈T, 1〉, 〈φ, 31〉...}〉, 4〉
The operation defined in (32), then, seems to match the notion of valuation generally assumed in theories of UG with Agree. It does, however, have a problematic prediction that I address below.
The last portion of Agree to be defined is what is often called “Probe”, which is an instance of “Minimal Search” (
The term Minimal Search, as it is usually used in minimalist syntactic theory, refers to an algorithm that retrieves the “highest” object in a structure that meets some particular criterion. While such an algorithm is almost certainly required for Agree, it is not required only for Agree. Indeed, Minimal Search is implicated in at least Internal Merge (
The criterion for a given instance of Search, it seems, depends on the purpose of that search. For Internal Merge, following a FreeMerge theory, the Search criterion is more or less identity—Internal Merge of X and Y requires a successful Search of X for Y or vice versa—
In order to properly define a Minimal Search algorithm we must first consider some test cases as follows. Each case is a complex abstract syntactic object containing two objects—G and H—each of which meets the search criterion. Each case is represented both as a binary set as constructed by Merge and a binary tree. The first case in (36) is the most straightforward—G asymmetrically ccommands H, so Minimal Search retrieves G and not H.
{X, {
The second case in (37) is slightly more complicated—G does not ccommand H, but Minimal Search should retrieve G because it is immediately contained in an object that asymmetrically ccommands an object that immediately contains H.
{X {{
Other cases, though, will give ambiguous results. These are cases in which G and H are equidistant from the root. In (38), for instance G and H are siblings, while in (39) they are immediately contained, respectively, by siblings.
{X, {
{X, {{
Our goal, then, is to construct an algorithm that has the abovedefined results. There are two broad classes of search algorithms appropriate to our task—DepthFirst Search (DFS) and BreadthFirst Search (BFS). DFS starts at the root of an object and searches to a terminal node before backtracking, as represented in (40), where the arrows and the numbers indicated the search order.
A DFS algorithm can be made minimal by designing it to stop as soon as it finds a node that meets its criterion. So, a Minimal DFS on Case 1 would be proceed as in (41) selecting.
However, in an ambiguous case, like Case 4, a Minimal DFS will incorrectly retrieve just a single object as shown in (42).
A Minimal DFS algorithm, then is overdefinite—it gives a definite result where we expect an ambiguous one.
There is also a deeper problem with DFS as applied to syntactic objects, and that is its reliance on linear order as well as structure. In the examples above, whenever the algorithm reaches a branching node, it takes the left branch first. If it, instead, took the right branch first, the result would be different—in both (41) and (42), a righttoleft Minimal DFS would retrieve H rather than G. The problem is made worse by the fact that, the structures that we are searching are constructed by Merge and, therefore, do not have a linear order. In order for our algorithm to make a decision at a “branch,” then, it would have to be a random decision. Therefore, the result of a DFS for a given syntactic object may be different each time it is run. Given these issues, I will set aside DFS.
Breadthfirst Search (BFS) algorithms, on the other hand, searches neighbour nodes before proceeding lower in the tree as represented in (43), where the arrows and the numbers indicated the search order.
Again, this can be made minimal by requiring that the algorithm stop immediately upon finding an object that matches the search criterion. A Minimal BFS on Case 2, then, is represented in (44).
Like the Minimal DFS, the Minimal BFS, as represented in (43) and (44) assumes that nodes are linearly ordered, even if that order is arbitrary. Unlike the Minimal DFS, the order of the neighbour nodes does not matter, at least for definite cases like Case 1 and Case 2. To demonstrate this, consider the reverse version of (44) in (45).
In an ambiguous case, though, Minimal BFS suffers the same fate as Minimal DFS—it is overdefinite. So, in Case 3, Minimal BFS will wrongly retrieve either G or H depending on the ordering of nodes, as shown in (46) and (47).
This flaw, however, can be overcome if, instead of traversing each node, we treat the sets of neighbour nodes as tiers, as in (48).
Minimal Tiered BFS, then, would visit each tier and extract the subset of that tier whose members all matched the search criterion, and stop as soon at it extracts a nonnull subset. Thus we can define a definite search result as in (49), an ambiguous search result as in (50), and a failed search as in (51).
For a syntactic object SO and criterion P, Search(SO,P) is definite iff
Search(SO,P)=1
For a syntactic object SO and criterion P, Search(SO,P) is ambiguous iff
Search(SO,P)>1
For a syntactic object SO and criterion P, Search(SO,P) is failed iff
Search(SO,P)={}
Minimal Tiered BFS, then, will be our choice of Search algorithm. The next step is to formally define it.
In order to define Search, then, we need to be able to properly generate search tiers. So, for instance, the tiers for (37) are given in (52).
Tier 1 = {X, {{G, Y}, {W, {{H, Z}, U}}}}
Tier 2 =
Tier 3 =
Tier 4 =
Tier 5 ={H, Z}
Tier 6 ={}
For a given Tier T
For a set of sets X̅ = {X_{0},…,X
Therefore we can define a procedure NextTier in (54) and with it, Search in (55).
For T, a set of syntactic objects,
NextTier(T)= ⋃{SO∈T: SO is not a lexical item token}.
For S, a set of syntactic objects, and Crit, a predicate of lexical item tokens,
Search(S,Crit) =
Probe, then is a special type of Search, where the search criterion is based on Match as shown in (56).
For F, a feature type, and SO, a syntactic object that immediately contains X, a lexical item token,
Probe(SO,X,F) = Search(SO, Match^{X,F})
where Match^{X,F} = 1 iff X contains a feature
With our definition of Probe in place, we can turn to our final definition of Agree which I turn to presently.
If and when an instance of Probe retrieves a lexical item token, that token must be modified—at least according to most versions of Agree.
Note that each copy of G must be replaced to maintain their copyhood. Taking, for example, the pronoun
We expect her to be hired.
{...{Voice, {
If accusative Case marking was performed by a “minimal” Agree—one that only valued the highest copy—then the result would be the syntactic object in (59) in which the two instances of the third person feminine pronoun are distinct from each other and, therefore, no longer copies in any sense.
{...{Voice, {
The pronouns 3SgF_{[Case:acc]} and 3SgF_{[Case:_]} are clearly distinct—one is Casemarked, the other isn’t—and furthermore, they have divergent derivational histories—one has undergone Value, the other hasn’t. What’s more is that the lower pronoun is not Casemarked and should therefore cause a crash at the interfaces. In order to maintain the identity between copies, then, Agree must be maximal—it must Value every copy.
Thus we can define Agree as in (60).
Where SO is a syntactic object F is a feature type, and v is a feature value
Agree(SO, G, F_{v}) =
Agree, according to (60), is defined for three cases. In Case (60a), where SO is an instance of G—the lexical item token to be valued, the output of Agree is the valued version of G—Agree applies nonvacuously. In Case (60b), where SO is a lexical item token, but not an instance of G, the output of Agree is SO—Agree applies vacuously. In Case (60c), where SO is a set, Agree is applied to each member of SO, and a new set containing the respective outputs of those Agree operations is constructed—Agree applies recursively. Note also, that the result of Case (60c)—binary setformation—is an instance of Merge, and I will treat it as such below.
To see how Agree works, consider accusative Case marking in the sentence
{_{α} Voice, {_{β}
Voice = 〈〈PHON_{Voice}, SYN_{Voice}, {…,〈Case, 2〉,...}〉, k〉
(Voice contains an Accusative Case feature in its SEM)
3SgM_{[Case:Acc]} = 〈〈PHON_{3SgM}, {…,〈Case, 0〉,…}, SEM_{3SgM}〉, k〉
(The 3rd person singular masculine pronoun contains an unvalued Case feature in its SYN)
{_{α} Voice, {_{β}
The first step of this instance of Agree is to Probe for unvalued Case features, as in (63).
Probe(α, Voice, Case) = {3SgM_{[Case:_]}}
The noncasemarked pronoun—
Agree(α, 3SgM_{[Case:_]}, Case_{acc}) =  (by (60c)) 
Merge(Agree(Voice, 3SgM_{[Case:_]}, Case_{acc}), Agree(β, 3SgM_{[Case:_]}, Case_{acc})) 
Agree(Voice, 3SgM_{[Case:_]}, Case_{acc}) = Voice  (by (60b)) 
Merge(Agree(Voice, 3SgM_{[Case:_]}, Case_{acc}), Agree(β, 3SgM_{[Case:_]}, Case_{acc}) =  (by (65)) 
Merge(Voice, Agree(β, 3SgM_{[Case:_]}, Case_{acc}) 
We then perform Agree on β which contains the verb and the direct object pronoun.
Agree(β, 3SgM_{[Case∶_]}, Case_{acc}) =  (by (60c)) 
Merge(Agree( 
Agree( 
(by (60b)) 
Agree(3SgM_{[Case∶_]}, 3SgM_{[Case∶_]}, Case_{acc}) =  (by (60a)) 
Value(3SgM_{[Case∶_]}, Case_{acc}) =  (by (32)) 
3SgM_{[Case∶acc]} 
Merge(Agree( 
(by (68), (69)) 
Merge( 
Then, having reached the “bottom” of our structure, we are left with two simple Merge operations which yield (62) as shown in (71).
Merge(Voice, Agree(β, 3SgM_{[Case∶_]}, Case_{acc})) =  (by (70)) 
Merge(Voice, Merge( 
(by (18)) 
Merge(Voice, { 
(by (18)) 
{Voice, { 
We have arrived at a formal definition of one variety of Agree (LDDVAgree) which we will use in the the following section as a basis for defining other varieties.
In defining a Downward Valuation Agree, we considered syntactic objects such as the one schematized in (72) which immediately contain lexical item tokens bearing a valued feature F_{v} and which contain a lexical item token bearing an unvalued feature F_{0}.
{X_{F:v}, {… G_{F:0}}}
In an Upward Valuation, the relevant features of X and G are swapped, as in (73).
{X_{F:0}, {… G_{F:v}}}
In order to capture Upward Valuation, then we need first modify the Match criterion of Probe as in (74), moving X to the second argument position.
For F, a feature type, and SO, a syntactic object that immediately contains X, a lexical item token,
Probe_{UV}(SO, X, F) = Search(SO, Match^{X, F}).
Thus, Probe_{UV} gives a definite result {G} only if X contains an unvalued F feature and G contains a valued F feature. Since, by definition, the relevant unvalued feature in Agree_{UV} is at the top of the structure, we might think that no exhaustive DFS is required. Unfortunately, though, the same concern with valuing copies is with us—just because a lexical item token is at the top of a tree doesn’t mean there isn’t a copy of it at the bottom. Therefore, our definition of Agree_{UV} in (75) look similar to that in (60).
For lexical item token X, syntactic object SO={X,...}, and feature type F, and lexical item token G such that Probe_{UV}(α, X, F_{v}) = G, where SO = α or SO is contained in α,
Agree_{UV}(SO, X, F_{v}) =
Versions of Agree whose effects are feature checking rather than valuation assume that all formal features—
A
For all feature types F and values v, 〈0, F, v〉 is an
Match_{✓}, then, compares a semantic feature of one lexical item token with a formal feature of another succeeding if both features have the same type and value and the formal feature is unchecked, as defined in (78).
For any two lexical item tokens X and G, feature type F and value v,
Match_{✓}(X, G, F) = 1 iff 〈F, v〉 is a feature of X and 〈0, F, v〉 is a feature of G.
Finally, Check is a simple matter of flipping a 0 to a 1 or leaving a 1 as a 1 as in (79). Note, though, that Check will never apply to an already checked feature, since Match is a prerequisite for Check and will only succeed if the feature in question is unchecked.
For a lexical item token SO=〈〈PHON, SYN, SEM〉, k〉, and formal feature F_{v}=〈c?, F, v〉,
Check(SO, F_{v}) =〈〈PHON, (SYN–F) ∪ {〈1, F, v〉}, SEM〉, k〉
These newly defined functions can be slotted into our formalized definitions of Agree as in (80) to give a definition of Agree_{✓}, where G is the result of Probing based on Match_{✓}.
Where SO is a syntactic object F_{v} is feature, and G is a lexical item token such that Probe_{✓}(α, X, F_{v}) = G, where SO = α or SO is contained in α,
Agree(SO, G, F_{v}) =
Early minimalist theories of agreement (e.g.
On its surface, Local Agree, as described above, has the advantage of not requiring an arbitrary search of the entire derived expression. Instead, the search is strictly and specifically limited to the very top of object. The canonical case of socalled “spechead” agreement is the finite subject merged with the finite predicate, shown in (81).
TP = {{D, …}, {T, …}}
Restricting our discussion to Case, we can see that the Agree operation is an interaction between the lexical item token immediately contained in one member of TP and the lexical item token contained in the other member of TP. We can define Probe_{Local}, then, as in (82).
For feature type F, lexical item tokens X and Y, and syntactic object SO={U, W},
Probe_{Local}(SO, X, F) =
It should be noted that Probe_{Local} makes no use of the notions “specifier” or “head.” Indeed, it assumes no structural asymmetry at all, only the valuedunvalued asymmetry.
It should also be noted that, since socalled “spechead” structures, especially those associated with Case and agreement, are often formed by Internal Merge, our final version of Agree_{Local}, much like longdistance Agree, will need to replace every instance of the object being valued/checked. Therefore, our final version of Agree_{Local}, is defined as in (83).
Where SO is a syntactic object F is a feature type, and v is a feature value ≠ 0 and G is a lexical item token such that Probe_{Local}(α, X, F) = G, where SO = α or SO is contained in α,
Agree_{Local}(SO, G, F_{v}) =
In this section, I provided a formal definition of one particular conception of Agree—LongDistance Downward Valuation Agree—by first breaking it into individual pieces—Probe, Match, Value—which I gave formal definitions, and then assembling those definitions in such a way as they define Agree. I then discussed a few alternative conceptions of Agree, showing how they could be defined by altering the previous definitions as minimally as possible. This description of the definition process might suggest that Agree is modular—that it consists of several independent operations that can be mixed and matched—but this is not the case. Rather, while the discussion of each alternative tended to focus on a single operation, the changes to that operation was such that it necessitated minor modifications to Agree as a whole. Agree, then, does seem to be real operation, albeit a rather complex one, as I will demonstrate in the next section.
With the Agree operation properly formalized, we are in a position to investigate the operation’s theoretical properties, which have either not been remarked upon in the literature, or been discussed without the precision that formalization allows. This section will discuss some of those properties. Rather than investigating Agree in isolation and following the premise that Agree is a fullfledged derivational operation like Merge, Select and Transfer, this section will focus on those properties of Agree that distinguish it from other operations—Merge in particular.
We will first see that Agree differs from Merge and Select in that it is inherently recursively defined, while the latter two are defined nonrecursively. Related to this, I will argue that the fact that our definition of Agree includes instances of Merge effectively rules out any general Agree requirement for Merge. Next, I show that, unlike Merge and Select, Agree does not close the set of syntactic objects, and that attempts to rectify this leads to problematic predictions for language acquisition. Finally, I discuss the implications of Agree for the NTC.
In order to do so, though, we must give a definition of UG_{Agree} in (84) and derivation in (85).
Universal Grammar is a 7tuple:
〈PHONF, SYNF, SEMF, Select, Merge, Transfer, Agree〉
A derivation from lexicon L is a finite sequence of stages 〈S_{1},…,S
where each S
For all LI and k such that 〈LI,k〉 ∈ LA_{1}, LI ∈ L,
W_{1} = {} (the empty set),
for all
(derivebySelect) for some A∈LA
(derivebyTransfer) …,
(derivebyMerge) LA
A∈W
Either A contains B or
W
(derivebyAgree) or LA
SO∈W
SO immediately contains X
Probe(SO,X,F_{v}) = {G}
W
This definition of a derivation uses the names of its procedures, but in the case of Merge and Select, one could just as easily expand them to give their full definition fully in terms of settheory because they are nonrecursive operations. Agree, however, is recursively defined, that is, it is defined in terms of itself—“Agree” appears on the lefthand and righthand side of the equals sign in (60)—so such an expansion is not possible. This is a fundamental difference between Agree and the other generative operations—Merge and Select are nonrecursive functions, while Agree is recursive.
Beyond its recursive definition, there are a number of properties that set Agree apart from its fellow operations. First, since performing Agree on a syntactic object entails searching the object, modifying certain constituents, and putting the object back together, and since objects can only be put together by applying Merge, every nontrivial application of Agree includes at least one application of Merge. This is reflected in definitions (60) and (75)—in which Merge appears in the intension of Agree—and concurs with
Early in the minimalist program,
The principal barrier to making Agree a prerequisite for Merge is that, as defined in (85), the derivation is a computational procedure and, therefore, is strictly incremental. That is, the validity of a given stage S_{n} (n≠1) depends solely on its form and the form of the immediately preceding stage S_{n1}. Requiring every instance of Merge to be preceded by an instance of Agree, however, would mean that the validity of a stage S_{n} (n≠1) depends on its two preceding stages S_{n1} and S_{n2}. That is, S_{n} can be derived from S_{n1} by Merge only if S_{n1} is derived from S by Agree. A derivation, then, would need memory, albeit a very small amount of it.
On its face, this does not seem to be an insurmountable barrier, but as we shall see, it will end up ruling out the first instance of Merge in any derivation. To begin with, we reformulate our definition of derivation by adding the underlined line in our derivebyMerge clause in (86).
A derivation from lexicon L is a finite sequence of stages 〈S_{1},…,S
where each S
For all LI and k such that 〈LI,k〉 ∈ LA_{1}, LI ∈ L,
W_{1} = {} (the empty set),
for all
(derivebySelect) for some A∈LA
(derivebyTransfer) …,
(derivebyMerge) LA
A∈W
Either A contains B or
W
(derivebyAgree) or LA
SO∈W
SO immediately contains X
Probe(SO,X,F_{v}) = {G}
W
Now, let’s consider an abstract subderivation of the syntactic object {X, Y} where X and Y are lexical item tokens. We start in S_{1}, given in (87) with an empty workspace and a lexical array containing at least X and Y.
S_{1}  =  〈LA 
=  ⟨{ X, Y, Z … }, {}⟩ 
Next we perform Select twice, to bring X and Y into the workspace.
S_{2}  =  Select(X, S_{1}) 
=  ⟨{Y, Z … }, {X}⟩ 
S_{3}  =  Select(Y, S_{3}) 
=  ⟨{Z … }, {X, Y}⟩ 
Under a free Merge grammar, we would, at this point simply Merge X and Y, but this option is not available to us, since derivebyMerge in (86) requires an Agree step. A Select step is possible here, but that would only postpone our dilemma. We need to perform Agree next.
Assuming that X could value Y for feature F—i.e., Match(X, Y, F) = 1—let’s consider the structural prerequisites. As stated in (86), X and Y must be contained in the same syntactic object SO, which, in turn, must be a member of the workspace. In S_{3}, however, both X and Y are members of the workspace, and there is no SO to speak of. No stage S_{4}, then, can be derived by Agree.
We’ve arrived then at an instance of circularity—every instance of Merge requires a preceding instance of Agree, and every instance of Agree requires a preceding instance of Merge. First Merge, then, is impossible if the definition of a derivation in (86) holds.
Since a computational procedure is essentially the repeated application of an operation, or set of operations, with each application providing the input for the following application, the domain of a given computational operation must be closed under that operation, as defined in (90).
Domain D is
for all
In the case of our syntactic derivations, our domain is the set of stages, which C&S demonstrate are closed under derivebySelect and derivebyMerge. I have thus far been assuming that it is also closed under derivebyAgree, but that assumption is perhaps not strictly true, under our present definitions.
As defined, derivebyAgree is a function from stages to stages that modifies a stage’s workspace, by performing Agree on a syntactic object in that workspace. Therefore, the set of stages is closed under derivebyAgree iff the set of syntactic objects is closed under Agree. For its part, Agree operates on a given syntactic object SO by applying Value to SO if SO is an appropriate lexical item token, or to the appropriate lexical item tokens contained in SO otherwise. Therefore the set of syntactic objects is closed under Agree iff the set of lexical item tokens is closed under Value. We need only consider a simple instance of Value to see that this is not obviously the case.
Consider the lexical item token X
X
where PHON_{X} ∈ PHONF*, SEM_{X} ⊂ SEMF,
What about the result of applying Value to X
Value(X
where v is a nonzero integer.
Since PHON_{X}, SEM_{X}, and
While there is no strictly formal reason for modifying our theory features by hypothesizing that SYNF contains valued and unvalued features, such a hypothesis would put us in something of a theoretical quandary. In the grammar assumed by this paper, language acquisition is at least partially a process of constructing lexical items from universal feature sets so that they match tokens in the primary linguistic data. The basic premise of Agree theory, though, is that a unvalued features cannot surface and therefore must be valued during the derivation. If this is the case, then there are effectively no tokens of unvalued features in the primary linguistic data. Why, then, would a language acquirer ever construct a lexical item with an unvalued feature?
To take a concrete example, consider the case of French adjectives which show gender and number agreement as demonstrated in (93).
Gender  Sg  Pl 

Fem  
Masc 
This situation is consistent with two sorts of lexicons if we assume lexically valued SYN features—lexicons with multiple
LEX = {… ⟨PHON
Since the lexicon in (94) represents a surface analysis of adjective morphology, it would be the more straightforward to acquire than (95) which requires an additional step of abstraction from the data. All else being equal, then, allowing SYNF to contain valued features would seem to predict the sort of lexicon in (94) for French. This, of course would be consistent with a checkingbased Agree, but not a valuationbased Agree.
Alternatively, we could assume that all and only unvalued features are members of SYNF—stated formally as an axiom in (96).
For all features ⟨F,v⟩, v=0 ↔ ⟨F,v⟩ ∈ SYNF.
This would remove the acquisition issue—(94) would be an impossible lexicon—and would be consistent with the basic premise of Agree theory. It still would require theoretical explanation, but of the more general sort suggested in Footnote 10. but it would mean that the set of lexical item tokens is not closed under Value, and therefore the set of stages is not closed under a Valuebased Agree. A Valuebased Agree, then, could not be a computational operation in a version of UG_{Agree} with (96) as an axiom.
In sum, in order for a valuationbased Agree such as the one defined in (60) to be a viable as a computational procedure, we must expand the domain of possible lexical items in a theoretically questionable way.
One of the theorems of C&S’s formal grammar is the No Tampering Condition defined by
For any two consecutive stages in a derivation S_{1} = 〈LA_{1}, W_{1}〉 and S_{2} = 〈LA_{2}, W_{2}〉,
for all A contained in W_{1}, A is contained in W_{2}.
Since the effect of every form of Agree defined in this paper is to replace all instances of some lexical item token G in a workspace with a distinct item G’, Agree violates NTC by design. The issues UG_{Agree} discussed above, then, may be predicted by Chomsky’s conjecture that UG operations conform to the NTC. There are essentially two ways of dealing with this result—either we take the approach that C&S take with Transfer and modify Agree so that it does not violate NTC, or we argue that “empirical evidence requires a departure from” NTC. I will discuss each of these options in turn below.
A straightforward way of constructing an Agree operation that respects the NTC is to formally separate the content of a derived expression from its structure in some way with Merge manipulating the structure and Agree manipulating the content. A stage of the derivation, then would consist of a lexical array, a workspace, and ledger as in the definition in (98).
A stage is a triple S = 〈LA, W, L〉, where LA is a lexical array, W is a set of syntactic objects, and L is a set of pairs of lexical item tokens. We call W the workspace of S and L the ledger of S.
Rather than modifying lexical item tokens in place, Agree would add a pair 〈LI
This sort of move also fixes a number of issues already discussed regarding Agree. A version of Agree that respects NTC does not alter the workspace—it merely constructs an ordered pair and adds it to the ledger. It does not take apart and put back together an already constructed syntactic object, as standard Agree as defined in (60) does. Therefore, it does not need to be recursively defined, and it does not need to refer to Merge in its definition.
This improvement aside, however, it also lays bare the fact that Agree as a syntacticderivational operation is fundamentally redundant. The prerequisites for Agree are a structural relation (Search) and content relation (Match) between two lexical item tokens. So, suppose X and G are lexical item tokens and, for some feature F, Match(X,G,F)=1. Further suppose that stage S
Even as stated by
The relative ubiquity of morphological agreement, for instance, might seem to be the sort of evidence we need, but it is not sufficient to invalidate NTC. Consider, as a parallel, linear order. It is a plain fact that external linguistic expressions have linear order, but that linear order is still assumed to be absent in the grammar—at least in standard Mergebased grammars. Yet, as
Beth and Sara met Hanako and Máire respectively.
= Beth met Hanako and Sara met Máire.
≠ Beth met Máire and Sara met Hanako.
What we need, then, is evidence that standard Agree is occurring in a derivation interspersed with Merge.
Morphological case feeds φagreement in quirkysubject languages.
ϕagreement feeds movement to canonical subject in nonquirkysubject languages.
The functioning of the grammar is uniform across languages (The Uniformity Principle).
The argument is logically sound, but it depends on an analysis of the evidence that is plausible, but not the only possible analysis. That is, it depends on the truth of the first two premises, which are empirical statements. Despite being empirical statements, though, they depend on two theoretical notions—“quirky subjects” and “canonical subject position”—to even be coherent. I will take for granted that the term “quirky subject” is coherent, and focus on “canonical subject position.”
Furthermore, it is worth noting, that Preminger frames his premises in terms of “feeding” rather than “driving” or “triggering.” An operation X feeds another operation Y if X creates the necessary conditions for Y and X precedes Y. “Feeding”, then, speaks to the order of operations more than causation.
One property of canonical subject position that Preminger is clear about is that it is syntactic—he says of movement to canonical subject position that it is “clearly syntactic (since it creates new binding configurations, for example)” (p. 177) and that it “is a syntactic process par excellence” (p. 184). We further know, based on the second premise of (100), which Preminger claims as an empirical result, that movement to canonical subject position in nonquirkysubject languages should always cooccur with φagreement. Since this latter requirement is an empirical claim, though, it should not be too directly tied to our definition lest our reasoning be circular. We can construct our definition by applying these two desiderata to some representative data.
Our representative data is given in (101), where the underlined subexpression is could be or has been considered to be in subject position in English.
There seem to be
They seemed
I expect
We believed
I believe that it is quite safe to label
This leaves us with nonfinite subject position in (101c) to (101f). In each of these cases, the underlined expression could reasonably be said to be in a subject position, and to have moved there, yet there is no apparent φagreement associated with that move. We could reasonably reject
Canonical subject position, then, seems to refer to the specifier of finite T, at least in English. Assuming such a position can be defined well enough to support generalizations such as Preminger’s premises,
We could try to rescue (100) by restating (100b) as (100b’) which is coherent in UG_{Agree}—assuming “quirky subject” can be defined and the problems outlined in above can be overcome.
(100b’) ϕagreement feeds Internal Merge in nonquirkysubject languages.
In order for this new premise to be true, though, movement to noncanonical subject position must also require
To recap, Preminger’s argument as given in (100), while seemingly logically sound, rests on the assumption that movement to canonical subject position is a bona fide syntactic operation, distinct from other types of movement. This assumption would be a departure from the theory assumed here, which takes all movement operations to be instances of Merge. Preminger’s conclusion, that agreement takes place in the syntax taken with my argument above that Agree violates the NTC, implies the conclusion that the NTC should be at least weakened
Throughout the exercise in formalization, many choices were made that could have been made differently with various levels of consequence for the overall system. For instance, the choice, adopted from C&S, to include PHON as part of the LI was essentially the choice of an “earlyinsertion” theory of morphology. This choice, however, was of little consequence for the formalization of Agree, since it dealt exclusively with the SYN and SEM features of LIs. The choice to formalize features as typevalue pairs, though, does have relevant consequences.
Suppose, for instance, I had adopted a geometric feature theory such as the one developed by
One formal definition of feature that would capture this is given in (103), with 2nd person feature formalized as in (104).
X is a
〈π, 〈
Where {π,
Quite obviously, this would require us to redefine or replace our auxiliary notions like
On the other hand, I see no reason to expect that we must alter our Minimal Search algorithm in (55) nor our final definition of Agree in (60) to account for alternative theories of features. Minimal Search is a general purpose algorithm—it doesn’t depend on the particular search criterion—and Agree Heading 1searches a structure and replaces Matching lexical item tokens with the result of Value—as long as Match is a predicate that compares lexical item tokens relative to features, and Value is a function from somehowdefective lexical item tokens to lessdefective lexical item tokens.
Likewise, were one able to adequately define a minimal DFS algorithm or if one adopted a ledgerbased model of Agree, there would not necessarily be any reason to abandon either the typevalue or the geometric theory of features. Agree, Search, and Match/Value, then are to a certain extent modular with respect to each other and, while the limits of that modularity are a purely theoretical question, the final choice of individual theories will depend on a combination of theoretical and empirical concerns.
The task of formalizing a theoretical conjecture occupies an odd place in the sciences. While it does generally not bring anything new to the table, it does give us the opportunity to objectively assess the validity and theoretical prospects of various informal proposals. By formalizing various proposals for Agree as a syntactic operation, we can see that what often is shown as a simple curved arrow on tree diagrams is actually a rather complicated computational operation. Not only is this complexity apparent simply from the size of the formal definition compared, say, to that of Merge, but it is reflected in the theoretical complexities identified in Section 4.
In its current state, then, Agree should not be taken for granted, even with what seems to be overwhelming evidence of its existence. This, however, leaves the theory in an awkward position—the phenomena that Agree is supposed to explain appear to be real and rather ubiquitous, but our tool for explaining them is not yet ready. If we are engaged in rational inquiry (
This has been true of generative grammars since their inception. Chomsky’s early major publications, for instance, are replete with references to “technical devices for expressing a system of recursive processes” (
See
Chomsky calls the operation Attract in this work.
The grammar C&S formalize seems to assume an “earlyinsertion” theory of morphology. Under a “lateinsertion” theory of morphology (
See
The operation Select is not to be confused with (c/s)selection. The first, indicated by capitalization, is a purely formal/theoretical construct, while the latter is an empirical generalization about categorial/semantic restrictions on phrase structure.
The two elements participating in Agree are commonly referred to as the probe and the goal respectively. The term “probe” is also often used to refer to the search process associated with Agree. To avoid this confusing ambiguity, I do not use “probe” and “goal” to refer to elements.
Multiple commentators have noted that a more intuitive and simple definition of Match would allow an X with [F:
The fact that SYNF and SEMF seem to be disjoint subsets of a natural class of features seems to indicate that they are not independent of each other. Indeed, the Strong Minimalist Thesis (SMT) would say that there is only one set of features in UG—SEMF. This, of course raises a number of fascinating questions which are beyond the scope of this paper.
An anonymous reviewer points out that, although
While
Match^{X,F} can be formally defined using the technique of λabstraction as (λ
If we wished to define Agree purely as a relation—
Interestingly, C&S also define Transfer recursively. It follows then that Transfer should also be considered a different kind to operation—a conclusion also predicted by the fact that Transfer is generally considered an operation of the interfaces rather than Narrow Syntax.
See
This is not to say that tying Agree to Merge in some way will always be a deadend. On the contrary, one of, for instance,
See theorems 2 and 3 in
An anonymous reviewer points out that the evidence that
It should be noted that the modifiers “quirky” and “canonical” both subjective in nature—they denote degrees of conformity to some norm—suggesting that the phenomena that they refer to have not yet been given a theoretical explanation, just as the terms “
We might call it the canonical canonical subject.
See
The expletive
It might be argued that the theory assumed here cannot account for the range of data that Preminger discusses and should, therefore, be rejected. Such an objection, I would argue, mistakes entirely the nature of scientific, and more broadly rational, inquiry. While a full airing of this argument is beyond the scope of this paper, I will merely ask the reader to consider two points:1. No scientific theory is or has ever enjoyed complete empirical coverage, even within its own domain. 2. Despite common narratives to the contrary, progress in the sciences is almost always led by theoretical progress rather than the collection of novel data.
The author has no funding to report.
The author has declared that no competing interests exist.
The author has no additional (i.e., nonfinancial) support to report.