1 Introduction
Minimalist Principles & Parameters theories of grammar deal mainly in procedures which generate linguistic expressions from atoms in an incremental fashion. That is, these theories traffic in computational procedures, in the sense of Turing, Church, Post, et al.,^{1} that relate stage n of a derivation to stage n+1 of that same derivation in a regular welldefined way. From this perspective, Merge is the crown jewel of these theories. It has been developed with the two main goals of formal explicitness and descriptive adequacy. Much of the current literature in minimalist P&P grammar, however, assumes the existence of a second core procedure, Agree.
As its name suggests, Agree is the operation that causes grammatical agreement—subjectpredicate agreement, case marking, etc.—which, I argue in this paper, has yet to be sufficiently defined in such a way as to properly analyze its theoretical and empirical properties.^{2} The correct characterization of Agree, as with any theoretical proposal, ultimately depends on empirical and theoretical considerations. Virtually the entire contemporary Agree literature, however, focuses on empirical concerns to the exclusion of theoretical questions.^{3} This paper seeks to remedy this gap somewhat. The assertion that the Agree literature is primarily focused on empirical concerns to the exclusion of theoretical ones, seems to be contradicted by the sheer number of theories of Agree that have been proposed—Chomsky (2000) begins with what might be called Classical Agree, and scholars later propose Cyclic Agree (Béjar & Rezac, 2009), Local Agree (Hornstein, 2009), Fallible Agree (Preminger, 2014), and Upward Agree (Zeijlstra, 2012; Bjorkman & Zeijlstra, 2014), just to name those theories of Agree which have names. In fact, the proliferation of such theories is to be expected when inquiry is guided by the empirical rather than the theoretical, just as the proliferation of empirical predictions is to be expected when inquiry is guided by the theoretical.
Take, for instance, the recent debate regarding Upward vs Downward Agree (Preminger, 2013; Zeijlstra, 2012). This debate turns entirely on whether one version of Agree can capture a certain set of data while the other cannot. The debate tacitly assumes that both versions are definable given shared theoretical assumptions, and makes no real effort to investigate what if any implications either might have for the broader grammatical theory in which it is embedded. Indeed, the contrast between the two types of Agree seems to be an unquestioned theoretical assumption, which perhaps need not be made.
This lack of theoretical assessment of Agree is troubling, since the operation has been implicated in a wide range of grammatical phenomena beyond the morphological agreement phenomena from which it gets its name. Agree has been argued to be necessary to explain movement/Internal Merge (Chomsky, 1995, p. 274 ff.),^{4} binding (Rooryck & Wyngaerd, 2011), External Merge (Wurmbrand, 2014), among many other phenomena. Indeed, it is difficult to find a single phenomenon that falls under the umbrella of syntax which has not been given an Agreebased analysis.
This proliferation of theories of Agree is further exacerbated by the fact that, since its inception, analyses in Generative Grammar have always had both derivational and representational expressions. In the theory used in Aspects (Chomsky, 1965), for instance, (1) can be given three formal expressions—one derivational expression in (2), and two representational expressions in (3) and (4).
1
Sincerity may frighten the boy.
2
S  (by S → NP^{⌢}Aux^{⌢}VP)  (cf. Chomsky, 1965, p. 68) 
NP^{⌢}Aux^{⌢}VP  (by VP→V^{⌢}NP)  
NP^{⌢}Aux^{⌢}V^{⌢}NP  (by NP→Det^{⌢}N)  
NP^{⌢}Aux^{⌢}V^{⌢}Det^{⌢}N  (by NP→N)  
N^{⌢}Aux^{⌢}V Det^{⌢}N  (by Det→the)  
N^{⌢}Aux^{⌢}V^{⌢}the^{⌢}N  (by Aux→M)  
N^{⌢}M^{⌢}V^{⌢}the^{⌢}N  (by M→may)  
N^{⌢}may^{⌢}V^{⌢}the^{⌢}N  (by N→sincerity)  
sincerity^{⌢}may^{⌢}V^{⌢}the^{⌢}N  (by N→boy)  
sincerity^{⌢}may^{⌢}V^{⌢}the^{⌢}boy  (by V→frighten)  
sincerity^{⌢}may^{⌢}frighten^{⌢}the^{⌢}boy 
3
[_{S} [_{NP} Sincerity_{N} ] [_{Aux} may_{M} ] [_{VP} frighten_{V} [_{NP} the_{Det} boy_{N} ]]]
The formal expression in (2) to (4) are all roughly equivalent, though each highlights a different aspect of the analysis they represent.
Since Generative Grammar within the P&P tradition is a computational theory, the derivational expression of a given analysis has always been the ultimate expression—a representation is only a valid analysis in such theory, insofar as it can be derived in that theory. The representational expressions, on the other hand, are much more concise and accessible, so they have been overwhelmingly used as shorthands for the derivational expressions, but they are useful as shorthands only insofar as all of the information they encode can also be represented with the derivational expressions.
These representational expressions become problematic, however, when they are augmented for the sake of clarity. For instance, movement/Internal Merge can be represented without arrows as in (5) but more often arrows will be added for ease of understanding as in (6), though (5) and (6) are assumed to be equivalent.
It is, perhaps, understandable that Agree, commonly represented by arrows similar to movement arrows as in (7), is assumed to have the same level of theoretical underpinning as movement.
To date, though, there has been no proposal for a derivational expression of the arrow in (7). The task of this paper in part, then, is to remedy this oversight.
To that end, I will be expanding the formalization of minimalist syntax developed by Collins and Stabler (2016). I sketch out this formalization, which is based on a moreorless contemporary theory within the minimalist program and extend it to include Agree. While I focus on what I call LongDistance Downward Valuing (LDDV) Agree, I also discuss how my definitions could be adjusted to reflect other theories such as those that assume feature checking or upward valuation, as well as local varieties of Agree. I then consider the theoretical implications of my definition of Agree, including its relation to Merge, its implications for the Lexicon, and its relation to the No Tampering Condition. Finally, I give some concluding remarks.
2 What Does a Definition Look Like?
Collins and Stabler (2016) provide a framework for formal definition. This formal definition uses sets and their basic predicates, relations, and operations (membership, subset, set difference, etc) and finite sequences referred to as “pairs,” “triples,” and so on depending on their size. Using these formal notions, the grammar they define is such that a number of organizing principles of minimalist theories are provable as theorems of this system. I will be defining Agree in this framework, and in order to understand what it means to define a derivational operation, I must first lay out some basic definitions starting with Universal Grammar (UG) in (8).
8
Universal Grammar is a 6tuple: 〈PHONF, SYNF, SEMF, Select, Merge, Transfer〉
PHONF, SYNF, and SEMF are universal sets of phonetic, syntactic, and semantic features, respectively; Select, Merge, and Transfer are operations. I will begin the outline of the formal grammar with the feature sets, postponing discussion of the operations for now. Collins and Stabler (2016) (hereafter C&S) also define the set PHONF* as the set of all possible phonetic strings. These featuresets are grouped together to form lexical items, which are grouped into a lexicon, which effectively defines individual grammars, as in (9) to (11).^{5}
9
A lexical item is a triple: LI = 〈PHON, SYN, SEM〉
where SEM and SYN are finite sets such that SEM ⊂ SEMF, SYN ⊂ SYNF, and PHON ∈ PHONF*.
10
A lexicon is a finite set of lexical items.
11
An ILanguage is a pair 〈Lex, UG〉, where Lex is a lexicon and UG is Universal Grammar.
In order to capture the Copy/Repetition distinction, C&S introduce lexical item tokens, defined in (12), which are the atoms of syntactic computation. C&S also define several other useful terms using LI tokens.^{6}
12
A lexical item token is a pair LI_{k} = 〈LI, k〉, where LI is a lexical item, and k is an integer.
13
A lexical array is a finite set of lexical item tokens.
14
X is a syntactic object iff:
X is a lexical item token, or X is a set of syntactic objects.
15
Let A and B be syntactic objects, then B immediately contains A iff A∈B.
16
Let A and B be syntactic objects, then B contains A iff
B immediately contains A, or for some syntactic object C, B immediately contains C and C contains A.
C&S then define a generative framework, wherein complex syntactic objects are derived in stages.
17
A stage is a pair S = 〈LA, W〉, where LA is a lexical array [a possibly ordered set of lexical item tokens] and W is a set of syntactic objects. We call W the workspace of S.
The operations Merge, Select, and Transfer operate on stages and derive new stages. Merge is binary setformation, Select moves lexical item tokens from the lexical array to the workspace,^{7} and Transfer converts syntactic objects into interface objects. Merge and Select are rather simple, as shown in (18) and (19). Transfer, on the other hand, is more complicated—so much so that C&S devote 5 sections of their paper to developing its definition. Since Transfer is not strictly relevant to this paper, I will omit its definition.
18
Given any two distinct syntactic objects A, B, Merge(A,B) = {A,B}.
19
Let S be a stage in a derivation S = 〈LA, W〉.
If lexical token A∈LA, then Select(A, S) = 〈LA – {A}, W∪{A}〉.
Thus, we can define the central notion of derivation in (20).
20
A derivation from lexicon L is a finite sequence of stages 〈S_{1}, …, S_{n}〉, for n≥1,
where each S_{i} = 〈L_{i}, W_{i}〉, such that

For all LI and k such that 〈LI,k〉∈LA_{1}, LI∈LA,

W_{1}={} (the empty set),

for all i, such that 1≤i≤n, either
(derivebySelect) for some A∈LA_{i}, 〈L_{i}_{+1}, W_{i}_{+1}〉 = Select(A, 〈L_{i}, W_{i}〉), or
(derivebyTransfer) …, or
(derivebyMerge) LA_{i} =LA_{i}_{+1}, and the following conditions hold for some A,B:

A∈W_{i}

Either A contains B [Internal Merge] or W_{i} immediately contains B [External Merge], and

W_{i}_{+1} = (W_{i} – {A, B}) ∪ {Merge(A, B)}

So, abstracting away from certain representational complexities, the sentence Brian smiles would be derived as in (21).
21
(S_{1})  ⟨{T_{Pres}, smile, Brian}_{LA1}, {}_{W1}⟩  (by Select(Brian, S_{1})) 
(S_{2})  ⟨{T_{Pres}, smile}_{LA2}, {Brian}_{W2}⟩  (by Select(smile, S_{2})) 
(S_{3})  ⟨{T_{Pres}}_{LA3}, {Brian, smile}_{W3}⟩  (by Merge(smile, Brian)) 
(S_{4})  ⟨{T_{Pres}}_{LA4}, {{Brian, smile}}_{W4}⟩  (by Select(T_{Pres}, S_{4})) 
(S_{5})  ⟨{}_{LA5}, {T_{Pres}, {Brian, smile}}_{W5}⟩  (by Merge(T_{Pres}, {Brian, smile}) 
(S_{6})  ⟨{}_{LA6}, {{T_{Pres}, {Brian, smile}}}_{W6}⟩  (by Merge(Brian, {T_{Pres}, {Brian, smile}})) 
(S_{7})  ⟨{}_{LA6}, {{Brian, {T_{Pres}, {Brian, smile}}}}_{W6}⟩ 
C&S’s formalization is open for some refinements, such as those that Chomsky (2020) suggests, and extensions, but it provides us with a framework for those refinements and extensions. In order to add Agree to the formal grammar, for instance, we would need to define it as a function from stages to stages to be added as a derivebyAgree clause to (20), and in order to define such a function, as we shall see, we will need a formal definition of features.
3 Defining Agree
Agree can be very broadly described as an operation that modifies a syntactic object X iff X stands in a particular formal/structural relation and a particular substantive relation with another syntactic object Y. So, in order to define Agree, we must formalize (a) the formal/structural prerequisite—Probe, a species of Search—(b) the substantive prerequisite—Match—and (c) the process of modifying the syntactic object in question—Value or Check—each of which has, in a sense, been the focus of its own debate in the literature. As a starting point, I will formalize LongDistance Downward Valuation Agree (LDDVAgree), which is more or less the version of Agree put forth by Wurmbrand (2014) and which has the following properties. LDDVAgree is longdistance in that it does not require a strictly local relation between the Agreeing syntactic objects, rather two elements stand in a ccommandplusrelativizedminimality relation as specified in (22).^{8}
22
Two elements X and Y can Agree iff X ccommands Y, Y Matches X, and there is no element H such that H Matches X, X ccommands H and H ccommands Y.
LDDVAgree is downward in the sense that it modifies the ccommanded element, and it is valuationbased in the sense that the element is modified by converting one of its unvalued feature into a valued one as specified in (23) and (24).
23
X Matches Y for feature F iff X has [F:val] and Y has [F:__].^{9}
24
If X and Y Agree for feature F then [F:__] on Y becomes [F:val].
The first thing we must do, is formalize the notion of “feature” as used here. By (8), there are three sets of features in Universal Grammar—PHONF, SYNF, SEMF. Setting aside PHONF as irrelevant to the current paper, our task is to formalize the members of SYNF and SEMF. Generally, a given syntactic or semantic feature is describable with reference to its interpretability, its type, and its value (or lack thereof). Interpretability can be taken care of by simple set membership—interpretable features are members of SEMF, uninterpretable features are members of SYNF—leaving us with type and value.^{10} Keeping with Wurmbrand (2014) as our basis, then, we can define features as in (25) along with a few auxiliary notions defined in (26)–(28).^{11}
25
A feature is a pair 〈F, v〉—hereafter abbreviated F_{v}—where v is an integer. F is called the feature type, v is the feature value.
26
For all feature types F, 〈F, 0〉 is an unvalued F feature.
27
For lexical item LI = 〈PHON, SYN, SEM〉, feature F_{v} is a feature of LI, iff F_{v}∈SYN or F_{v}∈SEM.
28
For lexical item token LI_{k} = 〈LI, k〉, feature F_{v} is a feature of LI_{k}, iff F_{v} is a feature of LI.
So, for instance, English present tense might have roughly the lexical representation in (29).
29
〈PHON, {…〈φ, 0〉…}, {…〈T, 1〉...}〉
This lexical item has some phonetic features, an unvalued uninterpretable φfeature, and an interpretable T feature with the value 1, which we can stipulate is interpreted as present tense. The choice to formalize feature values as integers is made only to allow for a perspicuous way of defining unvalued features. We could use any type of discrete symbol to represent values, provided it had a special symbol for “unvalued.”
We can define Match as in (30).
30
For any two lexical item tokens X, G and feature type F,
Match(X, G, F) = 1 iff for some feature value v ≠ 0, 〈F, v〉 is a feature of X and 〈F, 0〉 is a feature of G.
Under this definition, an English finite T head will match a nonCasemarked pronoun but not a Casemarked one, as demonstrated in (31).
31

Match(T, 3SgF_{[Case:_]}, Case) = $Match\left(\begin{array}{c}\u27e8\u27e8{PHON}_{T},{SYN}_{T},\left\{...\u27e8Case,1\u27e9...\right\}\u27e9,k\u27e9,\\ \u27e8\u27e8{PHON}_{3SgF},\left\{...\u27e8Case,0\u27e9...\right\},{SEM}_{3SgF}\u27e9,k\mathrm{\text{'}}\u27e9...\end{array},Case\right)=1$ 
Match(T, 3SgF_{[Case:ACC]}, Case) = $Match\left(\begin{array}{c}\u27e8\u27e8{PHON}_{T},{SYN}_{T},\left\{...\u27e8Case,1\u27e9...\right\}\u27e9,k\u27e9,\\ \u27e8\u27e8{PHON}_{3SgF},\left\{...\u27e8Case,2\u27e9...\right\},{SEM}_{3SgF}\u27e9,k\mathrm{\text{'}}\u27e9...\end{array},Case\right)=0$
Value is essentially a replacement operation—operating on a lexical item token, swapping an unvalued feature with a valued counterpart. This is defined in (32).
32
For lexical item token LI_{k}=〈〈PHON, SYN, SEM〉, k〉, and feature 〈F, v〉,
Value(LI_{k}, 〈F, v〉) = 〈〈PHON, (SYN–{〈F, 0〉})∪{〈F, v〉}, SEM〉, k〉
So, an instance of Value associated with subjectpredicate agreement, ignoring Case, might look something like (33).
33
Where T_{Pres} = (29) and 〈φ, 31〉 corresponds to 3rd person singular,
Value(〈T_{Pres}, 4〉, 〈φ, 31〉) = 〈〈PHON, {...〈φ, 31〉…}, {...〈T, 1〉...}〉, 4〉
The resulting lexical item token still has an interpretable tense feature and an uninterpretable φfeature but the latter now has the value 31, which we stipulate corresponds to 3rd person singular.
Note that, while I have been tacitly assuming that (un)valuedness and (un)interpretability are correlated in the lexicon—that all and only unvalued features are members of SYNF—the definition of Value in (32) contradicts this assumption, since the result of Value is an element that contains at least one valued uninterpretable feature.
In fact, any attempt to make the assumption that all uninterpretable features are unvalued hold in general runs into issues. We could save it by eliminating Value, but this would contradict one of the core premises of Agree theory—that Agree modifies lexical item tokens midderivation. Alternatively, We could save it by redefining Value, say as Value′ in (34) which removes the unvalued feature from SYN and adds a valued feature to SEM.
34
For lexical item token LI_{k}=〈〈PHON, SYN, SEM〉, k〉, and feature 〈F, v〉,
Value(LI_{k}, 〈F, v〉) = 〈〈PHON, SYN–{〈F, 0〉}, SEM∪{〈F, v〉}〉, k〉
With this definition though, our subjectpredicate agreement would look like (35), which seems to create a T head which is semantically 3rd person singular—something that does not exist, at least in English.
35
Where T_{Pres} = (29) and 〈φ, 31〉 corresponds to 3rd person singular,
Value(〈T_{Pres}, 4〉, 〈φ, 31〉) = 〈〈PHON, {…}, {...〈T, 1〉, 〈φ, 31〉...}〉, 4〉
The operation defined in (32), then, seems to match the notion of valuation generally assumed in theories of UG with Agree. It does, however, have a problematic prediction that I address below.
The last portion of Agree to be defined is what is often called “Probe”, which is an instance of “Minimal Search” (Chomsky, 2004) an algorithm that requires some discussion.
3.1 Minimal Search
The term Minimal Search, as it is usually used in minimalist syntactic theory, refers to an algorithm that retrieves the “highest” object in a structure that meets some particular criterion. While such an algorithm is almost certainly required for Agree, it is not required only for Agree. Indeed, Minimal Search is implicated in at least Internal Merge (Chomsky, 2020) and labelling (Chomsky, 2013).
The criterion for a given instance of Search, it seems, depends on the purpose of that search. For Internal Merge, following a FreeMerge theory, the Search criterion is more or less identity—Internal Merge of X and Y requires a successful Search of X for Y or vice versa—Chomsky’s (2013) Labelling Algorithm Searches for any lexical item token, and a Search in service of Agree will use Match as defined in (30) as its criterion. Thus, our definition of Search, while guided by the present goal of formalizing Agree, must be general.
In order to properly define a Minimal Search algorithm we must first consider some test cases as follows. Each case is a complex abstract syntactic object containing two objects—G and H—each of which meets the search criterion. Each case is represented both as a binary set as constructed by Merge and a binary tree. The first case in (36) is the most straightforward—G asymmetrically ccommands H, so Minimal Search retrieves G and not H.
36
Case 1: G is retrieved.

{X, {G, {Y, H}}}

The second case in (37) is slightly more complicated—G does not ccommand H, but Minimal Search should retrieve G because it is immediately contained in an object that asymmetrically ccommands an object that immediately contains H.
37
Case 2: G is retrieved.

{X {{G, Y}, {W, {{H, Z}, U}}}}

Other cases, though, will give ambiguous results. These are cases in which G and H are equidistant from the root. In (38), for instance G and H are siblings, while in (39) they are immediately contained, respectively, by siblings.
38
Case 3: Both G and H are retrieved.

{X, {G, H}}

39
Case 4: Both G and H are retrieved.

{X, {{G, Y}. {H, Z}}}

Our goal, then, is to construct an algorithm that has the abovedefined results. There are two broad classes of search algorithms appropriate to our task—DepthFirst Search (DFS) and BreadthFirst Search (BFS). DFS starts at the root of an object and searches to a terminal node before backtracking, as represented in (40), where the arrows and the numbers indicated the search order.
A DFS algorithm can be made minimal by designing it to stop as soon as it finds a node that meets its criterion. So, a Minimal DFS on Case 1 would be proceed as in (41) selecting.
However, in an ambiguous case, like Case 4, a Minimal DFS will incorrectly retrieve just a single object as shown in (42).
A Minimal DFS algorithm, then is overdefinite—it gives a definite result where we expect an ambiguous one.
There is also a deeper problem with DFS as applied to syntactic objects, and that is its reliance on linear order as well as structure. In the examples above, whenever the algorithm reaches a branching node, it takes the left branch first. If it, instead, took the right branch first, the result would be different—in both (41) and (42), a righttoleft Minimal DFS would retrieve H rather than G. The problem is made worse by the fact that, the structures that we are searching are constructed by Merge and, therefore, do not have a linear order. In order for our algorithm to make a decision at a “branch,” then, it would have to be a random decision. Therefore, the result of a DFS for a given syntactic object may be different each time it is run. Given these issues, I will set aside DFS.^{12}
Breadthfirst Search (BFS) algorithms, on the other hand, searches neighbour nodes before proceeding lower in the tree as represented in (43), where the arrows and the numbers indicated the search order.
Again, this can be made minimal by requiring that the algorithm stop immediately upon finding an object that matches the search criterion. A Minimal BFS on Case 2, then, is represented in (44).
Like the Minimal DFS, the Minimal BFS, as represented in (43) and (44) assumes that nodes are linearly ordered, even if that order is arbitrary. Unlike the Minimal DFS, the order of the neighbour nodes does not matter, at least for definite cases like Case 1 and Case 2. To demonstrate this, consider the reverse version of (44) in (45).
In an ambiguous case, though, Minimal BFS suffers the same fate as Minimal DFS—it is overdefinite. So, in Case 3, Minimal BFS will wrongly retrieve either G or H depending on the ordering of nodes, as shown in (46) and (47).
This flaw, however, can be overcome if, instead of traversing each node, we treat the sets of neighbour nodes as tiers, as in (48).
Minimal Tiered BFS, then, would visit each tier and extract the subset of that tier whose members all matched the search criterion, and stop as soon at it extracts a nonnull subset. Thus we can define a definite search result as in (49), an ambiguous search result as in (50), and a failed search as in (51).
49
For a syntactic object SO and criterion P, Search(SO,P) is definite iff
Search(SO,P)=1
50
For a syntactic object SO and criterion P, Search(SO,P) is ambiguous iff
Search(SO,P)>1
51
For a syntactic object SO and criterion P, Search(SO,P) is failed iff
Search(SO,P)={}
Minimal Tiered BFS, then, will be our choice of Search algorithm. The next step is to formally define it.
In order to define Search, then, we need to be able to properly generate search tiers. So, for instance, the tiers for (37) are given in (52).
52
Tier 1 = {X, {{G, Y}, {W, {{H, Z}, U}}}}
Tier 2 = $\left\{\begin{array}{c}\left\{\text{G},\text{Y}\right\},\\ \left\{\text{W},\left\{\left\{\text{H},\text{Z}\right\},\text{U}\right\}\right\}\end{array}\right\}$
Tier 3 = $\left\{\begin{array}{c}\text{G},\text{Y},\\ \left\{\left\{\text{H},\text{Z}\right\},\text{U}\right\}\\ \text{W}\end{array}\right\}$
Tier 4 = $\left\{\begin{array}{c}\left\{\text{H},\text{Z}\right\},\\ \text{U}\end{array}\right\}$
Tier 5 ={H, Z}
Tier 6 ={}
For a given Tier T_{i}, we can generate T_{i}_{+1} by first removing all the terminal nodes from T_{i} and performing what is called an arbitrary union which is defined in (53).
53
For a set of sets X̅ = {X_{0},…,X_{n}} the arbitrary union of X̅, ⋃X̅ = X_{1}∪...∪X_{n}.
Therefore we can define a procedure NextTier in (54) and with it, Search in (55).
54
For T, a set of syntactic objects,
NextTier(T)= ⋃{SO∈T: SO is not a lexical item token}.
55
For S, a set of syntactic objects, and Crit, a predicate of lexical item tokens,
Search(S,Crit) = $\left\{\begin{array}{cc}\left\{\right\}& \text{ifS}=\left\{\right\}\hfill \\ \{\text{SO}\in \text{S}:\text{Crit(SO)}=1\}& \text{if}\{\text{SO}\in \text{S}:\text{Crit(SO)}=1\}\ne \left\{\right\}\hfill \\ \text{Search(NextTier(S),Crit)}& \text{otherwise}\hfill \end{array}\right.$
Probe, then is a special type of Search, where the search criterion is based on Match as shown in (56).
56
For F, a feature type, and SO, a syntactic object that immediately contains X, a lexical item token,
Probe(SO,X,F) = Search(SO, Match^{X,F})
where Match^{X,F} = 1 iff X contains a feature g such that Match(X, g, F) = 1.^{13}
With our definition of Probe in place, we can turn to our final definition of Agree which I turn to presently.
3.2 A Formal Definition of Agree
If and when an instance of Probe retrieves a lexical item token, that token must be modified—at least according to most versions of Agree.^{14} More precisely, the token must be modified in place. That is, if token G is in position Q in stage S_{i}, then the modified token G′ must be in position Q in stage S_{i}_{+1}. Furthermore, if copies of G are in multiple positions (Q, Q′, Q′′…) in S_{i}, then copies of X′ must be in those same positions in S_{i}_{+1}. In order to do this we must traverse the syntactic object in question and replace every instance of G with G′, the result of Value.
Note that each copy of G must be replaced to maintain their copyhood. Taking, for example, the pronoun her in (57), which has at least two copies as indicated in (58).
57
We expect her to be hired.
58
{...{Voice, {expect, {3SgF_{[Case:_]}, {...{hire, 3SgF_{[Case:_]}}…}}}}...}
If accusative Case marking was performed by a “minimal” Agree—one that only valued the highest copy—then the result would be the syntactic object in (59) in which the two instances of the third person feminine pronoun are distinct from each other and, therefore, no longer copies in any sense.
59
{...{Voice, {expect, {3SgF_{[Case:acc]}, {...{hire, 3SgF_{[Case:_]}}…}}}}...}
The pronouns 3SgF_{[Case:acc]} and 3SgF_{[Case:_]} are clearly distinct—one is Casemarked, the other isn’t—and furthermore, they have divergent derivational histories—one has undergone Value, the other hasn’t. What’s more is that the lower pronoun is not Casemarked and should therefore cause a crash at the interfaces. In order to maintain the identity between copies, then, Agree must be maximal—it must Value every copy.
Thus we can define Agree as in (60).
60
Where SO is a syntactic object F is a feature type, and v is a feature value $\ne 0$ and G is a lexical item token such that Probe( $\alpha $, X, F_{v}) = $\left\{\text{G}\right\}$, where SO = $\alpha $ or SO is contained in $\alpha $
Agree(SO, G, F_{v}) = $\left\{\begin{array}{ccc}\text{Value}\left(\text{SO,}\u27e8\text{F,v}\u27e9\right)& \text{ifSO=G}\hfill & \left(a\right)\\ \text{SO}& \text{ifSOisalexicalitemtoken}\hfill & \left(b\right)\\ \{\text{Agree}\left({\text{A,G,F}}_{\text{v}}\right),\text{Agree}\left({\text{B,G,F}}_{\text{v}}\right)\}& \text{ifSO}=\left\{\text{A,B}\right\}\hfill & \left(c\right)\end{array}\right.$
Agree, according to (60), is defined for three cases. In Case (60a), where SO is an instance of G—the lexical item token to be valued, the output of Agree is the valued version of G—Agree applies nonvacuously. In Case (60b), where SO is a lexical item token, but not an instance of G, the output of Agree is SO—Agree applies vacuously. In Case (60c), where SO is a set, Agree is applied to each member of SO, and a new set containing the respective outputs of those Agree operations is constructed—Agree applies recursively. Note also, that the result of Case (60c)—binary setformation—is an instance of Merge, and I will treat it as such below.
To see how Agree works, consider accusative Case marking in the sentence Brian kisses him as an instance of Agree operating on the structure in (61) yielding the structure in (62).
61
{_{α} Voice, {_{β} kiss, 3SgM_{[Case:_]}}}

Voice = 〈〈PHON_{Voice}, SYN_{Voice}, {…,〈Case, 2〉,...}〉, k〉
(Voice contains an Accusative Case feature in its SEM)

3SgM_{[Case:Acc]} = 〈〈PHON_{3SgM}, {…,〈Case, 0〉,…}, SEM_{3SgM}〉, k〉
(The 3rd person singular masculine pronoun contains an unvalued Case feature in its SYN)
62
{_{α} Voice, {_{β} kiss, 3SgM_{[Case:acc]}}}
The first step of this instance of Agree is to Probe for unvalued Case features, as in (63).
63
Probe(α, Voice, Case) = {3SgM_{[Case:_]}}
The noncasemarked pronoun—i.e., the sole member of the result of Probe—stands in for G in (60) for our instance of Agree. Since α is a complex SO, the first instance of Agree, as shown in (64), proceeds by recursively performing Agree on α’s constituent parts—Voice and β—and Merging the results. Since Voice is a lexical item token but not our target for Agree, Agree does not change it, as shown in (65), and we can simplify our first iteration of Agree as in (66).
64
Agree(α, 3SgM_{[Case:_]}, Case_{acc}) =  (by (60c)) 
Merge(Agree(Voice, 3SgM_{[Case:_]}, Case_{acc}), Agree(β, 3SgM_{[Case:_]}, Case_{acc})) 
65
Agree(Voice, 3SgM_{[Case:_]}, Case_{acc}) = Voice  (by (60b)) 
66
Merge(Agree(Voice, 3SgM_{[Case:_]}, Case_{acc}), Agree(β, 3SgM_{[Case:_]}, Case_{acc}) =  (by (65)) 
Merge(Voice, Agree(β, 3SgM_{[Case:_]}, Case_{acc}) 
We then perform Agree on β which contains the verb and the direct object pronoun.
67
Agree(β, 3SgM_{[Case∶_]}, Case_{acc}) =  (by (60c)) 
Merge(Agree(kiss, 3SgM_{[Case∶_]}, Case_{acc}), Agree(3SgM_{[Case∶_]}, 3SgM_{[Case∶_]}, Case_{acc})) 
68
Agree(kiss, 3SgM_{[Case∶_]}, Case_{acc}) = kiss  (by (60b)) 
69
Agree(3SgM_{[Case∶_]}, 3SgM_{[Case∶_]}, Case_{acc}) =  (by (60a)) 
Value(3SgM_{[Case∶_]}, Case_{acc}) =  (by (32)) 
3SgM_{[Case∶acc]} 
70
Merge(Agree(kiss, 3SgM_{[Case∶_]}, Case_{acc}), Agree(3SgM_{[Case∶_]}, 3SgM_{[Case∶_]}, Case_{acc})) =  (by (68), (69)) 
Merge(kiss, 3SgM_{[Case∶acc]}) 
Then, having reached the “bottom” of our structure, we are left with two simple Merge operations which yield (62) as shown in (71).
71
Merge(Voice, Agree(β, 3SgM_{[Case∶_]}, Case_{acc})) =  (by (70)) 
Merge(Voice, Merge(kiss, 3SgM_{[Case∶acc]})) =  (by (18)) 
Merge(Voice, {kiss, 3SgM_{[Case∶acc]}}) =  (by (18)) 
{Voice, {kiss, 3SgM_{[Case∶acc]}})} = (62) 
We have arrived at a formal definition of one variety of Agree (LDDVAgree) which we will use in the the following section as a basis for defining other varieties.
3.3 Upward Valuation
In defining a Downward Valuation Agree, we considered syntactic objects such as the one schematized in (72) which immediately contain lexical item tokens bearing a valued feature F_{v} and which contain a lexical item token bearing an unvalued feature F_{0}.
72
{X_{F:v}, {… G_{F:0}}}
In an Upward Valuation, the relevant features of X and G are swapped, as in (73).
73
{X_{F:0}, {… G_{F:v}}}
In order to capture Upward Valuation, then we need first modify the Match criterion of Probe as in (74), moving X to the second argument position.
74
For F, a feature type, and SO, a syntactic object that immediately contains X, a lexical item token,
Probe_{UV}(SO, X, F) = Search(SO, Match^{X, F}).
Thus, Probe_{UV} gives a definite result {G} only if X contains an unvalued F feature and G contains a valued F feature. Since, by definition, the relevant unvalued feature in Agree_{UV} is at the top of the structure, we might think that no exhaustive DFS is required. Unfortunately, though, the same concern with valuing copies is with us—just because a lexical item token is at the top of a tree doesn’t mean there isn’t a copy of it at the bottom. Therefore, our definition of Agree_{UV} in (75) look similar to that in (60).
75
For lexical item token X, syntactic object SO={X,...}, and feature type F, and lexical item token G such that Probe_{UV}(α, X, F_{v}) = G, where SO = α or SO is contained in α,
Agree_{UV}(SO, X, F_{v}) = $\left\{\begin{array}{ccc}\text{Value}\left(\text{SO,}\u27e8\text{F,v}\u27e9\right)& \text{ifSO=X}\hfill & \left(a\right)\\ \text{SO}& \text{if SOisalexicalitemtoken}\hfill & \left(b\right)\\ \text{Merge}\left({\text{Agree}}_{\text{UV}}\left({\text{A,X,F}}_{\text{v}}\right),{\text{Agree}}_{\text{UV}}\left({\text{B,X,F}}_{\text{v}}\right)\right)& \text{ifSO}=\left\{\text{A,B}\right\}\hfill & \left(c\right)\end{array}\right.$
3.4 Feature Checking
Versions of Agree whose effects are feature checking rather than valuation assume that all formal features—i.e., members of SYNF—are valued, but must be checked by Agree (Chomsky, 1995). In order to formalize such a feature checking operation, Agree_{✓}, we must reformulate our notion of features and our Match predicate, and replace Value with Check. Formal features and their related notions, then, are defined as in (76) and (77), with semantic features retaining their definition in (25).
76
A formal feature is a triple 〈c?, F, v〉, where c? is 1 or 0 and v is an integer. F is called the feature type, v is the feature value.
77
For all feature types F and values v, 〈0, F, v〉 is an unchecked F_{v} feature, and 〈1, F, v〉 is checked F_{v} feature.
Match_{✓}, then, compares a semantic feature of one lexical item token with a formal feature of another succeeding if both features have the same type and value and the formal feature is unchecked, as defined in (78).
78
For any two lexical item tokens X and G, feature type F and value v,
Match_{✓}(X, G, F) = 1 iff 〈F, v〉 is a feature of X and 〈0, F, v〉 is a feature of G.
Finally, Check is a simple matter of flipping a 0 to a 1 or leaving a 1 as a 1 as in (79). Note, though, that Check will never apply to an already checked feature, since Match is a prerequisite for Check and will only succeed if the feature in question is unchecked.
79
For a lexical item token SO=〈〈PHON, SYN, SEM〉, k〉, and formal feature F_{v}=〈c?, F, v〉,
Check(SO, F_{v}) =〈〈PHON, (SYN–F) ∪ {〈1, F, v〉}, SEM〉, k〉
These newly defined functions can be slotted into our formalized definitions of Agree as in (80) to give a definition of Agree_{✓}, where G is the result of Probing based on Match_{✓}.
80
Where SO is a syntactic object F_{v} is feature, and G is a lexical item token such that Probe_{✓}(α, X, F_{v}) = G, where SO = α or SO is contained in α,
Agree(SO, G, F_{v}) = $\left\{\begin{array}{cc}\text{Check}\left({\text{SO,F}}_{\text{v}}\right)\text{ifSO=G}& \left(a\right)\\ \text{SOifSOisalexicalitemtoken}& \left(b\right)\\ \text{Merge}\left({\text{Agree}}_{\u2713}\left({\text{A,G,F}}_{\text{v}}\right),{\text{Agree}}_{\u2713}\left({\text{B,G,F}}_{\text{v}}\right)\right)\text{ifSO}=\left\{\text{A,B}\right\}& \left(c\right)\end{array}\right.$
3.5 Local Agree
Early minimalist theories of agreement (e.g. Chomsky, 1993) continued the GB assumption that agreement was limited to what was called a “spechead” relation. So, for example, subjectpredicate agreement was assumed to occur because, in the terminology of the day, the subject moves to the specifier of the predicate head (T or I), in contrast to later theories in which subjects move because they agree. Similarly, Case licensing, in these theories, is usually taken to occur under a “spechead” relation. In this section, I will formalize this conception of Agree.
On its surface, Local Agree, as described above, has the advantage of not requiring an arbitrary search of the entire derived expression. Instead, the search is strictly and specifically limited to the very top of object. The canonical case of socalled “spechead” agreement is the finite subject merged with the finite predicate, shown in (81).
81
TP = {{D, …}, {T, …}}
Restricting our discussion to Case, we can see that the Agree operation is an interaction between the lexical item token immediately contained in one member of TP and the lexical item token contained in the other member of TP. We can define Probe_{Local}, then, as in (82).
82
For feature type F, lexical item tokens X and Y, and syntactic object SO={U, W},
Probe_{Local}(SO, X, F) = $\left\{\begin{array}{c}\text{YifX}\in \text{U},\text{Y}\in \text{W},\text{andMatch(X,Y,F)}\\ \text{undefinedotherwise}\end{array}\right.$
It should be noted that Probe_{Local} makes no use of the notions “specifier” or “head.” Indeed, it assumes no structural asymmetry at all, only the valuedunvalued asymmetry.
It should also be noted that, since socalled “spechead” structures, especially those associated with Case and agreement, are often formed by Internal Merge, our final version of Agree_{Local}, much like longdistance Agree, will need to replace every instance of the object being valued/checked. Therefore, our final version of Agree_{Local}, is defined as in (83).
83
Where SO is a syntactic object F is a feature type, and v is a feature value ≠ 0 and G is a lexical item token such that Probe_{Local}(α, X, F) = G, where SO = α or SO is contained in α,
Agree_{Local}(SO, G, F_{v}) = $\left\{\begin{array}{cc}\text{Value}\left(\text{SO,}\u27e8\text{F,v}\u27e9\right)\text{ifSO=G}& \left(a\right)\\ \text{SOifSOisalexicalitemtoken}& \left(b\right)\\ \text{Merge}\left(\text{Agree}\left({\text{A,G,F}}_{\text{v}}\right),\text{Agree}\left({\text{B,G,F}}_{\text{v}}\right)\right)\text{ifSO}=\left\{\text{A,B}\right\}& \left(c\right)\end{array}\right.$
3.6 Summary
In this section, I provided a formal definition of one particular conception of Agree—LongDistance Downward Valuation Agree—by first breaking it into individual pieces—Probe, Match, Value—which I gave formal definitions, and then assembling those definitions in such a way as they define Agree. I then discussed a few alternative conceptions of Agree, showing how they could be defined by altering the previous definitions as minimally as possible. This description of the definition process might suggest that Agree is modular—that it consists of several independent operations that can be mixed and matched—but this is not the case. Rather, while the discussion of each alternative tended to focus on a single operation, the changes to that operation was such that it necessitated minor modifications to Agree as a whole. Agree, then, does seem to be real operation, albeit a rather complex one, as I will demonstrate in the next section.
4 Properties of Agree
With the Agree operation properly formalized, we are in a position to investigate the operation’s theoretical properties, which have either not been remarked upon in the literature, or been discussed without the precision that formalization allows. This section will discuss some of those properties. Rather than investigating Agree in isolation and following the premise that Agree is a fullfledged derivational operation like Merge, Select and Transfer, this section will focus on those properties of Agree that distinguish it from other operations—Merge in particular.
We will first see that Agree differs from Merge and Select in that it is inherently recursively defined, while the latter two are defined nonrecursively. Related to this, I will argue that the fact that our definition of Agree includes instances of Merge effectively rules out any general Agree requirement for Merge. Next, I show that, unlike Merge and Select, Agree does not close the set of syntactic objects, and that attempts to rectify this leads to problematic predictions for language acquisition. Finally, I discuss the implications of Agree for the NTC.
4.1 UG_{Agree}
In order to do so, though, we must give a definition of UG_{Agree} in (84) and derivation in (85).
84
Universal Grammar is a 7tuple:
〈PHONF, SYNF, SEMF, Select, Merge, Transfer, Agree〉
85
A derivation from lexicon L is a finite sequence of stages 〈S_{1},…,S_{n}〉 for n≥1,
where each S_{i} =〈S_{i},…,S_{i}〉, such that

For all LI and k such that 〈LI,k〉 ∈ LA_{1}, LI ∈ L,

W_{1} = {} (the empty set),

for all i, such that 1≤i≤n, either
(derivebySelect) for some A∈LA_{i}, 〈LA_{i+1}, W_{i+1}〉 = Select(A, 〈LA_{i}, W_{i}〉), or
(derivebyTransfer) …,
(derivebyMerge) LA_{i}=LA_{i}_{+1}, and the following conditions hold for some A,B:

A∈W_{i}

Either A contains B or

W_{i} immediately contains B, and W_{i}_{+1} ={W_{i} – {A, B}} ∪ {Merge(A,B)}
(derivebyAgree) or LA_{i}=LA_{i}_{+1} and the following conditions hold for some SO, X, G and F_{v}:

SO∈W_{i}

SO immediately contains X

Probe(SO,X,F_{v}) = {G}

W_{i}_{+1} = {W_{i} – {A, B}} ∪ {Agree(SO,G,F_{v}}

This definition of a derivation uses the names of its procedures, but in the case of Merge and Select, one could just as easily expand them to give their full definition fully in terms of settheory because they are nonrecursive operations. Agree, however, is recursively defined, that is, it is defined in terms of itself—“Agree” appears on the lefthand and righthand side of the equals sign in (60)—so such an expansion is not possible. This is a fundamental difference between Agree and the other generative operations—Merge and Select are nonrecursive functions, while Agree is recursive.^{15}
Beyond its recursive definition, there are a number of properties that set Agree apart from its fellow operations. First, since performing Agree on a syntactic object entails searching the object, modifying certain constituents, and putting the object back together, and since objects can only be put together by applying Merge, every nontrivial application of Agree includes at least one application of Merge. This is reflected in definitions (60) and (75)—in which Merge appears in the intension of Agree—and concurs with Hornstein (2009, pp. 126–154) who notes that the minimal ccommand relation required by Agree (Specifically nonlocal Agree, or AGREE in his terminology) is exactly the same as the one that is assumed to hold in all cases of InternalMerge (which he calls “Move”). Hornstein’s critique, that Agree and InternalMerge are redundant, is actually complementary to the fact that Agree as defined entails Merge. The former suggests that either Agree or Internal Merge should be eliminated, while the latter rules out eliminating InternalMerge.
4.2 Agree as a Prerequisite for Merge
Early in the minimalist program, Chomsky (2000) proposed that Agree was a prerequisite for Move—that Move was a reflex of Agree. Merge—what we now call External Merge—on the other hand, was free to apply without Agree. Once Internal Merge was discovered, though, theorists were faced with a dilemma—if Merge and Move were truly a single operation, they couldn’t very well have different prerequisites. There are two ways out of this dilemma—either all instances of Merge are free, or all instances of Merge require Agree.^{16} Although C&S’s formalization and my extension of it assume that all operations, except perhaps Transfer, are free, there are Agree theorists—for instance Wurmbrand (2014)—who take Agree to be a prerequisite to Merge. Therefore, in this section, I will discuss the barriers to modifying the formal grammar to make Agree a prerequisite for Merge.
The principal barrier to making Agree a prerequisite for Merge is that, as defined in (85), the derivation is a computational procedure and, therefore, is strictly incremental. That is, the validity of a given stage S_{n} (n≠1) depends solely on its form and the form of the immediately preceding stage S_{n1}. Requiring every instance of Merge to be preceded by an instance of Agree, however, would mean that the validity of a stage S_{n} (n≠1) depends on its two preceding stages S_{n1} and S_{n2}. That is, S_{n} can be derived from S_{n1} by Merge only if S_{n1} is derived from S by Agree. A derivation, then, would need memory, albeit a very small amount of it.
On its face, this does not seem to be an insurmountable barrier, but as we shall see, it will end up ruling out the first instance of Merge in any derivation. To begin with, we reformulate our definition of derivation by adding the underlined line in our derivebyMerge clause in (86).
86
A derivation from lexicon L is a finite sequence of stages 〈S_{1},…,S_{n}〉 for n≥1,
where each S_{i} =〈S_{i},…,S_{i}〉, such that

For all LI and k such that 〈LI,k〉 ∈ LA_{1}, LI ∈ L,

W_{1} = {} (the empty set),

for all i, such that 1≤i≤n, either
(derivebySelect) for some A∈LA_{i}, 〈LA_{i+1}, W_{i+1}〉 = Select(A, 〈LA_{i}, W_{i}〉), or
(derivebyTransfer) …,
(derivebyMerge) LA_{i}=LA_{i}_{+1}, and the following conditions hold for some A,B:

A∈W_{i}

Either A contains B or

〈LA_{i} ,W_{i}〉 is derived by Agree from 〈LA_{i}_{1} ,W_{i}_{1}〉, and

W_{i} immediately contains B, and W_{i}_{+1} ={W_{i} – {A, B}} ∪ {Merge(A,B)}
(derivebyAgree) or LA_{i}=LA_{i}_{+1} and the following conditions hold for some SO, X, G and F_{v}:

SO∈W_{i}

SO immediately contains X

Probe(SO,X,F_{v}) = {G}

W_{i}_{+1} = {W_{i} – {A, B}} ∪ {Agree(SO,G,F_{v}}

Now, let’s consider an abstract subderivation of the syntactic object {X, Y} where X and Y are lexical item tokens. We start in S_{1}, given in (87) with an empty workspace and a lexical array containing at least X and Y.
87
S_{1}  =  〈LA_{i}, W_{i}〉 
=  ⟨{ X, Y, Z … }, {}⟩ 
Next we perform Select twice, to bring X and Y into the workspace.
88
S_{2}  =  Select(X, S_{1}) 
=  ⟨{Y, Z … }, {X}⟩ 
89
S_{3}  =  Select(Y, S_{3}) 
=  ⟨{Z … }, {X, Y}⟩ 
Under a free Merge grammar, we would, at this point simply Merge X and Y, but this option is not available to us, since derivebyMerge in (86) requires an Agree step. A Select step is possible here, but that would only postpone our dilemma. We need to perform Agree next.
Assuming that X could value Y for feature F—i.e., Match(X, Y, F) = 1—let’s consider the structural prerequisites. As stated in (86), X and Y must be contained in the same syntactic object SO, which, in turn, must be a member of the workspace. In S_{3}, however, both X and Y are members of the workspace, and there is no SO to speak of. No stage S_{4}, then, can be derived by Agree.
We’ve arrived then at an instance of circularity—every instance of Merge requires a preceding instance of Agree, and every instance of Agree requires a preceding instance of Merge. First Merge, then, is impossible if the definition of a derivation in (86) holds.^{17}
4.3 The NonClosure of Agree
Since a computational procedure is essentially the repeated application of an operation, or set of operations, with each application providing the input for the following application, the domain of a given computational operation must be closed under that operation, as defined in (90).
90
Domain D is closed under nplace operation f iff
for all x_{1}, x_{2},…,x_{n} ∈ D, f(x_{1}, x_{2},…,x_{n})∈ D.
In the case of our syntactic derivations, our domain is the set of stages, which C&S demonstrate are closed under derivebySelect and derivebyMerge. I have thus far been assuming that it is also closed under derivebyAgree, but that assumption is perhaps not strictly true, under our present definitions.
As defined, derivebyAgree is a function from stages to stages that modifies a stage’s workspace, by performing Agree on a syntactic object in that workspace. Therefore, the set of stages is closed under derivebyAgree iff the set of syntactic objects is closed under Agree. For its part, Agree operates on a given syntactic object SO by applying Value to SO if SO is an appropriate lexical item token, or to the appropriate lexical item tokens contained in SO otherwise. Therefore the set of syntactic objects is closed under Agree iff the set of lexical item tokens is closed under Value. We need only consider a simple instance of Value to see that this is not obviously the case.
Consider the lexical item token X_{k}, defined in (91), which has only one syntactic feature, [F:0].
91
X_{k} =⟨⟨PHON_{X}, {⟨F,0⟩}, SEM_{X} ⟩, k⟩
where PHON_{X} ∈ PHONF*, SEM_{X} ⊂ SEMF, k is an integer, and ⟨F,0⟩ ∈ SYNF.
What about the result of applying Value to X_{k}, given in (92)?
92
Value(X_{k}, 〈F, v〉) = ⟨⟨PHON_{X}, {⟨F,v⟩}, SEM_{X} ⟩, k⟩
where v is a nonzero integer.
Since PHON_{X}, SEM_{X}, and k are unchanged, the new object is a lexical item token iff ⟨F,v⟩ ∈ SYNF. That is, the set of lexical item tokens is closed under Value only if the universal set of syntactic features in UG_{Agree} contains both valued and unvalued features.
While there is no strictly formal reason for modifying our theory features by hypothesizing that SYNF contains valued and unvalued features, such a hypothesis would put us in something of a theoretical quandary. In the grammar assumed by this paper, language acquisition is at least partially a process of constructing lexical items from universal feature sets so that they match tokens in the primary linguistic data. The basic premise of Agree theory, though, is that a unvalued features cannot surface and therefore must be valued during the derivation. If this is the case, then there are effectively no tokens of unvalued features in the primary linguistic data. Why, then, would a language acquirer ever construct a lexical item with an unvalued feature?
To take a concrete example, consider the case of French adjectives which show gender and number agreement as demonstrated in (93).
93
Gender  Sg  Pl 

Fem  grande  grandes 
Masc  grand  grands 
This situation is consistent with two sorts of lexicons if we assume lexically valued SYN features—lexicons with multiple adj LIs, each with valued φfeatures as in (94) and lexicons with a single adj LI with unvalued φfeatures as in (95).
94
$\text{LEX}=\left\{\begin{array}{c}...\\ \u27e8\text{/e/},\left\{\u27e8\text{\u0263},1\u27e9,\u27e8\text{\#},1\u27e9\right\},{\text{SEM}}_{adj}\u27e9\\ \u27e8\text{/es/},\left\{\u27e8\text{\u0263},1\u27e9,\u27e8\text{\#},2\u27e9\right\},{\text{SEM}}_{adj}\u27e9\\ \u27e8\varnothing ,\left\{\u27e8\text{\u0263},2\u27e9,\u27e8\text{\#},1\u27e9\right\},{\text{SEM}}_{adj}\u27e9\\ \u27e8\text{/s/},\left\{\u27e8\text{\u0263},2\u27e9,\u27e8\text{\#},2\u27e9\right\},{\text{SEM}}_{adj}\u27e9\\ ...\end{array}\right\}$
95
LEX = {… ⟨PHON_{adj}, {⟨ɣ,0⟩, ⟨#,0⟩}, SEM_{adj} ⟩ ...}
Since the lexicon in (94) represents a surface analysis of adjective morphology, it would be the more straightforward to acquire than (95) which requires an additional step of abstraction from the data. All else being equal, then, allowing SYNF to contain valued features would seem to predict the sort of lexicon in (94) for French. This, of course would be consistent with a checkingbased Agree, but not a valuationbased Agree.
Alternatively, we could assume that all and only unvalued features are members of SYNF—stated formally as an axiom in (96).
96
For all features ⟨F,v⟩, v=0 ↔ ⟨F,v⟩ ∈ SYNF.
This would remove the acquisition issue—(94) would be an impossible lexicon—and would be consistent with the basic premise of Agree theory. It still would require theoretical explanation, but of the more general sort suggested in Footnote 10. but it would mean that the set of lexical item tokens is not closed under Value, and therefore the set of stages is not closed under a Valuebased Agree. A Valuebased Agree, then, could not be a computational operation in a version of UG_{Agree} with (96) as an axiom.
In sum, in order for a valuationbased Agree such as the one defined in (60) to be a viable as a computational procedure, we must expand the domain of possible lexical items in a theoretically questionable way.
4.4 Agree and the NTC
One of the theorems of C&S’s formal grammar is the No Tampering Condition defined by Chomsky (2007, p. 8) as follows: “Suppose X and Y are merged. Evidently, efficient computation will leave X and Y unchanged (the NoTampering Condition NTC). We therefore assume that NTC holds unless empirical evidence requires a departure from [the strong minimalist thesis] in this regard, hence increasing the complexity of UG.” C&S’s formulation of NTC, which they prove as a theorem of UG, is given in (97).
97
For any two consecutive stages in a derivation S_{1} = 〈LA_{1}, W_{1}〉 and S_{2} = 〈LA_{2}, W_{2}〉,
for all A contained in W_{1}, A is contained in W_{2}.
Since the effect of every form of Agree defined in this paper is to replace all instances of some lexical item token G in a workspace with a distinct item G’, Agree violates NTC by design. The issues UG_{Agree} discussed above, then, may be predicted by Chomsky’s conjecture that UG operations conform to the NTC. There are essentially two ways of dealing with this result—either we take the approach that C&S take with Transfer and modify Agree so that it does not violate NTC, or we argue that “empirical evidence requires a departure from” NTC. I will discuss each of these options in turn below.
4.4.1 NTCRespecting Agree
A straightforward way of constructing an Agree operation that respects the NTC is to formally separate the content of a derived expression from its structure in some way with Merge manipulating the structure and Agree manipulating the content. A stage of the derivation, then would consist of a lexical array, a workspace, and ledger as in the definition in (98).
98
A stage is a triple S = 〈LA, W, L〉, where LA is a lexical array, W is a set of syntactic objects, and L is a set of pairs of lexical item tokens. We call W the workspace of S and L the ledger of S.
Rather than modifying lexical item tokens in place, Agree would add a pair 〈LI_{k}, LI_{k}′〉, where LI_{k} is a lexical item token contained in the workspace and LI_{k}′ is the result of Valuing LI_{k} for some feature. The ledger, then, postpones the tampering of Agree, either until Transfer, or until the SM and/or the CI system and thereby rescues the NTC.
This sort of move also fixes a number of issues already discussed regarding Agree. A version of Agree that respects NTC does not alter the workspace—it merely constructs an ordered pair and adds it to the ledger. It does not take apart and put back together an already constructed syntactic object, as standard Agree as defined in (60) does. Therefore, it does not need to be recursively defined, and it does not need to refer to Merge in its definition.
This improvement aside, however, it also lays bare the fact that Agree as a syntacticderivational operation is fundamentally redundant. The prerequisites for Agree are a structural relation (Search) and content relation (Match) between two lexical item tokens. So, suppose X and G are lexical item tokens and, for some feature F, Match(X,G,F)=1. Further suppose that stage S_{n} in derivation D is derived by Merge(X, Y), where Y contains G and no lexical item token H, such that Match(X,H,F)=1. At this point, our prerequisites are met and we can perform Agree, but supposing instead we derive stages S_{n}_{+1} and S_{n}_{+2} by Selecting and Merging another lexical item token. By the NTC, the object {X, Y} is contained in the root object of S_{n}_{+2}, and therefore all of the structural and content relations that held at S_{n} still hold at S_{n}_{+2} including the prerequisites for X to Agree with G for F.^{18} By extension, we can continue to postpone Agree at least until the next instance of Transfer without losing the prerequisites for Agree. It seems, then, that, while we can certainly define Agree so that it respects NTC, if we have NTC, we can define Agree as an interface operation, perhaps as part of Transfer. This formalization, then, represents a sharp departure from the various theories of Agree whose formalization is the task at hand, and which share the assumption that Agree modifies already constructed SOs midderivation.
4.4.2 Agree Instead of the NTC?
Even as stated by Chomsky (2007), the NTC is not an absolute law akin, say, to the law of noncontradiction. Rather, he proposes that we assume the NTC “unless empirical evidence requires a departure from [the strong minimalist thesis] in this regard.” In one sense, this is a very low bar, since NTC is a universal statement, which only requires a single counterexample to invalidate. In practice though, it is far from obvious what sort of evidence would count as counterexample.
The relative ubiquity of morphological agreement, for instance, might seem to be the sort of evidence we need, but it is not sufficient to invalidate NTC. Consider, as a parallel, linear order. It is a plain fact that external linguistic expressions have linear order, but that linear order is still assumed to be absent in the grammar—at least in standard Mergebased grammars. Yet, as Chomsky (2020) citing McCawley (1968) points out, adverbs like respectively, which depend on linear order for their interpretation, provide evidence that conjunction structures have inherent linear order.
99
Beth and Sara met Hanako and Máire respectively.

= Beth met Hanako and Sara met Máire.

≠ Beth met Máire and Sara met Hanako.
What we need, then, is evidence that standard Agree is occurring in a derivation interspersed with Merge. Preminger (2014) argues that we have exactly such evidence in the interrelation of morphological case, φagreement, and subject position.^{19} The form of the argument is given in (100).
100

Morphological case feeds φagreement in quirkysubject languages.

ϕagreement feeds movement to canonical subject in nonquirkysubject languages.

The functioning of the grammar is uniform across languages (The Uniformity Principle).

Therefore, morphological case and φagreement precede movement to subject.

Therefore, morphological case and φagreement are part of the narrow syntax.
The argument is logically sound, but it depends on an analysis of the evidence that is plausible, but not the only possible analysis. That is, it depends on the truth of the first two premises, which are empirical statements. Despite being empirical statements, though, they depend on two theoretical notions—“quirky subjects” and “canonical subject position”—to even be coherent. I will take for granted that the term “quirky subject” is coherent, and focus on “canonical subject position.”^{20}
Furthermore, it is worth noting, that Preminger frames his premises in terms of “feeding” rather than “driving” or “triggering.” An operation X feeds another operation Y if X creates the necessary conditions for Y and X precedes Y. “Feeding”, then, speaks to the order of operations more than causation.
One property of canonical subject position that Preminger is clear about is that it is syntactic—he says of movement to canonical subject position that it is “clearly syntactic (since it creates new binding configurations, for example)” (p. 177) and that it “is a syntactic process par excellence” (p. 184). We further know, based on the second premise of (100), which Preminger claims as an empirical result, that movement to canonical subject position in nonquirkysubject languages should always cooccur with φagreement. Since this latter requirement is an empirical claim, though, it should not be too directly tied to our definition lest our reasoning be circular. We can construct our definition by applying these two desiderata to some representative data.
Our representative data is given in (101), where the underlined subexpression is could be or has been considered to be in subject position in English.
101

The city is bustling.

There seem to be unicorns in my house.

The dog running down the street was quite a sight.

They seemed t to leave.

I expect t/PRO to leave shortly.

We believed them to be a capable team.
I believe that it is quite safe to label the city in (101a) as being in canonical subject position^{21}—it is the specifier of TP and it triggers φagreement on the finite auxiliary. On the other hand, the existential associate unicorns in (101b) is likely not in a canonical subject position.^{22} In fact, existential associates not being in canonical subject position gives force to the second premise of (100)—in order for φagreement to feed movement to canonical subject position, agreement must be necessary but not sufficient for movement and existential clauses show this only if we assume that their associates are not (possibly covertly) in canonical subject position.^{23}
This leaves us with nonfinite subject position in (101c) to (101f). In each of these cases, the underlined expression could reasonably be said to be in a subject position, and to have moved there, yet there is no apparent φagreement associated with that move. We could reasonably reject the dog in (101c) as being in canonical subject position, since it is not a specifier to a TP, leaving us with the null subjects in and the ECM subject in (101f). In a summarizing table, though, Preminger (2014, p. 164) seems to assert that, in English, only nominatives are candidates for movement to canonical subject. This would rule out traces/PRO and ECM subjects as canonical subjects.
Canonical subject position, then, seems to refer to the specifier of finite T, at least in English. Assuming such a position can be defined well enough to support generalizations such as Preminger’s premises,^{24} the Uniformity Principle—Preminger’s third premise—demands that we treat movement to the specifier of finite T either as a special case of Merge, distinct from external or ordinary internal Merge, or as derivational operation of its own, distinct from Select, Merge, and Agree. So, if we keep strictly to the theory assumed in this paper, where UG_{Agree} has Merge, Select, Agree, and Transfer, Preminger’s argument does not go through because the premise (100b) would not be welldefined.^{25} Put another way, (100) might be coherent in some theory of grammar, but it is not coherent in a theory that assumes UG as defined in (8) or UG_{Agree} as defined in (84).
We could try to rescue (100) by restating (100b) as (100b’) which is coherent in UG_{Agree}—assuming “quirky subject” can be defined and the problems outlined in above can be overcome.
(100b’) ϕagreement feeds Internal Merge in nonquirkysubject languages.
In order for this new premise to be true, though, movement to noncanonical subject position must also require $\mathrm{\phi}$agreement, which implies some sort of abstract or covert $\phi $agreement on nonfinite predicates such as those in . In light of this implication, it is difficult to see how this new premise could be justified empirically, and therefore it should be rejected, or at best treated as a hypothesis. Since any argument is only as strong as its premises, this would weaken Preminger’s argument a great deal.
To recap, Preminger’s argument as given in (100), while seemingly logically sound, rests on the assumption that movement to canonical subject position is a bona fide syntactic operation, distinct from other types of movement. This assumption would be a departure from the theory assumed here, which takes all movement operations to be instances of Merge. Preminger’s conclusion, that agreement takes place in the syntax taken with my argument above that Agree violates the NTC, implies the conclusion that the NTC should be at least weakened^{26}—another departure from the theory. It would seem, then, that one departure from theory begets other departures—a result that is far from surprising and, in fact, indicates the internal unity of the theory of grammar assumed here. More importantly, Preminger’s argument, the most explicitly fleshed out empirical argument in favour of Agree as a syntactic operation, should not be taken as a falsification of NTC or SMT.
5 Modularity and the Paths Not Taken
Throughout the exercise in formalization, many choices were made that could have been made differently with various levels of consequence for the overall system. For instance, the choice, adopted from C&S, to include PHON as part of the LI was essentially the choice of an “earlyinsertion” theory of morphology. This choice, however, was of little consequence for the formalization of Agree, since it dealt exclusively with the SYN and SEM features of LIs. The choice to formalize features as typevalue pairs, though, does have relevant consequences.
Suppose, for instance, I had adopted a geometric feature theory such as the one developed by Béjar (2003), where, 2nd person feature is represented as in (102).
102
One formal definition of feature that would capture this is given in (103), with 2nd person feature formalized as in (104).
103
X is a feature iff $\left\{\begin{array}{cc}\text{X}\in \text{SEM}& \text{(Anatomicfeature)}\\ \text{orXisapairoffeatures.}& \text{(Acomplexfeature)}\end{array}\right.$
104
〈π, 〈participant, addressee〉〉
Where {π, participant, addressee} ⊂ SEM
Quite obviously, this would require us to redefine or replace our auxiliary notions like featureof or unvalued feature and to define new ones like dependson or entails, but most importantly it would require new definitions of Match and Value. Béjar (2003) discusses various parameters that would determine these definitions—for instance, whether unvalued features should be fully specified or underspecified—so I will direct readers to that discussion should they wish to formalize Match and Value under this theory of features.
On the other hand, I see no reason to expect that we must alter our Minimal Search algorithm in (55) nor our final definition of Agree in (60) to account for alternative theories of features. Minimal Search is a general purpose algorithm—it doesn’t depend on the particular search criterion—and Agree Heading 1searches a structure and replaces Matching lexical item tokens with the result of Value—as long as Match is a predicate that compares lexical item tokens relative to features, and Value is a function from somehowdefective lexical item tokens to lessdefective lexical item tokens.
Likewise, were one able to adequately define a minimal DFS algorithm or if one adopted a ledgerbased model of Agree, there would not necessarily be any reason to abandon either the typevalue or the geometric theory of features. Agree, Search, and Match/Value, then are to a certain extent modular with respect to each other and, while the limits of that modularity are a purely theoretical question, the final choice of individual theories will depend on a combination of theoretical and empirical concerns.
6 Concluding Remarks
The task of formalizing a theoretical conjecture occupies an odd place in the sciences. While it does generally not bring anything new to the table, it does give us the opportunity to objectively assess the validity and theoretical prospects of various informal proposals. By formalizing various proposals for Agree as a syntactic operation, we can see that what often is shown as a simple curved arrow on tree diagrams is actually a rather complicated computational operation. Not only is this complexity apparent simply from the size of the formal definition compared, say, to that of Merge, but it is reflected in the theoretical complexities identified in Section 4.
In its current state, then, Agree should not be taken for granted, even with what seems to be overwhelming evidence of its existence. This, however, leaves the theory in an awkward position—the phenomena that Agree is supposed to explain appear to be real and rather ubiquitous, but our tool for explaining them is not yet ready. If we are engaged in rational inquiry (i.e., science) then we should not be surprised to find ourselves in such a position. It does not mean that its time to throw up our hands and discard our current theory. It means that we have plenty of work left—an enviable position to be in.