1 Introduction
If recent work in linguistics is correct, then Merge, the process of combining together linguistic objects, is a core property of language that is utilized by the language faculty to construct syntactic objects (SOs). Chomsky (2010, p. 52) writes that “unbounded Merge is the sole recursive operation within UG” and that it is “part of the genetic component of the language faculty.” If this is correct, human language makes use of recursive Merge. Berwick (2011) suggests that non-human primates have lexical items but no Merge, whereas birds have something like Merge (used in songs) but no lexical items. Human language, crucially, makes use of lexical items and Merge.
Chomsky (2001, 2013, 2015) takes the position that Merge is free. Chomsky (2015, p. 14) writes that “[t]he simplest conclusion … would be that Merge applies freely” and “[o]perations can be free, with the outcome evaluated at the phase level for transfer and interpretation at the interfaces.” I take this to mean that both External Merge and Internal Merge are free. Crucially, Free (Internal) Merge would result in an infinite number of possible structures generated for every possible utterance. This is untenable. Thus, Free Merge must be constrained by the language faculty.
The question then arises of how Free Merge is constrained. In this paper, I demonstrate how, given a Merge-based model of language generation (based on recent work in linguistic theory), Merge can be constrained. Crucially, I argue that arguments can be subject to Free Merge, subject to the constraints of the language module, but that Labeling in general is sufficient to eliminate most impossible derivations.
In the following sections, I discuss my core assumptions regarding syntactic structure, which I implemented in a computer model that automatically generates sentences. Notably, in this model, I attempt to remove many of the basic problematic and overly complex assumptions in recent work in the Minimalist Program (Chomsky, 1995) with the goal of keeping language simple, in accord with the Strong Minimalist Thesis (Chomsky, 2000, 2001, 2010; Chomsky et al., 2023), the notion that “language keeps to the simplest recursive operation, Merge, and is perfectly designed to satisfy interface conditions (Chomsky, 2010, p. 52).” Then I explain how Labeling is generally sufficient to constrain Free Merge. This is followed by discussion of issues that arise with respect to overgeneration.
2 Computer Model of Language
For this work, I created a computer model that implements the theory that is presented in this paper. This model was created in the Python programming language, and the output is generated with HTML and JavaScript. This model is fed an input stream of lexical items, which it Merges together to form SOs. Selection and Merge of a lexical item from the input stream is External Merge. The model also implements Internal Merge (displacement) of elements from within an SO. This Internal Merge (IM) is the main focus of this paper. The model can compute multiple derivations for a single input stream, which is crucial for implementing a version of Free Merge. This model is a language generator (not a parser) because it generates phrases and sentences from a given input list of lexical items; it is not fed complete sentences as input.1
Portions of a derivation produced by the model are shown in Figure 1. An initial list of lexical items is fed into the model. The model consecutively selects and Merges together the lexical items, in accord with the theory that is developed in this paper. After each Merge step, the model checks for the possibility of agreement relations and for the possibility of Labeling (see Section 3). When a derivation is complete, it is transferred to Spell-Out, where a particular pronunciation is determined. For any particular example, there can be multiple successful derivations (that converge), as well as multiple crashed derivations.
Figure 1
Main Components of the Computer Model
In this paper, I utilized this computer model to test the theory that is developed in the following sections. The model produces complete step-by-step derivations for all target constructions, thus making it possible to find problems with, and verify the accuracy of, the target theory. The complete derivations for all target constructions presented in this paper can be found in the Supplementary Appendix (see Ginsburg, 2024),2 and these can be of use to researchers who are interested in verifying the proposals in this paper. The main focus of this paper is on linguistic theory. I used this model to test the accuracy of the theory that I develop in this paper.
3 Basic Assumptions About Language
In this section, I review the basic Labeling-based proposals of Chomsky (2013, 2015) and then I describe the basic properties of the language faculty that I assume to be at work in the language model that I created.
3.1 Labeling-Based Derivations According to Chomsky (2013, 2015)
Following Chomsky (2013, 2015), I assume that a form of Labeling is at work with respect to language generation. Labeling is necessary for interpreting phrases. Labeling refers to a process of finding a prominent feature of an SO via the search process involved in language, Minimal Search (Chomsky 2013, 2015).3 Chomsky (2013, p. 43) writes that “[t]he simplest assumption is that LA [Labeling Algorithm] is just minimal search, presumably appropriating a third factor principle, as in Agree and other operations.” In this way, Labeling is really just a form of Minimal Search, which finds prominent features that can function as labels.
Labeling via Minimal Search works as follows. Assume that a head X and a phrase YP Merge, forming {X, YP}. In this case, the label is X, assuming that X has prominent features that are capable of Labeling. If an XP and a YP, both phrases, Merge to form {XP, YP}, then shared features can label. For example, assume that XP (specifically the head X) has phi-features and YP (the head Y) has unchecked phi-features. Minimal Search results in the XP and YP forming an Agree relation so that the uPhi on YP are checked by the iPhi (interpretable phi-features) on X, and then the shared phi-features on XP and YP can label. Chomsky also takes the position that the English T is too weak to label on its own—this accounts for the requirement that a clause have a subject (the traditional EPP effect of Chomsky (1981)). Given the structure {T, YP}, T alone cannot label. However, given {XP, TP} where TP and XP Agree in terms of phi-features, the shared phi-features label. This is accounted for as follows. T inherits uPhi from C. Given an {XP, TP} structure, the uPhi on T Agree with the phi-features on X and the shared phi-features label. Crucially, in an {XP, YP} structure in which XP and YP do not Agree, Labeling is not possible. Also, given a {T, YP} structure, Labeling is not possible, assuming that T is too weak to label. These are summarized in (1). Furthermore, consider a root that has Merged with a functional head (e.g., a categorizer) to form what is essentially a head-head structure; for example, the root walk Merges with a categorizer n. In this case, the root is too weak to label by itself, but the functional head can label. This is the position that Chomsky (2013, p. 47) takes, following Marantz (1997), Embick and Marantz (2008), and Borer (2005a, 2005b, 2013). I assume that a root can label after it Merges with a categorizer.
1
Labeling Failure
{XP, YP} – X and Y do not Agree
{T, YP} – T is too weak to label
The convergent derivation of Tom read books,4 following Chomsky (2013, 2015), proceeds as shown in Figure 2. The lines below each terminal node represent the frontier of the derivation—the portion of the derivation that is sent to Spell-Out to be pronounced. The root books set-Merges with the functional categorizer n, and n labels, as the root books is not capable of Labeling. Chomsky (2015) claims that a verbal root undergoes internal pair-Merge (head movement) with v to form <v*, read>, resulting in v* being dephased (also see Epstein et al., 2016). The <v*, read> pair-Merged structure is represented with a dotted arc. Dephasing is a process in which an element that would typically function as a phase head, thus being a point of transfer, no longer functions as phase head.5 In this case, Chomsky proposes that phasehood is passed onto the complement of v*. Thus, the complement of v* will function as a phase and be transferred. A phase head passes uFs to its complement, so the uPhi (uninterpretable Phi) of v* are passed onto the verbal root read, which being a root, is unable to label by itself. The object a book undergoes IM (Internal Merge) with read to form an {XP, YP} structure. In the matrix clause, the subject is initially-Merged with vP, and then it internally-Merges with the TP. The uPhi of C are inherited by T. Minimal Search results in phi-feature agreement in the {NP, {Tpast…}} and {NP, {read…}} structures, and these shared phi-features are able to label, where the label is indicated as <ɸ, ɸ>.
Figure 2
Structure of “Tom Read Books”
Note. Adapted from Chomsky (2015, p. 10).
In the following subsections, I explain my assumptions about Labeling Theory. Note that, for reasons discussed in the following sections, I do away with some of the operations utilized in the type of derivation shown in Figure 2.
3.2 Phases
Labeling Theory follows the view that the structures of sentences are constructed hierarchically in a bottom-up fashion, and sentences consist of phases, which are portions of sentences that essentially become inaccessible after construction. The core phases are generally assumed to be a transitive Verb Phrase (v*P) and a Complementizer Phrase (CP), following Chomsky (2000, 2001). Both (2)a–b are well-formed and both crucially are formed from the same set of lexical items. These examples differ, however, with respect to the ordering of lexical items. The embedded CP in (2)a is a phase that is constructed from a lexical array that does not contain there. As a result, a man raises to subject position of the CP. The expletive there is associated with the higher phase of the matrix clause. In (2)b, on the other hand, the expletive there is available in the embedded CP phase. As a result, there is inserted in subject position of the CP and a man does not need to move.
2
There is a possibility [CP that a man will be t in the room].
A possibility is [CP that there will be a man in the room]. (Epstein, Kitahara, & Seely, 2014, p. 469)
Once a phase is complete, the complement of the phase head becomes inaccessible to further operations, which is proposed to reduce memory burden—the mind can essentially put a completed phase to the side and compute the next phase. Note that when a phase head is Merged, there are differing views about which portions of the phase become inaccessible in accord with the Phase Impenetrability Condition (Chomsky, 2000, 2001; Müller, 2004; Richards, 2011). Under one version of the Phase Impenetrability Condition, the complement of the phase head becomes inaccessible and is transferred, but in another version, the complement (if present) of the lower phase head becomes inaccessible and is transferred. As noted by Boeckx and Grohmann (2007, p. 206),6 referring to Chomsky (2000), “[c]omputation cost reduction is the prime conceptual advantage and motivation for phases.” This means that all feature checking operations within the phase must be complete, and any elements that need to move out of the phase must have moved to the edge of the phase before completion. Since phases are thought to be complete (in some sense), they ideally should be of some advantage when accounting for island effects, although whether or not this is the case is open to debate (e.g., see Chomsky, 2008; Gallego, 2010). I incorporate the notion of phases into this model, since they are utilized in Labeling theory. I assume that the phases, following Chomsky (2001), are transitive VP (v*P) and CP. Note that, essentially following Chomsky (2021), I will assume that when v* or C is Merged, the v*P/CP is transferred. Thus, the head of the phase is transferred together with its complement but the specifier, if present, remains outside of the transferred phase.
3.3 Feature Inheritance and Agreement
Feature inheritance is an operation in which a phase head passes features onto a complement. The notion of feature inheritance was proposed by Chomsky (2008), based on work (to the best of my knowledge) by Carstens (2003) and Miyagawa (2005), among others (also see references in Carstens, 2003, and Miyagawa, 2005). Chomsky (2008, pp. 143–144) writes:
….for T, ϕ-features and Tense appear to be derivative, not inherent: basic tense and also tenselike properties (e.g., irrealis) are determined by C … or by the selecting V (also inherent)…In the lexicon, T lacks these features. T manifests the basic tense features if and only if it is selected by C…if not, it is a raising (or ECM) infinitival, lacking ϕ-features and basic tense. So it makes sense to assume that Agree and Tense features are inherited from C, the phase head.
Feature inheritance can be useful for accounting for Exceptional Case Marking (ECM) constructions. In the ECM (3)a, the embedded T, pronounced as to, occurs without C. In this case, T lacks agreement features and him Agrees with the matrix verb expect. In (3)b, on the other hand, T, pronounced as past tense on the verb win (resulting in won), occurs with C. T has agreement features and Agrees with the subject, resulting in the nominative pronoun he. These types of simple examples demonstrate how T, in the presence of C, has agreement features, which it lacks in the absence of C. While feature inheritance is useful for accounting for the ECM data, it isn’t necessarily clear if it is required. If non-finite T simply lacks a full set of unchecked/uninterpretable phi-features (uPhi), and tensed T has a full set of uPhi, the same facts can be accounted for, without recourse to feature inheritance.
3
I expect [T him to win].
I think [C that he won].
Feature inheritance notably is a complex operation that involves copying agreement features from C onto T, or the passing of features from C to T. Chomsky notes that this violates the No-Tampering Condition (Chomsky, 2000, pp. 136–137; Chomsky, 2008, p. 138), as it requires altering an already formed syntactic structure. The question then arises of whether or not it is conceptually necessary.
Complementizer agreement is found in a variety of languages such as Frisian, some Dutch and Germanic dialects, and Bantu languages (Koppen, 2017). Note that both C and a verb can show agreement with a subject, as in (4)a–b, in which the complementizer and the verb show agreement with the subject. Assuming that verbal agreement indicates agreement on T, then both C and the verb Agree with the subject in these examples.
4
datt-e wiej noar ’t park loop-t (Dutch, Hellendoorn dialect) that-pl we to the park walk-pl ‘that we are walking to the park’ (Ackema & Neeleman, 2001, p. 34; Carstens, 2003, p. 397) dan ik werken (West Flemish) that-1sg I work-1sg (Ackema & Neeleman, 2001, p. 29)
Although the existence of complementizer agreement as in (4) has been given as evidence for feature inheritance (Chomsky, 2008; Miyagawa, 2005), complementizer agreement tends to be less common and less complete than agreement with T. Matasović (2018, p. 9) writes “that the most common agreement pattern within the domain of the clause is verbal agreement.” Koppen (2017, p. 7) writes “The CA [Complementizer Agreement] paradigm is usually defective, however, in the sense that not all person/number combinations of the subject lead to an overt agreement reflex on the complementizer.” Koppen (2005, p. 35) points out this defectivity in a variety of Germanic/Dutch languages/dialects. In Frisian, a complementizer shows agreement only with a second person singular embedded subject, whereas a verb shows agreement with all types of subjects, as shown in Table 1. Koppen (2005) discusses similar paradigms in Tegelen Dutch, Bavarian, and Lapscheure Dutch. In all of these languages/dialects, there are variations with respect to the extent of complementizer agreement, but there are fewer complementizer agreement suffixes than verbal suffixes, thus providing further evidence for the notion that complementizer agreement tends to be defective.
Table 1
Agreement in Frisian
Person.Number | Comp. agreement | Verbal agreement |
---|---|---|
1 Per.Sg | -0 | -n |
2 Per.Sg | -st | -st |
3 Per.Sg | -0 | -t |
1 Per.Pl | -0 | -e |
2 Per.Pl | -0 | -e |
3 Per.Pl | -0 | -e |
Note. Koppen (2005, p. 35).
If it truly is the case that verbal agreement is more common than complementizer agreement and that verbal agreement tends to be more complete than complementizer agreement, then this may be an indication that a complementizer is not always the origin of agreement features. If C were the locus of agreement features, then one might expect agreement with C to be more common than it is, and for agreement with C to tend to be more, not less, complete than agreement with T.
Feature inheritance also raises technical problems. If uPhi are inherited by T, one possibility is that all of the uPhi of C are passed from C onto T and no longer remain on C. This would be the case when agreement only shows up on T (usually visible on the verb). This does not appear to be the case in the examples in (4) in which there is agreement between a subject and both C and the verb (assuming the verbal agreement is the result of agreement on T). Another possibility is that the uPhi of C are copied onto T, so that they appear on both C and T. This could account for the data in (4). Again, copying of features from one element onto another seriously alters an already formed SO, again violating the No-Tampering Condition. It would be simpler if C and T come with their necessary agreement features.
Richards (2007) provides arguments for feature-inheritance, proposing that feature transmission is a “conceptual necessity” in order to avoid transfer of uninterpretable features. Uninterpretable features, by definition, cannot be processed by the semantic component of a derivation. If uninterpretable features are checked but are not transferred immediately, then they should stay around and cause a derivation to crash, according to Richards, since they can’t be interpreted. Thus, uninterpretable features must be transferred as soon as they are checked. The assumption seems to be that when checked, uninterpretable features are transferred with the phase. They are “deleted” so that they are no longer visible to the semantic component. Assume that T has uPhi that are checked via Agree with a subject, before the phase head C is Merged. When these features are checked, they cannot be transferred until after the phase head C is Merged. Thus, these uPhi cannot be deleted as soon as they are checked. These checked uPhi, according to Richards, then become indistinguishable from interpretable features and interpretable phi-features on T, presumably, will cause a derivation to crash. The idea seems to be that since T is not an argument, phi-features (which are associated with arguments) cannot be interpreted on T. On the other hand, if uPhi are inherited from C by T, then as soon as they are inherited, they are checked, and since the phase level has been reached, the checked uPhi are instantly transferred, so that they no longer remain for the semantic component. Assuming that uninterpretable features originate on a phase head predicts phase-level operations of inheritance of features, Agree (e.g., checking of uPhi on T by phi-features on a subject), and transfer of the relevant portion of the phase.
However, the idea that uninterpretable features need to be deleted as soon as they are checked is not necessarily a given. Since these features are uninterpretable, by definition, they could cause a derivation to crash if they are transferred, but as long as they are deleted before transfer, it isn’t clear why they need to be deleted immediately – this seems to be a stipulation. Furthermore, some recent work takes the position that the complement of a phase head is not transferred immediately. Chomsky (2015) argues that phasehood can be transferred to the complement of a phase head, based on ECM constructions and the that-trace effect.7 As noted by Goto (2017), the motivation for feature-inheritance based on the need to delete uninterpretable features as soon as they are checked may not necessarily hold.
Another issue with feature inheritance involves probe-goal agreement. Since Chomsky (2001), agreement has typically been assumed to involve a probe-goal relation. For example, assume that T in (5) has uPhi that probe for and Agree with the phi-features on a subject. Similarly, v* has uPhi that probe for and Agree with phi-features on an object. The relations Agree(T[uPhi], Mary[iPhi]) and Agree(v*[uPhi], books[iPhi]) check the uPhi on T and Mary via probe-goal agreement.
5
[T[uPhi] Mary[iPhi] v*[uPhi] bought books[iPhi]]
Now consider how probe-goal agreement of this sort works given feature-inheritance. If T must inherit its uPhi from C, then the uPhi on T cannot probe until after C is Merged. Thus, probing is counter-cyclic, not from a root node, which is contrary to the original notion of probe-goal in which probing occurs from the root node (Richards, 2006).
Another problem, pointed out by Epstein, Kitahara, and Seely (2022), hereafter EKS (2022), is that given feature inheritance, there are cases in which agreement must occur with a goal that is no longer visible. Assuming feature inheritance, in (6), the uPhi features on T are inherited from C. Thus, T does not obtain its uPhi features until after C is Merged, and also after the subject has internally Merged with the TP (assuming that a subject raises to the specifier of TP). Then, following Chomsky’s (2013) view that only the highest copy of a syntactic object (SO) is visible to probing, the lower copy of the subject is not visible to probing. This means that the probe cannot find the subject in its base position. The higher copy of the subject is in the specifier of the TP, so that the past tense T does not c-command it. See Figure 3. EKS (2022) propose a solution based on Minimal Search (Agreement occurs between T and the subject in the TP). However, none of this is necessary if there is no feature inheritance. If T simply comes with its relevant set of uPhi features, then it can probe as soon as it is Merged. There is no need for counter-cyclic Agree relations, and the problem of agreement with an invisible copy of an SO does not arise.
6
[C Mary[iPhi] [T T[uPhi] Mary[iPhi] v*[uPhi] bought books[iPhi]]]
Figure 3
Agreement Given Feature Inheritance
The facts regarding feature inheritance are far from settled, but I will assume that from the perspective of the Strong Minimalist Thesis, it is best to do without it.8 Feature inheritance is best eliminated from the current theory from the perspective of simplicity; it is a complex operation, and a complex operation such as feature inheritance requires extraordinary justification.
3.4 Agreement and Case
Case is subject to a great deal of cross-linguistic and language-internal variation. As is well known, Case morphology in English is basically “phonologically zero” (Pesetsky & Torrego, 2011, p. 55), except for pronouns. Case shows up on nouns in Latin, as in (7). In Russian, there are declinable and indeclinable nouns (Pesetsky & Torrego, 2011, p. 55), so that whether or not Case appears overtly on a noun can depend on the particular noun, as in (8). In Icelandic, the verb luku ‘finished’ occurs with a dative object and the verb vitjuðum ‘visited’ occurs with a genitive object, as in (9)a–b. Icelandic is also well-known for constructions in which a subject appears with dative Case and an object with nominative Case, as in (9)c. Furthermore, Bobaljik (2008) points out that nominative-accusative case systems and ergative case systems (which typically mark the subject of an intransitive verb and the object of a transitive verb with ergative case) assign Case differently, but arguments seems to be treated syntactically in the same way in both types of systems, suggesting that Case is not truly a syntactic relation.
7
libr-um (Latin) |
book-Acc (Pesetsky & Torrego, 2011, p. 53) |
8
a. | mašin-u | b. | mašin-y | c. | mašin-oj | (Russian) | ||||||
car-Acc | car-Gen | car-Instr | ||||||||||
d. | kenguru | |||||||||||
kangaroo-Acc/Gen/Instr (Pesetsky & Torrego, 2011, p. 55) |
9
Ðeir luku kirkjunni (Icelandic) They finished the.church.Dat Við vitjuðum Olafs. We visited Olaf.Gen (Pesetsky & Torrego, 2011, p. 61) Jóni líkuðu ϸessir sokkar (Icelandic) Jon.Dat like.pl these socks.Nom ‘Jon likes these socks.’ (Jónsson, 1996, p. 143; per Bobaljik, 2008, p. 298)
These Case facts can be accounted for if Case is primarily a Spell-Out phenomenon. Marantz (2000, p. 20) argues that “case and agreement morphemes are inserted only after SS [Sentence Structure] at a level we could call “MS” or morphological structure.” Bobaljik (2008), following Marantz, writes that “the proper place of the rules of m-case [morphological-case] assignment is thus the Morphological complement, a part of the PF interpretation of syntactic structure (Bobaljik, 2008, p. 300).” Chomsky (2021, p. 23) suggests that “Case is part of externalization” further writing that “there seems to be no general semantic reason” for Case systems and “[p]erhaps establishing relations among elements facilitates perception/parsing.”
In my model, I take a Spell-Out-based approach to Case. I assume that Case appears at Spell-Out, following Chomsky’s view that Case is a reflex of phi-feature agreement (Chomsky, 2000, 2001). This approach can account, at least to a certain extent, for some of the language-internal and cross-linguistic idiosyncrasies that occur with Case. I assume that unchecked phi-features, uPhi, must be checked for a derivation to converge. The result of phi-feature agreement can lead to an argument being pronounced with overt Case morphology. Case, however, is a Spell-Out phenomenon. The exact form of Case can be subject to language internal and cross-linguistic variation, but the actual form of Case on an argument does not have an influence on syntax. Note that if an argument is unable to be pronounced with Case, a derivation can crash at Spell-Out (see Section 3.6).
3.5 Head Movement
Head movement is a controversial topic. It appears to be ubiquitous. However, it isn’t clear how exactly it works. Consider the basic examples in (10) and (11) which show typical head-movement of T to C. Assume that C is selected and Merged with TP. Then assume that T raises and undergoes IM (undergoes head-movement) with C, as shown in (10)b and in (11)b (assume that will is in T). These head-movement operations appear to violate the No Tampering Condition because an already formed CP is altered. Head movement also violates the Extension Condition (Chomsky, 1993, 1995) which is the requirement that operations extend the SO. Movement of a head to a position within the clause does not extend the size of the SO.
10
C Mary T bought the book.
C+T Mary T buy the book? → Did Mary buy the book?
11
C Mary will buy the book.
C+will Mary will buy the book? → Will Mary buy the book?
Chomsky has proposed differing views on head movement. Chomsky (1993, p. 23) argues that the Extension Condition does not apply to adjunction operations, which means that it does not apply to head movement, assuming that head movement is adjunction (also see Dékány, 2018, p. 5). However, making head-movement an exception to the Extension Condition is not exactly ideal. From the perspective of Minimalism, the notion that all Merge operations target the root of an SO is optimal. Thus, head-movement seems to violate this requirement, and this alone is enough to make head-movement suspect.
In recent work in Labeling Theory, a version which I assume here, Chomsky (2015) makes use of head movement. Chomsky (2015, p. 12) proposes that a verbal root R raises to v*, resulting in dephasing of v* (see Figure 2 above). This raising operation forms “the amalgam [R-v*]”, which Epstein, Kitahara, and Seely (2016) (hereafter EKS, 2016) specifically describe as being a case of internal pair-Merge. EKS (2016, p. 90) write “pair-Merge internally forms <R, v*> (=R with v* affixed).” Internal pair-Merge, unless it is implemented via sidewards movement, requires R to internally pair-Merge with v* after v* has Merged with the SO. This head-movement again violates the No Tampering Condition and the Extension Condition.
Despite making use of head movement in some works such as those discussed above, a variety of issues, including the violation of the Extension Condition/counter-cyclicity, lead Chomsky (2001) to propose that head-movement is a phonological (PF) operation. Chomsky (2001, p. 37) writes “[t]here are some reasons to suspect that a substantial core of head-raising processes, excluding incorporation in the sense of Baker (1988), may fall within the phonological component.” Some other reasons from Chomsky (2001, pp. 37–38) why head movement is problematic are as follows (see Roberts, 2011, for a summary of these arguments). There are no clear interpretation differences in languages such as French and Icelandic, in which the verb appears in a position generally considered to be T (possibly a result of head movement), compared with languages such as English, in which the verb remains below T. If head movement were responsible for verb movement, and if head movement influences interpretation, then the expectation is that semantic differences would arise (Roberts, 2011, p. 99). Another issue is that if a head raises and adjoins to a higher head within an SO, then “the raised head does not c-command its trace (Chomsky, 2001, p. 38).” Also, a phrase can undergo successive-cyclic movement whereby it moves from the specifier of one phrase to the specifier of another phrase, but this doesn’t appear to occur with a head. Rather, “it always involves ‘roll-up’ (i.e. movement of the entire derived constituent … iterated head movement always forms a successively more complex head (Roberts, 2011, p. 201).” For example, in an English interrogative, an auxiliary (Aux) moves to T (assuming that the auxiliary is not base generated in T), and then Aux-T raises to C. An auxiliary cannot move to C and leave T behind.
There are a variety of proposals related to head movement in the literature which make use of syntactic movement and/or post-syntactic PF operations. Embick and Noyer (2001) propose that there are postsyntactic lowering operations in which a head can lower and combine with another head. Matushansky (2006) argues that typical cases of head movement can result from a syntactic operation of movement of a head to a specifier position followed by a morphological (non-syntactic) operation that creates adjoined heads. Harizanov and Gribanova (2019) argue that some cases of head movement occur in the syntax via IM and other cases involve a morphological process that occurs outside of the syntax via an operation of “postsyntactic amalgamation” which can involve postsyntactic lowering or raising of a head to form a head-head adjunction structure. Harley (2004) argues for a non-syntactic operation of conflation (following Hale & Keyser, 2002) in which the defective phonological features of one head combine with the phonological features of a complement. Platzack (2013) also develops a purely phonological approach to head movement which permits pronunciation of heads to occur in a position that is different from in the syntax. See Roberts (2011) and Dékány (2018) for in-depth discussion of problems with head movement, as well as discussion of potential ways to account for head-movement, both in narrow syntax and at PF. Also see Roberts (2010) for an attempt to account for some components of head-movement in the syntax.
Notably, there are two types of proposals in the literature involving head-movement that occurs in the syntax without violating the Extension Condition.9
In some accounts (e.g., Harizanov & Gribanova, 2019; Matushansky, 2006; among others), a head can undergo IM to a specifier position. Thus, a head essentially functions as a specifier. The head moves to the root of the SO, so this movement is not counter-cyclic and does not violate the Extension Condition. However, the result of movement is an {Y X, {YP…X}} structure in which the X does not label, so X functions as a specifier. This is problematic given the core notions of Labeling Theory, which assume that given an {X, YP} structure, the head X labels.
Another approach is consistent with Labeling Theory. In this approach, a head undergoes IM with the core SO, and then the head relabels. Thus, IM of X with YP, forms an {X X, {YP…X}} structure in which X functions as the label. Presumably, X also functions as a label in its base position, and thus, this is sometimes referred to as “relabeling” as well as “reprojection”. For example, this type of head movement might be plausible in relative clauses, if one assumes that a nominal head raises and relabels, and if relabeling by a simplex head does not go against the core assumptions of Labeling Theory. This type of relabeling approach can be found in Georgi and Müller (2010), Donati and Cecchetto (2011), Cecchetto and Donati (2015), and Fong and Ginsburg (2023), among others. Note that this type of head movement might be compatible with my computer model, although it is not utilized for the derivations discussed in this paper (as it is not necessary).10
Although the issues regarding head-movement are far from settled, I adopt a model in which there is no counter-cyclic head movement in the narrow syntax. I take an affix hopping approach (Chomsky, 1957) in which a set of Spell-Out rules are applied to the output of the syntax.11 Crucially, if X (an affix) and Y are adjacent at PF, then it is possible for Y to be linearized before X.
The basic rules that I implemented are given in (12) below. Note that only a small set of rules is sufficient to account for the basic English constructions produced by my model. The computer model generates a syntactic structure. If a derivation does not crash, the nodes of the tree are sent to Spell-Out. Then the basic PF (Phonological Form) rules apply when necessary. Examples of rule applications for particular Spell-Out forms are shown in Table 2. Note that for these rules to apply, the PF component of the derivation has to have access to some syntactic category information. Thus, if the model finds T adjacent to v* which is adjacent to a root R, or an auxiliary, then T attaches onto the root or auxiliary, and v* is eliminated from PF, since it has no pronunciation. As shown in Table 2, for Tom saw Fred, the adjacent SOs T(Past,3rd,sg) v* see are converted into T(Past,3rd,sg)+see which is pronounced as saw, where T ends up suffixing onto the adjacent see; with a regular verb, past tense is generally pronounced as -ed. The verbal head v* is not pronounced. For Will Tom read a book, the interrogative C and T combine to form CQ+T(Pres,3rd,sg). Note that this requires T to attach onto CQ by moving over the subject at Spell-Out. Furthermore, the auxiliary will must move over the subject and combine with T to form T(Pres,3rd,sg)+will, which ends up being pronounced as will. In cases in which T combines with C, but there is no overt element in T, then the appropriate form of do (depending on Tense and agreement) is pronounced. The appropriate forms of do as well as irregular verb forms are listed in a lexicon, which the model consults.12
12
Basic PF (Phonological Form) Rules (R = Root, Aux = Auxiliary):
Tense: T v* R/Aux→ R/Aux+T
Modal: T Modal→ T+Modal
Passive: be -en v R → be R+-en
Progressive: be -ing v* R → be R+-ing
Perfective: have -en v* R→ have R+-en
Interrogatives: CQ N T → CQ+T N
Irregular verb forms are stored in lexicon.
Table 2
Spell-Out for Basic Sentences
Spell-Out | PF Rules |
---|---|
C n Tom T(Past,3rd,sg) v* see n Fred | Tom T(Past,3rd,sg)+see Tom saw Fred |
C n Tom T(Pres,3rd,sg) will v* read a n book | Tom T(Pres,3rd,sg)+will Tom will read a book |
C the n book T(Past,3rd,sg) be -en v read | the book T(Past,3rd,sg)+be the book was -en+read the book was read |
C n she T(Past,3rd,sg) be -ing v* read a n book | she T(Past,3rd,sg)+be she was -ing+read she was reading a book |
C n she T(Past,3rd,sg) have -en v* read a n book | she T(Past,3rd,sg)+have she had -en+read she had read a book |
CQ n Tom T(Pres,3rd,sg) will v* read a n book | CQ+T(Pres,3rd,sg) T(Pres,3rd,sg)+will will Tom read a book |
3.6 FormCopy
Chomsky (2021) proposes a FormCopy operation which accounts for how nominals are interpreted as Copies. Chomsky (2021) describes FormCopy as a rule that assigns “the relation Copy to certain identical inscriptions” (p. 17) that are in a c-command relation, and that presumably are in the same phase. Identical inscriptions refers to arguments that are identical in form. Consider how this accounts for the Control construction (Chomsky, 1981, 1986) in (13)a, as shown in (13)b. Here, I assume that the Control construction is a TP.13 The NP many people1 is externally-Merged in the vP theta-position and it undergoes IM to the non-finite TP specifier position. The NP many people2 is separately externally-Merged in the matrix vP theta-position, and internally Merged with the matrix T. FormCopy applies and all inscriptions of many people, except for the highest many people2 in the matrix TP, are interpreted as Copies, which are not pronounced. If FormCopy were not to apply in (13), the lower many people1 would be treated as a separate NP (a repetition, not a Copy) from the higher many people2. This is fine, as it has a separate theta-role from the matrix many people. However, the construction would then crash. Why exactly it would crash is then an issue. I assume that it would crash at Spell-Out due to Case reasons, not due to issues with the syntax. The relevant argument many people1, not being a Copy, would need to be pronounced, but it would not be in a position in which it could obtain Case, in the edge of a non-finite TP, and the matrix try is not the kind of verb that assigns accusative Case.
13
Many people tried [many people to win].
[Many people]2 T [many people]2 tried [many people]1 to [many people]1 win. (Adapted from Chomsky, 2021, p. 22)
Note that FormCopy has the advantage of eliminating the need for memory of movement operations. In (14), assuming the standard view of the VP-internal subject hypothesis, the subject John is externally-Merged in the v*P. Then it internally-Merges with T in subject position. With FormCopy, there is no need to retain memory of the movement of John. When the construction is generated, John undergoes IM. Then, at the phase level, FormCopy applies and the lower inscription of John is interpreted as a Copy of the higher John.14
14
[T John T [v* John v* walked to the store]]
A number of issues arise regarding the FormCopy operation and its formulation in the literature. First, the question arises of whether or not FormCopy can apply freely. Chomsky (2021, p. 25) writes:
Let’s return to simple transitive sentences, such as John saw X. Suppose X = John. With the subject inserted by EM [External Merge] in the predicate-internal position, they are in an IM-configuration [Internal-Merge-configuration]. If FC [FormCopy] applies, the expression will crash at CI [Conceptual-Intentional interface] with a θ-Theory violation. We conclude, then, that like other operations, FC is optional, not applying in this case so there is no deletion, just two repetitions of John.15
Chomsky indicates that FormCopy is optional, but I don’t take this to mean that it applies freely. Rather, it applies in some, but not all, cases in which there are multiple inscriptions with the same form.16 Specifically, it applies at the phase level. Chomsky et al. (2023, p. 25) write that “[i]n technical terms, the point at which FC [FormCopy] applies is referred to as the phase level.” Limiting FormCopy to the phase level accounts clearly for why FormCopy cannot convert the lower John1 into a Copy in John2 saw John1. As shown in (15), FormCopy should apply when the lower phase head v* is Merged, before John2 is Merged. After John2 is Merged, the lower v* and its complement are no longer accessible to FormCopy. Thus, FormCopy cannot apply between John2 and John1.
15
[ T John2 [v* see John1 ]]
In my model, I adopt this phase-level approach to FormCopy. Once a phase-head is Merged, FormCopy applies if possible. If X and Y are identical inscriptions of arguments in the same phase, then FormCopy converts Y into a Copy of X. Furthermore, FormCopy is beneficial as it enables us to do away with the need for the language faculty to memorize movement operations.17
3.7 Strength and Labeling
Chomsky (2013, 2015) utilizes a notion of strength, combined with the need for Labeling, to account for the traditional EPP effects in languages such as English. T is weak so that it must be strengthened. In order to be strengthened it must form an {XP, YP} structure with an argument. Chomsky (2015, p. 7) suggests that a lack of “rich agreement” may be a reason for the weakness of T in English, unlike in null subject languages like Italian that have rich agreement. Chomsky suggests that in a null subject language, T may be able to label by itself.
The Labeling-based analysis of TP in English, in one sense, enables us to do away with the EPP. But on the other hand, the notion of weakness is not entirely clear. If richness of agreement is at play, then questions arise around languages (e.g., Japanese) which do not appear to show any agreement, but allow null subjects.
If certain heads like the English T can be weak, then the question also arises of what to do with non-finite T. Assume that finite T is weak and requires agreement to be strengthened for Labeling. But non-finite T does not show agreement. In (16), non-finite T, pronounced as to, and the embedded subject Mary are adjacent. One possibility is that the embedded TP has the structure in (16)b with Mary and the non-finite T forming an {XP, YP} structure. In (17), there is no overt argument in the embedded clause, but there should be a Copy of John (or PRO) in the embedded clause, as in (17)b. In this case, since the lower Copy of John is not pronounced, it should be invisible to Labeling, and thus non-finite T cannot be strengthened. Presumably, non-finite T can label, and thus, it does not require strengthening.
16
John expects Mary to arrive.
John expects [T Mary to arrive]
17
John tried to finish the work.
John tried [T John to finish the work].
Chomsky (2015) assumes that in an ECM construction such as (16), the embedded subject raises to the matrix object position. This follows work by Postal (1974) and Lasnik and Saito (1991), among others. For example, the embedded subject in an ECM construction can be passivized and it behaves like it is in the matrix clause with respect to binding effects. Chomsky argues that the Root expect is weak and thus must be labeled by an {XP, YP} structure. On this account, the embedded subject raises and forms an {XP, YP} structure with the matrix verbal root. For example, in (16), the structure {Mary, {expect Mary to arrive}} is formed. Problems with this approach are that the root expect must inherit uPhi from the higher v* (if feature inheritance is assumed) and that expect has to undergo some form of head movement over Mary. If head movement is a Spell-Out phenomenon, then this would happen at Spell-Out.
Due to the complexities involved, I take a simpler approach—Mary is in the embedded non-finite TP in (16), where it Agrees with the higher v*, resulting in accusative Case appearing on Mary at Spell-Out. The evidence that the subject of a non-finite clause can behave like the object of the matrix clause is strong, but whether or not this requires the embedded subject to actually undergo IM with the matrix verbal root is not clear. I will simply assume that, due to the lack of an intervening phase boundary, an ECM subject can behave like it is a matrix object even if it is in the embedded clause.18
There is evidence that an overt subject can appear in a non-finite clause. In examples such as (18), the subject him appears to be in the embedded clause. It seems to be a fairly standard assumption that the embedded subject is in the non-finite T (e.g., see Chomsky & Lasnik, 1977). If Labeling Theory is correct, then there must be some type of agreement relation between non-finite T to and him.
18
I want for him to go.
If the embedded subject of a non-finite clause remains in the non-finite T, as in (16) and (18), there is a potential Labeling issue. If non-finite T and the subject do not Agree, then there should be a Labeling failure. To get around this problem, I assume that non-finite T contains uPerson that Agrees with an argument. In (16) and (18), the uPerson of non-finite T is checked by the Person feature of the subject. Since the subject is internally-Merged with the non-finite T, the result is an {XP, YP} structure that is labeled with shared Person features. In (17), uPerson is checked by the Person feature of John. Since John is a Copy here, there is no visible {XP, YP} structure, and to labels by itself. Thus, non-finite T can either label by itself, or in an {XP, YP} structure with an argument that it shares a Person feature with.
The question then arises of whether or not there is evidence for agreement between a subject and non-finite T. Notably, agreement in infinitives can be found in a variety of languages. For example, standard Brazilian Portuguese has infinitives that can be inflected for person and number as in (19).
19
(eu/você/ele/ela) fala-r (Brazilian Portuguese) I/you/he/she speak-Inf-∅ (nós) fala-r-mos (we) speak-Inf-1Pl (vocês) fala-r-em (you-Pl) speak-Inf-3Pl (eles/elas) fala-r-em (they) speak-Inf-3Pl (Pires, 2006, p. 92)
Inflected infinitives are also found in European Portuguese (Raposo, 1987), as well as other Romance languages such as Galician and Old Neapolitan (Groothuis, 2015; Scida, 2004). Hungarian also has inflected infinitivals, as shown in (20).
20
Kellemetlen volt Jánosnak az igazságot bevalla-ni-a. (Hungarian) unpleasant was John-Dat the truth-Acc admit-Inf-3Sg ‘It was unpleasant for John to admit the truth.’ Péter nem hagyta megnéz-ne-m a filmet. Peter not let-3Sg.Def watch-Inf-1Sg the film-Acc ‘Peter did not let me watch the film.’ (Tóth, 2000, p. 1)
An anonymous reviewer points out that the distribution of subjects with agreeing infinitives and with non-agreeing infinitives is different. According to Pires (2006, p. 93), in Brazilian Portuguese a non-inflected infinitival requires a PRO subject (which has a local antecedent), and an inflected infinitival has a pro subject (which does not require a local antecedent). Furthermore, a non-agreeing infinitive requires a sloppy reading under ellipses, whereas an agreeing infinitive permits a strict or sloppy reading, and a non-agreeing infinitive does not permit split-antecedence but an agreeing infinitive does. Although an in-depth analysis is beyond the scope of this work, I suggest that these differences boil down to whether or not the non-finite T Agrees partially or fully with an argument. In some Portuguese infinitivals, there may be full agreement with an argument (the infinitival property is due to the lack of tense, not phi-features), and pro is permitted. In non-agreeing infinitivals, there can only be partial agreement, which is not sufficient to check Case on an argument and only PRO is permitted.
Even though there is no clear overt indication of agreement in modern English infinitives, it is possible that there is partial agreement, as found in languages such as Portuguese, Hungarian, etc. Thus, I assume that T can label either by itself or via shared Person features.
Furthermore, I assume that heads can generally label. Mizuguchi (2017, p. 331) suggests that “[h]eads can label only when they are without unvalued features.” If a head has unvalued/unchecked features, it is incomplete, and thus it is reasonable to assume that Labeling isn’t possible. I adopt this view in my model; heads can label by themselves as long as they lack unchecked features. A root, however, needs to be categorized. Thus, a root cannot label by itself. For example, the root walk can be labeled only after it combines with a categorizer N or v.
3.8 Box Theory
Chomsky (2024) proposes Box Theory, in which the traditional A-bar elements (wh-phrases, topicalized phrases) are essentially placed into a box structure that can be accessed by C (and possibly other functional heads associated with topic and focus). The derivation of (21) proceeds along the following lines. When v* is Merged, the lower v*P phase is complete. At this point, the elements within the v*P are no longer accessible. However, the wh-phrase what is in a Box, where the Box could be thought of as a structure that contains focused (A-bar) elements.19 When the matrix interrogative CQ is Merged, it looks into the Box and finds what. The wh-phrase what ends up being pronounced at the position of C, but it still remains in the Box. Crucially, what in its base position must be converted into a Copy, and so FormCopy must apply.
21
CQ John T John [v* buy what]
Timing of insertion into the Box is an issue that arises. Chomsky (2024) suggests that boxing is contingent upon IM. Chomsky writes segregation of a boxed element is “established by IM [Internal Merge], which carries the derivation from the propositional to the clausal domain.” Chomsky further writes that “we can think of the element E that is IM-ed to the phase edge as being put in a box, separate from the ongoing derivation D.” Chomsky appears to be proposing that boxing results from IM of a particular SO to the phase edge. Note that it is simpler to put an SO into the Box, without boxing being contingent on IM, rather than to do IM of the SO followed by boxing. Furthermore, I also assume that IM of arguments is free (see Section 4). If IM to a phase edge results in boxing, then there could be overgeneration of boxed SOs.
Assuming that boxing happens without IM, it could be that as soon as a wh-phrase is externally Merged with an SO, it goes into the Box, or it could be that it goes into the Box at the phase-level. Also, when a phrase is accessed from the Box, its base position should be treated as a Copy. This means that FormCopy must apply. FormCopy could apply as soon as an SO goes into the Box, or it could apply as soon as the SO is accessed from the Box. Also, consider (22). In this case, CQ and the subject who are within the same phase. So whether or not who has to go into the Box isn’t clear as CQ should be able to access who without looking into the Box.
22
CQ who T who v* buy a house
In my implementation, I had to make decisions about the timing of the Box operation. From an implementational perspective, it is easier to put a wh-phrase into the Box as soon as possible, rather than to wait until a phase is complete, since waiting requires checking an already formed structure for wh-phrases (or other phrases that need to go into the Box). Thus, my model places a wh-phrase into the Box as soon as possible. Since the Box is assumed to exist, this applies to wh-subjects too. The model places wh-phrases in the Box and CQ can only see into the Box. FormCopy applies when CQ accesses a wh-phrase from the Box.20
Assuming the existence of the Box, successive-cyclic wh-movement is potentially an issue. If an argument is in the Box, there is no reason for it to undergo IM to an intermediary position. However, there is evidence for successive-cyclic wh-movement. Some well-known evidence is the existence of partial wh-movement (McDaniel, 1989). For example, in German and Albanian, a wh-phrase can appear in an intermediary position and a wh-phrasal scope marker (or some type of question element) can appear in the relevant scope position, as shown in (23) and (24). In Malay, as shown in (25), a wh-phrase can move to the edge of a clause in which it does not have scope, and be interpreted with scope in a clause in which there is no overt wh-marker.
23
[Was1 | glaubst | du | [was1 | Hans | meint | [[mit wem]1 | Jakob t1 | gesprochen hat]] | (German) |
Wh | believe | you | Wh | Hans | thinks | with whom | Jakob | talked has | |
‘With whom do you believe that Hans thinks that Jakob talked?’ (Cheng, 2000, pp. 78–79) |
24
A | mendon | se | [kë1 | ka | takuar | Maria t1] | (Albanian) |
Q | think-2s | that | who-ACC | has | met | Mary | |
‘Who do you think that Mary met?’ (Turano, 1998, p. 163) |
25
Ali | memberitahu | kamu | tadi | [apa1 | (yang) | Fatimah | baca t1 ] | (Malay) |
Ali | told | you | just.now | what | that | Fatimah | read | |
‘What did Ali tell you just now that Fatimah was reading?’ (Cole & Hermon, 2000, p. 105) |
The presence of a wh-phrase in an intermediary position has generally been taken to indicate that a wh-phrase undergoes movement through intermediary positions. Further well-known evidence for successive-cyclic movement is the existence of complementizers that agree with wh-phrases. For example, Irish has a particular complementizer that appears in a non-interrogative embedded clause when a particular wh-phrase undergoes long-distance movement (McCloskey, 1979, 2001).
Box Theory can deal with successive-cyclic wh-movement as follows. An intermediary complementizer can access the Box at externalization, but not in the syntax. This means that in some languages and constructions, there is access by an intermediary C, of an element in the Box, and this has an influence on Spell-Out. For example, an element in the Box can be pronounced at an intermediary position without actually being in that position in the syntax. Chomsky (2024) writes that in partial wh-movement constructions, presumably such as (23)–(25) above, there is a Labeling violation in the embedded clause in which a wh-phrase appears, as the wh-phrase does not share features with the non-interrogative C.21 To deal with this issue, Chomsky writes that “the boxed wh-XP is accessed” in the intermediary position “and under Externalization, spelled out, but with no labelling problem since the phrase does not appear in the derivation.” Although a variety of issues regarding when and how a wh-phrase is accessed for externalization require further examination, this analysis can account for partial wh-movement facts. If this approach is correct, then a wh-phrase does not actually undergo successive cyclic movement, but it can be accessed successive-cyclically, subject to language-internal and cross-linguistic variation, at the point of externalization.
The notion of the Box is beneficial in the following ways. First, it becomes possible to transfer a phase as soon as a phase head is externally-Merged. There is no need for an escape hatch at the v*P phase edge, which was previously assumed to exist to account for A-bar movement of a wh-phrase (e.g., see Chomsky, 2001). When v* is Merged, the v* head and its complement can be transferred. Furthermore, when C is Merged, transfer of the CP can occur. Elements at the edge of a CP presumably are accessed via the Box. Under the traditional Phase Theory view, the edge of a phase is transferred separately from the rest of the phase. But the traditional view complicates transfer of a matrix CP. Under the traditional view, when a matrix CP is formed, first the complement of C is transferred and then the CP edge is transferred, thus requiring two transfer operations to occur at the edge of a CP. This is no longer necessary. When C is Merged, the complete CP can be transferred.
While issues remain regarding the exact definition and nature of the proposed box structure, from an implementational perspective, it has some advantages.
3.9 Arguments as NPs
Although the status of determiners is peripheral to this work, it is necessary to explain how they are implemented in this model. I assume that arguments are NPs and not DPs (contra the typical DP hypothesis of Abney (1987), and much following work). The view that arguments are NPs, and not necessarily DPs, has been suggested by Chomsky (2007), as well as by Van Eynde (2006), Bruening (2009), Oishi (2015), Bruening, Dinh, and Kim (2018), and Bruening (2020), among others.
If an argument is really a DP, then problems arise regarding phi-features. As Bruening (2009, p. 28) points out, transitive verbs select nominal arguments, whereas they do not select for determiners. The typical approach in the Minimalist Program incorporates Agree relations between functional heads and arguments. Assume that unchecked phi-features on a probe on T or v* form an Agree relation with phi-features on an argument, in which uPhi on T Agree with phi-features on the argument. If the phi-features are on a nominal head N that is contained within a DP, then there is a potential problem because Agree(T, DP) would require T to see inside of the DP to the NP (or Agree would require features of N to percolate up to the D head). It is simpler for T to simply Agree with the head of the NP.
Although determiners can show phi-feature agreement in many languages (e.g., in Romance languages, etc.), gender, person, and number are properties of nominals, not determiners. Number shows up on nouns (e.g., cat vs. cats). Person and gender show up on pronouns (e.g., I vs. you, he vs. she). Agreement between determiners and nominals can occur in languages, such as in (26)a–b from Spanish, but as Bruening (2009, p. 30) points out, “every element in the nominal phrase must agree with the head noun in gender and number,” suggesting that the core element in these phrases is the N, rather than a determiner, quantifier, or adjective.22
26
a. | todos | esos | lobos | blancos | b. | todas | esas | jirafas | blancas |
all.Masc | those.Masc | wolves | white.Masc.Pl | all.Fem | those.Fem | giraffes | white.Fem.Pl | ||
all those white wolves | all those white giraffes | ||||||||
(adapted from Bruening 2009, p. 30) |
In my model, I indicate an argument as an NP as shown in Figure 4.23 When there is a D, it is pair-Merged to the NP. Pair-Merge is indicated with a dotted arc. Given two SOs X and Y, when X is pair-Merged with Y, forming <X, Y>, X is less prominent than Y and generally not accessible to syntactic operations (see Chomsky, 2000, 2004). When the NP is Merged with another SO, the pair-Merged D is invisible to Agree relations.
Figure 4
NPs Pair-Merged With D
4 Free Merge
Assume that IM is completely free, so that elements within an SO can be freely internally set-Merged to the root of the SO. This is untenable as an infinite number of possible SOs can be formed for every phrase and sentence.24, 25
If the discussions in the previous sections are correct, head movement is not a possible syntactic operation, and this greatly limits the Free Merge possibilities. For example, if head movement could apply freely, then illicit derivations such as those in Figure 5 would be possible; heads such as n, book, and will would be able to undergo IM. The resulting structures, however, would have to crash. Although structures of this sort could be ruled out as involving failures of Labeling and/or interpretation, generating them would involve a great deal of unnecessary and wasteful work. We can deal with this issue by simply assuming that head-movement (IM of heads) is not a possibility.
Figure 5
Illicit Derivations Involving Head Movement
Free Merge, if it exists, must apply to IM of arguments at the phase level only. Assuming Box Theory, only topicalized/focused arguments such as wh-phrases can escape from a phase, and escape is via the Box. Not permitting non-focused/topicalized arguments to escape greatly constrains Free Merge. Thus, I assume that Free Merge is limited by phase boundaries.
Given the constraints of the language module, as presented in this paper, it turns out that allowing Merge of arguments (NPs) to apply freely within a phase is not necessarily a problem. Ill-formed constructions can generally be ruled out as Labeling failures.
Consider the derivation of the simple statement in Figure 6. When v* is Merged, the lower v*P phase is transferred. Then the subject Tom is externally set-Merged. After Tense (past tense Tpast) is Merged, Tom undergoes IM with Tpast. After C is Merged, at the phase-level, FormCopy applies and converts the lower inscription of Tom into a Copy. The Spell-Out is computed as shown, whereby the frontier of the tree structure is converted via pronunciation rules (PF rules) into the correct output. Tpast and read combine to form the past tense read and functional elements such as v* and C are not pronounced.
Figure 6
Tom Read a Book
Note. Chomsky (2015, p. 10).
Next, consider the derivation of the wh-construction in Figure 7. When the v*P phase is completed, what is inside of the Box. The interrogative CQ is Merged, and then it looks into the Box and finds what. Importantly, what does not undergo IM to the CP. When what is accessed by CQ, FormCopy applies to what in its base position. Assume that FormCopy can still access the lower phase via the Box. FormCopy also converts the lower inscription of the subject Mary into a Copy—this FormCopy operation happens at the phase level. The frontier of the derivation is show in Figure 7b. At Spell-Out, CQ forces Tense to combine with it, forming CQ+T, and also T forces the auxiliary will to combine with it, so the result is C-T-Aux. This is not movement in the syntax, but rather displacement in the pronunciation of lexical items, as discussed in Section 3.5 above.
Figure 7
What Will Mary Buy?
Note. Pesetsky and Torrego (2001, p. 369)
I next turn to crashed derivations that result from Free Merge. Two failed derivations (crashed derivations) of Tom read a book are shown in Figure 8. As Free Merge of nominals is permitted at the phase level, it is possible for the object a book to undergo IM with the SO headed by read, and it is also possible for the subject Tom to simply remain in its base position (Tom is free to not undergo IM). In each case, there are Labeling failures due to {XP, YP} structures that lack shared features.
Figure 8
Crashed Derivations of “Tom Read a Book”
In some cases, there can be a large number of crashed derivations. Consider two crashed derivations of (27). These result from IM of an argument to a position in which Labeling cannot occur. In Figure 9a–d, the passivized object the book does not undergo IM to the TP. In each derivation, there is a Labeling failure at the position in which the book has undergone IM, due to a lack of shared features—the results are unlabelable {XP, YP} structures.
27
The book will have been being read.
Figure 9
Labeling Failures for “The Book Will Have Been Being Read”
Next, consider What did John say that Mary will buy? which contains long-distance wh-movement. A successful derivation is shown in Figure 10. This construction contains 3 phases. The verb say takes a clausal complement, but it is not an ECM verb. So I assume that it occurs with the non-phasal v (Chomsky, 2001), which does not Agree with an argument. After what is initially Merged, it is inserted into the Box. At the embedded CP phase level, after that is Merged, FormCopy applies to the lower inscriptions of Mary. When the matrix CQ is Merged, it looks into the Box and finds what. At Spell-Out, what is pronounced together with CQ, and Tpast is pronounced adjacent to CQ, resulting in pronunciation of did.
Figure 10
What Did John Say That Mary Will Buy?
Note. Pesetsky and Torrego (2001, p. 370).
Given Free Merge, there are five crashed derivations of What did John say that Mary will buy?. All of these are shown in Figure 11. These crash because of unlabelable {XP, YP} structures. In Figure 11a, what undergoes IM with the SO headed by the root buy to form an unlabelable {XP, YP} structure. In Figure 11b, Mary remains in-situ, resulting in an unlabelable {XP, YP} structure because Mary and v* do not share features. In Figure 11c, Mary undergoes IM with the SO headed by will and remains in this position, resulting in an unlabelable {XP, YP} structure because Mary and will do not share features. In Figure 11d-e, the derivations crash in the matrix clause because John remains in its base position forming an {XP, YP} structure with v, with which it does not share features. These two derivations are almost identical except that Mary has undergone IM from v* to Tpres in the embedded clause in Figure 11d, whereas in Figure 11e, Mary undergoes IM to the SO headed by will before it lands in the TP.
Figure 11
What Did John Say That Mary Will Buy (Crashes)
I next turn to a typical Control construction, such as John tried to win. In this case, there are crucially two separate arguments John in the same phase, assuming that the lower non-finite TP is not a phase. A convergent derivation is shown in Figure 12. John1 is externally-Merged in theta-position in the embedded clause. John2 is externally Merged with the matrix v in theta-position. Both John1 and John2 undergo IM to their respective TPs. In this case, FormCopy applies three times. FormCopy explains how John1 and John2 have the same referent, but separate theta-roles.
Figure 12
John Tried to Win
Note. Chomsky (2021, p. 21).
Given Free Merge, for John tried to win, a number of potentially problematic situations arise involving IM of the “wrong argument” as well as involving multiple instances of John (multiple specifiers) in the same phrase edge.
In Figure 13, Tpast undergoes an Agree relation with John2 (not John1). The uPhi on Tpast probe and Agree with the phi-features on John2. Then John1 (from the embedded clause) undergoes IM to the matrix TP. This derivation appears strange, since the wrong John moves to TP. However, this is permitted if Merge is truly free.26
Figure 13
John Tried to Win (John1 in TP and John2 in vP)
One possibility is that this derivation in Figure 13 crashes because the phi-features of John1 and John2, although identical in terms of person, number, and gender, are treated differently because they are associated with separate arguments. This can be modeled with what I refer to as a Unique Feature Rule—John1 comes with iPerson:3rd1, iNumber:sg1, and iGender:1, where the final 1 is a unique feature identifier.27 John2 comes with person, number and gender features that are identically valued to those of John1, but the unique feature identifier is 2 instead of 1, so the features are iPerson:3rd2, iNumber:sg2, and iGender:2. Utilizing this Unique Feature Rule, this derivation can be ruled out. The uPhi on Tpast Agree with the phi-features of John2. Then after John1 undergoes IM to the TP, Minimal Search finds the phi-features on Tpast and on John1, but they are treated as being different, due to the Unique Feature Rule. This is ruled out as a Labeling failure, shown in Figure 14a.
I also modeled this construction in my computer model without the Unique Feature Rule. When the Unique Feature Rule does not apply, then this derivation converges, as shown in Figure 14b. FormCopy converts all lower instances of John into Copies, and the highest John1 only is pronounced. Minimal Search finds equally valued person, number, and gender features on John1 and on Tpast—it does not matter that Tpast has obtained these phi-features via agreement with John2 instead of John1. Crucially, if this derivation is permitted, there is no problem for Spell-Out—the correct John tried to win results.
Figure 14
Completed Derivations of ‘John Tried to Win’ (John1 in TP and John2 in vP)
Although the Unique Feature Rule sounds like an added, and possibly unnecessary complexity, it is necessary. Consider the derivation of Figure 15. Without the Unique Feature Rule, if Tom remains within the v*P and does not undergo IM to the matrix TP, Labeling would be possible within the v*P. This is because the uPhi of v* are checked by the phi-features of Fred. The person, number, and gender features of Tom and Fred are identical. If features are not treated as unique, Labeling should be possible within the v*P.
Figure 15
Tom Saw Fred
Note. Labeling possible if features aren’t treated as unique.
In order to rule out superfluous Labeling as in Figure 15, the Unique Feature Rule, defined in (28), is required. Features that are valued the same way, but that are associated with different lexical items, are not treated as being identical by the language module.
28
Unique Feature Rule: Features associated with a particular lexical item are unique from identically valued features associated with a separate lexical item. (For example, iPerson:3rd1 of X are not identical to iPerson:3rd2 of Y.)
The derivations in Figure 16a–b below involve what would traditionally be referred to as multiple specifiers. In Figure 16a, John1 is initially Merged in theta-position in the non-finite clause. Then John1 undergoes IM to the matrix vP theta position, followed by EM of John2. Assume that there are no problems for theta-role assignment, in accord with Theta Theory (Chomsky, 1981), so John2 is able to obtain a theta role.28 Then John1 undergoes IM to the TP. Figure 16b is similar. In this case, John2 is successfully Merged in matrix theta position. Then John1 undergoes IM to the vP. John2 undergoes IM to the TP, but Tpast Agrees with the closest NP that it c-commands, John1. In both of these derivations, Tpast Agrees with a different John from the John that appears in the TP. These derivations are ruled out by the Unique Feature Rule, so that the phi-features of John2 are treated as being different from the phi-features of John1, as shown in Figure 17.
Figure 16
John Tried to Win (Multiple Specifiers)
Figure 17
Crashed Derivations: Unique Feature Rule Applies
Free Merge needs to be constrained to prevent multiple instances of IM of identical arguments within a single phase. For example, if Merge of an argument is completely free within a phase, then derivations such as in Figure 18 will arise in which the same argument undergoes IM multiple times with the root of the SO. Thus, it is necessary to prevent an argument from being successively remerged. Note that given FormCopy, all of these could potentially converge with the correct output.
Figure 18
Derivations With Successive Applications of IM for ‘John Tried to Win’
To block derivations such as these, which can potentially result in infinite loops, there needs to be a rule that blocks successive IM of multiple arguments to the same phrase. Generation of these ill-formed structures can be blocked by the following rule that simply bans consecutive applications of IM. After one application of IM of an argument, the next operation cannot be IM. This solves the relevant problem and structures such as those in Figure 18 cannot be generated. Thus, I will assume that this constraint No Successive IM holds.
29
No Successive IM: *IM IM (An IM operation cannot directly follow another IM operation.)
Note that if (29) holds and successive applications of IM are not permitted, then constructions in which there are consecutive applications of IM should not appear in language. Whether or not this is truly the case is an open question. Some languages have multiple wh-fronting (e.g. Bulgarian, Serbo-Croatian) that could potentially be formed by multiple applications of IM (e.g., see Boeckx & Grohmann, 2003; Bošković, 2002; Rudin, 1988), as in the following examples.
30
31
Box Theory offers an explanation. If these arguments are actually in the Box, from where they are accessed, they are not treated like typical arguments that are set-Merged with the core SO. Thus, their presence may be permitted at Spell-Out, with language-related idiosyncrasies that are beyond the scope of this work. They are pronounced together, but they do not actually involve consecutive applications of IM.
If the arguments in this paper are correct, Free Merge of arguments can generally be constrained by Labeling, but Free Merge also produces multiple convergent derivations for target constructions, which I turn to next.
5 Overgeneration
The main problem that Free Merge raises is that of overgeneration. Given Free Merge of arguments, a large number of crashed derivations can occur. Furthermore, a single construction can have multiple convergent derivations. As discussed in the previous sections, I used a computer model to implement Free Merge of arguments within a particular phase. The model also incorporates the Unique Feature Rule, which requires features associated with a particular argument to be uniquely identified, and the No Successive IM rule, which blocks consecutive applications of IM. The total numbers of convergent and crashed derivations for the main sentences generated by the model used for this paper are shown in Table 3–Table 6, which list the numbers of derivations that crash and converge for each target construction. All complete crashed and convergent derivations are available in the Supplementary Appendix (see Ginsburg, 2024).
Table 3
Basic Statements
Sentence | Crash | Converge | |
---|---|---|---|
1 | Tom saw Fred | 2 | 1 |
2 | Tom read a book. (Chomsky, 2015, p. 10) | 2 | 1 |
3 | He thinks that John read the book. | 3 | 1 |
4 | Mary arrived. | 3 | 5 |
5 | Tom will read a book. | 3 | 2 |
6 | The book was read. | 7 | 9 |
7 | John expects Mary to arrive. | 16 | 5 |
8 | Mary thinks that Sue will buy the book. (Pesetsky & Torrego, 2001, p. 357, originally from Stowell, 1981) | 5 | 2 |
9 | She was reading a book. | 3 | 2 |
10 | She had read a book. | 3 | 2 |
11 | She has been reading a book. | 5 | 4 |
12 | She will have been reading a book. | 9 | 8 |
13 | The book was being read. | 15 | 17 |
14 | The book had been being read. | 31 | 33 |
15 | The book will have been being read. | 63 | 65 |
Table 4
Control Constructions
Example | Sentence | Crash | Converge |
---|---|---|---|
1 | John tried to win. (Chomsky, 2021, p. 21) | 24 | 12 |
2 | John tried to finish the work. | 25 | 12 |
3 | Emily forgot to do the homework. | 25 | 12 |
4 | Emily will have forgotten to do the homework. | 217 | 108 |
Table 5
Wh-Questions
Example | Sentence | Crash | Converge |
---|---|---|---|
1 | Who do you expect to win? (Chomsky, 2015, p. 10) | 4 | 1 |
2 | What will Mary buy? (Pesetsky & Torrego, 2001, p. 369) | 3 | 2 |
3 | What did Mary buy? (Pesetsky & Torrego, 2001, p. 357) | 2 | 1 |
4 | Who bought the book? (Pesetsky & Torrego, 2001, p. 357) | 2 | 1 |
5 | Bill asked what Mary bought. (Pesetsky & Torrego, 2001, p. 378) | 3 | 1 |
6 | What did John say that Mary will buy? (Pesetsky & Torrego, 2001, p. 370) | 5 | 2 |
7 | *Who do you think that read the book? (Chomsky, 2015, p. 10) | 3 | 1 |
8 | Who do you think read the book? (Chomsky, 2015, p. 10) | 3 | 1 |
9 | What do you think that John read? | 3 | 1 |
10 | What do you think John read? | 3 | 1 |
11 | What did John say Mary will buy? (Pesetsky & Torrego, 2001, p. 370) | 5 | 2 |
12 | *Who did John say that will buy the book? (Pesetsky & Torrego, 2001, p. 371) | 5 | 2 |
13 | Who did John say will buy the book? (Pesetsky & Torrego, 2001, p. 371) | 5 | 2 |
Table 6
Yes/No Questions
Example | Sentence | Crash | Converge |
---|---|---|---|
1 | Will Tom read a book? | 3 | 2 |
2 | Does Tom read a book? | 2 | 1 |
3 | They asked if the mechanics fixed the cars. (Chomsky, 2013, p. 41) | 3 | 1 |
4 | Will the book have been being read (by the students)? | 63 | 65 |
The question arises of whether or not it is reasonable for there to be multiple crashed derivations for a single construction. For example, the following example (see discussion of (27) above) has 63 crashed derivations.
32
The book will have been being read.
The ideal model would most likely be one that does not generally produce crashed derivations. That said, it is crucial to note that if Merge is free, then derivations of this sort can be generated. Given Labeling though, they generally crash, which is desired.
The second issue related to overgeneration regards well-formed derivations. Many (although not all) of the constructions in Table 3–Table 5 have more than one convergent derivation. For example, for Tom will read a book there are two possible convergent derivations that differ with respect to how many times the surface subject the book has undergone IM. Since Merge within a phase is free, there is no need for successive cyclic movement, and thus nothing blocks these multiple derivations. In Figure 19a, the subject Tom undergoes IM to Tpres. In Figure 19b, Tom undergoes IM to will and then to Tpres. Both of these options are possible, and the result is the same well-formed output. In the latter case, there is no problem for the structure with respect to Labeling. Since Tom undergoes further IM, it isn’t visible to Labeling of {Tom, will Tom read the book}, and an unlabelable {XP, YP} structure does not result.
Figure 19
Multiple Derivations of ‘Tom Will Read a Book’
Four possible derivations (out of 65) for The book will have been being read are shown in Figure 20. In Figure 20a, the book undergoes IM to Psv, Prog, and Tpres. In Figure 20b, it undergoes IM to Psv, will, and Tpres. In Figure 20c, the book undergoes IM to Perf and Tpres. In Figure 20d, it undergoes IM to read, will, and Tpres.
Figure 20
Multiple Derivations of ‘The Book Will Have Been Being Read’
Six successful derivations (out of 12) of John tried to win are shown in Figure 21. Note that the lower John1 can remain in-situ in theta-position, or it can undergo IM to a higher position. FormCopy converts John1 into a Copy, so there are no Labeling problems in these positions. Both John1 and John2 are Merged in theta-positions and FormCopy results in only John2 being pronounced. The two derivations in Figure 21e–f involve IM of both John1 and John2 with v. Since Tpast Agrees with the same inscription of John that is present in the TP, Labeling is possible (without violating the Unique Feature Rule). None of these derivations cause problems for Labeling, and all converge successfully.
Figure 21
Multiple Derivations of ‘John Tried to Win’
Given Free Merge, successive-cyclic movement is not required.
Evidence from quantifier stranding indicates that an argument can internally set-Merge in intervening positions before arriving in subject position. The quantifier all can be stranded as in (33)b–c, which can be accounted for if all the children raises through a VP-internal position. Assuming that arrive is unaccusative, all the children should originate as the complement of arrive. In particular, in (33)c, if the adverbial quickly is within the VP, then all must also be in a VP-internal position.29
33
All the children arrived.
The children all arrived.
The children quickly all arrived.
McCloskey (2000) gives the following examples with quantifier stranding from West Ulster English, which support the idea that there is internal set-Merge (successive-cyclic IM) of an argument in intervening positions.30
34
What all did he say that he wanted to buy?
What did he say all that he wanted to buy?
What did he say that he wanted all to buy?
What did he say that he wanted to buy all? (McCloskey, 2000, p. 62)
These examples demonstrate that an argument can undergo IM in intermediary positions. A stranded quantifier is an indication that IM has occurred. However, these examples are compatible with the notion that IM can, but need not, occur in intermediary positions, which is what my model predicts. When there is a stranded quantifier, IM in intervening positions has occurred. When there is no stranded quantifier, IM may or may not have occurred.
Another important issue raised by this Free Merge model is overgeneration—it erroneously predicts a few derivations to be well-formed, contrary to fact.
In most cases, Labeling is sufficient to account for the general requirement that subjects appear in the TP in English. Consider the derivation of She was reading a book. In the convergent derivation shown in Figure 22, she undergoes IM to Tpast, and shared phi-features label. If the subject she does not move to the TP, the derivation will crash. For example, in Figure 23a, the subject she remains in-situ in the v*P and in Figure 23b, the subject undergoes IM with Prog. In both cases, the result is an unlabelable {XP, YP} structure which crashes due to a lack of shared features.
Figure 22
She Was Reading a Book
Figure 23
She Was Reading a Book (Crashed Derivations)
There are issues, however, with unaccusative and passive constructions. Given the standard assumption that the surface subject of an unaccusative and passive originates as an object, derivations in which an object remains in situ are not necessarily ruled out. The model incorrectly generates arrived Mary, shown in Figure 24, was read the book in Figure 25, and John expects to arrive Mary in Figure 26. Crucially, all of the other convergent derivations for these examples that are generated by this model result in the well-formed output (4 other derivations successfully converge as Mary arrived, 8 other derivations as The book was read, and 4 other derivations as John expects Mary to arrive). Thus, the model does produce the correct derivations most of the time.
Figure 24
*Arrived Mary (Mary Arrived)
Figure 25
*Was Read the Book (The Book Was Read)
Figure 26
*John Expects to Arrive Mary (John Expects Mary to Arrive)
In Figure 24–Figure 26, T probes and Agrees with the underlying object (which is the closest argument), and T’s uPhi are checked. Then T is not part of an {XP, YP} structure, so if it labels, it must label by itself. As discussed in Section 3.1, Chomsky deals with the EPP requirement for an overt subject in English by relying on strength, with the proposal that English T is too weak to label by itself, and so it requires an {XP, YP} configuration for Labeling. If T is weak and is stipulated to require an {XP, YP} structure for Labeling, then these derivations are correctly ruled out. However, strength is an unclear stipulation, which I do not adopt. See Section 3.7 above. In my model, non-finite T may or may not have an overt specifier. In the derivation for John tried to win, shown in Figure 27a, John1 remains in-situ (although it can also raise to toT, where it will be converted into a Copy). In the derivation of John expects Mary to arrive shown in Figure 27b, Mary appears in the non-finite clause forming an {XP, YP} structure with toT (where shared Person features label). If non-finite T were weak, then toT would always require an overt “specifier”, contrary to fact.
Figure 27
Non-Finite T in Control Constructions
The simplest assumption is that there are no strong or weak heads in the syntax, thus suggesting that Labeling Theory does not provide an explanation for the need for a subject to be in the traditional specifier of TP position (at least not in certain cases). I do not have a clear solution to this issue (which is the long-debated problem of the EPP), but one possibility is that the requirement for an overt subject in languages such as English is primarily a constraint on Spell-Out.
Richards (2016) develops what he calls Contiguity Theory, which takes the position that movement operations can be influenced by phonological structures, so that syntax and phonology are heavily connected. Richards proposes that in English, T is a suffix that must follow a metrical boundary, and a subject in the TP provides this metrical boundary. A metrical boundary is the edge of a metrical foot, where a foot contains one or more syllables, one of which receives more stress than the others. In a language such as Spanish, which does not require an overt subject, the vowel that precedes a tense morpheme is stressed, and thus the syllable before the Tense morpheme follows a metrical boundary. In Spanish, a metrical boundary can occur within a word. For example, in (35)a–b, the boldfaced tense morphemes follow a metrical boundary at the end of the verbal root.
35
cantá-is (Spanish)
sing Fut Past 2pl
‘you (pl.) sing’
canta-rí -a -is (Spanish)
sing Fut Past 2pl
‘you (pl.) would sing (conditional)’ (Richards, 2016, p. 12)
Richards (2016, p. 15) writes that in languages such as English, “metrical boundaries occur only on complete words, which are in turn found in specifiers.” For example, in (36), there is supposedly a metrical boundary at the edge of the subject there, which precedes the verb containing the tense morpheme suffix -ed.
36
There arrived a man. (Richards, 2016, p. 22)
Note that Richards argues that phonological constraints have specific effects on a syntactic derivation, and not necessarily only at Spell-Out, writing “the narrow syntax can make reference to, for instance, metrical boundaries” (Richards, 2016, p. 27). Consider how this approach can deal with examples in which T does not overtly follow a specifier at Spell-Out. In (37)a–b, I assume that T raises at Spell-Out. But in the syntactic structure, T still follows the subject, which has a metrical boundary. If the requirement for the affix T to follow a metrical boundary applies at the level of narrow syntax, then these can be accounted for.
37
Will Tom read a book?
Does Tom read a book?
Who did John say will buy the book? (Pesetsky & Torrego, 2001, p. 371)
Contiguity Theory, as developed by Richards (2016) is complex, and further examination of whether or not it can truly account for EPP effects is warranted, but it may be a promising approach.31
To summarize, my model potentially predicts that an argument in an unaccusative or passive construction need not move, contrary to fact. I suggested that there might be a Spell-Out based solution, but if not, then the general problem of the EPP still remains. I leave an in-depth analysis of this for future work. I also note that in the vast majority of cases, my model correctly generates the target Spell-Out.
Another remaining issue with my model is that the interaction between Free Merge and Labeling cannot account for the that-trace effect. Consider the sentence with the that-trace effect in Figure 28. When the embedded CP is completed, the wh-phrase who is in the Box. In this example, who and the Tpast phrase form a labelable {XP, YP} structure. When the matrix CQ is Merged, CQ accesses the box and the wh-phrase is pronounced in clause initial position.32 Thus, this is predicted to be well-formed, contrary to fact.
Figure 28
*Who Do You Think That Read the Book?
I suggest that the ill-formedness of that-trace effect constructions may have to do with extra-syntactic factors applying at Spell-Out. First of all, when that is not pronounced, the that-trace effect goes away. Chomsky (2015) relies on dephasing to account for this (in the absence of that, the embedded CP is no longer a phase), but dephasing is an extraordinarily complex process that also violates the No Tampering Condition. A simpler assumption is that the that-trace effect is simply dependent on whether or not C is pronounced overtly. A promising possibility is that the that-trace effect is not syntactic, but has to do with phonological factors. Sato and Dobashi (2016) propose that the that-trace effect results from constraints on prosodic phrasing. They propose that there is a “PF condition” that “[f]unction words cannot form a prosodic phrase on their own (2016, p. 1)”, and that when that is followed by a trace, that ends up forming a prosodic phrase by itself, which is not permitted. When that is followed by a subject, as well as other types of phrases such as adverbials, it does not form a prosodic phrase on its own, and there is no problem. It is also notable that the that-trace effect is not found in certain English dialects as well as in many other languages. For example, Sobin (1987) points out that some English speakers do not find some that-trace constructions to be ill-formed. This suggests that the cause of the that-trace effect may not be syntactic in nature. If the that-trace effect lacks a syntactic cause, then it needs to be accounted for at Spell-Out. I leave in-depth examination of this issue for future work.
6 Conclusion
In this paper, I have discussed Free Merge as implemented by a computer model. I presented the basic components of this model, which attempts to take a “simple” approach (although not necessarily as simple as possible) to language generation, dispensing with complex mechanisms. I have shown that in general, Labeling, combined with phase boundaries, is sufficient to constrain Free Merge. Also, note that Theta Theory and Case Theory play no clear role in ruling out derivations, and Labeling alone is generally sufficient to constrain Free Merge.
There is a certain amount of overgeneration that is a potential problem. In order to deal with overgeneration, I needed to propose the Unique Feature Rule and No Successive IM. If these are truly principles of language, they require further examination. Overgeneration of ill-formed structures ideally should not occur or should be severely limited, probably more so than presented in this paper. Furthermore, overgeneration of well-formed structures is an issue, but the potentially problematic examples discussed in this paper can possibly be eliminated at Spell-Out. The beauty of Free Merge is that IM requires no trigger, and a variety of attested IM operations fall out from the model. Note, however, that feature-driven IM has the advantage of doing away with this overgeneration problem. On the other hand, feature-driven Merge is complicated by the need for a variety of features to trigger IM. Whether or not feature-driven Merge should truly be eliminated from the theory requires further examination.