Reviews

Chomsky, N., Seely, T. D., Berwick, R. C., Fong, S., Huybregts, M. A. C., Kitahara, H., . . . Sugimoto, Y. (2023). Merge and the Strong Minimalist Thesis. Cambridge University Press. ISBN 9781009343244.

Review of Merge and the Strong Minimalist Thesis

Elly van Gelderen*¹

[1] English Department, Arizona State University, Tempe, AZ, USA.

Biolinguistics, 2024, Vol. 18, Article e14525, https://doi.org/10.5964/bioling.14525

Received: 2024-04-30. Accepted: 2024-05-14. Published (VoR): 2024-06-21.

Handling Editor: Patrick C. Trettenbrein, Max Planck Institute for Human Cognitive and Brain Sciences & University of Göttingen, Germany

*Corresponding author at: Ross-Blakley Hall, PO Box 871401, Tempe, AZ 85287-1401, USA. E-mail: ellyvangelderen@asu.edu

This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

This review first provides a summary of the central ideas in Merge and the Strong Minimalist Thesis and then presents a discussion of the more controversial points. The book offers an introduction to the Minimalist Program. The focus is on Merge, which plays a central role in the Faculty of Language because it is “the primary structure-building device of the syntax” (p. 2). The book clarifies the status of Theta Theory, Search, and Workspace, and provides a novel account of passives and obligatory control.

Keywords: Minimalism, Merge, Search, Theta Theory, Form Copy

Merge and the Strong Minimalist Thesis provides an excellent and accessible discussion of current ideas in the Minimalist Program. As the title suggests, it focuses on Merge, which plays a central role in the Faculty of Language because it is “the primary structure-building device of the syntax” (p. 2). The book is largely based on Chomsky (2021) and clarifies earlier issues, e.g. the status of Theta Theory, deletes some concepts, e.g. determinacy, and introduces new terms, e.g. inscription. In this review, I’ll first give a summary of the book and then present a discussion of the more controversial points.

Merge and the Strong Minimalist Thesis (hence MSMT) is divided into eight sections. Section 1 (pp. 1–5) introduces the basic properties of language: From a finite set of words, speakers can form an infinite set of sentences. It also introduces the term ‘inscription,’ where ‘occurrence’ had been used previously: “lexical items are drawn as needed from the Lexicon and inscriptions of them are available for computation” (p. 2) and comes back to the notion of Workspace (WS) which it defines as “the set consisting of the material available for computation at a given derivational stage” (p. 2). Although the WS is a set, normally indicated by curly brackets, i.e. {}, square brackets are used, i.e. [], to show the WS, to avoid confusion (p. 3) with regular sets. Merge is defined in a more complete way in MSMT. From an earlier definition of Merge, as in (1), the current book suggests (2), among other definitions and a sample, partial derivation using Merge is shown in (3).

1

Select two lexical items α and β and form the set {α, β} in a workspace.

(Chomsky et al., 2019)

2

Merge	(i)	‘looks inside’ the WS that it is applying to,
	(ii)	targets material within the WS,
	(iii)	builds from that targeted-material an object (i.e., it builds a nonatomic structure), which is now
	(iv)	a new object within the WS, thereby modifying the WS. (p. 3)

3

WS = [the, see, I, child]

WS’ = [see, I, {the child}]

WS’’ = [I, {see, {the, child}}]

WS’’’ = [{I, {see, {the, child}}}]

Clarified in this section and later on is the notion that Merge is not free but constrained by “general principles, external and internal to language” (p. 4). Economy and Theta Theory are among these constraints.

Section 2 (pp. 6–13) reiterates the biolinguistic nature of the Generative Enterprise: Language is understood as a part of the brain in the same way that other cognitive components are and functions to build hierarchical structures from lexical items. Merge is core to those structures. Providing an explanation rather than just a description is also crucial to Generative Grammar and the three factors (genetic endowment, experience, and general principles) as well as the Strong Minimalist Thesis (SMT) are helpful to that attempt. The SMT is understood as requiring the simplest operations and as seeing the Faculty of Language as an optimal solution to language-specific conditions, such as Theta Theory. Since Chomsky (1981), Universal Grammar has included Theta Theory—more below.

Section 3 (pp. 13–27) offers more detail on the operation Merge and an alternative definition of (2) is formulated as (4).

4

Merge

targets elements, P₁, … P_m, within WS
puts the targeted-elements into a set, {P₁, …, P_m} (expressing that the elements are in a relation with each other by virtue of being members of the same set, traditionally a `phrase’), and
the newly created set is now a member of the WS, which is then available for further computation. (p. 14)

Merge has to be simple because it had little time to evolve in humans, in keeping with the SMT, and that dictates answers to a number of questions, such as: How many elements are merged, how are the elements to be merged selected, and do the targets stay in the WS? If less computation is better, Merge will be binary. Evidence for this is given using adjectival scope: In ‘the third yellow house’, ‘third’ has scope over ‘yellow house’, thus showing the binarity of Merge. The possible target of Merge is constrained by Minimal Search (p. 19) and, here, the distinction between member and term becomes relevant and, as I understand it, the difference between Least Search and Search (p. 19). Let’s say the WS has reached the stage of (5).

5

WS = [a, {b, c}]

Members of the WS are a and {b, c} but b and c are terms. “Merge will first locate P, where P can be any member” (p. 19) by means of Least Search. Locating b or c would require “more computation (a more extended Search)” (p. 19). However, the second step, i.e. searching for the second element of Merge allows two options (p. 20): (a) Search looks inside P and finds a term of the WS or (b) Least Search locates a member of the WS. That gives us Internal Merge and External Merge, respectively. In my comments later on, I discuss how I understand this description as two different Search options, which is apparently (Daniel Seely, p.c.) not what MSMT argues.

An illustration of the interplay between External Merge (EM) and Internal Merge (IM) then follows in the derivation of the passive, as in (6). In (6a), the inscriptions have been entered into the WS and the and apple have merged through EM. The next two stages, (6b) and (6c), also involve EM but the last stage, (6d), takes a term in the WS and merges it with the extended verbal phrase, through IM.

6

WS = [eaten, {the, apple}]
WS’ = [{eaten, {the, apple}}] (p. 20)
WS = [{was, {eaten, {the, apple}}}]
WS’ = [{{the, apple}, {was, {eaten, {the, apple}}}}] (p. 21)

The inscriptions of {the, apple} in (6d) are structurally identical and and considered as such by Preservation. Preservation is not a syntactic operation and must therefore “be able to ‘scan’ each derivational step to be sure that an inscription has not changed interpretation” (p. 22), i.e. it can see both the input and output to know that the two inscriptions of {the, apple} are identical and are thus copies of each other.

Identical inscriptions in the WS can also result in distinct interpretations, as in ‘Many people praised many people’ in which case the inscriptions are repetitions. The difference between copies and repetitions is captured by Form Copy (FC) for copies and a default interpretation for repetitions (i.e. inscriptions are repetitions, p. 24), respectively. The application of FC determines the difference at the stage where the interpretation is handed over to the interfaces: If it applies, we have a copy, and, if it doesn’t, we have a repetition. What then causes the FC to apply in the case of passives but not in the case of repeated phrases? The answer is Theta Theory: The verb in a passive only ‘wants’ one real NP but the verb in a transitive would ‘want’ two NPs.

Section 4 (pp. 27–30) presents a short discussion of Resource Restriction (RR), i.e. what happens to the targets of Merge: Do the targets of Merge, namely, a and {b, c}, remain as members of the WS after the merge operation? So, is the output WS’ [a, {b, c}, {a, {b, c}}], as in (7), or is the output [{a, {b, c}}], as in (5)?

7

WS’ = {a, b, {b, c}, c}]

The answer is that the output is (5) because the “computational system seeks to minimize resources” (p. 28). RR is based on work by Fong (2021) and others that shows that brains are limited in using all sensory input and that it needs to restrict that input. Another way of formulating RR is as Minimal Yield (MY): “Merge yields the fewest possible new terms that are accessible to further operations” (p. 29). Thus, (7) violates MY but (5) does not.

Section 5 (pp. 30–36) further explores the language-specific conditions (LSC) on Merge. Like Chomsky (2021, p. 13), MSMT considers Theta Theory as language-specific and therefore as a First Factor principle. Theta Theory requires one argument for one theta role and vice versa. At the conceptual-intentional (CI) interface, sentences such as ‘Juan sleeps the building Tom’ and ‘Juan put’ are not interpretable. This section also returns to the Duality of Semantics, namely that EM builds argument structure and that IM is relevant to scope and discourse-related phenomena.

Finally, this section shows how FC derives obligatory control. The derivation of control has been problematic, with two basic analyses: Either with a PRO subject that receives its theta role independent of that of the subject of the main clause, as in (8a), or as a raising analysis, as in (8b), where one argument can bear more than one set of theta-features. The PRO analysis is pursued in e.g. Chomsky (1981) and the raising one in Hornstein (1999).

8

The man tried PRO to read a book.
The man tried <the man> to read a book.

In MSMT, an alternative is developed for the obligatory control structure in (9a), namely as starting out as a completely externally merged (9b) followed by (9c). The next step is to externally merge a new inscription of {the, man}, as in (9d).

9

The man tried to read a book.
{the, man}, {read, {a, book}}}
{tried, {to, {{the, man}, {read, {a, book}}}}}
{{the, man}, {tried, {to, {{the, man}, {read, {a, book}}}}}} (p. 35)

After (9d), FC applies “with no knowledge of how [9] was constructed” (p. 35) and the lower NP is considered a copy and not pronounced when the sentence is externalized. The claim is that FC applies in accordance with Theta Theory, and that “FC applies in accord with” Theta Theory (p. 36).

Section 6 (pp. 36–46) presents practical illustrations in further detail: more on transitives and on passives and more on control. Phases are now added to the discussion. A phase is an intermediate result in a derivation that can be stored and which makes certain parts of the derivation inaccessible. They place locality constraints on the syntax and this is, therefore, relevant to RR because it restricts the input. It also leads to a more complex version of (5). I’ll focus on the transitive since the passive and control are similar to what is explained above, regarding (6) and (9).

Building up a structure for the transitive in (10a) involves first externally merging the verb ‘eat’—entered as a root (R)—to an NP, the Internal Argument (IA), and then internally merging it, as in (10b). The latter operation is optional. The next steps involve adding a v* and an Agent, the External Argument (EA), and then an I(nflection) head—T in earlier work—and internally moving the EA. Finally, a C is added, resulting in (10c).

10

The fox ate a pear.
WS = [{{a, pear}, {R, {a, pear}}}] (p. 38)
WS = [{C, {EA, {I, {EA, {v*, {IA, {R, IA}}}}}}}] (p. 39)

The sequence {IA, {R, IA}} is in grey because v* is a phase head and, on completion of the v*P, its complement, i.e. {IA, {R, IA}}, will become inaccessible. A difference with earlier work is that of the two copies of an NP, e.g. EA in (10c), only the most recent will be accessed by Search (the one crossed out is not `seen’). This Search is Least Search because it is the simplest. This assumption is different from that in the work of Chomsky et al. (2019) where all copies in the phase are accessible and may lead to indeterminate (and therefore ungrammatical) structures. The Determinacy Principle—basically an anti-locality principle (cf. Abels, 2003; Grohmann, 2003)—is dropped in later work but that leaves some ungrammatical structures unaccounted for, such as, the impossibility to topicalize a subject and wh-movement from a subject position. For an instance of the latter, Goto and Ishii (2019, p. 94) show that indeterminacy rules out (11a)—the wh-questioned version of (11b)—, because a subject in the specifier of the v*P that moves to the specifier of the TP will have two copies of the wh-element in the complement of C and therefore this wh-element cannot move to the specifier of the CP. The derivation is shown in (11c).

11

*Who did a picture of please you?

A picture of Mary pleased you?

[CP who [C-did

[TP [a picture of <who>] [T [v*P [<pictures of who>] [v*

[ please you]]]]]]].

Workspace: not allowed by determinacy

Transfer 1

If the subject stays in the specifier of the v*P, there is no violation, as in (12a)—the wh-questioned version of (12b)—, because the specifier of TP is filled with an expletive.

12

Who is there a picture of on the wall?
There is a picture of Mary on the wall?
[CP who [C is [TP there [T [v*P [a picture of <who>] [v* …

The difference between (11a) and (12a) shows that two copies in a WS result in an indeterminate structure.

Section 7 (pp. 46–60) discusses the history and development of Merge. It starts with Chomsky’s (1955/1975) generalized transformations and then proceeds to the highly influential Chomsky (1965), aka the “Aspects model”. The latter combines recursive phrase structure rules and transformations that can alter the shape of the phrase structures. That system is top-to-bottom, unlike the current bottom-to-top Merge. A big change comes about in the 1970s with X-bar Theory: All phrases have the same structure, namely they are based around a head that projects to a phrase, which can also include a specifier and a complement. MSMT points out that X-bar Theory makes it possible to see structures hierarchically rather than linearly. Move-alpha, the successor of transformations, is also not constrained by linear order but by c-command, i.e. a hierarchical relation, as is similarly the case for (Set) Merge. The move to Merge maximizes the explanatory effects and simplifies the form of Merge (p. 55). The authors also explain why labels (VP, NP, etc) are relegated to a Labeling Algorithm, using the Third Factor Minimal Search. The section shows how, all through the history of Generative Grammar, there is a move from language- or construction-specific rules to more general principles (but still specific to the Faculty of Language) to Third Factor Principles.

Section 8 (pp. 60–66) offers some prospects for the future and addresses “remaining empirical issues” (p. 60). It examines Across-the-Board (ATB) phenomena, as in (13a), as cases where the wh-element externally merges twice and one of the two inscriptions moves to the specifier of the CP, as in (13b). FC then reduces all inscriptions of what to the same interpretation. The analysis offered here is preliminary (“space prohibits discussion of the empirical details” p. 61) but similar to that of Obligatory Control.

13

I wonder what Gretel recommended and Hansel read. (p. 61)
I wonder _CP[what Gretel recommended what and Hansel read what].

The section concludes with open questions and prospects for the future. One of these open questions is the nature of successive cyclicity and the inventory of phases: Is it just CP and v*P or are other phrases headed by a phase head as well? Other questions are: Can the distinction between A and A-bar be eliminated, how to account for unbounded coordination, and how to give a unified account for island phenomena?

MSMT provides an exciting, up-to-date account of Merge within the latest version of the Minimalist Program. In the remainder of this review I first point out some areas that have changed, some minor and some major. I then call attention to some matters that could benefit from further elaboration.

Some minor changes from previous work: (a) we are back to I(INFL) rather than T(ense), as shown in (10c), although the TP is mentioned on p. 63 regarding phases, (b) there is no mention of Inheritance from C to T but this may be due to Merge being the emphasis of the book and not Agree, (c) the Lexicon stays open, as in (6c) where the initial choice of elements in the WS gets added to (by the passive auxiliary), and (d) the term inscription comes out of the blue but that change must be trying to avoid the earlier problem of copying from the Lexicon (see e.g. Putnam & Stroik 2011).

More major changes are: (e) The status of Merge as a First Factor Principle is less clear than before and (f) Theta Theory is regarded a First Factor phenomenon (p. 30) and has been since Chomsky (2021).

As for (e), in Chomsky (2005, p. 12), for instance, we read that “It could be that unbounded Merge, and whatever else is involved in UG, is present at once” and “the Great Leap Forward yields Merge. The fundamental question of biology of language mentioned earlier then becomes, What else is specific to the faculty of language?” (p. 12, my emphasis). These quotes suggest that the genetic component of the Faculty of Language is/includes Merge. However, in MSMT, this is no longer clear: “Merge in its unconstrained form, the form it takes outside of any first or third-factor principles” (again my emphasis). So, unconstrained Merge cannot exist within the Faculty of Language but always has to be constrained.

Regarding (f), I had always assumed that Theta Theory was not Language Specific but used outside of the Faculty of Language, e.g. in Moral Grammar. Bickerton (1990, p. 185) writes that the “universality of thematic structure suggests a deep-rooted ancestry, perhaps one lying outside language altogether.” If argument structure is also relevant outside the linguistic system, humans without language could have had it and so could other species. A knowledge of thematic structure is crucial to understanding causation, intentionality, and volition, part of our larger cognitive system and not restricted to the language faculty. It then fits that argument structure is relevant to other parts of our cognitive make-up, moral grammar being one area (see van Gelderen, 2018). In addition, the argument from evolvability, invoked for Least Search and simple Merge, makes it less likely that a relatively complex system such as Theta Theory (theta roles and connections to aspect and definiteness) would have evolved in a relatively short time.

Some areas in MSMT that I have a harder time understanding are (a) the uses and definitions of Search, Least Search, and Minimal Search because, at this stage, it seems arbitrary which kind of Search is used where, (b) the application of Form Copy (FC), which generally relies on Theta Theory, and (c) the difference between EM and IM.

As for the definition of the various Search operations, Daniel Seely (p.c.) assures me that MSMT regards all Search as Least Search. However, reading the book, I saw Least Search looking for members and Search looking for terms (and members). There are at least three areas where sometimes the one is used and sometimes the other. First, EM uses Least Search and IM Search (pp. 19–20). What I have quoted under (5) above makes that clear: There are two steps in IM, first locating the member X and then looking further for something inside X. Second, Least Search will be used for {X, YP} Labeling since both X and YP are members but Search will be used for Labeling {XP, YP} (p. 38) (as phi, phi) since Search first has to access the phrase and then take a second step. Third, Least Search is relevant to selecting an item from the WS (p. 39) and, in this way, avoids indeterminate structures (unlike in Chomsky et al., 2019; Goto & Ishii, 2019; and van Gelderen, 2022).

Regarding (b), how should we decide when FC is relevant? Recall from the discussion around (6) that Preservation constrains the interpretation of an inscription and requires that it does not change (p. 22). FC is introduced on p. 24 to assign copies the same interpretation. “FC provides the information ‘these two identical inscriptions are copies’ and the semantics thus knows to interpret them in the same way” (p. 25). FC applies in the case of a passive but needs to be blocked in the case of a repetition, which it does through access to Theta Theory (p. 26). In Obligatory Control structures, FC applies (cf. (9)) even though the theta roles of the arguments are different and you’d expect an interpretation of repetition. I will quote the passage that seems to weaken Theta Theory: “Crucially, the result [in (9)] does not violate [Theta Theory] since {the, man} and its structurally identical copy, EA (= {the, man}), get a theta role from different theta role assigners, in complete conformity with univocality” (p. 43) and, therefore, FC is only blocked if there “are multiple theta roles from the same verb” (p. 26, my emphasis). Another question I have around FC, if FC depends on Theta Theory, why not just have Theta Theory?

As for (c), it is great to see that Phrase Structure (X-Bar) rules and movement (Move) continue to be unified in EM and IM, respectively. However, is IM now harder because it allows Search rather than Least Search, as in the case of EM? MSMT would say no since they only allow Least Search (Daniel Seely, p.c.), but I am less certain.

Funding

The author has no funding to report.

Acknowledgments

Many thanks to Daniel Seely who provided in-depth comments on the first draft, to Stefanie Bode for further thoughts and questions, to two anonymous reviewers for excellent suggestions, and to the Spring 2024 Arizona State University Syntax Reading Group for great discussions, namely, Abdulrahman Albanawi, Ming Chen, Saki Gejo, Annette Hornung, Robert LaBarge, Derek McCarthy, and John Powell.

Competing Interests

The author has declared that no competing interests exist.

References

Abels, K. (2003). Successive cyclicity, anti-locality, and adposition stranding [Doctoral thesis]. University of Connecticut.
Bickerton, D. (1990). Language and species. University of Chicago Press.
Chomsky, N. (1955/1975). The logical structure of linguistic theory. Plenum Press.
Chomsky, N. (1965). Aspects of the theory of syntax. Mouton.
Chomsky, N. (1981). Lectures in government and binding. Foris.
Chomsky, N. (2005). Three factors in language design. Linguistic Inquiry, 36(1), 1-22. https://doi.org/10.1162/0024389052993655
Chomsky, N. (2021). Minimalism: Where are we now, and where can we hope to go. Gengo Kenkyu, 160, 1-41.
Chomsky, N., Gallego, Á. J., & Ott, D. (2019). Generative grammar and the faculty of language: Insights, questions, and challenges. Catalan Journal of Linguistics, 2019(Special issue), 229-261. https://doi.org/10.5565/rev/catjl.288
Fong, S. (2021). Some third factor limits on Merge [Manuscript]. University of Arizona.
van Gelderen, E. (2018). The diachrony of verb meaning: Aspect and argument structure. Routledge.
van Gelderen, E. (2022). Third factors in syntactic variation and change. Cambridge University Press.
Goto, N., & Ishii, T. (2019). The principle of determinacy and its implications for MERGE. Proceedings of the 12th GLOW in Asia & 21st SICOGG, 91–110. https://drive.google.com/file/d/1Tj86B_hYZ94hpaX274Xi7OewieEJY4Q-/view
Grohmann, K. K. (2003). Prolific domains: On the antilocality of movement dependencies. John Benjamins.
Hornstein, N. (1999). Movement and control. Linguistic Inquiry, 30(1), 69-96. https://doi.org/10.1162/002438999553968
Putnam, M., & Stroik, T. (2011). Syntax at ground zero. Linguistic Analysis, 37(3–4), 389-404.