Articles

Uniformity and Diversity of Language in an Evolutionary Context

Stefanie Bode*1

Biolinguistics, 2024, Vol. 18, Article e12823, https://doi.org/10.5964/bioling.12823

Received: 2023-09-16. Accepted: 2024-01-02. Published (VoR): 2024-02-08.

Handling Editor: Kleanthes K. Grohmann, University of Cyprus, Nicosia, Cyprus

*Corresponding author at: English Department, Kaethe-Hamburger Weg 3, 37073 Goettingen, Germany. E-mail: Stefanie.Bode@phil.uni-goettingen.de

This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

The paper explores a view on language that is in line with the Strong Minimalist Thesis and that derives an evolutionary scenario predicting language variation in time and space. A stable and uniform UG making available recursive Merge shaped by laws of nature such as simplicity and efficiency has been integrated by a sudden rewiring of the brain into an existing biological system which is comparable to the concept of the faculty of language in the broad sense. The basic oppositions such as symmetry and asymmetry, internal language/thought and externalization, uniformity and diversity, universality and particular languages are derived as an automatic consequence of the architecture of the grammar as it evolved in the human species in concert with general principles of nature. A stable and simple system can be reconciled with a dynamic complex one.

Keywords: evolution of language, universal grammar, architecture of the grammar, externalization, variation of language in space and time, grammaticalization, minimalism, the strong minimalist thesis

I could be bounded in a nutshell and count myself a king of infinite space... Shakespeare, Hamlet, Act 2, Scene 2

1 Introduction

The goal of the paper is to discuss the architecture of the language system to develop an evolutionary scenario that is in line with the Strong Minimalist Thesis (SMT) and derives the underlying uniformity but also captures surface variation.

The paper is organized along the following lines. First, the capacity for language is addressed and located in a minimalist setting. The next section presents the puzzle: uniformity must be reconciled with diversity. Next, variation in space and time is considered more closely before a proposal is put forth which derives variation as a consequence of the architecture of the grammar seen under an evolutionary scenario that is in line with SMT. In the final section, we summarize the main results.

2 The Capacity for Language: Infinite Structure and Infinite Thought

Two important lines of research have an impact on the current generative view on language. First, the biolinguistic tradition (see for instance Hauser et al., 2002) considers human language as a biological organ, part of the human endowment, which emerged suddenly in the human species by means of a small rewiring in the brain. Second, the Strong Minimalist Thesis (SMT) is based on the idea that the new system has been integrated into existing systems by following laws of nature including principles of efficient computation (Chomsky, 2005, third factor principles1). A recursive structure-building device, Merge, makes possible infinite creation of thoughts. It is conceived of as a simple and recursive operation creating unordered, unlabeled binary sets of the form {A, B} (see Seely, 2006, p. 191). Merge has most recently been re-conceptualized as applying to a workspace mapping sets to a new workspace (Chomsky, 2019, 2020, 2021, 2023; Chomsky et al., 2019; Seely, 2021, 2023).

The resulting minimized conception of UG effectively increases the explanatory force because positing a small UG captures the sudden emergence of human language. Complexity needs time and cannot evolve suddenly. The simplification strategy of minimalist research (see Epstein et al., 2015, 2022) contributes a lot to our understanding of human language since it makes explicable why humans have the capacity for infinite thought that non-human creatures lack. UG, the linguistic implementation of the capacity is small, simple, uniform and could evolve in a short evolutionary window in the human species.

By means of unbounded combination of symbols (recursive set-formation, MERGE) humans can express ideas or things that do not exist in the external world and even point to objects or events that are conceptually impossible as illustrated in (1).2 Importantly this is a property of human language that applies to any particular language.

1
  1. a black-haired boy with curly blond locks

  2. The car drives fast in a slow motion around a corner.

  3. Im Auto saßen stehend Leute schweigend ins Gespräch vertieft.

    In the car sat standing people silently engaged in conversation.

    ‘People sat standing in the car silently engaged in conversation’

Both, the unboundedness of structure-building and the mind-dependent character of thought are restricted to the human species. Hierarchical order is a property of any particular language. There is innumerable evidence for structure in language. The conclusion that UG contains an operation such as Merge is therefore inescapable.

A minimal setting of the grammar entails a recursive structure-building device (SYN) generating an infinite array of hierarchical expressions which may convey complex thought (SEM). Thought can be externalized in different modes including spoken or sign language (PHON). Following Bode (2020, in press), a minimalist architecture should imply clear-cut tasks, which is a consequence of the third factor principles demanding simplicity and non-redundancy. Roughly speaking, creating structure is distinct from interpreting structure and from externalizing it. The distinction should be reflected by the respective set-related tasks.

This view yields the following picture. Syntax creates symmetric sets. Transfer labels the sets and thereby renders them accessible to interpretation and at the same time inaccessible to syntax (cf. Bode, 2020, labeling is Transfer). The labeled (and therefore) asymmetric sets are interpreted by semantic rules, and finally, language particular resources get inserted to externalize the sets according to language particular rules. Notably, syntax creates reversible, symmetric3 relations which are rendered asymmetric by a label. The label4 makes them accessible to SEM and PHON where dependencies are established by means of asymmetric rules.5

This architecture suggests an organization as in (2a) which is characterized by a shift of properties as in (2b).

2
  1. SYN→SEM→PHON

  2. Transition of the system

    1. from symmetry to asymmetry

    2. from the internal to the external

    3. from uniformity to diversity6

    4. From universal to language particular

Basically, we suggest that the principles of the third factor not only shape the operation Merge, but have an impact on the architecture of the grammar too. SYN, SEM and PHON have different tasks, which avoids redundancies and inefficiency. Syntax establishes hierarchical relations. A relation is symmetric and reversible. Merging two items does not assign any prominence to either of the members. The sole function of Merge is structure-building. In contrast to SYN, interpretation of structures involves assigning a direction to the relations and thereby establishing dependencies. Hence, interpretation is necessarily asymmetric in nature. The following simplified data illustrate the difference between symmetry of SYN as opposed to asymmetry at SEM and PHON. Consider the examples in (3) illustrating the idea.

3
  1. The man saw *she/her.

  2. The man ate *the theory/the pizza.

Both examples are syntactically grammatical with either object. Merge creates the symmetric set {V, D}, hence, a relation between two syntactic objects. Dependencies are established post-syntactically when structures are phonologically and semantically interpreted. The phonological shape (the case) of the pronoun in (3a) is determined by the transitive verb in English. The direction is not reversible but asymmetric. The semantic dependency between ‘eat’ and the nominal entity in (3b) is also asymmetric because the verbal content requires a certain nominal content and not vice versa. Under this view, we get a system with distinct set-related tasks. SYN creates symmetric sets, while SEM and PHON interpret the sets in an asymmetric way. Furthermore, SYN and SEM are internal in the sense that they universally enable humans to create infinite thoughts. PHON externalizes sets in the sense that they receive the phonological form that corresponds to the language particular rules of a specific language such as English in the examples above. Consequently, the transition of the system described in (2b) follows from the architecture since we can assign the respective properties to the systems at hand as in Table 1.

Table 1

Properties of SYN-SEM-PHON

System Property 1 Property 2 Property 3
SYN Symmetric Uniform Internal
SEM Asymmetric Uniform Internal
PHON Asymmetric Diverse External

A computation based on a simple recursive operation being implemented into a simple architecture of the grammar is in line with SMT. Efficiency and simplicity shape the human system entailing that each domain has a separate task, which prevents overlaps and redundancies. Simplicity furthermore constrains the form of the structure-building operation by principles such as Minimal Yield (MY)7 which allows increase of sets by one (creative aspect) and decrease by none (structure preserving aspect).

So far, we observed that UG must contain a structure-building operation. In addition, the operation needs elements to operate on. If there are no atoms, there can be no set-formation. Consequently, UG must also provide elements that can enter structure-building. Since MERGE creates structures, the first elements to operate on cannot be structured themselves and must, in fact, be atomic. UG-atoms must have certain properties besides being unstructured. They have to be simple, uniform and stable like UG, and innate too. They cannot be infinite either because infinite items could not have evolved suddenly through a small change in the human brain. Crucially, they cannot be lexical items in the sense of language particular items.8 Berwick and Chomsky state that the atoms present a deep mystery (Berwick & Chomsky, 2016, p. 90). The mystery lies in the necessity of abstractness, discreteness and finiteness of UG-atoms. The structure-building operation must not change the items it operates on, manipulate them or add properties. The items may not have the features of words which are hard to define anyway, and which are not discrete either since words, morphemes, clitics are language particular elements that belong to externalization where the boundaries are blurred9. A language particular lexicon is an instable, constantly changing store, a dynamic system that nobody would assume to be part of UG.

Taking stock, the capacity for language is unique to the human species and requires a basic underlying and simple implementation (UG). Other creatures do not have language/thought though they may have communication systems of different quality and complexity. In this context, it makes sense to recall the distinction of the faculty of language in the broad sense (FLB) and the faculty of language in the narrow sense (FLN) suggested by Hauser et al. (2002). While human beings have FLN, recursion (UG with atoms + MERGE), non-human creatures lack FLN but may share with us FLB. The authors locate the sensory-motor (SM/PHON) system and the conceptual-intentional (C-I/SEM) system under FLB. Notice that the capacity for language emerged suddenly and was integrated into existing systems following the laws of nature. In this sense, one can combine the conception of FLB/FLN with SMT and capture the fact that the capacity for language is uniquely linked to the human species.

What needs to be explained is the tension between UG as a stable, uniform system underlying the capacity and variation which is visible in the resources of a particular lexicon and externalization of the internal system.

3 The Puzzle: Uniformity and Variation

We have seen that invariant principles underlie the human capacity for language. A small and simple UG conceived of as laws of language plus general, invariant laws of nature shaping the system can account for the emergence of human language. So, variation has to be reconciled with invariant principles. In the evolutionary scenario, variation being visible in particular languages (2nd factor) must be deducible from 1st and 3rd factor without being a part of either, which causes an obvious puzzle.

Otto Jespersen, ahead of his time, stressed that there must exist principles underlying the various grammars of existing languages. According to Jespersen, the formatives of languages are diverse but syntax serves as a common basis for human thought viewed as applied logic. There can be no universal morphology (see Jespersen, 1965, pp. 47–52).

Morpho-phonological rules apply to particular languages. They are best located at externalization (PHON). At the external side, variation shows up in form of the elements used (language particular lexicon), linear order of elements, inflectional properties (Agree), and (non-)pronouncement of linguistic material (copies, subject and object pronouns, functional heads etc.). For instance, inflectional rules such as feature-sharing agreement apply at the level of feature-values. SYN forms symmetric sets which may get phonologically interpreted as agreement dependencies that is, an asymmetric relation. Under the view sketched above, Agree applies post-syntactically and affects the language particular resources. Bode (2020, pp. 117–118) suggests an evolutionary scenario that entails that the building blocks of SYN-SEM-PHON differ accordingly. Merge operates on abstract categories and roots which are related to semantic features (SEM), and later associated with phonological features and feature values which contribute to variation. It follows that SYN is free of variation. This assumption accords well with Merge being part of a uniform UG.

Furthermore, SMT entails eliminating parameters from UG. In contrast to the conception of UG in the Government and Binding era, parameters or generally speaking variation has been shifted to the lexicon which became known as the Borer-Chomsky conjecture (Baker, 2008). Notably, the language particular lexicon contains features and values of items which cannot be equated with UG-atoms as mentioned above. Borer (1984) convincingly argues that variation is associated with inflectional rules and grammatical formatives, the vocabulary and its idiosyncratic properties that have to be learned by the child (Borer, 1984, p. 29). Similarly, Chomsky (2001, p. 2, 2007, pp. 6–7) locates parametric variation in the lexicon to account for the varieties of particular languages. Strikingly, in later research Chomsky points out that externalization might be the best place for the complexities observed with variation because linear order and inflectional arrangements reflect properties of the sensory-motor system (PHON), which does not belong to human language (SYN-SEM) enabling thought in the first place (compare for instance with Chomsky et al., 2015, p. 74). The simplest computational system that emerged suddenly in the human species cannot include complex variation. Furthermore, what has to be learned does not belong to innate UG which is not subjected to acquisition since it is part of the human endowment.

A three-stage evolutionary scenario may follow. First, the emergence of human language by means of a simple and sudden mutation made available UG (1st factor/SYN) which was integrated into the existing systems (obeying the 3rd factor). There can be no variation at the point of language/thought emerging in a member of the human species. The new system is uniform, stable and specific to a human being. What is uniform, the internal system (of thought/SEM), cannot be diverse at the same time. Externalization is an option for later generations that has to follow the emergence of (internal) language. Depending on the mode of externalization (speech or sign/PHON) external language forms. In a third step implying spreading, separating and external grouping, distinct particular languages (2nd factor) may have formed. Language variation necessarily follows externalization and can thus be considered as multiple and different answers to the same task, namely, mapping the internal to the external side. The logical timing is summarized in (4).

4

Logical Timing

  1. Invariant internal system (UG-based) = language

  2. Externalization (amalgam of language and the sensory-motor system as it is called by Chomsky in 2021)

  3. Cross-linguistic variation (diversity of particular languages: distinct marking strategies expressed by inflection, agreement, linear order, pronouncement etc. in the spoken mode) and possible language change

Chomsky (2010, 2017, 2019, 2020, 2021, 2022) has frequently observed that the emergence of language is independent of the SM-system. He has also stressed that externalization as the locus of complexities and variation relates two independent systems as a secondary, ancillary process, which corresponds to the logical timing suggested in (4).

Summing up, what evolved in the human species is a stable, uniform and internal system. Variation is related to the language particular lexicon and to externalization (PHON). While SYN-SEM is uniform, the linking to PHON is a pre-condition for variation. This view also entails assumptions on what can be universal because what is missing in one particular language cannot be assumed to play a universal role. For instance, there are languages that lack phi-features (Japanese). Hence, it would be inconclusive to claim that phi-features10 were universally relevant. Different functional features that contribute to cross-linguistic variation (such as case features) but not to universal meaning aspects relating to thought should occur at the externalization side (PHON) only. Here we encounter different external marking strategies by case, phi-agreement, also by specific particles. If variation is expressed by means of (presence/absence) of specific feature values, it makes sense to assume that these entities emerged much later than human language itself (which is purely internal and uniform).

The tension between uniformity (of UG) and diversity (of LEX and externalization) is also reflected in the emergence of both. Whereas UG being small and simple could have emerged suddenly (4i) yielding thought by means of a simple computation (SYN: SEM), variation needs time to develop because of its complexity and it presupposes the possibility of externalization (SYN: SEM: PHON). Consequently, the puzzle to be solved consists in reconciling the uniformity of our internal UG with the diversity of the external data.

4 Variation in Space and Time

In this section, we address the issue of variation less from an empirical but rather from a theoretical angle. First of all, variation is two-fold because there is variation in space and variation in time as illustrated in Table 2.

Table 2

Language Variation

Dimension Language Variation
Space cross-linguistic variation
Time diachronic change

Similar observations apply to language acquisition as shown in Table 3.

Table 3

Language Acquisition

Dimension Language acquisition
Space instable, parallel grammars
Time stages of growth

Crucially language variation and language acquisition apply at the level of particular languages and this entails language particular resources (lexicon) and externalization (PHON). Children grow the language of their environment and particular languages differ in and change their resources and the morpho-phonological patterns exhibited at externalization. Interestingly, scholars have invoked 3rd factor principles to account for variation.

Roberts (2019) suggests that UG contains a list of underspecified formal features (FF) lacking parametric values. According to him, these values emerge in the process of language acquisition which is constrained by 3rd factor principles such as the one in (5).

5
  1. Feature Economy (FE) (Roberts & Roussou, 2003, p. 201)

    Postulate as few features as possible11

  2. Input Generalization (IG) (Roberts, 2007, p. 201)

    Maximize available features

The learner assumes (by default) that no head bears a given feature (FE and IG). On encountering F in the primary linguistic data (PLD), IG says that the feature is generalized to all relevant heads (though this violates FE). If the learner detects a head without F, he/she revises the generalization and assumes that some head bears F. This strategy provides a learning path from no head—all heads—some head to finer-grained distinctions, which in turn yields the taxonomy of parameters shown in (6).

6
  1. Macro-parameter: all heads of the relevant type share the feature value

  2. Meso-parameters: all heads of a given natural class (e.g. +V or a core functional category) share the feature value

  3. Micro-parameters: a small but lexically definable sub-class of functional heads (e.g. modal auxiliaries, subject clitics) shares the feature value

  4. Nano-parameters: one or more individual lexical item/s is/are specified for the feature value

Roberts’ approach can account for macro-differences among particular languages, for example the phi-parameter where a learner starts to set phi-related options on exposure to exponents of phi-features in the PLD and also micro-differences which are related to specific properties of individual functional heads holding among language families such as Romance languages. By referring to 3rd factor principles the search space can be reduced and unburdens the learner. Furthermore, general (typological) patterns (macro) and detailed variation (micro) in the data (second factor) can be accounted for, and the first factor, UG, can be kept uniform.

Roberts’ empirical results and the system of emergent variation are impressive. Yet, we have to raise a few issues.

We have seen in the first section that UG must minimally contain a recursive structure-building device and atomic elements to be operated on. If UG contains underspecified formal features, they either have to be added to the atoms which requires structure (i.e. FF [attribute: value]) or they should form the atoms that enter set-formation which is inconceivable too because Merge does not operate on features. Further complication arises with the (format of) uFs and iFs as being provided by UG. The language learner is argued by Roberts to assemble these features into the lexical items of his particular language. The decisions that would guide the learner are then of the following kind. Is F unvalued or valued—does it trigger Agree or Movement (internal Merge)? Decisions like these require a triggered syntax with Merge being parasitic on Agree, which stands opposite to a free syntax adopted under an SMT-view on language. Basically, we argue that different values on functional categories are reflexes of PHON and must have emerged from the later association of the computational system (human language) with the SM-system. They cannot be part of UG but are properties of language particular elements at externalization. Roberts states that formal features must have a direct phonological exponence. They must be morpho-phonologically visible to be detectable by the learner. This makes good sense because variation shows up at PHON but including FF with values in UG is problematic since it locates the option of externalization and variation in UG.

Considering cross-linguistic variation and diachronic change in the context of 3rd factor principles is, however, a promising route that has also been taken by van Gelderen (2009, 2022a, 2022b, 2024). She argues that 3rd factor principles of economy lead to grammaticalization and help to explain linguistic cycles. She considers the emergence of language in the context of biolinguistics too and suggests that the innovation UG, Merge, being added to a pre-linguistic conceptual system made it possible to organize a thematic layer before a full-fledged functional system developed. Under her view, (external) Merge provided thematic structure, followed by a grammaticalization process guided by 3rd factor principles to enlarge the operational options (internal Merge). Van Gelderen’s intensive research leads to many insightful empirical and theoretical conclusions. Since ungrammatical (uninterpretable) features are parametrized, she further assumes that they could not have been present when UG emerged. The main idea is that grammaticalization adds the morphology, the second layer of semantic information. This corresponds to our discussion insofar as morpho-phonological properties are purely external and therefore do not belong to language proper.

The 3rd factor principles of economy, van Gelderen (2009, pp. 232–234) proposes are given in (7).

7
  1. Head Preference Principle (HPP)

    Be a head, rather than a phrase.

  2. Late Merge Principle (LMP)

    Merge as late as possible.

The HHP can explain the diachronic re-analyses from pronouns as emphatic full phrases to clitic pronouns to agreement markers, and negatives from full DPs to negative adverb phrases to heads.

The LMP captures the change from lexical phrases, for instance PPs, base-generated higher to specifiers of functional heads which then recycle as the head of the functional phrases.12 In later research, van Gelderen (2022a, 2024) further generalizes by subsuming both principles under the principle of determinacy. A structural configuration of two larger sets, {XP, YP}, creates a labeling conflict (see Chomsky, 2013) that needs to be resolved. Minimal Search (MS, a further 3rd factor principle, see Chomsky, 2023) inspects the sets and finds two heads. In contrast, a set consisting of a head and an XP can be labeled easily since MS finds the single head X in {X, YP}. Van Gelderen revisits the HPP in the context of labeling. A ‘spec’ is reanalyzed as a head diachronically because the result is a configuration that is not indeterminate in terms of labeling. She furthermore derives the LMP from 3rd factor determinacy applied to the workspace. Externally merging a phrase in a higher position instead of letting it enter lower and internally merge it by means of copies, can reduce the indeterminacy multiple copies raise.

Whether one can assume that one stage of a particular language raises more indeterminate configurations than other stages so that the general course in diachronic change would be a simplification should be discussed further. Alternatively, we rather assume that the principles of computational efficiency (MS, determinacy, MY etc.) may act on diachronic changes insofar as, for instance, different strategies of resolving indeterminacy are available so that diachronic change can be seen as shifting from one option to another. A general principle of efficiency favors determinate, unambiguous states but there may be different ways to achieve these and decide on which element is accessible (to labeling or to operations in the WS). The stable UG-system implemented in larger systems in concert with the laws of nature (3rd factor principles) yields a dynamic system visible due to externalization.

We follow van Gelderen in that grammaticalization of functional categories occurred later. We further assume that the development of language particular functional elements, including respective unvalued features could have emerged no sooner than the link to PHON, to externalization, was available. One might go further and assume that it is explicable why the higher levels of sentence are basically functionally determined and involve internal Merge, which produces discourse relevant configurations. Discourse entails the use of language and includes non-linguistic entities such as the speaker and the hearer. Human language (thought) is distinct from communication but it can be used for it, which is due to the externalization option having emerged in a later step.

Notice that we do not claim that internal Merge evolved later. External and internal Merge are the same operation. Internal Merge is an efficient means for a syntactic object to enter multiple relations, and there is basically only one operation Merge creating sets. Merge combines atoms (finite, discrete, uniform) to sets. The recursive operation does not access any content, neither semantic nor phonological content. We only get abstract structures such as in (8).

8
  1. Merge {X, root}

  2. Merge {Y {X root}}

  3. Merge {X {Y {X root}}}

  4. Merge {X {Y {X root}}}, {Y {X root}}

Pure structure-building requires two distinct elements (3rd factor binarity) to be creative. Recursive application of set-formation makes it basically impossible to make a distinction between external or internal applications.

Structure-building is uniform and universal. Prior to externalization providing the connection to PHON, there are no concrete lexical or functional inventories which are language particular and needed for the various forms of externalization.

In her research, van Gelderen provides massive evidence for the ways in which particular languages may differ. For instance, (see van Gelderen, 2009, p. 241) languages may mark thematic roles by case (Old English, and also German) and others by position (modern English). Chinese uses time adverbials, while other languages use grammatical elements. So, we observe that there is a flexibility that allows particular languages to express meaning by using different morpho-phonological strategies. The language particular resources distribute differently across the SEM-PHON-axis which should be reflected in the architecture of the grammar and should be explicable in terms of the evolution of language too.

5 An Evolutionary Scenario: From Uniformity to Diversity

In this section, we try to put the pieces together to reconcile the sudden emergence of UG (human language; the 1st factor) shaped by the laws of nature (the 3rd factor) with the slow and ongoing development of particular languages (2nd factor) based on the options provided by the 1st and the 3rd factor.

We started the discussion with some basic insights made in the generative framework. Language crucially differs from communication which is not restricted to the human context and humans can use means distinct from language to communicate (i.e. smiling, crying, waving, nodding etc.). Furthermore, language must not be mixed up with speech either. Sign-languages do not need articulation via sound waves but display the infinite combinatory possibilities characteristic of human language/thought like any other particular language. Language can be used for communication and it may be externalized by means of sounds or signs.

What emerged in the human species is UG entailing a maximally efficient system generating hierarchical structure enabling thought. This internal and uniform system does not include any variation which we observed to be an external effect.

A simple communication system necessarily pairs external forms with a meaning. This can be illustrated as in Figure 1. There must be a combination of SEM and PHON.

Click to enlarge
bioling.12823-f1
Figure 1

Simple Communication

Simple organisms interact with their environment, which is clearly communication but not language.

A more complex communication system—still no language—entails direct referencing. The external world is referred to by signals. The creatures may use different canals (for instance, the auditory, cf. monkey calls). Mind-independent signs are used by which meaning units may be listed but no new meanings can be generated because a combinatory device is missing. SEM might have the quality of a C-I system (including a theory of mind, intentions etc. shared by humans and other creatures, for instance the great apes). Figure 2 illustrates the idea.

Click to enlarge
bioling.12823-f2
Figure 2

More Complex Communication

The atoms are signals which combine with a meaning that directly relates to the external system. This entails a one-to-one correspondence between the distinct layers, Atom: SEM: PHON, which is strikingly different from the human system.

The system of ‘atoms’ makes available the storing of unstructured items which have a meaning that may get externalized but without a creative system SYN, there can neither be structure-building nor complex thought but only direct reference.

For language a further system has to be added: SYN. The most deeply embedded, the newest system is uniquely human. It inserts the capacity for infinite structure-building (by means of recursive Merge). When UG-atoms, which are abstract and discrete by nature (categories and roots), enter structure-building so that set-formation by internal and external application of Merge forms hierarchical structures, mind-dependent meaning and complex thought come into play that may get externalized. The overall system is indicated in Figure 3.

Click to enlarge
bioling.12823-f3
Figure 3

Language

Notice that by means of a generative procedure, the relation between Atom: SEM: PHON must drastically change because the unstructured atoms enter hierarchical structure-building and form recursive sets. This change entails that we do not deal with one-to-one correspondence here since syntax establishes relations that get interpreted. Furthermore, we assume that humans have the capacity for categorization. Consequently, the elements undergoing Merge, the atoms, must be different from the simple atoms of the non-human system. Therefore, we called them UG-atoms in the final figure pointing to their abstract and universal character, and to the fact that they can enter set formation by means of Merge.

It is, however, not reasonable to assume that a human being suddenly endowed with UG (4 in Figure 3) by a mutation in the brain and thereby with the capacity for thought emerging from the SYN-LEX-SEM connection started externalization at once.

It is also not reasonable to assume that this very human being endowed with the capacity for thought stopped communicating with others on the LEX-SEM-PHON-axis.

Hence, one has to assume that for this human being there was a separation between communication with his species on the one hand (using 3-2-1 of Figure 3) and thought (using 4-3-2 of Figure 3). Since modern humans communicate by other means too (we not only use language when we communicate) as we have argued before, it seems to be conceivable for humans today as well to make use of a non-linguistic LEX-SEM-PHON-axis besides the linguistic SYN-LEX-SEM-(PHON)-axis. What is crucial is that connecting the SYN-LEX-SEM-system with PHON—the second (‘ancillary’ step as Chomsky calls it) happened with later generations. The offspring of the first thinker began to externalize their thoughts.

The interaction of the systems can be seen as follows. Connecting 3 (the atoms) with 4 (Merge) corresponds to Select. Items enter the set-formation operation taking place at SYN. Recall that we assumed that categories label sets (Transfer). Hence, Transfer connects 4 (SYN) and 3 (containing categories and roots). Connecting with 2 (SEM) is equivalent to interpreting sets, and the final step (when externalization began to be available) links the system to PHON to externalize the sets as in Figure 4.

Click to enlarge
bioling.12823-f4
Figure 4

Connecting SYN-LEX-SEM-PHON

The dashed arrow signifies that the externalization is optional. Furthermore, the connection to this system, the oldest system, was added last (as we described in the previous section). Notice that the system is built on the set-related tasks we suggested on basis of an architecture shaped by laws of nature such as efficiency and simplicity.

Atoms (abstract categories and roots) form the first input to SYN (from LEX to SYN). SYN creates sets. Sets need to be labeled (transferred) to be interpretable by the rules of SEM. Since categories can label sets, the arrow points to LEX again which is also an inventory of potential labels (only categories can label sets). Labeled sets enter SEM-interpretation which can finally form the input to externalization.

Further connections as in Figure 5 make the system of inter-actions complete and have to be added.

Click to enlarge
bioling.12823-f5
Figure 5

Interactions

Importantly, we can now derive variation. Form the interaction indicated by the additional arrows it follows that variation and acquisition include the language particular lexicon because it is not an inventory or fixed store but an instable, growing and changing concept since the (development of a) language particular lexicon spans 1–3. The System 3 (atoms) is discrete in the sense that it provides categories and roots (for human language) that enter set-formation but the language particular instantiations that distribute over 1 (PHON), 2 (SEM) and 3 (atoms) are fuzzy (reflecting the difficulty to define clear boundaries (compare with affix, clitic, word) and also with floating boundaries between content categories (‘lexical’) and functional categories (for instance, adpositions) on the SEM-PHON-axis. Language particular lexicons develop and vary by means of the interaction that necessarily includes PHON. The PHON-system is diverse and variable in nature. Roughly speaking, there are different sounds, different gestures and different (linear) orderings. Targeting the external, humans automatically impose structure on it (by having UG). For instance, a single linear order may have different structures. So, one might say that externalized language (in this case linear order) is ambiguous, open to different analyses13. Suppose that the same applies to any other PHON-property: PHON-properties are basically variable and may be open to more than one analysis. In this sense, variation (cross-linguistic variation and diachronic change too) is rooted in externalization. Diversity enters language from the outside, is analyzed by the child learning the language according to the invariant principles of human language and nature which may lead to competing analyses all in accordance with the invariant principles but which open the floor to variation. The interaction between PHON-SEM-LEX-SYN yields variation in time and space. The final connection to the oldest system is crucial for the explanation of diversity while the newest system SYN accounts for the uniformity of human language.

Let us consider grammaticalization again. Grammaticalization concerns the language particular system too. According to van Gelderen (2009), grammaticalization is defined as a process whereby lexical items lose phonological weight and semantic specificity and gain grammatical function. It seems to be obvious that before a lexical item can lose phonological weight and semantic specificity, there must be a lexical item that has both properties. As we argued, by abstract atoms (category – root) undergoing set-formation first linked to SEM, we can derive thought but in the very beginning there could not be lexical items. Lexical items are language particular units. They are neither universal, nor uniform, nor stable, nor discrete. They have to be learned and they are in variation. Through the interaction suggested above atoms (cat/root) may gain semantic features (SEM) and phonological features and values (PHON) and also lose or change features. The interaction is a general sign of creativity in humans because the re-using, keeping and renewal derive variation. Van Gelderen (2024) also provides evidence for the theoretical concept of renewal in a linguistic cycle. She explains that material that is lost in cycles provides insights into the semantic features available. In our terms, (new) lexical items arise through the interaction of the LEX-SEM-PHON-axis. Both directions are predicted. A simplified picture is given in (9).

9
  1. Direction of Grammaticalization (weakening): PHON - SEM – LEX

  2. Direction of renewal (strengthening): LEX – SEM - PHON

In terms of the proposal we made, the concept of a ‘cycle’ would also be a consequence of the architecture of the grammar.

Van Gelderen (2024) elaborates on micro- and macro-cycles in depth. She convincingly shows that micro-cycles target subparts of the language particular grammar while macro-cycles affect the whole grammar of a particular language.14

Since there is necessarily interaction between 3 (LEX) and 4 (SYN), variation applies to smaller units and also larger ones too. For variation in time, we might therefore argue that targeting smaller units results in micro cyclic change, while larger units affecting the full scale of sets formed in the derivations of a particular language may results in macro cyclic change. Meaning arises in structure (SEM interprets the sets generated by SYN) and sets can be externalized at PHON. It is an automatic side-effect of the architecture suggested above that changes may have an impact on smaller parts, but also on larger ones, then causing typological shifts.

Strikingly, variation in space and time are interdependent. You cannot have one without the other. Furthermore, one can only speak of variation in any reasonable sense if there is an underlying sameness (UG). This means that particular languages depend on UG (human language) but not the other way around. UG does not depend on particular languages. The fact that thought does not have to be externalized is an obvious consequence.

Particular languages (re-)organize along the (SYN-)LEX-SEM-PHON axis within the limits of a universal grammar which embedded in the human biology also entails a universal architecture of the grammar.

A reviewer drew attention to the research of Mendívil-Giró who makes a similar proposal. Focusing on the biolinguistics perspective, Mendívil-Giró (2014) works out an interesting analogy between natural evolution and linguistic evolution. According to him, the evolution of an organism is analogous to the evolution of I-language and the evolution of different species can be compared to E-languages. Importantly, this analogy implies that variation of particular languages belongs to the external system as we have argued too. Furthermore, Mendívil-Giró (2014, 2019a, 2019b) elaborates on an internal lexicon as a lexical interface externalizing the internal computation. In order to account for variation, the author argues that the I-lexicon develops in the course of internalization of environmental stimuli. Consequently, particular languages are susceptible to change and variation. While syntax belongs to the internal biological system, the external system of morphology and phonology are part of history. Mendívil-Giró (2019a) states that the lexical interface that is culturally determined and internalized from the environment can be compared to the externalist view on language, while the innate and universal syntax corresponds to the internalist view on language. The combination of both, under his view, provides room for the universal part that is internal to the mind on the one hand and for variation and change located at the lexical interface on the other hand. Interestingly, the interface is externalized from the perspective of the internal system, the language of thought, and internalized when viewed from the outside accounting for language acquisition. So, this is similar to our understanding of uniformity on the one hand and variation and change on the other hand. The internal system can be externalized (PHON-related) with diversity entering the system from the outside. Furthermore, Mendívil-Giró (2019b) distinguishes the input to Merge and the output of Merge (syntactic words) from the output of the lexical interface (phonological words). Mendívil-Giró defines the p-word as a categorized fragment of a syntactic derivation (s-word) associated with a phonological form. We also stressed that the UG-atoms must not be equated with words and morphemes since the latter belong to the language particular resources and therefore do not qualify as universal atoms. It therefore follows under both views that structures are not projected from the lexicon.15

Mendívil-Giró (2019b, p. 1181) argues further that categorization converts a concept into a computable unit. This raises questions though because the operation that combines concepts and categories is Merge, and this then must already be part of the computation. Hence, concepts must be computable from the very beginning of the computation, which means that the idea of a conversion into a ‘computable’ unit is a bit counter-intuitive. Under our view, labels (categorization of sets) render symmetric sets visible to interpretation, which means that they get transferred to SEM. We agree with the author that syntax does not operate on morphemes, but on syntactic categories. Yet the idea of incorporating concepts through Merge with categorization conceived of as a syntactic operation on concepts (Mendívil-Giró, 2019b, p. 1207) seems to suggest that Merge includes labeling. Categorization is also called lexicalization by the author (Mendívil-Giró, 2019b, p. 1209), which should actually be distinguished if categorization relates to the s-word and lexicalization to the p-word. Furthermore, Mendívil-Giró (2019b, p. 1205) speaks of Merge as being endocentric, which entails that the syntactic operation not only combines but also designates heads as labels. In contrast, we assume that Merge is a binary operation creating symmetric, unlabeled sets with no member being more prominent than the other. Assuming that Merge includes labeling goes against SMT since the simplest possible operation creating hierarchy should be favored from an evolutionary perspective. What goes beyond hierarchical structure-building should be eliminated from the syntactic operation Merge.

Another point to clarify further concerns Mendívil-Giró’s (2019b) idea of the selection (of a concept) (p. 1181). How shall selection be implemented? In addition, he refers to interpretable categorial features, extended projections and agreement in a footnote (Mendívil-Giró, 2019b, p. 1181) so that one might wonder in which part of the grammar these elements and processes should be located. Apart from these questions, Mendívil-Giró’s approach is intriguing and the reader is referred to his work for details. What is important to our discussion is that Mendívil-Giró comes to the same conclusion, namely that the architecture of the grammar must be investigated in order to resolve the tension between language as a universal human capacity and the diversity of particular languages.

We argued that variation and change enter the internal system through externalization. The internal system creating infinite thoughts is universal. Syntax conceived of as a simple and free system constrained by the laws of nature only is necessarily immune to variation. Externalization is an option following the emergence of (internal) language. Depending on the mode of externalization (speech or sign/PHON), external languages form. In the course of spreading, separating and external grouping of humans, distinct particular languages (2nd factor) could have developed and still develop. Language variation can be considered as diverse solutions to the task of externalizing the internal system. The impact of the three factors corresponds with the SMT-perspective we adopted in the paper and can be presented as in Table 4.

Table 4

Three Factors and Language

3rd factor = laws of nature: shaping function 1st factor = laws of language: internal 2nd factor = dependent on mode and environment: external
biological setting basic property of language: hierarchy particular languages
general learning strategies evolution of language language acquisition
principles of efficiency Universal Grammar: uniform and stable language variation and change

Notice that the third factor conceived of as including biological conditions (human biology, vocal tract, maturing processes) and general learning strategies acts on the external system (mode of externalization, language acquisition) too. Computational efficiency minimizes UG, which reveals the shaping function of the 3rd factor on the internal system.

It is necessary for externalization to have an internal language system shaped by the laws of nature. Furthermore, the properties of external systems depend on the mode of externalization. Spoken languages require linearization and involve prosody and stress patterns. All aspects of pronunciation (linear order, stress and prosody, (non-)pronouncing of elements such as PRO, pro and copies, contractions, language particular morphological rules such as case and agreement) and the various external patterns and systematic marking strategies found in different particular languages reflect uniform internal principles (dependencies are structure-based) and derive from interactions along the SYN: LEX: SEM: PHON axis.

6 Conclusion

The goal of the paper was to explain the general tension between symmetry and asymmetry, internal and external side, uniformity and diversity, universal properties and language particular properties as resulting from the architecture of the grammar. The system of human language emerged suddenly. SYN (recursive Merge), the most recent innovation, has been integrated into a system of atoms and SEM (C-I). Since Merge interacts with a system of abstract atoms (Select and Transfer) associated with SEM, the beginning of human thought does not imply externalization. Interactions with PHON—the oldest system—that can be conceived of as the latest step have three side effects summarized in (10).

10
  1. The internal system has been connected with the external side, thereby establishing more categories (grammaticalization and lexical resources) forming a particular lexicon.

  2. The arising system is a dynamic one because of the interactions between SYN-LEX-SEM-PHON which predict variation in space (particular languages with different lexical resources and different externalizations) and in time (diachronic change of particular languages in micro and macro cycles along the SYN-LEX-SEM-PHON-axis).

  3. Acquisition of a particular language targets the externalization (input for the learner). The internal system (Merge using atoms to generate thought, and the general architecture) does not have to be learned because it is made available by endowment. The ambiguities in the external data resolve under structural analyses made available by UG.

The proposal combines an SMT-setting with the idea of a FLB and FLN. A simple UG (recursive system) could emerge by means of a sudden mutation and has been integrated into the biological system according to laws of nature (3rd factor). The resulting system is based on the simplest possible operation creating an infinite number of sets and a simple architecture entailing systems with clear-cut, set-related tasks. The overall interactions along the SYN-LEX-SEM-PHON-axis predict variation of particular languages in space and time.

Notes

1) 1st factor = UG, 2nd factor = external data, 3rd factor = principles of nature that are independent of the language system but necessarily act on language as on any other natural system.

2) Compare with a German poem by an unknown author around 1850 with the title ‘Dunkel war’s der Mond schien helle’ (dark it was, the moon shone bright) which contains many oxymora such as in 1.

3) Notice that a set {X, Y} establishes a symmetric relation without a direction. The recursive application of MERGE creates hierarchies and thereby explains the underlying principle of human language which is hierarchical order. We do not adopt Merge as being parasitic on Agree creating directed, asymmetric and labeled sets like {x X [+F], Y [-F]}.

4) Seely (2006) conclusively shows that labels cannot be syntactic objects since they are no terms and do not enter c-command relations. Furthermore, their elimination simplifies the operation Merge. The reader is referred to Seely’s insightful and elegant discussion. Chomsky (2013) argues that labels identify an object which is necessary for its interpretation. Hence, it makes sense to put labels outside of the syntax. In the context of labeling = Transfer, labeled sets are inaccessible to SYN but accessible to SEM.

5) See also Bode (in press) for a detailed discussion of this point. Suffices to say here that a dependency is an irreversible relation since X depends on Y (versus Y depends on X). A relation is interpreted by getting a direction. This direction can either be phonological (i.e. case) or semantic (i.e. arguments/theta-predicate) in nature.

6) The transitions (iii-iv) will be subject of the following sections too.

7) Compare with Chomsky (2021, 2023).

8) See Bode (in press) for a detailed discussion of the problematic and mysterious status of the UG-atoms of computation.

9) This is evident with synthetic languages but also obvious in analytic languages such as English where syntactic categories occur attached to lexical categories/roots (see i-ii) when externalized. SYN-PHON mismatches are an architectural consequence.

  1. The man work-ed hard. (T-v-root)

  2. They ran yesterday. (T-v-root)

10) A reviewer raised the question of how to define phase heads without (u)phi features. In Chomsky’s system, phase heads are points of spell-out and interpretation. The definition in terms of uninterpretable features is problematic insofar as these features refer to a triggered system again. We argue not only for a free syntax but also for free Transfer. Labeling (conceived of as Transfer) prepares the output of SYN, namely unlabeled, symmetric sets for interpretation. In an efficient system, creation of sets should not apply vacuously. Hence, applying labeling (categorization of sets) as a prerequisite for entering interpretation can be viewed in terms of 3rd factor efficiency in line with SMT.

11) A reviewer correctly stresses that FE refers to linguistic features which do not belong to third factor principles. Consequently, one must assume that the economy part of the conditions mentioned in the text can be derived from the third factor but needs to be considered in combination with language specific material then.

12) Van Gelderen provides in depth analyses of the changes and presents convincing empirical evidence from various particular languages and stages of particular languages. The reader is referred to van Gelderen's (2022b, 2024 in particular) excellent discussion.

13) An example such in i. illustrates that the linear sequence is subjected to the two structural analyses in ii. and iii.

  1. I know how happy linguists feel.

  2. I know [[how happy] C [linguists feel <how happy>]].

  3. I know [[how] C [[happy linguists] feel <how>]].

14) Particular language may change from analytic to synthetic and back. Van Gelderen (2024) shows that macro cycles relating to agreement, namely, cycles concerning head-marking shifting to dependent-marking (and back) are more precisely definable than the synthetic/analytic cycle. She furthermore shows that linear word order is subjected to diachronic change too. The reader is referred to her excellent discussion.

15) The reader is also referred to the tradition of Distributed Morphology (Halle & Marantz 1993). Basically, syntax in DM operates on abstract categories and roots and concrete material is inserted late (post-syntactically). In DM, there is no traditional lexicon but three lists that get accessed at different points of the derivation. Importantly, morphology and phonology are realizational (late vocabulary insertion) and syntax is the sole generative component.

Funding

The author has no funding to report.

Acknowledgments

My thanks go to Kleanthes K. Grohmann for supporting the publication of this article. I am also grateful to two reviewers for important questions and comments that I hopefully could address properly. I especially appreciate the mentioning of related work by Mendívil-Giró that completely escaped my attention until now.

Competing Interests

The author has declared that no competing interests exist.

References