Pragmatic Grammar in Genus Homo

Dieter G. Hillert*1, Koji Fujita2

Biolinguistics, 2023, Vol. 17, Article e11911,

Received: 2023-05-08. Accepted: 2023-08-13. Published (VoR): 2023-09-06.

Handling Editor: Patrick C. Trettenbrein, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany

*Corresponding author at: SDSU School of Speech, Language, and Hearing Sciences, SLHS Building room 221, 5500 Campanile Drive, San Diego, CA 92182-1518, USA. E-mail:

This is an open access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


The question of how humans got language is crucial for understanding the uniqueness of the human mind and the cognitive resources and processes shared with nonhuman species. We discuss the origin of symbolic elements in hominins and how a pragmatic grammar emerged from action-based event-structures. In the context of comparative neurobiological findings, we report support for the global workspace hypothesis and social brain hypothesis. In addition, reverse linguistic analysis informs us about the particular role of a pragmatic grammar stage. We assume that this stage was associated with changes to the hominin genotype. Homo erectus may have used a pragmatic grammar which consisted of two or three symbolic elements. Extended syntax and morphology, including hierarchical branching, are not based on genotype changes but may reflect cultural accumulations related to socioecological adaptations. We conclude that the biological capacity for language may have emerged already 1.8 million years ago with the appearance of genus Homo.

Keywords: comparative neurology, evolution of language, Homo erectus, pragmatic grammar, extended syntax

1 Introduction

Children acquire at an early age the ability to organize syntactic and morphological relations between and within words. They include various rule-governed structures, from flat structures to non-adjacent dependencies, and syntactic frames, (semi-)fixed expressions, morphological case markers, etc. We call these elaborated syntactic and morphological structures extended syntax or extended morphology. Here, we are motivated to enhance our understanding of how these extended structures emerged in the hominin lineage. According to the minimalist account of generative syntax, the syntactic capacity emerged from a single macro-mutation as early as 100–200 kya in modern humans (e.g., Berwick & Chomsky, 2016, 2019; Chomsky, 2017). This mutation brought about the Merge operation. It is said that the single operation Merge is critical for syntax, defines human language per se, and recursive application generates infinite structures from finite means in a binary fashion (Chomsky, 1995).

Indirect anthropogenetic findings, however, favor a Darwinian scenario (e.g., Christiansen & Kirby, 2003; Christiansen et al., 2009; de Boer et al., 2020; Foley, 2001; Fujita & Fujita, 2022; Hillert, 2015, 2021; Pinker & Bloom, 1990; Zuberbühler, 2019). Nevertheless, the apparent debate is less controversial than is often stipulated since the Merge- or recursion-only hypothesis reserves adaptive evolutionary processes for the broad language faculty (Hauser et al., 2002; Pinker & Jackendoff, 2005).

Our approach considers neurobiological changes in the hominin lineage and evidence from reverse linguistic analysis. We discuss various pre-syntactic conditions and view the biological Merge capacity as a property that may have emerged early in our hominin ancestors, such as Homo erectus. The actual cognitive implementation and use of this capacity in language may be a late cultural byproduct of complex syntactic branching, particularly in context of the development of writing and reading. The signal system of extant nonhuman primates has been fairly well assessed. Monkeys use referential vocal signals and respond accordingly to different call types (Fitch, 2000; Gifford et al., 2005; Seyfarth et al., 1980). The signal patterns are fixed in monkeys but less constrained and more open in nonhuman apes (e.g., Botha, 2003; Coudé et al., 2011; Deacon, 1989; Ferrari et al., 2017). One of the earliest forms of hominin communication may have consisted of a signal exchange system based on the perception of events. The signals that resembled our ancestors’ environment were presumably mainly iconic and holistically stored and retrieved from memory. Then, a transition took place from discrete references to sense, a distinction already introduced by Frege (1892). The development of discrete concepts and nonverbal action-based event-structures is one of the most critical steps for creating words, that is, conventionalized vocal or gestural signs. Pantomimes certainly played an important role in referring not only to perceived events but also to imagined ones (e.g., Arbib, 2012; Gärdenfors, 2017; Zywiczynski et al., 2018). Specifically, pantomimes include the ability to imagine past and future events. The process of imagination is a precondition for the invention of conventionalized symbols. We discuss a more precise possible scenario further below, but non-referential concepts seem not to be verifiable in nonhuman primates.

Here, we consider extended syntax and morphology as an optional late cultural product. In contrast to the minimalist account, we assume that the biological Merge-capacity is not necessarily human-specific. One reason is that the level of cortical development in Homo erectus was virtually comparable to early modern humans, that is, quite advanced. Another reason is that we do not consider the iterative application of Merge to be critical for the initial stages of language. Instead, we suggest a pragmatic grammar without function words or inflections that exclusively relies on contextual information. The emergence of basic lexical grouping is semantically motivated (see Goldberg, 2005; Hillert, 2023; Jackendoff, 1997). Others propose an intermediate stage of proto-Merge (Progovac, 2010, 2015; Progovac & Locke, 2009) or an initial stage of core-Merge (Suzuki & Matsumoto, 2022).

We agree here with the idea of one (or more) intermediate precursor stages of modern language, such as pragmatic grammar. The critical stages involve the development of symbolic elements and their grouping within a perceived event-context. These lexical groupings can be considered as semantic frames. Syntax may not have played a cardinal role at this stage. We find evidence for this argument in contact languages (signed or spoken) and in the analysis of modern languages (e.g., Bickerton, 1990; Gil, 1994, 2005, 2013; Gil & Shen, 2019; Jackendoff, 1999; Jackendoff & Wittenberg, 2014, 2017; Willer Gold et al., 2018).

A pragmatic grammar may have been the semantic foundation for organizing words hierarchically, including binary branching, such as Merge, or n-ary tree structures. We believe that the ability to use a pragmatic grammar may be associated with properties of a new genotype. This new genotype can be associated with late Homo erectus if we consider neurobiological factors and archeological reports about their cognitive behaviors. Since we find structures of a pragmatic grammar in first language acquisition (e.g., telegraphic speech) and language breakdowns (e.g., agrammatic Broca’s aphasia) but also in different types of contact languages (e.g., second language acquisition or home-signing), we assume that there are no genotype differences in the ability to use a pragmatic or an extended syntax.

Extended syntax and morphology are exclusively a cultural product and took thousands of years to develop. However, as mentioned before, the biological capacity for a pragmatic grammar may have been deeply rooted in the survival strategy of an early hominin group refining the dominant social group structures which can be studied even today in extant nonhuman primates. We agree that binary branching such as Merge is an elegant computational mechanism to generate complex structure with a single operation. According to our account, however, the human brain is more flexible and therefore more efficient in generating lexical groupings in multiple ways rather than relying on a rigid single operation. Cognitive and neurobiological factors must be considered to understand how syntax and other properties of language emerged. In particular, rehearsal of specific sound patterns associated with discrete concepts expanded the workspace capacity at the cortical level.

In sum, the shift from a pragmatic grammar to non-binary or binary branching served the purpose of internalizing rules for expressing complex thoughts. Morphosyntactic rules replaced complex storytelling at the pragmatic grammar level. The internalization of morphosyntactic rules may have been related to population growth and to the urge for complex social bonding and coordination (Dunbar, 1996). It took modern humans about 16,000 generations to develop a modern language (assuming that a generation lasts, on average, 18 years). However, the timeline may be much longer if we consider different precursor stages. In addition, we assume that the genotype, here the innate capacity supporting a pragmatic grammar and its modern derivatives, is not unique to early modern humans but shared with their closest extinct relatives, such as Neanderthals, Denisovans and late Homo erectus, if we adopt the classical species taxonomy and do not reclassify them as variations of the same species, that is, Homo sapiens sensu lato (Bräuer, 2008).

2 Cortical Changes in the Hominin Lineage

Human language is strikingly different from animal communication because a grammar system organizes words in sentences and discourse to convey complex meanings. One reason is that language is a cultural product that is ready-made and available to the child in its surrounding world. Another reason is that the child's brain is language-ready. The brain is ready to be selectively sculptured according to the perceived input. After birth, synaptic density significantly increases and peaks at 1–2 years of age. It drops sharply during adolescence and stabilizes during adulthood. Evolution selected neural excess and pruning in the lineage of hominins as an efficient and robust mechanism to shape distributed networks for cognition (e.g., Navlakha et al., 2015). Typically developing children acquire the basic properties of language at the latest by the age of four or five. However, the acquisition of extended syntax involving non-canonical structures and long-distance dependencies takes much longer, until late childhood or early adolescence (e.g., Dabrowska et al., 2009; Skeide et al., 2016; Skeide & Friederici, 2016). Moreover, a prolonged acquisition process applies also to certain non-literal expressions, including sarcasm and novel metaphors (e.g., Glenwright & Pexman, 2010; Van Herwegen et al., 2013). In contrast to macaques or chimpanzees, cortical synaptogenesis is significantly delayed in humans (e.g., Huttenlocher & Dabholkar, 1997; Liu et al., 2012). In-born acquisition abilities enable the child to use finite means to produce infinite structures. To what extent these innate abilities are language-specific syntactic parameters (Chomsky, 1986) or properties of general cognition, such as mind reading and perceptual and cognitive strategies, remains to be seen (Tomasello, 2003).

We find qualitative and quantitative differences in comparing the neural properties of the human brain against the brain of (nonhuman) great apes. In humans, the hubs of the language circuit are Broca’s area with the Brodmann areas (BAs) 44 & 45 (respectively pars opercularis and pars triangularis) and Wernicke’s area with the posterior sections of BAs 21 & 22 of the superior and middle temporal gyrus (S/MTG). Moreover, prosodic information and metaphoric expressions are primarily processed in the right hemisphere (e.g., Bottini et al., 1994), but idiomatic strings are not (Hillert & Buračas, 2009), and auditory and fine-grained articulatory processes involve subcortico-cortical structures, particularly the basal ganglia. Furthermore, neuropsychological data reveal an ambiguous picture. Some studies report that the language circuit is engaged not only during language processing but also in the context of actions, music, or calculations (e.g., Bookheimer, 2002; Fadiga et al., 2009; Nishitani et al., 2005; Ruck, 2014; Wakita, 2014). Other studies show language-specific activations in the left inferior frontal gyrus (e.g., anterior BA 44) which do not overlap with non-linguistic processes (e.g., Campbell & Tyler, 2018; Fedorenko et al., 2011; Jouravlev et al., 2019; Papitto et al., 2020). Some methodological issues are associated with this debate: any two different tasks will recruit different cortical activations. The question is how specifically we define the relevant cortical region. The narrower the definition of a cortical region, the higher the probability of finding activations exclusively for a specific task within the predefined region of interest. It is an empirical question whether specific cortical regions are recruited by the type of computation rather than by domain specificity. Furthermore, it has been argued that activation differences may be related to differences in workspace demands and integrative control functions associated with a syntactic structure (e.g., Hillert, 2014; Kaan & Swaab, 2002; Novick et al., 2010; Saur et al., 2008). We use the term workspace here to refer to a buffer that holds memory traces for about two seconds, but rehearsal operations prevent fading (Baddeley & Hitch, 1974; Cowan, 2001). This approach is consistent with the global workspace hypothesis. Dynamic global workspace functions are supported by prefrontal and posterior regions (e.g., Baars et al., 2013; Dehaene & Changeux, 2011). Most interesting is the finding that the white-matter fiber streams of the language circuits seem to engage different types of computations. The dorsal streams, which connect Broca’s area and the prefrontal motor cortex with the parietotemporal junction (PTJ) and posterior STG, mainly consist of the arcuate fasciculus (AF) and the superior longitudinal fasciculus (SLF). In principle, SLF connects frontal and parietal regions, and AF is a frontotemporal tract extending towards the parietal under SLF. The streams differ in their endpoints: AF terminates (directly or indirectly) in BA 44, SLF in BA6 of the premotor cortex (e.g., Bernal & Ardila, 2009; Friederici & Gierhan, 2013). AF is particularly implicated in complex, hierarchical syntactic processing but also in phonological processing. SLF is involved in speech processing, including rehearsal operations (Catani & Mesulam, 2008; Hickok & Poeppel, 2007).

The ventral streams connect the posterior temporal lobe (STG and MTG) to Broca’s area via the extreme capsule (EC), and uncinate fasciculus (UF). The primary function of the ventral streams is to transfer lexical information (form and meaning), local phrase structures, and treelets to the inferior frontal gyrus (e.g., Bajada et al., 2015; DeWitt & Rauschecker, 2012; Hillert, 2014; Hodgson et al., 2021; Matchin & Hickok, 2020; Pillay et al., 2017; Ralph et al., 2017; van der Lely & Pinker, 2014). Finally, two different streams, the inferior longitudinal fasciculus (IFL) and inferior fronto-occipital fasciculus (IFOF), connect the occipital lobe with the frontal lobe via temporal regions. Little is known about their precise functions, but it has been suggested that they are involved in processes associated with lexical semantics, goal-orientation and possibly theory of mind (e.g., Almairac et al., 2015; Catani & Thiebaut de Schotten, 2008; Glasser & Rilling, 2008). Compared to nonhuman great apes, the anterior prefrontal and precentral regions of the human cortex increased in size (Semendeferi et al., 2001; Schoenemann et al., 2005).

Again, left-sided asymmetric regions homologous to Broca’s and Wernicke’s areas have been identified in nonhuman primates (monkeys, chimpanzees, bonobos, gorillas, orangutans). However, these homologs differ in size, degree of laterality, cortical connectivity, and microstructure. In general, Broca’s area (BAs 44 & 45) and the anterior (Heschl’s gyrus) and posterior portions of Wernicke’s area (BA 22 corresponds to Tpt) show a more pronounced left-sided asymmetry consisting of larger cortical mini-column spacing for better connectivity (e.g., Buxhoeveden et al., 2001; Golestani et al., 2007; Tzourio-Mazoyer & Mazoyer, 2017). AF of the dorsal stream projects further into the middle and inferior temporal cortex. Wider-spaced mini-columns enable a higher resolution of phonological processing, while a denser structure with a lower resolution is associated with holistic-like processes (e.g., Hopkins et al., 2009; Palomero-Gallagher & Zilles, 2019; Schenker et al., 2008, 2010; Spocter et al., 2010; Wilson & Petkov, 2011). Furthermore, it has been reported that BA 44, but not BA 45, is left-over-right asymmetric in individual adult human brains (n = 10; Amunts et al., 1999), but asymmetry of BAs 44 & 45 seems to change throughout the lifespan, and BA 44 is later maturing (Amunts et al., 2003). Left-sided asymmetry of neurophil spacing (space between neurons and glial cells) in the gray matter has, however, also been found in other cortical regions, such as the visual and primary motor cortex (Amunts et al., 1996, 2007; Seldon, 1981a, 1981b). It is, therefore, plausible to assume that some cortical changes are a direct outcome of environmental factors and relate to different behavioral-cognitive activities that are not necessarily language-specific. Moreover, MTG and ITG expanded in the hominin lineage. In macaques (Macaca) and chimpanzees (Pan troglodytes), AF reaches posterior STG. Still, modern humans also massively project into MTG and ITG (e.g., de Schotten et al., 2012; Rilling, 2014; Sousa et al., 2017).

The precise neuroanatomical changes associated with extinct ancestral hominins are difficult to reconstruct as the only evidence relies on endocasts (Holloway, 1978). Australopithecus (A.) species mainly lived between 4.4 and 1.4 mya in eastern and southern Africa during the Pliocene and Pleistocene cooling periods. The fossil remains of the bipedal A. afarensis show hybrid anatomical features (such as dentition and shape of skeletal structure) between Homo species and nonhuman great apes. Paleoneurological evidence points to an expansion of the superior and inferior parietal region at about 3 mya, which may have caused rewiring of the temporoparietal junction, including a region homologous to Wernicke’s area (Bruner et al., 2023). Endocasts of A. afarensis show that the lunate sulcus, which separates area V1 of the occipital lobe from the angular gyrus of the parietal lobe, is placed more posterior (Armstrong et al., 1991; Dart, 1925; Holloway et al., 2004). Endocasts of Homo erectus, from which anatomically modern humans are descended, show significant brain expansion up to 1,000 cc and a pronounced Broca’s cap. This bulge can be seen in an endocast at the level of the temporal pole. A more recent endocranial morphology study supports the view that the frontoparietal areas expanded in concert rather than separately (Ponce de León et al., 2021).

In sum, we assume that with expansion and interconnectivity of the neural networks, signals became more discrete at the low end of the iconic-symbolic spectrum. Semantic relations between symbolic signals may have initially referred to perceptual criteria without syntactic constraints. Not only did experiences become internalized in symbolic representations, so did the relations between these concepts in terms of action-based event-structures. A critical role was certainly played by the increasing workspace and rehearsal capacity, but also by cortical control of signing and vocalization. We assume that abstract semantic categories are required for high-ordered branching in general, whereas the n-ary Merge operation may be a byproduct of those properties. Before we discuss in more detail how extended syntax emerged from symbolic representations, let us briefly review the concept of a pragmatic grammar.

3 Reverse Linguistic Analysis

We find evidence that a precursor stage of extended syntax is rooted in simple syntax (Culicover & Jackendoff, 2005). We introduce the term pragmatic grammar here as the semantic or syntactic relations between lexical elements that are implicitly provided by pragmatics rather than by syntactic markers or word order. Asymmetric semantic relations may be based on non-verbal strategies, such as agent-first, and preference attributes may be mentally stored along with a symbolic unit. In general, the interpretation relies mainly on contextual information, prosody, or default strategies. A pragmatic grammar can be found in certain stages of first and second language acquisition, in agrammatic aphasia, in grammar acquisition of feral children, in contact languages and emerging sign languages (e.g., Bickerton, 1981, 1990; Jackendoff, 1997, 1999; Jackendoff & Wittenberg, 2014; Klein & Perdue, 1997; Progovac & Locke, 2009; Sebba, 1997; Tallerman, 2014).

An often-quoted example is the Malayan dialect of Riau Indonesian, which served in its history as a lingua franca. It is considered to be mono-categorial: it has virtually no syntactic categories, and the word order is based on pragmatic or prosodic strategies provided by an association operator (Gil, 2005, 2013, 2014). Depending on the context, listeners interpret ayam makan (chicken eating) or makan ayam (eating chicken) as we eat chicken, someone is eating chicken, someone eats chicken because of the chicken, the chicken is eating, etc. Presented out of context, the default associative strategy may be to understand chicken as the theme and not as the agent. Otherwise, the speaker has the option to use a grammatical marker. The existential marker ada in ada makan can be understood as there is an eating, someone’s eating, or he did eat although context is still required for a more precise interpretation.

Moreover, an interesting lexical pattern can be found at around 18 to 24 months during first-language acquisition. Children produce two-word utterances with a mean length of utterance (MLU) of two morphemes (range 1.75–2.25), such as give toy or Daddy go, whereas inflections and function words are rarely produced. In general, MLU gradually increases during acquisition, but different grammar stages can be differentiated (Brown, 1973). One study showed that workspace span abilities in 3-year-olds are a better predictor of MLU than age is (Blake et al., 1994). Children also go through these stages when the acquisition is delayed, as in the case of two deaf children who were not exposed to a first sign language until the age of six years (Berk & Lillo-Martin, 2012). A study with “post-childhood” first language learners of American Sign Language (ASL) with at least 9 years of language experience shows that pragmatic grammar (event knowledge) overrides word order, independent of the subject’s animacy. In contrast to the control groups, deaf native ASL signers and hearing second-language ASL signers consistently relied on word order (Cheng & Mayberry, 2019, 2021). In the case of restricted language experience in early childhood, a structural magnetic resonance imaging (MRI) study reveals negative changes in adjusted grey matter volume and cortical thickness in bilateral frontotemporal regions. However, no anatomical changes are reported when deaf infant signers are compared to hearing infant speakers (Cheng et al., 2019, 2023).

Again, deaf children, who create home signs to communicate with their hearing parents, rely on a relatively fixed word order by distinguishing the agent role and placing the action in the final position of a sequence. Similar to spoken language, deaf children go through two gestural stages, and their developed home sign systems are more complex than the gestures used to support speech (Feldman et al., 1978; Goldin-Meadow, 2003; Goldin-Meadow & Yang, 2017). The well-known case of the Nicaraguan Sign Language also illustrates a gradual process from a basic to a more extended grammar. Initially, the deaf children used a word order based on pragmatic principles. The younger deaf children elaborated on these basic structures acquired from the older children and developed grammatical markers to express syntactic relations or verb agreement (Senghas et al., 2004, 2005). Further examples are the emerging Al-Sayyid Bedouin Sign Language (Sandler et al., 2005) and the isolated village sign language Central Taurus Sign Language (Caselli et al., 2014) which indicate similar basic-to-extended grammar patterns.

Again, adults who learn a second language without explicit instructions show a canonical linguistic competence called the basic variety across all examined pairs of first and second language (Jackendoff, 1999; Klein & Perdue, 1997). Initially, second-language speakers tend to acquire words without inflections and rely on a word order based on pragmatic strategies. For example, the agent-first strategy, which often applies together with the focus-last strategy, is efficient in interpreting trigrams such as hit girl boy as The girl hit the boy rather than The boy hit the girl. focus-last often represents the result or significance caused by the agent. However, pragmatics typically tells us the intended meaning. For example, the string drink milk Bob and drink Bob milk will always be understood as Bob drinks milk.

Individuals who suffer from brain lesions show systematic linguistic deficits. In the case of agrammatic aphasia, patients often fall back on the agent-first strategy since they have particular difficulties with function words and assigning thematic roles. Accordingly, they have a high error rate in understanding reversible passive sentences or object-relative clauses in which the patient is mentioned first (e.g., Caplan et al., 1985; Caramazza & Zurif, 1976). Another example is feral children who have difficulties acquiring the grammatical competence of native speakers. Genie, a well-known victim of severe child abuse, was not exposed to language until the age of 13 years. She quickly acquired words after her rescue, but her grammar remained far behind despite many years of intensive training (Curtiss, 1977).

A pragmatic grammar also resurfaces in standard fully-fledged languages. These structures include the agent-first strategy, minimal attachment of modifiers, literal and figurative lexical collocations, and syntactically freely placed adverbial expressions. If pragmatic grammar was a precursor stage in evolution, its interpretative processes relied on contextual information, world knowledge, theory of mind about subjective intentions or social conventions, and on additional gestural, vocal, facial, or postural cues. All these aspects are still today part of spontaneous speech. We assume that the refinement of grammatical structure, including extended syntax, is closely related to the implications of the social brain hypothesis. The social brain hypothesis implies a correlation between social group size and neocortex size in primates. In modern humans at least, this correlation is mediated by mentalizing skills and associated with the theory of mind network that links the prefrontal cortex with the temporal lobe (e.g., Dor, 2015; Dunbar, 1996, 2009, 2005; Dunbar et al., 2015; Roberts et al., 2022). The social brain hypothesis is consistent with the previously mentioned concept of global workspace functions. They are considered here to be significant for the development of extended syntax in languages.

4 The Emergence of Semantics and Syntax

An answer to how the capacity for extended syntax and morphology emerged remains speculative. However, indirect evidence from various disciplines, particularly paleoanthropology, lets us sketch a plausible scenario. Our starting point is the signal exchanges of our closest extant relatives, monkeys and genus Pan. Monkeys combine no more than two vocal signals, and the meanings seem to be idiomatic-like or combinatory rather than compositional (e.g., Arnold & Zuberbühler, 2008; Cheney & Seyfarth, 1990; Seyfarth & Cheney, 2003; Zuberbühler, 2019; Zuberbühler & Bickel, 2022). Again, trained or enculturated chimpanzees occasionally produce flexible bigrams to express immediate needs (e.g., Crockford & Boesch, 2005; Girard-Buttoz et al., 2022; Goodall, 1986; Savage-Rumbaugh et al., 1986).

Apart from fossilized bones, the most striking clues about cognitive behavior in the hominin lineage are the development of the lithic tool industry, from basic pounding tools to flint knapping. A. afarensis already engaged in habitual tool manufacture as early as 3.4 mya (Skinner et al., 2015), while flint-knapping as part of the Acheulean assemblage was a domain of Homo erectus. But what kind of abilities do these tools indicate concerning the evolution of language? The oldest hominin tool users were individuals of the species A. afarensis. This species lived about 3 million years ago and applied Oldowan techniques.

These techniques require only basic goal-oriented behavior, consisting of a few percussions, and indicate sequential steps. In contrast, the Acheulean techniques are associated with Homo erectus, a species with a significant increase in cortical mass and connectivity (up to 1000 cc) compared to Australopithecus (450 cc). In particular, manufacturing a hand-axe at around 1.6 mya required more than 50 percussions, from which several goal-oriented steps can be inferred (e.g., Gowlett, 2006; Holloway, 2008, 2012). The manufacturing steps of the Acheulean techniques were removing the core's surface layer, detaching large flakes for bifacial thinning, finer thinning and shaping, and preparing the edge. Finishing work was done with wooden or bone hammers to control the flaking process better. These techniques imply visual affordance and manual actions to be planned and sequentially combined. Moreover, it is also argued that the late Acheulean techniques (< 800k years ago) imply hierarchical steps and nested part-whole structures (Stout, 2011; Stout et al., 2008). More recently it has been argued, however, that action grammar is sequential in nature and shows weak compositionality (Coopmans et al., 2023).

Two aspects are of particular interest here. First, we find a significant increase in the complexity of toolmaking from Oldowan to late Acheulean. Second, functional MRI studies simulating late Acheulean toolmaking steps and language production both activate the inferior frontal gyrus, including Broca’s area (e.g., Molenberghs et al., 2009; Stout et al., 2021; Uomini & Meyer, 2013). This finding supports the thesis that Broca’s area is involved in processing more complex intentional actions (Fedorenko et al., 2012; Koechlin & Jubault, 2006). To what extent BA 44 or BA 45 or further subdivisions thereof are specifically involved in action grammar, much like for symbolic computations, requires further research. Furthermore, as mentioned before, we can find dominant social group structures or structured representations in great apes’ cognition (Planer & Sterelny, 2021). Since action grammar developed during a period of more than 2 million years, it is a plausible assumption than behavioral changes had an incremental impact on early hominins’ cognition and brain structure and circuits. Thus, it is possible that initially action grammar provided the neurophysiological foundation for symbolic grammar. We argue here that it is not only Broca’s area and its subdivisions, which may have gradually emerged, but the complete frontotemporal circuit providing a substantial increase in workspace.

Another plausible link between action and symbolic grammar implies the technology hypothesis, which states that skills of stone-tool making were culturally transmitted by gestural language (e.g., Corballis, 2003; Fazio et al., 2009; Fitch, 2014; Fujita, 2009; Fujita & Fujita, 2022; Lombao et al., 2017; Morgan et al., 2015; Stout & Chaminade, 2012). Thus, imitation and pantomimes were cardinal for teaching tool manufacturing and informing about weather conditions, predators, or locations of food resources (e.g., Arbib, 2011, 2012; Gärdenfors, 2017, 2021). In particular, the teacher-student relationship may have played a significant role. The partial transfer to vocal instructions was a success story. Although the following evolutionary steps of pragmatic grammar lack direct empirical evidence, they seem plausible, and most are debated in the literature.

Initially, the signing was iconic and holistic (including pantomimes) in both the gestural and vocal domains, and imitated sounds and shapes of the perceived habitat. An onomatopoeia that resembles the sounds that it describes is perhaps one residual. Another type is sound-shape congruency, such as the bouba-kiki effect, which shows that sounds may be linked to shapes across cultures in a way that is non-arbitrary. For example, speakers associate the nonce word bouba with a round shape and kiki with a spiky shape (e.g., Ćwiek et al., 2022). The development of discrete concepts includes a gradual dissociation from iconicity towards symbolism. The role of iconicity in ASL indicates that the development towards sensory-independent meanings might also be motivated by easing process demands. In one study, only new, hearing ASL-learners benefited from sign iconicity, in contrast to proficient ASL-English bilinguals. Different factors might be related to this outcome. One explanation is that bilinguals’ iconic sign computations are conceptually mediated, slowing down processing time. One possible conclusion is that ASL-English bilinguals process symbolic (non-iconic) signs more efficiently than iconic signs since concepts can be directly accessed (Baus et al., 2013; Emmorey, 2014). The evolution of a non-iconic semantic network may therefore be motivated not only by the increasing number of lexical options but also by having direct access points to discrete concepts. Other important factors may have contributed to this emerging process, such as gossiping, grooming, motherese, or pair bonding (e.g., Számadó & Szathmáry, 2006).

However, the most challenging question is how vocalizations associated with emotional arousal became a phonetic, speech-like format. The first step, vocalizations, may have been used intentionally before the sound patterns became arbitrary and applied to speech. Thus, the segmentation process was also gradually implemented at the sound level before a speech-like format was developed. One idea is that the segmentation of holistic chunks of sound patterns produced distinct syllables (MacNeilage, 2008). Moreover, the increasing demand for more (content) words asked for affixation and hierarchically organized structures of the sound patterns (e.g., Carstairs-McCarthy, 1999; Jackendoff, 1999; Wray, 1998). The duality of patterning was born (Hockett, 1959).

We suggest that semantics, along with phonology, was born before basic or extended syntax. Collocations of two or three words may have been the standard pattern of early pragmatic grammar. Further gradual and incremental developments can be assumed at the phrasal level to create asymmetric relations between words. The relations are conceptually grounded, such that the action follows the entity causing the action. Along with the increasing population growth in the hominin lineage, social bonding and cooperation in all aspects of life were mutual, reciprocal processes (e.g., Scott-Phillips, 2007). At all linguistic levels, compositional structures became eminent. Concepts and their semantic relations mentally consolidated through argument structures, thematic roles, and phrase structures.

The timeline of when extended syntax, including Merge, emerged in the hominin lineage is controversially debated. We agree with the generative model that it emerged more recently and may coincide with the appearance of behavioral modernity. However, we assume that the extended syntax capacity was already in place in Homo erectus but not used because pragmatic grammar was sufficient for their socioecological needs. Restricting any form of language-readiness to archaic or modern humans seems anthropocentric considering the long history of hominin evolution. Our assumptions are based on the following.

According to conservative estimates, the species Homo erectus was around for about 1.8 my, but within its lineage, there are significant anatomical variations. Late Homo erectus’ brain volume increased to 1,000 cc and had human-like prefrontal and temporoparietal regions (Wynn, 1998). Fossil records show, moreover, a Broca's cap morphology. Again, Homo erectus did not only develop Acheulean tools (e.g., Shea, 2016) but traveled long distances to the south and north of Africa and out of Africa to the Middle East and China. Moreover, they built water-transport crafts to reach the island of Java (Dubois, 1894). Since this species had the social and technological skills to build boats, these large-scale social group activities imply that individuals had the ability to plan for the future and to make predictions about new habitats. Most of all it implies that they presumably developed a language-like system, such as pragmatic grammar, to share knowledge (Everett, 2017; Gil, 2008). Further support for this assumption is the discovery of the 700–230 ky old Berekhat Ram figurine, which appears to demonstrate symbolism. This figurine has been associated with Homo erectus (d’Errico & Nowell, 2000).

These technological and aesthetic skills point to a more sophisticated social culture quite distinguishable from any cultural activities seen before in the hominin lineage. At the same time, it is obvious that extended syntax, argument structures, and rich morphology were not needed in context of the socioecological conditions Homo erectus was living in. However, this species may have had the innate capacity to generate those extended linguistic structures on the basis of a pragmatic grammar. We assume, furthermore, that binary branching implied by Merge does not play a crucial role in modern languages and does not exclusively define syntax or language (Pinker & Jackendoff, 2005). Finally, since Homo erectus increasingly used manual skills, we believe that vocalizations successively replaced gestures while the latter kept their supplemental function. Human language consists of multiple components that evolved separately or in concert. It is, therefore, difficult to single out a particular hominin species equipped in a single step with these various basic and extended cognitive and language-related components.

5 Conclusion

We suggest different evolutionary milestones in the evolution of syntax. Nonhuman primates including monkeys and nonhuman apes primarily produce vocal signals to express states of emotional arousal, occasionally combining two signals. Although their brains share homologous structures with the brains of modern humans due to common evolutionary ancestry, neural mass and connectivity at the synaptic level and between cortical and subcortical regions are not specifically designed for elaborated symbolic mentalizing. In contrast, the human brain supports cortical control of conceptualizations, whereas rehearsal operations provide maintenance and updates of these representations and increase workspace capacities. We, furthermore, assume that ventral streams mainly support pragmatic grammar while extended hierarchical branching is associated with dorsal streams. Here, Broca’s area seems to work like a buffer in which information is unified and linearized for output. In contrast, semantic and syntactic structures are generated in posterior regions, including Wernicke’s area and PTJ (e.g., Boeckx et al., 2014; van der Lely & Pinker, 2014).

The evolution of language in functional terms implies several milestones. Although various scenarios are possible, the general picture we suggest is as follows: Early hominins may have mainly relied on iconic and holistic signals, including pantomimes, which resembled emotional arousal states and information perceived in the environment. In turn, segmentation took place at different levels. Concepts became discrete, and sound patterns symbolic. Two or three words were combined according to action-based event structures. This pragmatic grammar stage, also indicated by reverse linguistic analysis, presumably can be associated with critical genotype changes in Homo erectus that provided the foundation of extended symbolic computations.

The externalization of thoughts was an overwhelming benefit for our ancestors. Along with population growth and the increasing demand for social collaboration, semantic roles as found in non-verbal event structures of action grammar became internalized. The externalization of these semantic structures in the fashion of symbolic representations brought about pragmatic grammar. We do not believe that genotype differences between Homo erectus and Homo sapiens sensu lato (anatomical modern humans, (pre-) archaic Homo sapiens, Homo heidelbergensis, Neanderthals, and Denisovans) were critical for the development of extended syntactic and morphological structures. They can be considered as cultural accumulations.

The development of extended syntax may have started with the generation of treelets that are small templates of syntactic nodes typically underspecified in some respects of sentential tree structure. These treelets can be readily accessed and integrated into larger structures (J. D. Fodor, 1998; Sakas & J. D. Fodor, 2012). The path was set for basic and extended syntactic branching which includes hierarchical structures. They were also implemented at the phonological or morphological level. Binary branching of Merge and its iterative application is only one form of possible syntactic branching. Other strategies to organize phrases and sentences are equally important, including idiomatic collocations, metaphoric expressions, treelets, and n-ary branching. After all, the beauty of language is its diversity.


K. F. was supported by JSPS Grants-in-Aid for Scientific Research (C) (22K00552) and (B) (22H00600) and D. H. by JSPS Invitational Fellowship (L20512).


We are grateful of the reviewers’ constructive and insightful comments and Bridget Samuels for editorial notes.

Competing Interests

The authors have declared that no competing interests exist.


  • Almairac, F., Herbet, G., Moritz-Gasser, S., de Champfleur, N. M., & Duffau, H. (2015). The left inferior fronto-occipital fasciculus subserves language semantics: A multilevel lesion study. Brain Structure & Function, 220, 1983-1995.

  • Amunts, K., Armstrong, E., Malikovic, A., Hömke, L., Mohlberg, H., Schleicher, A., & Zilles, K. (2007). Gender-specific left-right asymmetries in human visual cortex. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 27(6), 1356-1364.

  • Amunts, K., Schlaug, G., Schleicher, A., Steinmetz, H., Dabringhaus, A., Roland, P. E., & Zilles, K. (1996). Asymmetry in the human motor cortex and handedness. NeuroImage, 4(3), 216-222.

  • Amunts, K., Schleicher, A., Ditterich, A., & Zilles, K. (2003). Broca’s region: Cytoarchitectonic asymmetry and developmental changes. The Journal of Comparative Neurology, 465(1), 72-89.

  • Amunts, K., Schleicher, A., Mohlberg, H., Uylings, H. B., & Zilles, K. (1999). Broca’s region revisited: Cytoarchitecture and intersubject variability. The Journal of Comparative Neurology, 412(2), 319-341.<319::AID-CNE10>3.0.CO;2-7

  • Arbib, M. A. (2011). From mirror neurons to complex imitation in the evolution of language and tool use. Annual Review of Anthropology, 40, 257-273.

  • Arbib, M. A. (2012). How the brain got language: The mirror system hypothesis. Oxford University Press.

  • Armstrong, E., Zilles, K., Curtis, M., & Schleicher, A. (1991). Cortical folding, the lunate sulcus and the evolution of the human brain. Journal of Human Evolution, 20(4), 341-348.

  • Arnold, K., & Zuberbühler, K. (2008). Meaningful call combinations in a nonhuman primate. Current Biology, 18(5), R202-R203.

  • Baars, B. J., Franklin, S., & Ramsøy, T. (2013). Global workspace dynamics: Cortical “binding and propagation” enables conscious contents. Frontiers in Psychology, 4(200), Article 200.

  • Baddeley, A. D., & Hitch, G. (1974). Working memory. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 8, pp. 47–89). Academic Press.

  • Bajada, C. J., Lambon Ralph, M. A., & Cloutman, L. L. (2015). Transport for language south of the Sylvian fissure: The routes and history of the main tracts and stations in the ventral language network. Cortex, 69, 141-151.

  • Baus, C., Carreiras, M., & Emmorey, K. (2013). When does iconicity in sign language matter? Language and Cognitive Processes, 28(3), 261-271.

  • Berk, S., & Lillo-Martin, D. (2012). The two-word stage: Motivated by linguistic or cognitive constraints? Cognitive Psychology, 65(1), 118-140.

  • Bernal, B., & Ardila, A. (2009). The role of the arcuate fasciculus in conduction aphasia. Brain, 132(9), 2309-2316.

  • Berwick, R. C., & Chomsky, N. (2019). All or nothing: No half-Merge and the evolution of syntax. PLoS Biology, 17(11), Article e3000539.

  • Berwick, R. C., & Chomsky, N. (2016). Why only us: Language and evolution. MIT Press.

  • Bickerton, D. (1981). Roots of language. Karoma Publishers.

  • Bickerton, D. (1990). Language and species. University of Chicago Press.

  • Blake, J., Austin, W., Cannon, M., Lisus, A., & Vaughan, A. (1994). The relationship between memory span and measures of imitative and spontaneous language complexity in preschool children. International Journal of Behavioral Development, 17(1), 91-107.

  • Boeckx, C., Martinez-Alvarez, A., & Leivada, E. (2014). The functional neuroanatomy of serial order in language. Journal of Neurolinguistics, 32, 1-15.

  • Bookheimer, S. (2002). Functional MRI of language: New approaches to understanding the cortical organization of semantic processing. Annual Review of Neuroscience, 25, 151-188.

  • Botha, R. P. (2003). Unravelling the evolution of language. Elsevier.

  • Bottini, G., Corcoran, R., Sterzi, R., Paulesu, E., Schenone, P., Scarpa, P., Frackowiak, R. S. J., & Frith, D. (1994). The role of the right hemisphere in the interpretation of figurative aspects of language A positron emission tomography activation study. Brain, 117(6), 1241-1253.

  • Bräuer, G. (2008). The origin of modern anatomy: By speciation or intraspecific evolution? Evolutionary Anthropology, 17, 22-37.

  • Brown, R. (1973). Development of the first language in the human species. The American Psychologist, 28(2), 97-106.

  • Bruner, E., Battaglia-Mayer, A., & Caminiti, R. (2023). The parietal lobe evolution and the emergence of material culture in the human genus. Brain Structure & Function, 228(1), 145-167.

  • Buxhoeveden, D. P., Switala, A. E., Roy, E., Litaker, M., & Casanova, M. F. (2001). Morphological differences between minicolumns in human and nonhuman primate cortex. American Journal of Physical Anthropology, 115(4), 361-371.

  • Campbell, K. L., & Tyler, L. K. (2018). Language-related domain-specific and domain-general systems in the human brain. Current Opinion in Behavioral Sciences, 21, 132-137.

  • Caplan, D., Baker, C., & Dehaut, F. (1985). Syntactic determinants of sentence comprehension in aphasia. Cognition, 21(2), 117-175.

  • Caramazza, A., & Zurif, E. B. (1976). Dissociation of algorithmic and heuristic processes in language comprehension: Evidence from aphasia. Brain and Language, 3(4), 572-582.

  • Carstairs-McCarthy, A. (1999). The origins of complex language. Oxford University Press.

  • Caselli, N. K., Ergin, R., Jackendoff, R., & Cohen-Goldberg, A. M. (2014). The emergence of phonological structure in Central Taurus Sign Language [Paper presentation]. Conference proceedings: From Sound to Gesture, Padua, Italy.

  • Catani, M., & Mesulam, M. (2008). The arcuate fasciculus and the disconnection theme in language and aphasia: History and current state. Cortex, 44(8), 953-961.

  • Catani, M., & Thiebaut de Schotten, M. (2008). A diffusion tensor imaging tractography atlas for virtual in vivo dissections. Cortex, 44(8), 1105-1132.

  • Cheney, D. L., & Seyfarth, R. M. (1990). How monkeys see the world: Inside the mind of another species. University of Chicago Press.

  • Cheng, Q., & Mayberry, R. I. (2019). Acquiring a first language in adolescence: The case of basic word order in American Sign Language. Journal of Child Language, 46(2), 214-240.

  • Cheng, Q., & Mayberry, R. I. (2021). When event knowledge overrides word order in sentence comprehension: Learning a first language after childhood. Developmental Science, 24(5), Article e13073.

  • Cheng, Q., Roth, A., Halgren, E., Klein, D., Chen, J.-K., & Mayberry, R. I. (2023). Restricted language access during childhood affects adult brain structure in selective language regions. Proceedings of the National Academy of Sciences of the United States of America, 120(7), Article e2215423120.

  • Cheng, Q., Roth, A., Halgren, E., & Mayberry, R. I. (2019). Effects of early language deprivation on brain connectivity: Language pathways in deaf native and late first-language learners of American Sign Language. Frontiers in Human Neuroscience, 13(320), Article 320.

  • Chomsky, N. (1986). Knowledge of language: Its nature, origin and use. Praeger.

  • Chomsky, N. (1995). The minimalist program. MIT Press.

  • Chomsky, N. (2017). The language capacity: Architecture and evolution. Psychonomic Bulletin & Review, 24(1), 200-203.

  • Christiansen, M. H., Chater, N., & Reali, F. (2009). The biological and cultural foundations of language. Communicative & Integrative Biology, 2(3), 221-222.

  • Christiansen, M. H., & Kirby, S. (2003). Language evolution: Consensus and controversies. Trends in Cognitive Sciences, 7(7), 300-307.

  • Coopmans, C. W., Kaushik, K., & Martin, A. E. (2023). Hierarchical structure in language and action: A formal comparison. Psychological Review, 130(4), 935-952.

  • Corballis, M. C. (2003). From mouth to hand: Gesture, speech, and the evolution of right-handedness. Behavioral and Brain Sciences, 26(2), 199-208.

  • Coudé, G., Ferrari, P. F., Rodà, F., Maranesi, M., Borelli, E., Veroni, V., Monti, F., Rozzi, S., & Fogassi, L. (2011). Neurons controlling voluntary vocalization in the macaque ventral premotor cortex. PLoS One, 6(11), Article e26822.

  • Cowan, N. (2001). The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences, 24(1), 87-185.

  • Crockford, C., & Boesch, C. (2005). Call combinations in wild chimpanzees. Behaviour, 142(4), 397-421.

  • Culicover, P. W., & Jackendoff, R. (2005). Simpler syntax. Oxford University Press.

  • Curtiss, S. (1977). Genie: A psycholinguistic study of a modern-day "wild child". Academic Press.

  • Ćwiek, A., Fuchs, S., Draxler, C., Asu, E. L., Dediu, D., Hiovain, K., Kawahara, S., Koutalidis, S., Krifka, M., Lippus, P., Lupyan, G., Oh, G. E., Paul, J., Petrone, C., Ridouane, R., Reiter, S., Schümchen, N., Szalontai, Á., Ünal-Logacev, Ö., ...Winter, B. (2022). The bouba/kiki effect is robust across cultures and writing systems. Philosophical Transactions of the Royal Society B: Biological Sciences, 377(1841), Article 20200390.

  • d’Errico, F., & Nowell, A. (2000). A new look at the Berekhat Ram figurine: Implications for the origins of symbolism. Cambridge Archaeological Journal, 10(1), 123-167.

  • Dabrowska, E., Rowland, C. F., & Theakston, A. (2009). The acquisition of questions with long-distance dependencies. Cognitive Linguistics, 20(3), 571-597.

  • Dart, R. (1925). Australopithecus africanus: The man-ape of South Africa. Nature, 115, 195-199.

  • de Boer, B., Thompson, B., Ravignani, A., & Boeckx, C. (2020). Evolutionary dynamics do not motivate a single-mutant theory of human language. Scientific Reports, 10, Article 451.

  • de Schotten, M. T., Dell’Acqua, F., Valabregue, R., & Catani, M. (2012). Monkey to human comparative anatomy of the frontal lobe association tracts. Cortex, 48(1), 82-96.

  • Deacon, T. W. (1989). The neural circuitry underlying primate cells and human language. Human Evolution, 4, 367-401.

  • Dehaene, S., & Changeux, J.-P. (2011). Experimental and theoretical approaches to conscious processing. Neuron, 70(2), 200-227.

  • DeWitt, I., & Rauschecker, J. P. (2012). Phoneme and word recognition in the auditory ventral stream. Proceedings of the National Academy of Sciences USA, 109(8), 505-514.

  • Dor, D. (2015). The instruction of imagination: Language as a social communication technology. Oxford Academic.

  • Dubois, E. (1894). Pithecanthropus Erectus: Eine menschenaehnliche Uebergangsform aus Java. G.E. Stechert.

  • Dunbar, R. I. (1996). Grooming, gossip and the evolution of language. Faber and Faber.

  • Dunbar, R. I. (2005). The human story: A new history of mankind’s evolution. Faber and Faber.

  • Dunbar, R. I. (2009). The social brain hypothesis and its implications for social evolution. Annals of Human Biology, 36(5), 562-572.

  • Dunbar, R. I. M., Arnaboldi, V., Conti, M., & Passarella, A. (2015). The structure of online social networks mirrors those in the offline world. Social Networks, 43, 39-47.

  • Emmorey, K. (2014). Iconicity as structure mapping. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 369(1651), Article 20130301.

  • Everett, D. (2017). How language began: The story of humanity's greatest invention. W. W. Norton.

  • Fadiga, L., Craighero, L., & D’Ausilio, A. (2009). Broca's area in language, action, and music. Annals of the New York Academy of Science, 1169, 448-458.

  • Fazio, P., Cantagallo, A., Craighero, L., D’Ausilio, A., Roy, A. C., Pozzo, T., Calzolari, F., Granieri, E., & Fadiga, L. (2009). Encoding of human action in Broca's area. Brain, 132(Pt 7), 1980-1988.

  • Fedorenko, E., Behr, M. K., & Kanwisher, N. (2011). Functional specificity for high-level linguistic processing in the human brain. Proceedings of the National Academy of Sciences of the United States of America, 108(39), 16428-16433.

  • Fedorenko, E., Duncan, J., & Kanwisher, N. (2012). Language-selective and domain-general regions lie side by side within Broca’s area. Current Biology, 22(21), 2059-2062.

  • Feldman, H., Goldin-Meadow, S., & Gleitman, L. (1978). Beyond Herodotus: The creation of language by linguistically deprived deaf children. In A. Lock (Ed.) Action, symbol, and gesture: The emergence of language (pp. 351–414). Academic Press.

  • Ferrari, P. F., Gerbella, M., Coudé, G., & Rozzi, S. (2017). Two different mirror neuron networks: The sensorimotor (hand) and limbic (face) pathways. Neuroscience, 358, 300-315.

  • Fitch, W. T. (2000). The evolution of speech: A comparative review. Trends in Cognitive Sciences, 4(7), 258-267.

  • Fitch, W. T. (2014). Toward a computational framework for cognitive biology: Unifying approaches from cognitive neuroscience and comparative cognition. Physics of Life Reviews, 11(3), 329-364.

  • Fodor, J. D. (1998). Unambiguous triggers. Linguistic Inquiry, 29(1), 1-36.

  • Foley, R. (2001). In the shadow of the modern synthesis? Alternative perspectives on the last fifty years of paleoanthropology. Evolutionary Anthropology, 10(1), 5-14.<5::AID-EVAN1008>3.0.CO;2-Y

  • Frege, G. (1892). Über Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik, 100(1), 25-50.

  • Friederici, A. D., & Gierhan, S. M. (2013). The language network. Current Opinion in Neurobiology, 23(2), 250-254.

  • Fujita, H., & Fujita, K. (2022). Human language evolution: A view from theoretical linguistics on how syntax and the lexicon first came into being. Primates, 63(5), 403-415.

  • Fujita, K. (2009). A prospect for evolutionary adequacy: Merge and the evolution and development of human language. Biolinguistics, 3(2-3), 128-153.

  • Gärdenfors, P. (2017). Demonstration and pantomime in the evolution of teaching. Frontiers in Psychology, 8, Article 415.

  • Gärdenfors, P. (2021). Demonstration and pantomime in the evolution of teaching and communication. Language & Communication, 80, 71-79.

  • Gifford, G. W., III, MacLean, K. A., Hauser, M. D., & Cohen, Y. E. (2005). The neurophysiology of functionally meaningful categories: Macaque ventrolateral prefrontal cortex plays a critical role in spontaneous categorization of species-specific vocalizations. Journal of Cognitive Neuroscience, 17(9), 1471-1482.

  • Gil, D. (1994). The structure of Riau Indonesian. Nordic Journal of Linguistics, 17(2), 179-200.

  • Gil, D. (2005). Word order without syntactic categories: How Riau Indonesian does it. In A. Carnie, H. Harley, & S. A. Dooley (Eds.), Verb first: On the syntax of verb-initial languages. (pp. 243–263). John Benjamins.

  • Gil, D. (2008). How much grammar does it take to sail a boat? (Or, what can material artefacts tell us about the evolution of language?). In A. D. M. Smith, K. Smith, & R. Ferrer i Cancho (Eds.), The evolution of language (pp. 123–130).

  • Gil, D. (2013). Riau Indonesian: A language without nouns and verbs. In J. Rijkhoff & E. van Lier (Eds.), Flexible word classes: Typological studies of underspecified parts of speech. Oxford Academic Press.

  • Gil, D. (2014). Sign languages, creoles, and the development of predication. In F. Newmeyer, & L. Preston (Eds.), Measuring grammatical complexity (pp. 37–64). Oxford University Press.

  • Gil, D., & Shen, Y. (2019). How grammar introduces asymmetry into cognitive structures: Compositional semantics, metaphors and schematological hybrids. Frontiers in Psychology, 10, Article 2275.

  • Girard-Buttoz, C., Zaccarella, E., Bortolato, T., Friederici, A. D., Wittig, R. M., & Crockford, C. (2022). Chimpanzees produce diverse vocal sequences with ordered and recombinatorial properties. Communications Biology, 5, Article 410.

  • Glasser, M. F., & Rilling, J. K. (2008). DTI tractography of the human brain’s language pathways. Cerebral Cortex, 18(11), 2471-2482.

  • Glenwright, M., & Pexman, P. M. (2010). Development of children's ability to distinguish sarcasm and verbal irony. Journal of Child Language, 37(2), 429-451.

  • Goldberg, A. (2005). Constructions at work: The nature of generalization in language. Oxford University Press.

  • Goldin-Meadow, S. (2003). The resilience of language: What gesture creation in deaf children can tell us about language-learning in general. Psychology Press.

  • Goldin-Meadow, S., & Yang, C. (2017). Statistical evidence that a child can create a combinatorial linguistic system without external linguistic input: Implications for language evolution. Neuroscience and Biobehavioral Reviews, 81, 150-157.

  • Golestani, N., Molko, N., Dehaene, S., LeBihan, D., & Pallier, C. (2007). Brain structure predicts the learning of foreign speech sounds. Cerebral Cortex, 17(3), 575-582.

  • Goodall, J. (1986). The chimpanzees of Gombe: Patterns of behavior. Harvard University Press.

  • Gowlett, J. A. J. (2006). The elements of design form in Acheulian bifaces: Modes, modalities, rules and language. In N. Goren-Inbar & G. Sharon (Eds.), Axe age: Acheulian tool-making from quarry to discard (pp. 203–221). Equinox.

  • Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002). The faculty of language: What is it, who has it, and how did it evolve? Science, 298(5598), 1569-1579.

  • Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing. Nature Reviews. Neuroscience, 8, 393-402.

  • Hillert, D. (2014). The nature of language: Evolution, paradigms and circuits. Springer.

  • Hillert, D. (2021). How did language evolve in the lineage of higher primates? Lingua, 264, Article 103158.

  • Hillert, D. G. (2015). On the evolving biology of language. Frontiers in Psychology, 6, Article 1796.

  • Hillert, D. G. (2023). On how “early syntax” came about. Frontiers in Language Sciences, 2, Article 1251498.

  • Hillert, D. G., & Buračas, G. T. (2009). The neural substrates of spoken idiom comprehension. Language and Cognitive Processes, 24(9), 1370-1391.

  • Hockett, C. F. (1959). Animal “languages” and human language. Human Biology, 31(1), 32-39.

  • Hodgson, V. J., Ralph, M. A. L., & Jackson, R. L. (2021). Multiple dimensions underlying the functional organisation of the language network. NeuroImage, 241, Article 118444.

  • Holloway, R. L. (1978). Evolution of brain and behavior. Evolution, 32(1), 223-224.

  • Holloway, R. L. (2008). The human brain evolving: A personal retrospective. Annual Review of Anthropology, 37, 1-37.

  • Holloway, R. L. (2012). Language and tool making are similar cognitive processes. Behavioral Brain Research, 35(4), 226-226.

  • Holloway, R. L., Broadfield, D., Yuan, M., Schwartz, J. H., & Tattersall, I. (2004). The human fossil record, brain endocasts: The paleoneurological evidence (Vol. 3). Wiley.

  • Hopkins, W. D., Lyn, H., & Cantalupo, C. (2009). Volumetric and lateralized differences in selected brain regions of chimpanzees (Pan troglodytes) and bonobos (Pan paniscus). American Journal of Primatology, 71(12), 988-997.

  • Huttenlocher, P. R., & Dabholkar, A. S. (1997). Regional differences in synaptogenesis in human cerebral cortex. The Journal of Comparative Neurology, 387(2), 167-178.<167::AID-CNE1>3.0.CO;2-Z

  • Jackendoff, R. (1997). The architecture of the language faculty. MIT Press.

  • Jackendoff, R. (1999). Possible stages in the evolution of language. Trends in Cognitive Sciences, 3(7), 272-279.

  • Jackendoff, R., & Wittenberg, E. (2014). What you can say without syntax: A hierarchy of grammatical complexity. In J. Frederick, L. Newmeyer, & B. Preston (Eds.), Measuring grammatical complexity (pp. 65–82). Oxford University Press.

  • Jackendoff, R., & Wittenberg, E. (2017). Linear grammar as a possible stepping-stone in the evolution of language. Psychonomic Bulletin & Review, 24(1), 219-224.

  • Jouravlev, O., Zheng, D., Balewski, Z., Le Arnz Pongos, A., Levan, Z., Goldin-Meadow, S., & Fedorenko, E. (2019). Speech-accompanying gestures are not processed by the language-processing mechanisms. Neuropsychologia, 132, Article 107132.

  • Kaan, E., & Swaab, T. Y. (2002). The brain circuitry of syntactic comprehension. Trends in Cognitive Sciences, 6(8), 350-356.

  • Klein, W., & Perdue, C. (1997). The basic variety, or: Couldn’t language be much simpler? Second Language Research, 13(4), 301-347.

  • Koechlin, E., & Jubault, T. (2006). Broca’s area and the hierarchical organization of human behavior. Neuron, 50(6), 963-974.

  • Liu, X., Somel, M., Tang, L., Yan, Z., Jiang, X., Guo, S., Yuan, Y., He, L., Oleksiak, A., Zhang, Y., Li, N., Hu, Y., Chen, W., Qiu, Z., Pääbo, S., & Khaitovich, P. (2012). Extension of cortical synaptic development distinguishes humans from chimpanzees and macaques. Genome Research, 22(4), 611-622.

  • Lombao, D., Guardiola, M., & Mosquera, M. (2017). Teaching to make stone tools: New experimental evidence supporting a technological hypothesis for the origins of language. Scientific Reports, 7, Article 14394.

  • MacNeilage, P. F. (2008). The origin of speech. Oxford University Press.

  • Matchin, W., & Hickok, G. (2020). The cortical organization of syntax. Cerebral Cortex, 30(3), 1481-1498.

  • Molenberghs, P., Cunnington, R., & Mattingley, J. B. (2009). Is the mirror neuron system involved in imitation? A short review and meta-analysis. Neuroscience and Biobehavioral Reviews, 33(7), 975-980.

  • Morgan, T. J. H., Uomini, N., Rendell, L., Chouinard-Thuly, L., Street, S. E., Lewis, H. M., Cross, C. P., Evans, C., Kearney, R., de la Torre, I., Whiten, A., & Laland, K. N. (2015). Experimental evidence for the co-evolution of hominin toolmaking teaching and language. Nature Communications, 6, Article 6029.

  • Navlakha, S., Barth, A. L., & Bar-Joseph, Z. (2015). Decreasing-rate pruning optimizes the construction of efficient and robust distributed networks. PLoS Computational Biology, 11, Article e1004347.

  • Nishitani, N., Schürmann, M., Amunts, K., & Hari, R. (2005). Broca’s region: From action to language. Physiology, 20(1), 60-69.

  • Novick, J. M., Trueswell, J. C., & Thompson-Schill, S. L. (2010). Broca’s area and language processing: Evidence for the cognitive control connection. Language and Linguistics Compass, 4(10), 906-924.

  • Palomero-Gallagher, N., & Zilles, K. (2019). Cortical layers: Cyto-, myelo-, receptor- and synaptic architecture in human cortical areas. NeuroImage, 197, 716-741.

  • Papitto, G., Friederici, A. D., & Zaccarella, E. (2020). The topographical organization of motor processing: An ALE meta-analysis on six action domains and the relevance of Broca’s region. NeuroImage, 206, Article 116321.

  • Pillay, S. B., Binder, J. R., Humphries, C. J., Gross, W. L., & Book, D. S. (2017). Lesion localization of speech comprehension deficits in chronic aphasia. Neurology, 88(10), 970-975.

  • Pinker, S., & Bloom, P. (1990). Natural language and natural selection. Behavioral and Brain Sciences, 13(4), 707-727.

  • Pinker, S., & Jackendoff, R. (2005). The faculty of language: What’s special about it? Cognition, 95(2), 201-236.

  • Planer, R. J., & Sterelny, K. (2021). From signal to symbol: The evolution of language. MIT Press.

  • Ponce de León, M. S., Bienvenu, T., Marom, A., Engel, S., Tafforeau, P., Alatorre Warren, J. L., Lordkipanidze, D., Kurniawan, I., Murti, D. B., Suriyanto, R. A., Koesbardiati, T., & Zollikofer, C. P. E. (2021). The primitive brain of early Homo. Science, 372(6538), 165-171.

  • Progovac, L. (2010). Syntax: Its evolution and its representation in the brain. Biolinguistics, 4(2-3), 234-254.

  • Progovac, L. (2015). Evolutionary syntax. Oxford University Press.

  • Progovac, L., & Locke, J. (2009). The urge to merge: Insult and the evolution of syntax. Biolinguistics, 3(2-3), 337-354.

  • Ralph, M. A. L., Jefferies, E., Patterson, K., & Rogers, T. T. (2017). The neural and computational bases of semantic cognition. Nature Review Neuroscience, 18(1), 42-55.

  • Rilling, J. K. (2014). Comparative primate neuroimaging: Insights into human brain evolution. Trends in Cognitive Science, 18(1), 46-55.

  • Roberts, S. G. B., Dunbar, R. I. M., & Roberts, A. I. (2022). Communicative roots of complex sociality and cognition: Neuropsychological mechanisms underpinning the processing of social information. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 377(1860), Article 20210295.

  • Ruck, L. (2014). Manual praxis in stone tool manufacture: Implications for language evolution. Brain and Language, 139, 68-83.

  • Sakas, W. G., & Fodor, J. D. (2012). Disambiguating syntactic triggers. Language Acquisition, 19(2), 83-143.

  • Sandler, W., Meir, I., Padden, C., & Aronoff, M. (2005). The emergence of grammar in a new sign language. Proceedings of the National Academy of Sciences of the United States of America, 102(7), 2661-2665.

  • Saur, D., Kreher, B. W., Schnell, S., Kümmerer, D., Kellmeyer, P., Vry, M. S., Umarova, R., Musso, M., Glauche, V., Abel, S., Huber, W., Rijntjes, M., Hennig, J., & Weiller, C. (2008). Ventral and dorsal pathways for language. Proceedings of the National Academy of Sciences of the United States of America, 105(46), 18035-18040.

  • Savage-Rumbaugh, S., McDonald, K., Sevcik, R. A., Hopkins, W. D., & Rubert, E. (1986). Spontaneous symbol acquisition and communicative use by pygmy chimpanzees Pan paniscus. Journal of Experimental Psychology. General, 115(3), 211-235.

  • Schenker, N. M., Buxhoeveden, D., Blackmon, W. L., Amunts, K., Zilles, K., & Semendeferi, K. (2008). A comparative quantitative analysis of cytoarchitecture and minicolumnar organization in Broca’s area in humans and great apes. Journal of Computational Neurology, 510(1), 117-128.

  • Schenker, N. M., Hopkins, W. D., Spocter, M. A., Garrison, A. R., Stimpson, C. D., Erwin, J. M., Hof, P. R., & Sherwood, C. C. (2010). Broca’s area homologue in chimpanzees (Pan troglodytes): Probabilistic mapping, asymmetry, and comparison to humans. Cerebral Cortex, 20(3), 730-742.

  • Schoenemann, P. T., Sheehan, M. J., & Glotzer, L. D. (2005). Prefrontal white matter volume is disproportionately larger in humans than in other primates. Nature Neuroscience, 8(2), 242-252.

  • Scott-Phillips, T. C. (2007). The social evolution of language, and the language of social evolution. Evolutionary Psychology, 5(4), 740-753.

  • Sebba, M. (1997). Contact languages: Pidgins and creoles. McMillan Press.

  • Seldon, H. L. (1981a). Structure of human auditory cortex. I. Cytoarchitectonics and dendritic distributions. Brain Research, 229(2), 277-294.

  • Seldon, H. L. (1981b). Structure of human auditory cortex. II. Axon distributions and morphological correlates of speech perception. Brain Research, 229(2), 295-310.

  • Semendeferi, K., Armstrong, E., Schleicher, A., Zilles, K., & Van Hoesen, G. W. (2001). Prefrontal cortex in humans and apes: A comparative study of area 10. American Journal of Physical Anthropology, 114(3), 224-241.<224::AID-AJPA1022>3.0.CO;2-I

  • Senghas, A., Kita, S., & Ozyurek, A. (2004). Children creating core properties of language: Evidence from an emerging sign language in Nicaragua. Science, 305(5691), 1779-1782.

  • Senghas, R. J., Senghas, A., & Pyers, J. E. (2005). The emergence of Nicaraguan Sign Language: Questions of development, acquisition, and evolution. In S. T. Parker, J. Langer, & C. Milbrath (Eds.), Biology and knowledge revisited: From neurogenesis to psychogenesis (pp. 287–306). Psychology Press.

  • Seyfarth, R. M., & Cheney, D. L. (2003). Signalers and receivers in animal communication. Annual Review of Psychology, 54, 145-173.

  • Seyfarth, R. M., Cheney, D. L., & Marler, P. (1980). Monkey responses to three different alarm calls: Evidence of predator classification and semantic communication. Science, 210(4471), 801-803.

  • Shea, J. J. (2016). Stone tools in human evolution: Behavioral differences among technological primates. University Press Cambridge.

  • Skeide, M. A., & Friederici, A. D. (2016). The ontogeny of the cortical language network. Nature Reviews. Neuroscience, 17(5), 323-332.

  • Skeide, M. A., Brauer, J., & Friederici, A. D. (2016). Brain functional and structural predictors of language performance. Cerebral Cortex, 26(5), 2127-2139.

  • Skinner, M. M., Stephens, N. B., Tsegai, Z. J., Foote, A. C., Nguyen, N. H., Gross, T., Pahr, D. H., Hublin, J. J., & Kivell, T. L. (2015). Human evolution: Human-like hand use in Australopithecus africanus. Science, 347(6220), 395-399.

  • Sousa, A. M. M., Meyer, K. A., Santpere, G., Gulden, F. O., & Sestan, N. (2017). Evolution of the human nervous system function, structure, and development. Cell, 170(2), 226-247.

  • Spocter, M. A., Hopkins, W. D., Garrison, A. R., Bauernfeind, A. L., Stimpson, C. D., Hof, P. R., & Sherwood, C. C. (2010). Wernicke's area homologue in chimpanzees (Pan troglodytes) and its relation to the appearance of modern human language. Proceedings of the Royal Society B, 277(1691), 2165-2174.

  • Stout, D. (2011). Stone toolmaking and the evolution of human culture and cognition. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 366(1567), 1050-1059.

  • Stout, D., & Chaminade, T. (2012). Stone tools language and the brain in human evolution. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 367(1585), 75-87.

  • Stout, D., Chaminade, T., Apel, J., Shafti, A., & Faisal, A. A. (2021). The measurement, evolution, and neural representation of action grammars of human behavior. Scientific Reports, 11(1), Article 13720.

  • Stout, D., Toth, N., Schick, K., & Chaminade, T. (2008). Neural correlates of early stone age toolmaking: Technology, language and cognition in human evolution. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 363(1499), 1939-1949.

  • Suzuki, T. N., & Matsumoto, Y. K. (2022). Experimental evidence for core-Merge in the vocal communication system of a wild passerine. Nature Communications, 13, Article 5605.

  • Számadó, S., & Szathmáry, E. (2006). Selective scenarios for the emergence of natural language. Trends in Ecology & Evolution, 21(10), 555-561.

  • Tallerman, M. (2014). No syntax saltation in language evolution. Language Sciences, 46(B), 207-219.

  • Tomasello, M. (2003). Constructing a language: A usage-based theory of language acquisition. Harvard University Press.

  • Tzourio-Mazoyer, N., & Mazoyer, B. (2017). Variations of planum temporale asymmetries with Heschl’s Gyri duplications and association with cognitive abilities: MRI investigation of 428 healthy volunteers. Brain Structure & Function, 222(6), 2711-2726.

  • Uomini, N. T., & Meyer, G. F. (2013). Shared brain lateralization patterns in language and acheulean stone tool production: A functional transcranial doppler ultrasound study. PLoS One, 8(8), Article e72693.

  • van der Lely, H. K. J., & Pinker, S. (2014). The biological basis of language: Insight from developmental grammatical impairments. Trends in Cognitive Sciences, 18(11), 586-595.

  • Van Herwegen, J., Dimitriou, D., & Rundblad, G. (2013). Development of novel metaphor and metonymy comprehension in typically developing children and Williams syndrome. Research in Developmental Disabilities, 34(4), 1300-1311.

  • Wakita, M. (2014). Broca’s area processes the hierarchical organization of observed action. Frontiers in Human Neuroscience, 7, Article 937.

  • Willer Gold, J., Arsenijević, B., Batinić, M., Becker, M., Čordalija, N., Kresić, M., Leko, N., Marušič, F. L., Milićev, T., Milićević, N., Mitić, I., Peti-Stantić, A., Stanković, B., Šuligoj, T., Tušek, J., & Nevins, A. (2018). When linearity prevails over hierarchy in syntax. Proceedings of the National Academy of Sciences of the United States of America, 115(3), 495-500.

  • Wilson, B., & Petkov, C. (2011). Communication and the primate brain: Insights from neuroimaging studies in humans, chimpanzees and macaques. Human Biology, 83(2), 175-189.

  • Wray, A. (1998). Protolanguage as a holistic system for social interaction. Language & Communication, 18(1), 47-67.

  • Wynn, T. (1998). Did Homo erectus speak? Cambridge Archaeological Journal, 8(1), 78-81.

  • Zuberbühler, K. (2019). Syntax and compositionality in animal communication. Philosophical Transactions of the Royal Society B, 375(1789), Article 20190062.

  • Zuberbühler, K., & Bickel, B. (2022). Transition to language: From agent perception to event representation. Wiley Interdisciplinary Reviews: Cognitive Science, 13(6), Article e1594.

  • Zywiczynski, P., Wacewicz, S., & Sibierska, M. (2018). Defining pantomime for language evolution research. Topoi, 37, 307-318.