Some Problems for Biolinguistics

Biolinguistics will have to face and resolve several problems before it can achieve a pivotal position in the human sciences. Its relationship to the Minimalist Program is ambiguous, creating doubts as to whether it is a genuine subdiscipline or merely another name for a particular linguistic theory. Equally ambiguous is the relationship it assumes between ‘knowledge of language’ and the neural mechanisms that actually construct sentences. The latter issue raises serious questions about the validity of covert syntactic operations. Further problems arise from the attitudes of many biolinguists towards natural selection and evo-devo: The first they misunderstand, the second they both misunderstand and overestimate. One consequence is a one-sided approach to language evolution crucially involving linguistic ‘precursors’ and the protolanguage hypothesis. Most of these problems arise through the identification of biolinguistics with internalist and essentialist approaches to language, thereby simultaneously narrowing its scope and hindering its acceptance by biologists.


Introduction
In this article I try to deal with some issues that seem to me to be crucial if biolinguistics is to achieve the centrality in the human sciences to which its subject-matter surely entitles it.One or two of these may be issues that involve image and perception, but most are much more substantive, involving braingrammar relations, understanding of old and new aspects of evolutionary biology, the process of language evolution, and fundamental issues in the philosophy of biology.Some issues result from still unclarified aspects of the relationship between biolinguistics and generative grammar, but all of them, to a greater or lesser extent, prejudice the unification of biolinguistics with other biological fields.
Fears widespread among both linguists and non-linguists that 'biolinguistics' may turn out to be merely a more scientific-sounding term for generative minimalism are reinforced by the way the distinction is made between 'strong' and 'weak' senses of biolinguistics by Boeckx & Grohmann (2007: 2).They define "the strong sense" of the term as "provid[ing] explicit answers to questions that necessarily require the combination of linguistic insights and insights from related disciplines (evolutionary biology, genetics, neurology, psychology".They define the "weak sense" as "refer [ing] to 'business as usual' for linguists, so to speak, to the extent they are seriously engaged in discovering the properties of grammar, in effect carrying out the research program Chomsky initiated in Syntactic Structures.Emphasizing that by their use of the words 'strong' and 'weak' they are not proposing some two-tier system of "superior" and 'inferior' biolinguists, they point out that work "focusing narrowly on properties of the grammar… has very often proven to be the basis for more interdisciplinary studies" (loc.cit.).
It is difficult to avoid the conclusion that adhering to the latest version of generative grammar is indeed a prerequisite, not perhaps for simply attempting to engage in biolinguistics, but certainly for being taken seriously by serious biolinguists.Granted, the authors try to forestall this conclusion by claiming that "minimalism is an approach to language that is largely independent of theoreticcal persuasion" (Boeckx & Grohmann 2007: 3).But I suspect that even a long-time Yakuza member could number the minimalist works by non-generativists on the fingers of one hand.In fact, too many biolinguists have taken over without questioning a number of assumptions made within generative grammar at one time or another, many of which pre-date the Minimalist Program (MP), none of which have any necessary connection with it, and some of which are orthogonal, even prejudicial, to the achievement of MP goals.

'Knowledge of Language'
Let's begin by examining some versions of the famous 'Five Chomsky Questions' (Chomsky 1986, 1988, Jenkins 2000, Boeckx & Grohmann 2007, Di Sciullo et al. 2010).As indicated by the dates of these citations, the questions precede the efflorescence of biolinguistics but are now routinely repeated in one form or another by authors of programmatic statements about the field.It is interesting (and very relevant) to compare the wording of Question 4 in three versions of the questions.That of Boeckx & Grohmann (2007) adheres most closely to Chomsky's 'knowledge of language' formula: (1) How is that knowledge [of language-DB] implemented in the brain?
'Knowledge' in this context has long provoked the ire of empiricist philosophers, but my objection is quite different; use of the term gives a highly misleading picture of the nature of syntax.Although syntax is often regarded as part of cognition, its operations are automatic and out of reach of conscious awareness.We are no more aware of how our brains construct sentences than we are of how our stomachs digest food or our hearts circulate blood.No-one who proposed to study our 'knowledge of digestion' or 'knowledge of circulation' could hope to be taken seriously.Granted, one says informally things like "Does he know Russian?", whereas nobody ever said "Does he know digestion?"-butthere are many languages, and only one digestion.The problem here arises, I think, simply from the ambiguity of the term 'language', as opposed to the French distinction between langage (the faculty) and langue (an individual language).I know English and I have (hopefully reliable) intuitions about what are, or are not, grammatical sentences in (some variety of) English, as do all other speakers of that language.But if I did not have years of professional training and experience I would be as unable to explain the basis for those intuitions as is any naïve speaker, and I have no intuitions whatsoever about what is grammatical in Russian.It is surely significant that the anonymous reviewer who queried my treatment of 'knowledge of language' admitted that "speakers' internalized linguistic capabilities are 'about' one or another particular grammar" (my emphasis) and not about langage at all.But it is surely langage and not langue that we must be talking about if we are asking the "five questions".
Whether guided by some awareness of this or for other reasons, three years later Question 4, like the other four questions, was rephrased to excise 'knowledge' (Di Sciullo et al. 2010): (2) How is language implemented in the brain?But it is perhaps even more revealing to see how Jenkins (2000) produces yet a third variant of the question: (3) What are the relevant brain mechanisms?All of these formulations, fortuitously or otherwise, avoid one of the most crucial issues that biolinguistics should be resolving-the relationship between grammars and how the brain actually produces sentences. 1Consider Chomsky's (1988) version of Question 4.
(4) What are the physical mechanisms that serve as the material basis for this system of knowledge and for the use of this knowledge?
Chomsky's formulation presupposes two distinct and separate mental objects: a system of knowledge and a system for executing that knowledge.It is an astonishingly dualist claim from someone who has consistently adhered to monism, but let that pass.Taking (4) at its face value, there could obviously be two ways of describing syntax.One would provide maximal coverage of the empirical data while simultaneously achieving maximal levels of elegance, 1 I am well aware of work by Embick & Poeppel (2005a, b) and associates on the relationship between neurobiology and linguistics.Though the issues may seem the same, I approach them from a different direction with different assumptions and different goals.Consequently we see different problems and different solutions.To discuss these differences would take us too far from present topics, but some flavor of them maybe found in this quotation from Poeppel et al. (2012: 14130): "By connecting the brain science of language to formal models of linguistic representation, the work decomposes the various computations that underlie the brain's multifaceted combinatory capacity."Poeppel and his associates seem to believe that, given the right granularity level, current analyses in linguistics and neurobiology can be matched without substantive change to either.I do not.simplicity, and explanatory power.The other would adhere, as far as possible, to a literal description of what the brain actually does in order to produce sentences.Would those two descriptions be isomorphic?Not necessarily.The first, constrained solely by the linguistic data, could legitimately use whatever devices might help it achieve its goals of simplicity, elegance, and comprehensiveness, regardless of how its solutions related to what brains actually do.Should those two descriptions be isomorphic?Obviously yes.To the extent that they differed, one would simply be wrong, and if they prove instead to be isomorphic, one is redundant.But which is redundant, the knowledge model or the mechanistic model?There can be no question that the former is redundant, since without the latter, there would be nothing to describe.
The standard objection to this sort of argument is to say, "We don't yet know enough about the brain to let it influence the construction of grammatical theories".If that is still true, something much less obvious than it was a decade or so ago, we are not yet ready for biolinguistics, and 'the weak sense' is the only one that might be applicable.However, we can't afford to sit on our hands and wait for neurobiologists to do our work for us.We might find ourselves waiting a long time.The only course is to kick-start the procedure by beginning to think about and discuss what, given all we already know or can reasonably surmise, the brain might be expected to do.What the brain seemed likeliest to do in order to meet its own goals of economy would then become a default hypothesis for the grammar, to be maintained unless or until valid reasons (linguistic or neurological) for abandoning it became manifest.

Covert Movement
There are, of course, serious obstacles to any rapprochement between linguistics and neurobiology, of which the 'granularity mismatch' discussed by Embick & Poeppel (2005a, b) is perhaps the best known.Here I will suggest that differences in the levels at which the analytic units of linguistics and neurobiology respectively apply may be far from all that is involved here.It may be that there are also serious mismatches between the types of process envisaged by linguistic analysis and the processes the brain actually uses when it forms sentences.This is not a pressing problem yet, since few if any proposals specific enough to evoke it have so far appeared, but it will surely become one, and very soon, if biolinguistics is to go on developing.
As noted above, most biolinguists subscribe to the MP, which increases the likelihood that at least the basic assumptions of the MP, and likely also the kinds of syntactic process that it employs, will serve as a basis for any attempts to achieve the desired rapprochement.One of the most ubiquitous features of MP analyses is covert movement.
Covert movement differs from overt movement in the following respect.In overt movement, the same syntactic unit is associated with (at least) two positions in the same sentence, but is pronounced in only one of them.The reality of the unpronounced unit can be linked to empirical findings such as the blocking of want-to contraction where the unit is subject of the embedded clause.Given the copy theory of movement, overt movement is unproblematic for the brain; two instantiations of the same item are present during the brain's assembly of sentence materials, but only one is actually uttered.Covert movement is another matter altogether.
Covert movement has been invoked to explain a variety of phenomena, from differences in the positions of verbs and adverbs in French and English to variability in quantifier scope.In contrast to overt movement, covert movement does not link with any empirical finding; its motivation is theory-internal.Processes that involve covert movement include verb raising (Emonds 1985, Pollock 1989), quantifier raising (May 1977), VP shells (Larson 1988), subject-raising (Koopman & Sportiche (1985), and more.The theory-internal nature of these can be readily demonstrated.For instance, VP shells, which involve initially merging direct and indirect objects into positions where the former will c-command the latter and then re-merging them to yield the English surface order, were motivated by a desire to preserve the relation of c-command and thus avoid the apparent violations of Principle A of the Binding Theory first pointed out by Barss & Lasnik (1986).In the case of verb raising, subjects are supposed to end up in SpecIP, but the latter being a functional projection, they cannot be theta-marked there, and must consequently be assumed to have acquired their theta-marking within the maximal projection of V (Burton & Grimshaw 1992) before being raised.
Readers will have noticed that all the citations for covert movement given above come from pre-MP versions of the grammar, and that the analyses provided therein presuppose versions of X-bar theory that according to Chomsky (1995) should not be included in the strong version of the MP.Yet covert movement lives on and is if anything more frequently invoked than ever, as constituents are required to move covertly for purposes of feature-checking, and, as in the case that follows, also to satisfy the requirements of the Linear Correspondence Axiom (Kayne 1994), among which is that preceding constituents asymmetrically c-command following ones.Take the following derivation of a simple sentence from Hornstein (2009: 31, ex. 22).
Merge The first bracketed segment in (b) is given as "likes her", although "her likes" is given as the consequence of the merge-an obvious error, though natural enough in view of (6).The use of italicization in the original is also inconsistent; I have repaired this by using italics for unmerged items and normal lower-case for merged items throughout.On a different level entirely, one might question the status of T 0 and TP, given Chomsky's proposal that "any structure formed by the computation" should be "constituted of elements already present in the lexical items selected for N; no new objects are added in the course of computation apart from rearrangements of lexical properties (in particular, no indices, bar levels in the sense of X-bar theory, etc." (Chomsky 1995: 228, emphasis added).
Copies are retained and originals deleted throughout.Thus a three-word sentence requires six operations, whereas on a naïve view of Merge, two would suffice: (6) a.
Merge her and likes: [likes her] b.
Merge John and [likes her]: [John [likes her]] Why should not (6), rather than (5), be the way the brain does things?How likely is it that in the course of constructing sentences, the brain should have to move constituents repeatedly into new configurations?( 6) is simpler, shorter, and requires less energy, and one of the things we do know about the brain is that it consumes an enormous amount of energy, maybe as much as a quarter of human energy, despite the fact that it forms only a small fraction of body mass.Indeed, the take-home message for biolinguists from the work of Cherniak (1994, 2005, Cherniak et al. 2002) and his associates should be not so much "non-genomic nativism" (interesting and reassuring though that may sound) as the fact that the brain's optimization of wiring patterns is driven precisely and exclusively by its own need for energetic economy.
At this stage, biolinguists are likely to respond, "But it's ridiculous to change tried and trusted analyses just because of vague intuitions about what the brain can and can't do."I agree, it would be, but that's not what I'm saying.All I'm saying is that if we are serious about biolinguistics we should start asking ourselves (and one another) whether it's okay to unquestioningly accept analyses whose motivation is mainly if not wholly theory-internal and which in many cases originated before anyone had started thinking about evolution or brain mechanisms and before there was even a hint of the MP.While some recent works such as Balari & Lorenzo (2013) show a commendable effort to explore physiological and computational foundations for the language faculty, such work is still at a fairly abstract level as compared with the kind of nuts-and-bolts, neurology-friendly description of what core grammatical computations of the specificity of ( 5) and ( 6) above really look like that, sooner or later (preferably sooner) must be a task for any adequate biolinguistic theory.A good place to start might be to determine which of the formal proposals of the MP would best fit such a theory; covert movement does not look like a promising candidate.

3.
Problems with Biology (Old and New)

Natural Selection
A more immediately pressing issue concerns ways in which biolinguists understand (or misunderstand) biology.In the first place, they have problems with the notion of natural selection, up to and including a total failure to comprehend what it is and how it works.Typical is the following statement from the abstract for Longa (2001): "Natural selection is claimed to be the only way to explain complex design.The same assumption has also been held for language.However, sciences of complexity have shown, from a wide range of domains, the existence of a clear alternative: self-organisation, spontaneous patterns of order arising from chaos."Natural selection could not 'explain' complex design, even if Pinker & Bloom (1990), Dennett (1995), and others who are not biologists think it does.In fact, natural selection does not provide a single one of the factors that go into creating design.As its name suggests, it selects, and that's all it does.The only sense in which it contributes to complex design is by (a) selecting certain alleles to fix in a population, (b) narrowing the search space by its successive choices, and (c) not undoing its own work, so that a ratchet effect preserves each step towards a better adaptation, forming a secure base for subsequent steps.What natural selection selects from-that is, where design and everything else come from-is variation, and many different factors generate that variation: the consequences of assortative mating, genetic mutations of several kinds, variation in gene expression, interactions between genes and genes and between genes and environment, and more.Self-organization is simply one of those factors, albeit a very potent one where the brain is concerned (Bickerton 2014).
Thus Longa is attacking a straw man, and his claim that any process is an 'alternative' to natural selection is simply a category mistake.All the processes that he and others treat as alternatives to natural selection are in fact suppliers of the materials without which natural selection could not even exist.Natural selection simply preserves whichever of these materials works best for a particular species in a particular situation.Whether the result of such preservation increases or reduces complexity depends entirely on the species and the situation concerned: the same force that resulted in eyes for formerly eyeless lineages may lead to blindness in others that formerly had eyes.Natural selection may best be conceptualized not as a designer (of complexity or anything else) but simply as a test that every biological development (not excluding the 'Promethean' mutation of Chomsky 2010: 59) has to pass.How else does Chomsky suppose that his mutation was "transmitted to offspring, coming to predominate"?
Even while they reject the 'creative' role so often attributed to natural selection by non-biologists such as Pinker and Dennett, evo-devo specialists, unlike their linguist aficionados, continue to recognize the centrality and ubiquity of natural selection.In an article specifically claiming that evo-devo represents not a mere addition but an alternative paradigm to neo-Darwinism, Laubichler (2010, 207) asserts that "[t]he developmental system determines whether or not a new phenotype is produced in the first place.Natural selection, of course, then decides its future fate" (emphasis added).The filtering (but exceptionless) role of natural selection is clearly expressed by de Robertis (2008: 194): "In sum, several types of mutations, some acting on the function of conserved developmental gene networks, provide the variation on which natural selection acts."The overall position taken by most, if not all, specialists in evo-devo is well expressed by Arthur (2011: x): "However, [the process of development] is seen as being important as well as, not instead of, changes in gene frequency caused by Darwinian natural selection.This is a crucial point, because some previous approaches to evolution advocated a dismissal of population genetics and a denial that microevolutionary changes within species form the basis of most long-term evolution; this denial is now seen to be mistaken." Before we leave natural selection we should note a striking irony.Longa's 'alternative' to natural selection-"spontaneous patterns of order arising from chaos"-is virtually identical with the claims of computational linguists who oppose the whole idea of an innate universal grammar and promote iterated learning models in its place (Batali 1998, Kirby 2001, Brighton 2002, Christiansen & Ellefson 2002, etc.).If self-organization can single-handedly produce from chaos a brain capable of constructing language, why couldn't it take a short cut and directly produce language itself?

Evo-Devo
Biolinguistic problems extend to more recent developments in biology.In biolinguistics generally, evo-devo (the union of evolutionary and developmental biology) is routinely invoked in any discussion of evolutionary issues by biolinguists (Chomsky 2005, 2007, 2010, Berwick & Chomsky 2011, Boeckx 2006, Balari & Lorenzo 2009, Uriagereka 2011, etc.).Most of these are long on programmatic statements and short on detailed proposals with empirical support.For instance, Berwick & Chomsky (2011: 27) claim that in development "very slight changes can yield great differences in observed outcomes", but the sole example they offer involves pelvic spines in sticklebacks-hardly on a par with the emergence of what Maynard Smith & Szathmary (1995) classified as one of only eight major transitions in the whole of evolution.Berwick & Chomsky (2011: 26) seem to suppose that such developmental changes can occur in an ecological vacuum, without any prompting from environment factors, hence they see language as resulting from some purely organism-internal factor, perhaps "absolute brain size" or "some minor chance mutation".But this runs counter to a large consensus among evo-devo specialists who repeatedly indicate that developmental changes, even if not directly provoked by external factors, can only take place if there is intensive interaction between genetic or epigenetic events and the environment and ecology of the organisms concerned.Nowhere is this better understood than in the field of ecological and evolutionary developmental biology ('eco-evo-devo').For example, Ledón-Rettig & Pfennig (2011: 391) recommend taking the spadefoot toad, a species whose tadpoles show extensive phenotypic variation in response to "diverse environmental stimuli", as "a model system for addressing fundamental questions in ecological and evolutionary developmental biology (eco-evo-devo)."The authors go on to declare that "By characterizing and understanding the interconnectedness between an organism's environment, its development responses, and its ecological interactions in natural populations, such research promises to clarify further the role of the environment in not only selecting among diverse phenotypes, but also creating such phenotypes in the first place" (emphasis added; see also Blute 2008, Gilbert & Epel 2008, etc.).
But what is perhaps the most authoritative statement on the true relationship between internal and external forces is made in one of the most influential and most frequently cited treatises in the evo-devo paradigm (West-Eberhard 2003: 20), which deserves citation at some length."First, environmental induction is a major initiator of adaptive evolutionary change.The origin and evolution of adaptive novelty do not await mutation; on the contrary, genes are followers, not leaders, in evolution.Second, evolutionary novelties result from the reorganization of existing phenotypes and the incorporation of environmental elements.Novel traits are not de novo constructions that depend on a series of genetic mutations." Where does all this leave stickleback pelvic spines?According to Berwick & Chomsky (2011: 27), "[t]here are two kinds of stickleback fish, with or without spiky spines on the pelvis.About 10,000 years ago, a mutation in a genetic 'switch' near a gene involved in spine production differentiated the two varieties, one adapted to oceans and one adapted to lakes."Not only does this claim (like the associated suggestion that language evolution could have been triggered by a 'minor chance mutation') run directly counter to West-Eberhard's formulation, it is based on a serious distortion of the very papers that the authors cite as primary sources.
The primary sources the authors cite (Colossimo et al. 2004(Colossimo et al. , 2005) ) have nothing at all to say about the presence or absence of spines in sticklebacks; both papers concern differing quantities of armored plates on oceanic and lacustrine varieties of the species in question (known as the "three-spined stickleback"), and in Colosimo et al. (2005) there are 126 references to these plates as against one mention of spines.The authors can only have derived the notion that the varietal differences involve spines rather than armor from a popular account of evo-devo in the New Yorker (Orr 2005) that they also cite.Furthermore, the notion that the change was due to a single mutation, unrelated to environment or ecology, is not supported by either of the primary sources.The very first sentence of one of these reads: "Particular phenotypic traits often evolve repeatedly when independent populations are exposed to similar ecological conditions" (Colosimo et al. 2005(Colosimo et al. : 1928, emphasis added).Indeed, while mutation could have contributed to the physiological changes, Colossimo et al. note the occurrence of "repeated selection on the standing genetic variation already present in marine ancestors" and conclude that "the presence of a shared haplotype in most low-plated populations suggests that selection on standing variation is the predominant mechanism underlying the recent rapid evolution of changes in lateral plate patterns in wild sticklebacks" (Colosimo et al. 2005(Colosimo et al. : 1932, emphasis added), emphasis added).
In other words, evo-devo factors are constrained by the resources of preexisting phenotypes and internal developments are typically not stochastic processes but responses triggered by external (ecological, environmental) events.This is a hard pill for dedicated internalists to swallow (and a devotion to exclusively internal processes is key to most of biolinguists' problems with biology, as we will see) but swallow it they must if they want to engage in substantive dialog with biologists.The pill would be swallowed more easily if biolinguists were as cognizant of the other radical innovation in twenty-first century biology-niche construction theory (Odling-Smee et al. 2003, Laland & Sterelny 2006)-as they are of evo-devo.Indeed, the two areas complement one another (Laland et al. 2008) by showing precisely how developmental factors interact with environmental ones to bring about evolutionary innovations.The central thesis of niche construction theory is that animals whose livelihood is threatened by some environmental change may respond by trying to carve out a new ecological niche for which they are not genetically pre-adapted, whereupon the target of selection shifts to any traits that support exploitation of that niche, and both genetic and epigenetic factors combine to produce phenotypes that are progressively better adapted to the new niche.But niche-construction theory is mentioned once in the biolingusitic literature for every ten or even hundred times that evo-devo is mentioned.
Why this difference in the treatment of two equally radical and equally influential revisions of the 'Modern Synthesis' (MS) that is so often a target for biolinguistic disapproval?The answer perhaps lies in the dreaded word 'environment'.Generative grammar has been virtually from its beginning an internalist theory, allowing only endogenous factors to play a role in the development of the language faculty.More will be said on this score when we come to deal with biolinguistic assumptions about language evolution.Here, I would merely note the disproportion in the amount of attention given to evo-devo and niche construction as illustrating a tendency among biolinguists to cherry-pick biology for researchers whose work supports, or may be presented as supporting, traditional generative positions.This leads them to exaggerate both the extent and the significance of the changes biology is currently undergoing (e.g., "a multiplicity of stunning advances in biology and in evolutionary theory in the last several years have… completely reshaped the standard neo-Darwinian picture"; Piattelli-Palmarini 2008: 185).There is little doubt that within the next decade or two the MS of neo-Darwinism will undergo a substantial revision; the first shots have already been fired (Pigliucci & Müller 2010).There is equally little doubt that this revision will not amount to the kind of gross paradigm shift that many biolinguists hope for and expect--one that would sideline and demote, if it did not banish entirely, the specter of natural selection.For generativists, the title of the Pigliucci & Müller volume ("Evolution: The Extended Synthesis") will recall the Extended Standard Theory (EST) of Chomsky (1973).More than a mere similarity of names is involved.They should find it helpful to note that the relationship between the Extended Synthesis and the MS is very similar to that between the EST and the Standard Theory, in that in both cases the former is an extension rather than a replacement of the latter.
While appeals to evo-devo are ubiquitous in biolinguistic work on language evolution, I know of only two works by evo-devo biologists that directly and substantively address this topic.One is Scharff & Petri (2011), but this paper offers cold comfort for biolinguists.In the first place, it makes no reference to anything in the biolinguistic literature except for the Hauser et al. (2002) program of seeking precursors of language components in other species (see below).In the second place, its focus is on "discussing the evolution of language in the context of animal vocalizations" (emphasis added), thereby ruling out any consideration of the syntactic (recursion, etc.) or the semantic (mind-dependent concepts that Merge computes over) aspects of language, as well as invoking a notion of communicative continuity that is anathema to most biolinguists.The main body of the paper devotes itself to summarizing the present situation with regard to comparative animal studies and discussing the possible functions of FoxP2.In the third place, it culminates with the depressing finding that while FoxP2 obviously has some connection with language, it is still far from clear what that connection is.One of the few things the authors are sure of is that only two amino acids distinguish the human version of FoxP2 from the chimpanzee version and that these acids were very likely not the target of the selective sweep that affected human FoxP2 in the last few hundred thousand years.This means, of course, that the (so far) most plausible candidate for a recent recursion-enabling mutation looks likely to turn out a non-starter.
The second paper, Dor & Jablonka (2010), offers even colder comfort.This paper presents "a social-developmental, innovation-based theory of the evolution of language", at the core of which lies "the understanding that language itself, the socially constructed tool of communication, culturally evolved before its speakers were specifically prepared for it on the genetic level" (Dor & Jablonka 2010: 136).Jablonka is, of course, also co-author of one of the major treatises of the evo-devo paradigm (Jablonka & Lamb 2005), and in this context it is revealing to consider the reaction of a review of this book in the journal Biolinguistics (Piattelli-Palmarini 2008).Piattelli-Palmarini highly praises the overall evo-devo approach of the volume, but is deeply shocked when its authors seemingly abandon this approach in the case of language evolution, substituting a gradualist, culturallydriven account.He does not consider an alternative explanation: that biolinguists in general may have misunderstood evo-devo, distorting and exaggerating its emphasis on organismal-internal development, and that in consequence, when it comes to language evolution, evo-devo is no more friendly to orthodox biolinguistic accounts than it is to gradualist-externalist ones.

'Design Features' and 'Precursors'
One might have hoped that when real biologists came on board, so to speak, biolinguists might have acquired a better understanding of modern biology.Unfortunately, this was not to be the case.Chomsky's collaboration with two biologists, Marc Hauser and Tecumseh Fitch, gave rise to a paper (Hauser et al. 2002) that most biolinguists treat with reverence as a classic example of "Science's Compass" (the section of Science in which the article originally appeared), pointing the way to all subsequent investigators of language evolution.Unfortunately, discussion of this paper has focused almost exclusively on quibbles about what is, and what is not, to be included in FLN (the faculty of language, narrowly conceived) as opposed to FLB (the totality of mechanisms involved in languagesee Pinker & Jackendoff 2005, Fitch et al. 2005 etc.).Commentators failed to notice much more important and troubling aspects of the paper that related to biology rather than to linguistics.The evolution of language must have taken place during the evolution of humans, as a part of that evolution, and indeed, given its importance in their subsequent development, as arguably the most important part of that evolution.In fact, surprisingly little of the literature, biolinguistic or other, makes any serious attempt to place language evolution in the context of human evolution.But even in that company, Hauser et al. (2002) stands out as being perhaps the only work on the evolution of language that includes not a single word about how humans evolved.(Imagine a paper about the evolution of dam-building without a word about how beavers evolved.) The resultant space is filled with abuses of the comparative method.These involve decomposing language into component features or functions and then seeking other species where these components can allegedly be found.Ironically, this approach was pioneered by Hockett (1960) and Hockett & Altman (1968), while Hockett was developing what his Wikipedia entry describes as his "stinging criticisms of Chomskyan linguistics".Hockett's work was praised by Hauser (1996) and gave rise to the methods pursued by Hauser et al. (2002), which differed from Hockett's only in that "design features" such as 'semanticity' and 'duality of patterning' were replaced by more functional-sounding components such as "vocal imitation and invention", "capacity to acquire non-linguistic conceptual representation" and "imitation as a rational, intentional system".Such components were to be sought among species as diverse as whales, macaques, and starlings.It is assumed without argument throughout the paper that once these 'precursors of language' have been found and analyzed, language evolution has been definitively explained (except perhaps for recursion, unless this too can be found somewhere else in the animal kingdom).Subsequently, biolinguists have accepted, still without argument, that "building blocks of language" (Lorenzo 2012: 289) lie scattered across a wide range of species, just waiting to be assembled in the human brain.
I know of no species other than humans for which such a procedure has even been suggested, let alone put into practice.Standard texts in comparative evolutionary biology such as Harvey & Pagel (1991: 1) give as examples of the kinds of question comparative biologists might try to answer as "How much molecular evolution is neutral?Do large genomes slow down development?Is sperm competition important in the evolution of animal mating systems?What lifestyles select for large brains?Are extinction rates related to body size?" Nowhere is it suggested that any complex trait in a given species can be explained by breaking it into components and studying those components regardless of phylogenetic distance or ecological context. 3 For example, studies of the evolution of echolocation in bats (Zentali 2003, Neuweiler 2003, Jones & Holderied 2007, Li et al. 2007) never look outside bats for explanations, even though a number of other species-whales, dolphins, oilbirds, 3 It should be noted, however, that any set of components must be arbitrary and subjective, since the Hockett and Hauser et al. lists differ in every particular.I do not doubt that a third and equally disjoint set could be easily assembled.An anonymous reviewer sees the study of FLB as licensing the kind of comparative studies that I criticize "if one assumes the dichotomy" of FLN/FLB.But regardless of whether one assumes it or not, this isn't part of the solution-it's part of the problem!Such comparative studies are legitimate if the dichotomy is legitimate and the dichotomy is legitimate if the comparative studies are legitimate, but both are assumptions, and assumptions, moreover, that entail one another-if you think language divides in this way you must think that most language components are spread across other species, and conversely.This, though a blatant circularity, might be excusable if there weren't any other possible assumptions.But in fact at least one assumption is more plausible: niche construction theory strongly suggests that a novel trait with all its essential components (as distinct from mere pre-requisites) evolves in place, as a structured whole rather than a collection of mostly pre-existing attributes.swiftlets, shrews, tenrecs-have developed different types of echolocation.Yet one of these sources, Li et al. (2007), suggests bat echolocation as a precursor of human language!This approach has been sharply criticized by some comparative psychologists.In reviewing studies that claim similarities between non-human traits and components of human language, Rendell et al. (2009: 238) state that "the loosely defined linguistic and informational constructs […] are problematic when elevated beyond metaphor and pressed into service as substantive explanation for the broad sweep of animal-signaling phenomena".According to Owren et al. (2010: 762) the procedure becomes abusive when "characteristics of signaling in an array of species are routinely tested for possible language-like properties, thereby turning the normal evolutionary approach on its head", and incidentally taking an approach to the comparative method that is not only "more a distraction than a boon to serious scientific inquiry" (Owren et al. 2010: 763) but also "both teleological and circular" (Rendell et al. 2009, loc. cit).Such an approach presupposes that humans are somehow special and should therefore be treated differently from other species.It is almost as if human language constituted the goal towards which animals were constantly striving but were as constantly falling short.
However, treating humans as special is far from the only failing of Hauser et al. (2002).Let us give the article the benefit of the doubt and assume that a novel and highly complex trait could have emerged in a single species through the accumulation of component parts from a large number of different species. 4 Would this explain how and why language evolved in humans and only in humans?Not really-in fact, not at all.Even if we make an additional leap of faith, assuming that all the components of the language faculty stem from "deep homologies" (Shubin et al. 2009), so that the same genetic and developmental mechanisms underlie vocal imitation in human and whales, constraints on rule learning in humans and macaques, and discrimination of sound patterns in humans and starlings, the real problems remain.How did all these components come together in a single species?Why did this happen in humans but not in any other species, some of which must have shared many, if not all, of the same components?When they came together, how and why did they form a single module 4 Two anonymous reviewers found fault with my claim that a componential approach to the evolution of language is illegitimate and that biolinguists who adopt it are thereby misusing the comparative method.One provided me with a list of biology textbooks showing that "reusing and recombining pre-existing resources" is the default explanation for evolutionary novelties.This, as I was well aware, is indeed the case-where physiological form is concerned!But form is not behavior, and the texts I was referred to deal exclusively with form; not a single behavior is analyzed in this way.If and when biologists successfully decompose orb-web spinning, echolocation, bowerbird nest construction, or-to bring things closer to home-hymenopteran communication systems, I will be happy to reconsider my position.As things stand, to extrapolate from form to behavior is simply wishful thinking.The same reviewer also cited the work of Lynn Margulis as an indication that complex biological traits could derive from separate components, but this is again comparing apples with oranges.Margulis's work is concerned exclusively with whole prokaryotes that absorbed one another to form eukaryotes. Nobody is (I hope) claiming that humans emerged when a whale swallowed a macaque and a starling (three species Hauser et al. 2002 mention as possessing language precursors), but that is the only kind of process that might be analogous in the present context.devoted to language?Why didn't they simply go on doing what they had done in other species, which by definition (since language is unique to humans) must have been things that had nothing to do with language?
Even stating all these questions does not exhaust the problems.In biology generally, it is assumed that novel evolutionary developments can be driven only by a particular set of circumstances that changes the selective pressures operating on the species in question.Hauser himself, when not associating with Chomsky, fully recognizes this, indeed takes it for granted.For instance, he asks: "What special problems do bats confront in their environment that might have selected for echolocation?"(Hauser 1996: 154, emphasis added).Similarly, he points out that "[t]he goal [in dealing with possible analogies-DB] isn't to mindlessly test every species under the sun, but rather, to think about the ways in which even distantly related species might share common ecological or social problems, thereby generating common selective pressures and ultimately, solutions given a set of constraints" (Hauser et al. 2007: 108;emphasis added).Most biologists would unquestioningly agree with this, but Hauser et al. (2002), like most of the biolinguistic literature, simply ignore any connection between novel traits and special external problems.

Protolanguage
If language didn't evolve to solve any special problem but emerged as a result of organism-internal developments, there need not be anything you could call proto-language.I can think of nothing more likely to create a barrier between biolinguists and a majority of biologists than the former's insistence that language emerged ready-made, "pretty much as we know it today" (Boeckx 2012: 495).For most biologists it is axiomatic that any complex evolutionary trait has real precursors, that is to say not separate alleged components in other species but immature versions of the complete trait, in the species concerned or its immediate ancestors, that would have similar functions but lack some of the mature trait's features, or have them only in some partially developed form, or both.Among biolinguists, however, protolanguage denial is the norm, 5 and possible real precursors, as distinct from the illegitimate ones described in previous paragraphs, are often explicitly dismissed.

5
An anonymous reviewer complained that s/he had found no protolanguage deniers among biolinguists apart from Berwick and Chomsky.I find this remark extraordinary in light of the fact that two more are cited in this section of this paper: Boeckx (see his remark that language emerged ready-made, "pretty much as we know it today" cited earlier in this paragraph) and Piattelli-Palmarini (see below), who in the article there cited, without explicitly denying the possibility of a protolanguage, renders one effectively impossible by denying the possibility of a medium with words but without syntax and rejecting the belief that language could have evolved in a series of steps.The assumption of a sudden and rapid evolution of language some 50,000 to 100,000 years ago, shared by Hornstein (2009) and numerous other biolinguists, also entails that there cannot have been a protolanguage, regardless of whether this is explicitly claimed or not.Note also the absence of any discussion of protolanguage (how it might have been constituted, or how it might relate to language) from virtually all biolinguistic accounts of language evolution apart from Fitch (2010).Biolinguists for the most part do not even go to the trouble of denying protolanguage.Despite the number of authors that have discussed it, they simply assume it doesn't exist and is therefore not even worth talking about.
Consider the following, from Berwick & Chomsky (2011: 31): "Notice that there is no room in this picture for any precursors to language-say a languagelike system with only short sentences.There is no rationale for postulation of such a system: to go from seven-word sentences to the discrete infinity of human language requires emergence of the same recursive procedure as to go from zero to infinity, and there is of course no evidence for such protolanguages."This echoes in slightly different words Chomsky's (2010: 53) claim that "There are many proposals involving precursors with a stipulated bound on Merge: for example, an operation to form two-word expressions from single words, perhaps to reduce memory load for the lexicon; then another operation to form threeword expressions, etc. Clearly there is no evidence from the historical or archaeological record for such stipulations…" It is surely significant that though there have been many coherent arguments for the necessary existence of a protolanguage (Bickerton 1990, Jackendoff 1999, Fitch 2010, among others), none of them are answered or even mentioned here.In place of rational answers we find straw men or even outright falsehoods.Chomsky cites not a single example of the "many proposals" for protolanguages with stipulated sentence lengths, for the simple reason that there are none.Noone has suggested even a language with "short sentences", because utterances in protolanguage have never been claimed to be sentences.Sentences of natural language (and I know of no other kind) are propositions with syntax; protolinguistic utterances are propositions without syntax.As for absence of "evidence from the historical or archaeological record", protolanguage had disappeared tens if not hundreds of thousands of years before there was any 'historical record', while the 'archeological record', throughout history and prehistory alike, remains stubbornly silent on length and complexity of any utterance, whether in protolanguage or language.
Part of the problem is that Chomsky does not accept the existence of any way to put words together except through Merge, which is nothing if not a fullfledged syntactic process.His position here seems to me entirely irrational.Its full flavor cannot be grasped without quoting from a correspondence we had on this precise issue.When I wrote, "[p]rotolanguage consists of A + B + C…, i.e. there is no Merge," Chomsky replied, "[t]hat's commonly believed, but it's an error.A sequence a, b, c… that goes on indefinitely is formed by Merge: a, {a, b}.{{a, b}, c}, etc. (or some other notation, it doesn't matter).If we complicate the operation Merge by adding the principle of associativity, then we suppress {, } and look at it as a, b, c….So a sequence is a special case of Merge, with added complications" (Noam Chomsky, p.c. , 16 March 2006).
The principle (more frequently described as 'property') of associativity is what makes processes like addition and subtraction apply to sequences like 1 + 3 + 6 regardless of the order in which the operations are carried out (in other words [1 + 3] + 6 yields an identical result to 1 + [3 + 6]).It follows that the order in which integers are arranged-1, 3, 6: 6, 3, 1; 3, 1, 6…-is equally immaterial.This is precisely true of the examples of types of protolanguage for which we do have historical records.For example, we have Nim Chimpsky's utterance (Terrace 1979): (7) Give orange me give eat orange me eat orange give me eat orange give me you.
The propositional meaning of ( 7) is clear-Nim wants someone to give him an orange to eat-but the same meaning is conveyed by any arrangement of the constituents.This contrasts sharply with the English equivalent, where of the sentences in (8), only (8a) arranges its constituents in an order acceptable to English speakers: Similarly we have utterances of pidgin speakers, many of which are semantically more opaque than Nim's utterances (note that for the sake of comprehensibility I have adjusted the phonology, interesting but irrelevant here, to fit English spelling conventions): (9) a.
And then, white meat tuna, three hundred seventy-five dollar, one ton-that's why, white meat kind, us go get 'em, no? (Japanese pidgin speaker, Hawaii) b.
And too much children, small children, house money pay, very hard time, no more money-poor.School children, my children go school, take house money pay, everything poor, too hard, that's why Korea Kim name one more time me marry.(Korean pidgin speaker, Hawaii) c.
Inside lepo (dirt) and hanapa (to cover) and blanket.(Filipino pidgin speaker, Hawaii) Clearly, norms of constituent structure found in any natural language do not hold in (early-stage) pidgins such as that used in Hawaii from 1788 to the emergence of creole around 1900 (Roberts 1998), and subsequently by any adult immigrants who had arrived before the first creole speakers reached adulthood and began to influence the rest of the population (Bickerton & Odo 1976).If a medium lacks any consistent constituent structure, the most likely (perhaps the only possible) reason is because that medium lacks syntax-no principle or rulegoverned process, certainly not Merge, determines the order in which words are strung together.Yet if Chomsky is correct, both pidgin speakers and Nim the chimp must first be applying Merge, then the 'principle of associativity' to undo any combinatorial properties peculiar to Merge, and then presumably some 'principle of distributivity' to arrive at the variable orderings shown in (9).Why anyone would have to go to such lengths when they have the obvious alternative of just stringing the words together anyhow is something only Chomsky, if anyone, can explain.
The impossibility of protolanguage is supported, from a different albeit complementary position, by Piattelli-Palmarini, who claims that there cannot be any form of language that has words but no syntax: "Words are fully syntactic entities and it's illusory to pretend that we can strip them of all syntactic valence to reconstruct an aboriginal non-compositional protolanguage made of words only, without syntax" (Piattelli-Palmarini 2010: 160).But stripping words of some or all of their syntactic valence is exactly what both Nim and the pidgin speakers do, in their rather different ways, in ( 7) and ( 9) respectively.Words to them, and presumably to the original pre-human protolanguage speakers, simply were not the same as words today.But if one is committed to essentialism, that assumption is impermissible.
The first words can't have had syntactic valences because there was no syntax to provide those valences.They were mere lexical shells, vocal or gestural forms that could carry a meaning of some sort (perhaps vaguer and more general than that carried by natural-language words) but little else.I call such things 'words' because what else could you call them-proto-words?They are not 'calls' or 'signals' in the animal-communication sense of those terms.They are symbolic, but they are more than mere symbols; a cross on a map may be a symbol for 'church', but you can't insert that into a conversation, however crude and simplified.All one has to do is accept that words, like language itself, evolved over time.To claim otherwise commits one to essentialism.And it is as a result of the intersection of essentialism and internalism that the most serious problems for biolinguistics arise.

Essentialism + Internalism = Anti-Biologism
Essentialism is anathema to most biologists for a variety of reasons, the "population thinking" of Mayr (1963) being the most frequently mentioned and the issue of speciation being perhaps the most relevant here.Regardless of whether changes occur in a species over long periods or in a rapid cascade (the likelier procedure under niche construction theory) there comes a time when some individuals descended from species X can no longer be regarded as members of species X but must be assigned to a new species, species Y.If it were possible to draw a hard and fast line anywhere in the process-if for instance each differentiating feature, from a new means of exploiting food sources to sterility and ultimately impossibility of hybridization, occurred simultaneously and instantaneously on the flipping of a set of developmental switchesessentialism might make some sense.But things don't happen that way.What is misleadingly characterized as a 'speciation event' may take hundreds of thousands if not millions of years to complete (Foley & Lahr 2005).In light of these facts, a sudden birth of words with all their current properties seems far less likely than a developmental process, giving time for a variety of influences, both external and internal, that would have progressively added to and refined properties of the original lexical shells.Internalism runs equally counter to most biological thinking.Even those biologists who join with biolinguists in rejecting the MS (see, e.g., Dor & Jablonka 2010, Laubichler 2010 as cited above) concur with supporters of the MS in conceding that external forces and events are almost always instrumental in triggering evolutionary developments, as shown be the numerous citations of evodevo authors in preceding sections.The consensus is most forcibly stated by Müller (2010: 314) who gives a fully explicit statement of the sequence of events as perceived by evo-devo: "…the 'behavioral change comes first' position also gained new support from developmental psychology.Behavioral flexibility based on developmental plasticity is argued to result in behavioral neophenotypes, which in turn cause morphological innovation followed by genetic integration." But internalism seems to be entailed by essentialism.Essentialism paints biolinguists into a corner by imposing a strict time limit: if language is deprived of true precursors, forms intermediate between animal communication and full language that arose in the two-million year history of the genus Homo, it cannot be older than the species that possesses it (~200 kya, at the most) and any universal grammar cannot be younger than the start of the human diaspora (90-60 kya).This gives insufficient time for any prolonged interaction with the environment or for any complex new traits to develop.In the words of Boeckx (2012: 495), "[t]he recent emergence of the language faculty is most compatible with the idea that at most one or two evolutionary innovations, combined with the cognitive resources available before the emergence of language, delivers our linguistic capacity pretty much as we know it today."Logically, only internal developments could bring this about in the narrow time-window available.Logically, but not biologically-the notion that a single mutation, or even a rapid cascade of mutations, could precipitate one of the eight major transitions in evolution is something that the geneticist Rebecca Cann has dismissed as "magical thinking" (Diller & Cann 2010).
Painting oneself into a corner always has negative consequences, and the essentialist-internalist corner is no exception.Chomsky (2010: 57) was perhaps the first to clearly spell out one of the most crucial differences between language and animal communication.The latter refers directly to what Chomsky called "mind-independent entities"-things out there in the world-whereas language does so only indirectly, having as its primary reference the "mind dependent entities" of categorical concepts.It does this by a process of lexicalization: by providing each of these concepts with an associated word.Words form a common currency that "mixes conceptual apples and oranges in virtue of them all being word-like things", as Boeckx (2012: 498) insightfully observes.
But where do words come from?The inability of biolinguistics (so far) to deal with this question is clearly shown by the fact that one leading biolinguist has published in the same year two contradictory explanations.Berwick, as second author in Miyagawa et al. (2013), commits himself to the opinion that the alarm-calls of vervet monkeys constitute "the simplest lexically based system" suggesting that "non-human primate calls may be construed as lexical" and thus formed precursors of "lexical structure" that only required to be joined with a computational component for full human language to emerge.But as lead author in Berwick et al. (2013) he takes a much more pessimistic (and realistic) view, noting that primate calls "lack key properties of human words" and that consequently "there is scant evidence on which to ground an evolutionary account of words" (p.93).
Other biolinguists are equally baffled.While emphasizing that only words, not animal signals, are accessible to the operation Merge, Chomsky (2010) has nothing to say on their origin.Boeckx (2012: 499) does try to grapple with the issue, but he can only offer three possibilities: "random mutation", "an inevitable spandrel", or "we will never know".Another serendipitous mutation on top of the one for recursion is too much to swallow.A spandrel immediately prompts the question "spandrel of what?" for which no immediate answer is forthcoming."We will never know" is a counsel of despair that has been uttered countless times in human history and almost as frequently refuted by the advances of the natural sciences.
What is, from a scientific perspective, totally unacceptable about this treatment of word origins is that there already exists, and has been in print for the last four years, a fully-developed explanation of how words could have originated (Bickerton 2009; see now the much fuller exposition in Bickerton 2014) that is nowhere discussed or even mentioned in the sources cited above.This is, moreover, an explanation explicitly licensed by Hauser et al.'s (2002Hauser et al.'s ( : 1572) ) proposal of "the extension of the comparative method to all vertebrates (and perhaps beyond)" (emphasis added) as well as by the already-cited adjuration of Hauser et al. (2007: 108) to "think about the ways in which even distantly related species might share common ecological or social problems, thereby generating common selective pressures…" This explanation may be, as a reviewer remarked, "plain radical externalism", but so what?For any unbiased inquirer, this would no more exclude it from consideration than its being 'plain radical internalism', especially if internalist accounts had failed to supply any explanation at all.It is precisely this tendency among biolinguists to prejudge issues along ideological lines that I am objecting to.
More than a decade ago, Lyle Jenkins (2000) stated that the major goal of biolinguistics was to become integrated into the natural sciences.Alas, practices like those described here are taking it not nearer but further from that goal.Some biolinguists may react defensively to what I have written here.I think that would be a mistake, because this paper is not an attack and was never intended as an attack.I have merely tried to take an objective view of beliefs and practices that may have been held and carried out without full realization of their consequences.Unless biolinguists really wish to become isolated from other biological sciences, they should as a minimum think much more carefully than they have done to date about the issues raised here.