Some Questions about Determining Causal Inference and Criteria for Evidence : Response to Ladd , Dediu & Kinsella ( 2008 )

Ladd, Dediu & Kinsella (2008; LDK from now on) and Dediu & Ladd (2007; DL from now on) are excellent examples of contemporary biolinguistic research and convey what Chomsky (2000: 27) calls the primary goal of bringing “the bodies of doctrine concerning language into closer relation with those emerging from the brain sciences and other perspectives”. The articles also shed light on what Chomsky (2005, 2007) calls the “three factors” of language design: (i) genetic factors and UG, (ii) experience and variation within narrow parameters, and (iii) principles not specific to Language such as efficient computation. Additionally, they point out what Boeckx & Grohmann (2007: 2) define as the sense of ‘weak’ and ‘strong’ biolinguistics. In particular, LDK’s investigation of correlations between populations exhibiting a low frequency of certain allele combinations with populations exhibiting a specific type of language feature — tone systems — concretely puts into practice observational analysis of possible genetic factors related to UG principles and parameters and the notion that variations from UG principles must be narrow in range. Here, the narrow variations (parameters) are part of both the ‘physical’ and ‘abstract’ properties of linguistic inquiry and implicate consequences for both the physical brain and the abstract-theoretical structure of grammatical systems. Of course, LDK and DL only present an observed correlation that could be the result of chance — as any correlation between X and Y may be the result of chance with no underlying causation between X and Y. The goal of my response is to highlight some questions that could potentially be useful for issues of deriving inferences from the observed correlations to a degree of causation (see Clark 2000, Shipley 2000, and Thagard 1998 for discussion of causality and correlation). I also ask some questions about


Introduction
Ladd, Dediu & Kinsella (2008; LDK from now on) and Dediu & Ladd (2007;DL from now on) are excellent examples of contemporary biolinguistic research and convey what Chomsky (2000: 27) calls the primary goal of bringing "the bodies of doctrine concerning language into closer relation with those emerging from the brain sciences and other perspectives".The articles also shed light on what Chomsky (2005Chomsky ( , 2007) ) calls the "three factors" of language design: (i) genetic factors and UG, (ii) experience and variation within narrow parameters, and (iii) principles not specific to Language such as efficient computation.Additionally, they point out what Boeckx & Grohmann (2007: 2) define as the sense of 'weak' and 'strong' biolinguistics.In particular, LDK's investigation of correlations between populations exhibiting a low frequency of certain allele combinations with populations exhibiting a specific type of language feature -tone systems 1 -concretely puts into practice observational analysis of possible genetic factors related to UG principles and parameters and the notion that variations from UG principles must be narrow in range.Here, the narrow variations (parameters) are part of both the 'physical' and 'abstract' properties of linguistic inquiry and implicate consequences for both the physical brain and the abstract-theoretical structure of grammatical systems.Of course, LDK and DL only present an observed correlation that could be the result of chance -as any correlation between X and Y may be the result of chance with no underlying causation between X and Y.The goal of my response is to highlight some questions that could potentially be useful for issues of deriving inferences from the observed correlations to a degree of causation (see Clark 2000, Shipley 2000, and Thagard 1998 for discussion of causality and correlation).I also ask some questions about Thank you to the editors.I especially want to thank an anonymous reviewer for very helpful comments, though she/he may not agree with the direction I have them.
1 To be entirely accurate about LDK and DL's idea about the direction of bias -whether the muted allele pairs bias toward tone systems or non-muted allele pairs bias toward non-tone systems I quote DL (2007: 4): "Finally, note that this bias could be either for or against tone, but the fact that nontonality is associated with the derived haplogroups […] suggests that tone is phylogenetically older and that the bias favors nontonality".what kinds of evidence and/or counter-examples are needed to potentially support an inference from gene-tone correlation to causation.

The Problem
Despite the pioneering new ground that biolinguistic inquiry is starting to cover, it still faces classic problems related to the issues of correlation and causality, evidence, counterexample, and refutation.The LDK and DL papers are no exception to these problems.Here I ask questions about general problems of adequate evidence/counterexample and causal-inference-from-correlation in gene-language studies.My questions originate from what I perceive to be a possible problem of simultaneity in correlating genetic and linguistic features (see below).A related problem that arises in making causal inferences from observed correlations between genes and languages has to do with the fact that populations of speakers can change languages or language featuresconsciously or not -quite rapidly when compared to the time it takes for genetic change (see Campbell 2006 for this basic idea applied to gene populations and language families, as well as criticisms of many of the gene-language identification approaches).In other words, a homogenous genetic population can, over time, come to represent a heterogeneous linguistic population by "random chance" of history, culture, and demography.The gene-language relations in these cases are coincidental and no complex causal chain of inference can be established.LDK are vigilant in responding to these 'spurious' relations that are most likely the result of chance and not causality, and thus, their correlational observation seems to not be the result of chance.It is worth quoting LDK (2008: 117) in full.
The statistical analysis showed that the distribution of the correlations between genetic and linguistic features strongly supports the hypothesized connection between ASPM-D/MCPH-D and tone.To rule out the likelihood that this correlation is of the spurious type discussed above, i.e. due entirely to underlying demographic and linguistic processes, Dediu & Ladd computed the correlation between tone and the two derived haplogroups while simultaneously controlling for geographic distances between populations (a proxy for population contact and dispersal) and historical linguistic affiliation between languages (a proxy for similarity through common descent); the proportion explained by these factors turned out to be minimal (again, details are to be found in Dediu &Ladd 2007 andDediu 2007).It seems, therefore, that the relationship between tone and the derived haplogroups is not due to these standard factors; instead, it could reflect a causal relationship between the inter-population genetic and linguistic diversities.
But I have a question.Does correlating a typological feature, such as tone, with a (muted) genetic feature in a population assume that the typological feature has been around approximately as long as the genetic feature it correlates with?That is, if we observe a correlation between tone and a low frequency of alleles in specific populations, and we want to try to infer some complex causal chain wherein the genetic features are part of a complex network of causal factors for the emergence of tone, then the genetic features and the typological features should be simultaneously existent at some point in time.However, the correlated typological feature need not be active -nor does it need to be fully developed.But this leads to a problem for LDK and DL's observations.It is true in general that the absence of evidence (for a feature or property) is not evidence of absence (for that feature or property).But let us assume that a human being does not need the proposed genetic feature in order to acquire or use a natural language tone system -which LDK and DL do.There seems to be no clear way in which to distinguish the natural development of tonogenesis in languages with speakers who do not have the muted allele pairs from the development of tonogenesis in languages with speakers who do have the muted allele pairs, also assuming tonogenesis proceeds the same in both population groups (which if LDK and DL are right then it should not).Additionally, if one assumes that tone languages arise only in populations with the muted allele pairs, then that is begging the question.Furthermore, LDK state that the particular focus of their discussion "is the recent claim (Dediu & Ladd 2007) that there is a causal relationship between genetic and linguistic diversities at the population level, involving brain growth-related genes and linguistic tone" (p.114).If one is going to draw a 'causal' implication between gene populations and a typological feature, then is one also arguing that the language feature has stayed somewhat actively constant in the target population?LDK allow for the masking of the typological feature by other features or factors, but if the correlation is viewed as somehow 'causal' in any degree then such a typological feature should have the capacity to resurface systematically in at least some of the populations that have a predisposition for it.But given the rapidity and frequency with which language populations can, and usually do, alter typological features (due to reanalysisborrowing-extension and intergenerational parameter shifts -see Harris & Campbell 1995, Lightfoot 1979, 1991, and Roberts 2007 for these two partly conflicting views on language change) it would be rare that any population would consistently retain such features for a substantial period of time, say 6,000 years. 2 The rarity of long-term retention of tone-systems (in at least some of the distinct populations exhibiting the gene-typology correlation), would add credence to the possibility of a neuro-genetic bias.Additionally, long-term retention seems to point out a possible direction for dismissing the gene-typology correlation: Show that the typological feature has not stayed constant in the target population it is supposed to be correlated with.Of course, there are mitigating circumstances and one instance of a counter-example would not be enough to dismiss LDK's suggestions.It has also been pointed out by a reviewer that there is no problem in showing that factor X contributes to the prevalence of phenomenon Y, and then observing that in given populations Y has disappeared while X is still existent -in this case other factors W, A, Z suppress the effect of X.As the reviewer points out, this is an essential notion to what a correlation is.But I argue that a systematic instability or non-continuity of the supposed typological feature in all the target populations would be adequate evidence for questioning the validity of LDK and DL's correlation leading to a causal explanation.
Imagine a scenario where speakers of language T are transplanted to a population of speakers of language I.In this scenario the children of following T generations will acquire perfect I, becoming I T .But if the ethnically T descendents who speak I (= I T ) were consequently isolated from the original I population for an adequate amount of time, the LDK view would seem to predict that a possible tonal neuro-genetic bias could again trigger some kind of tonogenesis in the newly acquired and now geographically isolated I T .This kind of 'natural experiment' could surely provide some evidence, but it would not be easy to observe and is probably never likely to be observed.Instead it serves its purpose as an appropriate scenario, or 'thought experiment', and brings to light another question I have.Exactly what kind of counter-example (or what kind of systematic instability) of the possible neuro-genetically biased typological feature would count as dismissive? 3 If a (muted) genetic feature in specific populations can be correlated with a typological feature of those same populations, then should we not expect that typological feature to be prevalent in the languages of those populations?And if not always prevalent or stable across a sufficient period of time -a very likely probability as the 'thought experiment' informally shows -then what kind of instability of the predicted typological feature would count as a genuine counter-example to a possible LDK-type hypothesis?In other words, if there existed a population of speakers with the muted allele pairs that had never acquired or used a tone system, then how would one explain this?If the explanation was the systematic suppression of the typological feature of tone by other factors W, Z, A, then what evidence would count as showing the systematic non-expression of a feature that is correlated with a genetic predisposition for it?To put it another way, how does one measure the suppression of a typological feature?One might argue that (i) a causal inference from the correlation of a genetic feature with a typological feature does not imply that the typological feature should be approximately as old as the genetic feature; whether active or not.In this case, an external or internal stimulus could 'trigger' the rapid development through the mechanism of intergenerational transmission of the typological feature throughout the target genetic population.One might also argue that (ii), assuming a causal inference from the correlation, the development of the typological feature took a very long time to reach its present state, and thus, there is no need to say that the typological feature had been around for 6,000 years.Instead, it is a more recent innovation with a long historical development now facilitated by the mechanism of intergenerational transmission.But (i) and (ii) are both complications that need to be verified empirically.In the first case of the 'trigger' (i), what would serve as an adequate stimulus?In the second case of the development scenario (ii), it seems one would be hard pressed to show why this development is not different than any other kind of language structure 3 I admit that this is simplistic, as counter-examples need not dismiss, destroy, or falsify a hypothesis, or a theory built from hypotheses, based on empirical observations -assuming a theory can develop from the gene-tone correlation.I merely intend here to ask what would constitute a genuine counter-example.development that occurs over time: What makes it so unique that it can be causally linked with a correlated neuro-genetic feature?I argue that if LDK's correlational observation is going to yield a causal explanation then it cannot escape the implication that the 'life' of the typological feature (whether active or not) should be roughly simultaneous with the appearance or 'activation' of the genetic feature.Providing evidence for this simultaneity is another issue.
Lastly, if a typological feature could be proved to be stable for a specific population of speakers, and this population appeared to have some unique genetic feature that could be shown to correlate with the language feature in question, then there will be a discrepancy between the time-depth of reliable information between language and gene datum -as historical-descriptive linguistics generally has only a reliable 6,000 year time-depth, while genetic information can exceed this limit by a substantial amount. 4It is not clear if this poses any real problems to causal inferences for genetic and typological features, but it is surely a factor in considering the kinds of evidence used for establishing genetic and typological relations.
In (1) I repeat the questions asked above; though I accept the risk that they may not be coherent out of the context in which they were asked and there may be some redundancy.Following (1) are a few more questions, in (2), that might be relevant to both the general method of gene-language correlation and the specific observations of LDK and DL.I could not possibly begin to sketch answers to these questions in a short response, but will try to give very short answers to those in (2).
Does correlating a typological feature, such as tone, with a (muted) genetic feature in a population assume that the typological feature has been around approximately as long as the genetic feature it correlates with? b.
If one is going to draw a "causal" implication between gene populations and a typological feature, then is one also arguing that the language feature has stayed somewhat actively constant in the target population?

c.
Exactly what kind of counter-example, or systematic instability, of the possible neuro-genetically biased typological feature would count as dismissive? d.
If a (muted) genetic feature in specific populations can be correlated with a typological feature of those same populations, then should we not expect that typological feature to be prevalent in the languages of those populations? e.
What kind of instability of the predicted typological feature would count as a genuine counter-example to a possible LDK-type hypothesis? 4 Where ASPM-D is about 5.8 thousand years old and MCPH-D is 37 thousand years old.Perhaps a moot point here because the correlation crucially involves the pair of haplogroups and so any typological feature correlated with the pair can only be as old as the earliest instance of the pair.
f.If there existed a population of speakers with the muted allele pairs that had never acquired or used a tone system, then how would one explain this?
g. What evidence would count as showing the systematic nonexpression of feature that is correlated with a genetic predisposition for it?
h.How does one measure the suppression of a typological feature?
Are tone systems really prone to regular historical change, or are they somehow more resilient to change (in the sense that if tone systems are very stable in themselves, then long-term retention of them may not be due to a possible genetic bias but to the abstract nature of the typological feature itself)? b.
Can we confidently show that populations suggested to exhibit genetic factors that increase the likelihood of having tone systems based on certain muted allele pairs have historically stable tone systems -and what are the linguistic factors contributing to loss or gain of tone systems in these populations?
c. What kind of unstable, or discontinuous, appearance of the typological feature in the target population counts as a genuine counter-example -or does the criterion of "appearance of the feature" even qualify as relevant to establishing the parameters for counter-examples to LDK's research? d.
What kinds of assumptions about simultaneity of genetic properties/features and typological properties/features are operative when discussing issues of the neuro-genetic bases of natural human languages?
I think the answer in (2a) is fairly straightforward: Tone systems are prone to regular change and do not show any more stability than other structures (Gussenhoven 2004, Yip 2002).But with this answer comes more questions about certain facts of tone.For example, if the target population has a predisposition or bias to acquiring and using tone systems, then do they also have a bias for what are commonly recognized as the phonemic/phonetic precursors to tone (Fromkin 1978, Hombert, Ohala & Ewan 1979, Matisoff 1973)?(Of course, see footnote 2.) The answer to (2b) would take some time, but I believe that it is a productive direction towards compiling linguistic data sets relevant to LDK's research.Of course, it has its strict limits -namely that even with written records going back 6,000 years the evidence of a tone system in a language that old is not easy (or impossible) to substantiate.As for (2c), also (1c) and (1e), I have no adequate answer, but it seems to be an important and relevant question to specific issues in LDK and DL if one assumes that the goal is to derive causal inferences from the observed correlations and the problem of simultaneity is a real problem.As for (2d), it is a general question relevant to the methodological aims and practices of biolinguistic research specifically aimed at deriving causal inferences from correlational observations about genes and typology; it can only be answered through the process of research, investigation, and critical inquiry and can, I believe, potentially have what Chomsky (1995: 232) attributes to the Minimalist Program -"a certain therapeutic value".

Conclusion
Unless a causal link between gene-language or genetic feature and typological feature can be established, then an observed correlation does not seem to be very useful.LDK and DL are clearly committed to a research strategy that seeks to discover a causal link; although it is overwhelmingly clear that this link should not be direct or deterministic and is likely not to be.Any degree of causality here, I think, is generally expected to be of a complex, multifactorial nature.In fact, Paul Thagard's Causal Network Instantiation (CNI) model ( 1998) for making causal inferences from observed correlations in medical scientific explanations for diseases seems like a good fit with the LDK and DL research.As Thagard (1998: 76) himself says, I expect, however, that there are many fields such as evolutionary biology, ecology, genetics, psychology, and sociology in which explanatory practice fits the CNI model.For example, the possession of a feature or behavior by members of a particular species can be explained in terms of a causal network involving mechanisms of genetics and natural selection.Similarly, the possession of a trait or behavior by a human can be understood in terms of a causal network of hereditary, environmental, and psychological factors.
In psychology as in medicine, explanation is complex and multifactorial in ways well characterized as causal network instantiation.
In pushing any research to reveal potentially useful inferences from correlation to causation one almost heuristically demands that there is a causal link, once chance has been somewhat ruled out (and while trying to rule out other causes), and then one works to establish the most likely complex causal path.This should be true also in the search for causal paths, webs, or networks from genes to languages -or populations with specific muted allele pairs to populations who are predisposed to acquire, use, or generate tone languages.Shipley (2000) argues that in most cases correlation implies an unresolved causal structure -unresolved in that we have not yet discovered cause, effect, and/or other variables.Shipley (2000: 3) says that "[i]n fact, with few exceptions, correlation does imply causation.If we observe a systematic relationship between two variables, and we have ruled out the likelihood that this is simply due to random coincidence, then something must be causing this relationship."Precariously, the assumptions needed for discovering inferences from correlation to causation may turn out to be as complex as the phenomenon under investigation.As Chomsky (1995: 233) notes, "[i]t is all too easy to succumb to the temptation to offer a purported explanation for some phenomenon on the basis of assumptions that are roughly the order of complexity of what is to be explained".LDK and DL seem to me to be cautious about not succumbing to the 'temptation'.And even though the assumptions needed to discover inferences from correlation to causation may be complex, and the criteria of evidence for measuring the neuro-genetic bias or predisposition that a person and population may have for exhibiting some linguistic trait may not seem clear (whether that trait is ever expressed or not), it is well to remember what Boeckx (2006: 91) points out about rigor and maturation in research programs: "Programs take time to mature, and rigor cannot be required in the beginning".The expectation of solid evidence of some causal link between muted allele pairs and tone systems is premature and stifles the hard-won creativity in research that the Minimalist Program, and by extension Biolinguistics, has achieved.New areas of scientific research are messy, and this messiness should not cloud our vision of what kind of order may reveal itself over time.But this does not mean we should not ask a variety of questions and expect some answers -or at least a direction towards answers.Whether an inference from correlation to causation in LDK and DL will ultimately be found, or the questions asked here are useful or relevant, the lesson is that there is at least a "therapeutic" value to biolinguistic research through eliminating questions and trying to establish causal inferences.