Brain-Language Research : Where is the Progress ?

Recent cognitive neuroscience research improved our understanding of where, when, how, and why language circuits emerge and activate in the human brain. Where: Regions crucial for very specific linguistic processes were delineated; phonetic features and fine semantic categories could be mapped onto specific sets of cortical areas. When: Brain correlates of phonological, syntactic and semantic processes were documented early on, suggesting language understanding in an instant (within 250 ms). How: New mechanistic network models mimicking structure and function of left-perisylvian language areas suggest that multimodal action-perception circuits — rather than separate modules for action and perception — carry the processing resources for language use and understanding. Why language circuits emerge in specific areas, become active at specific early time points and are connected in specific ways is best addressed in light of neuroscience principles governing neuronal activation, correlation learning, and, critically, partly predetermined structural information wired into connections between cortical neurons and areas.


Introduction: Questions in Focus
The aim of the neuroscience of language is to find the brain correlates of linguistic processes and representations.Correlates of linguistic representations are sought in neuronal structures, that is, nerve cell circuits, and correlates of linguistic processes are sought in patterns of neuronal activation.These aims have as yet not been reached.In many cases, conclusions are still at the level of 'areas' 'performing' certain functions, a state not untypical for cognitive neuroscience in general.However, such 'arealogy' can be understood as an intermediate step on the journey towards neuroscientific explanation.To keep the ultimate destination in sight and in focus, it may be relevant to pause and check.Ultimately, the clarification of the brain correlates of a given cognitive representation R and process P implies answers to (at least) four critical questions: 1.
Where-question: Which brain parts, areas, and, eventually, neurons are active during, and are critical for, process P and the representation(s) R P relies on?

2.
When-question: At which point in time in the usage or understanding of language does process P occur; when is representation R activated and processed? 3.
How-question: Which neuronal circuit, which nerve cells linked in which way, is the brain basis for representation R; which spatiotemporal pattern of neuronal activation in this circuit does underpin the process P? 4.

Why-question:
For what reason are R and P located in these specific brain parts and activated at these specific points in time, and why is R laid down in this specific neuronal circuit, P being expressed by these specific activation patterns?
The present contribution will briefly review research addressing critical facets of these four questions.A focus will be on recent progress in mapping specific linguistic representations and processes onto brain space and time and a second focus will be on circuit structure and function.

Where-Question: Meaning
Once, the name of the game in the cognitive neuroscience of language was to find a place in the brain for the major modules of linguistic processing.For example, when I wrote a paper for the journal Behavioural and Brain Sciences in 1999, a number of colleagues, well-known leaders in the field, commented on my review of the cortical basis of semantic processing, most of them by communicating that they had good empirical evidence to believe that a specific brain part is particularly relevant for word meaning processing (see comments , Pulvermüller 1999).This 'meaning centre' as one may want to dub it, was placed in different parts of the left hemisphere, so that a large part of the left hemisphere was covered with semantic areas and reconciling the different views with each other appeared difficult.A more recent update of the literature shows a similar picture, especially in the temporal lobe, on which many studies focus.For example, Hickok & Poeppel (2007) put their 'lexical interface' assumed to connect phonological and semantic representations in the middle-temporal cortex, Scott & Johnsrude (2003) suggest the anterior part of the superior-temporal gyrus as the meaning interface, and Patterson et al. (2007) put -based on a wealth of evidence from degenerative brain disease -that the temporal pole is the key area for semantic processing.Figure 1 illustrates the variability of positions.(Epstein 1999, Posner & DiGirolamo 1999, Pulvermüller 1999, Salmelin et al. 1999, Skrandies 1999, Tranel & Damasio 1999, Scott & Johnsrude 2003, Hickok & Poeppel 2007, Hodges & Patterson 2007).This paper attempts at developing an integrated perspective.
The principal problem of the debate about such unitary meaning centersor centers whose function it is to bind any meaning to any word/symbols -is the following: There is solid evidence for the importance of various cortical areas, at least in temporal and frontal lobes, in semantic processing and, by accumulating more evidence in favor of the importance of any one area, one cannot, evidently, disprove the role of the other ones.This would only be the case if there was an exclusive either-or, that is, if only one cortical area was allowed to include a major meaning switchboard.Although semantic processing implies the integration of information from different sensory modalities, such integration can be computed locally between adjacent neurons as it can be carried by distributed populations of interacting neurons; hence again no need for a unitary semantic area.
A second major problem for a unitary meaning approach is semantic category-specificity: Lesions in many cases do not affect all word kinds (and symbol types) to the same degree.Dependent on where the lesion is situated, specific categories of knowledge are affected more or less.Significant differences between semantic kinds -such as animal vs. tool names -have been documented with lesions in frontal and temporal cortex (Warrington & McCarthy 1983, Gainotti 2006) and processing differences between fine-grained semantic categories have even been reported in patients with semantic dementia (Pulvermüller et al., in press).Here, the solution lies in the integration of general lexicosemantic and category-specific semantic processes, as may be manifest in the interaction between a range of cortical areas (Patterson et al. 2007, Pulvermüller et al., in press).The precise location -or, perhaps better: distribution -of general and category-specific semantic circuits is one of the hottest topics in current neuroscience research.Figure 2 shows recent data indicating the approximate lo-cations of category-specific semantic circuits, as they can be inferred today from neuroimaging data, and contrasts them with brain activations generally seen for meaningful written word stimuli.Pulvermüller, Kherif et al. 2009).Areas found active generally to all kinds of words may indicate the distribution of circuits for processing of general lexical-semantic information, whereas the widely distributed area sets found active for specific semantic types may index the distribution of category-specific semantic circuits.
So where is the progress?It still lies in the mapping of meaning on brain matter.Not just in the mapping of any kind of meaning to brain structure, or the delineation of a unitary meaning centre, global semantic binding site or the like, but in the brain mapping of sometimes fine grained semantic categories and subtypes of knowledge.Most words indeed activate middle and inferior-temporal areas mainly involved in the processing of visual information about objects.This is not surprising because most words in languages like English are nouns referring to objects known through the visual modality.Animal and tool words, and similarly their related concepts, activate different inferior-temporal and middletemporal areas in both hemispheres (Damasio et al. 1996, Chao et al. 1999, Martin 2007), and words referring to objects with characteristic form or color features (square vs. coal) elicit activity in overlapping but distinct areas in bursiform, parahippocampal and middle temporal gyri (Moscoso Del Prado et al. 2006, Pulvermüller & Hauk 2006, Simmons et al. 2007).The inferior-temporal cortex -from pole to temporo-occipital junction -reflects a range of semantic distinctions and lesion in this region also appears to lead to specific degradation of particular semantic categories, to category specific semantic deficits (Warrington & Shallice 1984, Damasio et al. 1996, Miceli et al. 2001, Neininger & Pulvermüller 2003).As one example, lesion of rostro-mesial temporal cortex in the left hemisphere -a subpart of which (anterior parahippocampal gyrus) was found active specifically during color word processing -impairs object color knowledge specifically (Miceli et al. 2001).
The temporal lobes are not the only key areas for semantic processing.Words loaded with affective-emotional meaning can activate the amygdale, insular structures, and the posterior cingulated cortex (Straube et al. 2004, de Araujo et al. 2005).Odor words, as compared with matched control words, activate olfactory cortex along with limbic structures (Gonzalez et al. 2006), sound related words activate the superior temporal lobes more strongly than matched control words (Kiefer et al. 2008) and, critically, action-related verbs spark the motor and premotor cortex in such a specific manner that the body part relatedness of the action indexed by the words becomes manifest in somatotopic activation in the motor strip (Pulvermüller et al. 2000, Hauk et al. 2004, Shtyrov et al. 2004, Pulvermüller, Shtyrov & Ilmoniemi 2005).Words such as 'pick' and 'kick' would therefore specifically activate areas also active when subjects move their finger or foot.The overlap between areas active during motor performance and during congruous word processing is not complete; notably, normal motor performance creates somatosensory input, leading to somatosensory postcentral activation which, when overlaid with motor cortex activation, shifts the centre of gravity of activation backward, towards the parietal lobe.However, the somatotopic line-up of premotor activity reflecting aspects of action semantics could be replicated by a range of studies (Tettamanti et al. 2005, Aziz-Zadeh et al. 2006, Tomasino et al. 2007, Kemmerer & Gonzalez-Castillo 2010, Boulenger et al. 2009, Raposo et al. 2009), with occasional failure to replicate activity in specific regions of interest (Postle et al. 2008).In one study, the semantic somatotopy could even be documented in abstract idiom processing ('grasp the idea', 'kick the habit'; cf.Boulenger et al. 2009) consistent with an embodied, partly compositional view on abstract sentence meaning construction, to which lexical meaning contributes (Lakoff 1987, Barsalou 1999).
Importantly, these motor activations seem to index critical parts of the cortical semantic processor.Lesions in the motor system impair the processing of action-related words, especially that of action verbs (Damasio & Tranel 1993, Daniele et al. 1994, Neininger & Pulvermüller 2003, Tranel et al. 2003, Gainotti 2008) and, in addition to these, of the related action concepts (Bak et al. 2001, Bak et al. 2006).In healthy individuals, magnetic stimulation below the motor threshold to hand and foot areas in the left motor cortex could be shown to facilitate the processing of hand and foot related words specifically (Pulvermüller, Hauk et al. 2005).These results document a causal role of the motor system in processing action concepts and words semantically related to actions.
What we have learned is, therefore, that the level of specificity of brainmeaning mapping is much greater than previously thought.This is exciting from a linguistic perspective, as some semantic features of words seem to be apparent from the brain response they elicit.Of theoretical importance here is the fact that semantic areas could be predicted a priori on the basis of brain-theory, lending strong evidence for the underlying explanatory model (see section 5 below in the why-section of this article).Very specific action and perception features of referential semantic information linked to words can be mapped onto cortex.The search for the unitary meaning centre has, however, led to much disagreement, although it is possible that meaning integration at highly abstract levels draws upon only one area.The meaning centre seems to be best described as the union of brain areas critically involved in category-specific processing and the bias towards temporal cortex may relate to the habit of researchers to test object nouns and their related concepts.Note that important knowledge about most objects comes through the visual modality and the involvement of the inferiortemporal stream of object processing is therefore not surprising.Even for abstract words and sentences, different areas were found active by different researchers (e.g., Noppeney & Price 2004, Binder et al. 2005, Boulenger et al. 2009) raising the question whether category-specificity might hold even at abstract semantic levels (Pulvermüller & Hauk 2006).In one view, gradually more abstract semantic representations develop in progressively anterior areas in temporal and frontal cortex as a consequence of sensorimotor activity (Pulvermüller 2008).
An integrated view proposes category-specific semantic circuits whose precise distribution depends on meaning type (cf.Fig. 2).Areas most important for meaning emerge close to left-perisylvian language cortex -especially the inferior-frontal and superior-temporal gyri and sulci along with the underlying insula.All linguistic functions depend on this perisylvian region, whereas the category-specific meaning circuits extend throughout the cortex, the extrasylvian space.Action and object related meaning circuits draw upon motor and sensory areas and abstract semantic circuits develop in the vicinity of these sensorimotor sites, in anterior temporal and prefrontal cortex.There is differential laterality of linguistic and semantic processes and representations.Due to some property of the left perisylvian cortex (see the why-section 5), linguistic circuits are generally lateralized, although semantic circuits are spread out more symmetrically throughout both hemispheres (Fig. 2; Pulvermüller & Mohr 1996, Pulvermüller, Kherif et al. 2009).
Although the recent support for category-specific semantic circuits appears as a milestone in understanding the brain basis of meaning, it should not be ignored that some colleagues expressed criticisms.Caramazza's group suggested that motor activity during the processing of action verbs may not be related to semantic processes but may instead be an epiphenomenon related to mental images being retrieved, if not entirely irrelevant 'overflow' activation (Oliveri et al. 2004).In face of more recent neuropsychological evidence supporting a crucial role of motor systems for processing words of specific action-related semantic categories (for review, see Pulvermüller & Fadiga 2010), a new proposal now acknowledges a (possible) semantic function of the motor system, but complements it with an abstract symbol processor (Mahon & Caramazza 2008), a view similar to Patterson et al.'s (2007) suggestion that a 'semantic hub' -according to their data, in the temporal pole -complements widely distributed category-specific semantic circuits (see also Pulvermüller et al., in press).
A common misunderstanding about the role of sensorimotor circuits in semantic processing is that they provide the only source of meaning knowledge.However, this position does not appear very plausible.Combinatorial knowledge about words regularly occurring in sentence and discourse contexts implies semantic knowledge, for example about the most frequent color word the item 'strawberry' would co-occur with (Landauer & Dumais 1997).Combinatorial word properties allow not only the classification of words into syntactic classes, they also lead to distinctions along semantic boundaries, separating types of ob-jects and types of actions (Pulvermüller & Knoblauch 2009).A mechanistic neurobiological approach captures the storage of the underlying word-word correlations by way of the very same mechanisms it also uses for storing wordworld correlations in neuronal links between sensorimotor and perisylvian language cortices (Pulvermüller 2010).Furthermore, correlation learning is not restricted to the single word level, but can, in principle, occur for larger constructions, especially if they are being used stereotypically in specific contexts (Goldberg 2003).Current neuroimaging results seem consistent with a contribution of semantic representations of both constituent words and whole constructions when the meaning of abstract idiomatic sentences is being processed (Boulenger et al. 2009).
Some issues in the cortical localization of semantic processes are still open.The idea that access to movement knowledge tied to words is reflected in lateral temporal activation just anterior to a movement sensitive visual processing area (Martin et al. 1995) was recently questioned based on a lack of activation differences between nouns with more or less semantic relationship to movement (Bedny et al. 2008).While this finding argues against a role of middle temporal cortex in kinematic semantics, there is still solid evidence that the action-relatedness of word meaning is reflected in the activation of the left middle temporal area (MNI coordinates -62/-52/4; Hauk et al. 2008).The fact that the area activates more strongly to verbs than for nouns (e.g., Bedny et al. 2008, Hauk et al. 2008) is consistent with the action relatedness of most verbs, even verbs used to speak about so-called 'internal states'.States such as thinking and feeling have characteristic behavioural expressions, thus intrinsically linking the semantics of the respective terms to action (Wittgenstein 1953).Therefore, any noun-verb difference is hopelessly confounded with semantic differences (Pulvermüller et al. 1999).Furthermore, a recent study suggested that in the left middle temporal area, there are, side by side, different subareas that respond to words generally (-53/-49/-1), thus possibly contributing to general lexico-semantic processes, and to very specific semantic subcategories of action verbs (e.g., hand-related action verbs, -49/-51/-9) (Pulvermüller, Kherif et al. 2009).Such fine subcategorization may be a consequence of recurrent connections with the motor system, where semantic somatotopic activation is established.If the middletemporal activation to action-related words is due to links with the motor system (rather than to knowledge about moving visual input), it becomes explainable why such activation persists in visually deprived individuals (Mahon et al. 2009) who are not principally limited in their action repertoire and typically learn words, even visually-related ones, in action contexts (see Landau & Gleitman 1985).These data are consistent with a differential role of temporal and frontal areas in semantic processing, although more research may indeed help clarifying the various linguistic roles of middle temporal gyrus activation in word and sentence processing.

Where-Question: Speech Sounds
Phonological processes are located in perisylvian cortex.In one view, speech analysis is attributed to systems in the anterior-lateral (antero-ventral processing steam) and/or posterior part of the superior-temporal cortex (postero-dorsal stream, including planum temporale and lateral superior-temporal gyrus) (Rauschecker & Scott 2009).The planum temporale and other posterior superiortemporal areas have long been viewed as critical for language perception and understanding, based on evidence from clinical language deficits (see, e.g., Geschwind 1970).Recent neuroimaging experiments showed that speech yields stronger activation in antero-ventral superior-temporal areas compared with matched noise patterns (Scott et al. 2000, Uppenkamp et al. 2006), and this evidence is also consistent with data from macaques that anterior superior-temporal activity indexes species-specific calls (Romanski et al. 1999).Similar responses in posterior superior-temporal cortex to speech and other acoustic stimuli still allow for a role of this region in speech-language processing.This observation is compatible with a view of postero-dorsal areas, especially planum temporale but possibly also temporo-parietal junction, as a 'computational hub' for processing spectrotemporally rich acoustic patterns (Griffiths & Warren 2002).In addition to superior-temporal cortex, inferior-frontal cortex is active during listening to speech, as could be demonstrated using TMS (Fadiga et al. 2002), and inferiorfrontal activation even persists during passive exposure to speech, as could be shown using MEG (Pulvermüller 2003, Pulvermüller, Shtyrov & Ilmoniemi 2003).Critical inferior-frontal areas include posterior Broca's (pars opercularis) and premotor cortex (Wilson et al. 2004).Similar to the posterior superior-temporal cortex, the motor system's role is not confined to speech processing.The sounds of actions activate different sections of the fronto-central sensorimotor cortex in a very similar manner as linguistic sounds do (Hauk, Shtyrov & Pulvermüller 2006, Lahav et al. 2007).These results suggest that the computational hub for sound processing extends from posterior-temporal cortex to inferior-frontal and premotor regions.Precisely timed spatio-temporal patterns of cortical activation spreading in this distributed cortical system may signify the processing of speech and other action sounds (Pulvermüller & Shtyrov 2009).
Similar to the semantic domain, recent advances in our knowledge about phonological representations and processes in the brain relates to specificity.Different areas in superior-temporal cortex were found active when subjects listened to different kinds of speech sounds (Diesch et al. 1996, Obleser et al. 2003, Obleser et al., 2006, Pulvermüller et al. 2006, Obleser et al. 2007).Typical examples of the phonemes [p] and [t] for example were mapped to adjacent areas in superiortemporal gyrus, anterior to primary auditory cortex and Heschl's gyrus.Interestingly, a similar phonological mapping was evident in the motor system, where the production of [p] and [t] activated different precentral areas in a soma-totopic fashion.The articulatory mapping of phonemes to the motor system corresponded to the localization of the articulators mainly involved in the production of the respective speech sounds -the lips for [p] and the tongue for [t] (Lotze et al. 2000, Hesselmann et al. 2004).
Notably, these different precentral motor/premotor areas were also found active during listening to speech.Listening to [t] activated the precentral focus also excited when producing a [t] or moving the tongue tip, and when hearing [p], a slightly dorsal area also active when producing this phoneme or when moving the lips lighted up (Pulvermüller et al. 2006).The critical role of these motor systems in speech perception is evident from TMS work stimulating the motor regions of the lips and the tongue: Such stimulation biases the speech comprehension system in favor of congruent sounds.Therefore, when the tongue (or lips) area was stimulated, subjects tended to perceive [t] (or [p]) sounds more quickly, or even to misperceive [p] sounds as [t] (or the reverse) (D'Ausilio et al. 2009).This observation demonstrates that motor systems critically contribute to the speech perception process.
In the phonological domain, progress seems two-fold.First, the perisylvian cortex, which is well-known to be critical for phonological processing and representation, can be further subdivided according to phonological properties.Phonetic distinctive features, DFs, and speech sounds discriminated by these DFs can be mapped on different brain substrates in inferior-frontal and (antero-lateral) superior-temporal cortex.Second, the temporal and frontal neuronal ensembles appear to interact with each other and to be functionally interdependent in phonological processing.The summarized data argue against proposals that seem to play down the role of frontal cortex in speech perception (see the how-section 4 below, Hickok & Poeppel 2007, Lotto et al. 2009, and Scott et al. 2009).Note again that left inferior-frontal cortex activates in speech perception even when subjects try to ignore incoming speech sounds.Frontal activation therefore does not depend on attention being focused on speech (Pulvermüller et al. 2003, Pulvermüller & Shtyrov 2006, 2009), although attention certainly exerts a modulatory function on language-elicited brain activity (Garagnani, Shtyrov & Pulvermüller 2009, Shtyrov et al., in press).In the language domain, the posterior-dorsal vs. anterior-ventral stream debate seems, at present, not fully conclusive, as both parts of the superior-temporal cortex are apparently involved in speech processing and the absence of phoneme specificity in the posterior superior-temporal cortex appears as a null result without strong implications.Clear evidence exists for anterior-lateral superior-temporal activation discriminating phonemes from noise and phonemes between each other, but a contribution of posterior parts of superior-temporal cortex to phonological processing also receives support.

Where-Question: Syntax
It would seem exciting to delineate cortical maps for rules of syntax, similar to the mapping of semantic categories and that of phonetic DFs reviewed earlier in this section.However, such syntactic mapping has so far not been fully successful and major reasons for this lie in the tremendous difficulties the grammar domain creates for the experimental scientist.When comparing grammatical sentences to word strings with syntactic errors, the latter elicit stronger brain activation in left perisylvian cortex, especially in inferior-frontal and in superior-temporal cortex (e.g., Friederici et al. 2000, Indefrey et al. 2001, Pulvermüller & Shtyrov 2006, and Friederici 2009).When directly comparing sentences with different grammatical structure, for example active and passive, subject and object relative, and coordinated and subordinated sentences, the grammatically more demanding sentences tended to elicit stronger activation; again some of the activation differences were located in left perisylvian cortex (Just et al. 1996, Caplan et al. 2000, Caplan et al. 2008).Although these results suggest that processing of grammaticality and of the complexity of grammatical structure relates to inferior-frontal and superiortemporal circuits, they do not unambiguously prove this.Ungrammatical sentences are rare and therefore exceptional, whereas grammatical ones are normally more common, and among the grammatical ones, the sentences considered to be more complex (e.g., object relatives) are rarer than the ones considered to be simpler (e.g., subject relatives).Heroic attempts have been made to control sequential probabilities while, at the same time, varying grammatical structure of wellformed sentences (e.g., Bornkessel et al. 2002).However, as has been argued by linguists, such control has not been perfect (Kempen & Harbusch 2003) making it seem impossible to exclude the probability confound.
Whether syntax depends on discrete combinatorial rules or is best described in terms of sequential probabilities constitutes a major debate in cognitive science (McClelland & Patterson 2002, Pinker & Ullman 2002).The linguistic position that rules of syntax and universal underlying principles govern grammar is in contrast with approaches using systematic probability mapping in neural networks or statistical procedures lacking any rule-like symbolic representations.These, too, are capable of modeling linguistic processes and have the additional advantage of explaining aspects of the learning of grammar (Rumelhart & McClelland 1987, Hare et al. 1995).That certain types of syntactic (and, likewise, phonological, semantic) structures a priori require a system of discrete combinatorial rules and representations may, therefore, appear as a too strong statement, although this assumption figures as a firm corner stone of much cognitive theorizing in the second half of the 20 th century.Whether discrete representations and especially rule-like entities exist turns out to be an empirical issue but also one addressable by brain theory and brain-based modeling (see how-section 4).The term 'discrete' -an expression with many facets that is used in different areas of cognitive science with rather different meanings -is used here to refer to a mechanism that either is being engaged in a given condition or is not, with little if any room for gradual intermediate steps.The sentence 'Build a sentence from (at least) a noun and a verb' describes a discrete combinatorial mechanism at an abstract linguistic level.Can we expect that such discrete rule-like processes, rather than probability mapping, are effective at the neurobiological level?
Empirical testing of the existence of rule-like mechanisms is possible if probability mapping and rule-applicability dissociate.Examples are sentences that are grammatically correct but extremely rare in language use.These can be contrasted with grammatical sentences that are common, but also with ungrammatical strings that are rare to a similar degree as the rare grammatical items.As mentioned, ungrammatical strings elicit stronger brain activity than common grammatical sentences.As this neurophysiological difference is even observable if the same, identical recordings of spoken word strings are presented many times and even when subjects do not pay attention to the speech stimuli, some grammatical brain processes appear to be automatic (Shtyrov et al. 2003, Hasting & Kotz 2008, Pulvermüller et al. 2008).But, critically, would a rare gram-matical string produce a brain response indistinguishable from that of a common grammatical string, as a discrete all-or-nothing approach might suggest, or would the gradual probability differences between the strings be reflected in the neurophysiological brain response, as a probability mapping theory would predict?The former was found: The brain response to rare ungrammatical strings was enhanced, that to grammatical strings was attenuated, regardless of the sequential probability of their constituent words (Pulvermüller & Assadollahi 2007).This result pattern is consistent with, and therefore supports, the rule theory.It should however be noted that a neural approach without discrete representations can be modified to fit these data if appropriate non-linearities are built into the network.
So, again, where is the progress?In the delineation of category-specific semantic circuits distributed over specific sets of cortical areas, in the mapping of phonetic features onto brain systems that encompass superior-temporal and inferior-frontal (including premotor) areas, and in new evidence in favor of discrete combinatorial rules brain-supported by left perisylvian circuits.

When? The Rapid Time Course of Language Understanding
A main stream view held that language understanding is a relatively late process (for review, see Barber &Kutas 2007 andPulvermüller et al., 2009).Semantic processing, along with lexical ones, were assumed to be first indexed by the N400 component of the event-related brain potential and field.Syntactic processing was assumed to be indexed by an even later component, called P600.Both components peak around half a second after information necessary for identifying critical stimulus words is present, suggesting that at least this amount of time elapses between presence of a word in the input (say, 'bear' in a warning context), and the initiation of an appropriate response (for example, running away).Such long-delay comprehension systems may have advantages under certain conditions, however, from a Darwinian perspective, a faster system minimizing the comprehension latency would certainly have constituted an evolutionary advantage.
In fact, some research indicated early brain reflections of syntactic and semantic processing.In the semantic domain, the meaning of action words becomes manifest in somatotopic motor systems activation already 100-250 ms after availability of information about the identity of spoken (Shtyrov et al. 2004, Pulvermüller, Shtyrov & Ilmoniemi 2005) or written (Pulvermüller et al. 2000, Hauk & Pulvermüller 2004) stimulus words.At the same latencies, the neuro-physiological responses dissociated between word kinds semantically linked to visual information -for example, between color-and form-related words (Sim & Kiefer 2005, Moscoso Del Prado Martin et al. 2006, Kiefer et al. 2007).Likewise, large word categories differing in both their grammatical function and their semantic characteristics -for example, grammatical function words and referential content words, or object nouns and action verbs -dissociate neurophysiologically within 250 ms (for an overview, see Pulvermüller, Shtyrov & Hauk 2009).A range of other psychological and linguistic factors, including frequency of occurrence of words and their parts and general semantic properties, could also be mapped onto the first 250 ms, in this case calculated from the onset of written word presentation.The early effects (<250 ms) seem to be more variable and less robust than the later (~500 ms) ones (Barber & Kutas 2007).Importantly, they do depend strongly on stimulus properties, especially the length and lumi-nance of written words and the loudness of spoken works and the point in time when they can first be recognized with confidence (Pulvermüller & Shtyrov 2006).Early and late indexes of semantics may reflect different processes in the analysis of word meaning, automatic semantic access and semantic re-analysis (Pulvermüller, Shtyrov & Hauk 2009).
Syntactic processing is long known to have an early brain correlate.Violations of phrase structure rules were found to lead to enhanced negativities already at 100-250 ms, in the early left-anterior negativity (Neville et al. 1991, Friederici et al. 1993).More recently, similar early violation responses, in the syntactic mismatch negativity component, have been reported to violations of the rules of phrase structure and agreement (Shtyrov et al. 2003, Hasting & Kotz 2008, Pulvermüller et al. 2008).It is these early responses that are automatic (see wheresection above), whereas the late ones (P600) depend on attention to stimulus sentences: Brain correlate of syntactic mismatches in the 'syntactic mismatch negativity' can be recorded in subjects who do not attend to speech that includes grammatical violations; the early responses (up to 150 ms) remain unchanged even if subjects are heavily distracted from speech and syntax by a continuously applied attentional streaming task.This proves that at least some early mechanism of the brain's grammar machinery operates automatically or, as once claimed, 'like a reflex' (Fodor 1983).
In sum, syntactic and semantic processes are reflected by relatively late brain responses that depend on task and attention to stimuli.In addition, early, <250 ms, brain indexes of syntax and semantics also exist and these seem to be less dependent on attention being paid to stimuli, in some cases entirely attention-independent.For phonological and pragmatic processes, there are also reports about early as well as late brain correlates.For example, an early brain correlate of phoneme processing is present in the mismatch negativity (Dehaene-Lambertz 1997, Näätänen et al. 1997) and early indexes of pragmatic deviance have been reported in studies of text processing (Brown et al. 2000).Phonological expectancy violations (Rugg 1984, Praamstra & Stegeman 1993) as well as pragmatic and discourse-related ones ( van Berkum, Brown et al. 2003, van Berkum, Zwitserlood et al. 2003) also were found to produce late effects at ~400 ms.An interpretation of early vs. late effects is possible along the lines of dual-stage models, such as Friederici's (2002) influential model.The early process would accordingly be an automatic comprehension or matching process, whereas the late process could either imply an in depth extension of the process or a revision and re-analysis.Friederici proposes this for the syntactic domain but, in light of the early and late components indexing essentially all kinds of psycholinguistic information, the same general concept can be applied to other psycholinguistic levels of processing, too.
A different and complementary approach relates the latency of cognitive neurophysiological responses to stimulus properties, especially the variability of physical, form-related properties.Larger variance of these variables, including word length and acoustic properties of spoken materials, increases the variance of brain responses especially at early latencies.Therefore, such variance may mask early brain responses reflecting cognitive and psycholinguistic processing.Late responses survive in attention-demanding tasks because they are large, long-lasting and widespread.Evidence for this view has recently been reported.Minimal variability of physical and psycholinguistic stimulus properties is critical for obtaining early effects in lexical, semantic, and syntactic processing (Assadollahi & Pulvermüller 2001, Pulvermüller & Shtyrov 2006, Penolazzi et al. 2007).
Important time aspects are immanent to the orchestration of cortical sources.It has been suggested that the brain uses a precise temporal code to transmit information at the neural level.Serial models had put that phonological, lexical, syntactic, and semantic processes follow each other in a given order (which varied between models).However, the early near-simultaneous neurophysiological responses mentioned above suggest that these processes run largely in parallel with little if any offset between them (Hauk, Davis et al. 2006).Furthermore, small timing differences have been reported between subtypes of semantic processes.Leg-related action words tend to spark the leg region slightly later than words with arm or face reference activate their corresponding inferior motor areas (Pulvermüller, Shtyrov & Ilmoniemi 2005).Here, timing differences seem to indicate semantic differences.A recent study comparing the timing of left superior-temporal, lateral-central, and inferior-frontal area activations found delays between regions that depended on stimulus type.Phoneme sequences and word stimuli led to a delayed activation of inferior-frontal cortex, 10-25 ms after superior-temporal cortex, whereas noise stimuli failed to elicit a comparable activation delay between regions (Pulvermüller & Shtyrov 2009).Therefore, the delay between regions of interest activations coded the phonological status of acoustic stimuli.
Interestingly, reliable time delays only emerged between superior-temporal and inferior-frontal cortex.The latero-central region including motor and premotor areas activated together with superior-temporal cortex.This suggests that the postero-dorsal stream activates more quickly than the antero-ventral stream, but that the latter conveys important information about the phonological status of sounds.
In summary, early near-simultaneous brain responses (latency <250 ms) index different facets of the comprehension process, including word form analysis, semantic access along with syntactic and semantic context integration, suggesting near-simultaneity (or short-delay seriality) in psycholinguistic information access.The short delays are potentially accountable in terms of cortical conduction times (Pulvermüller et al. 2009).

How? Brain-Based Models of Circuits, Their Activations and Delays
A general conclusion suggested by much recent research is that action and perception are not stand alone processes but are functionally interwoven at the mechanistic level of neuronal circuits.This insight has gained momentum in basic and cognitive neuroscience, including research into perception and action in animals and humans (Rizzolatti & Craighero 2004, Rizzolatti & Sinigaglia 2010), visuomotor integration (Bruce & Goldberg 1985), and language processing (Pulvermüller 2005, Pulvermüller & Fadiga 2010).Consistent with such interdependence are both behavioural and neurofunctional observations.Action does not solely relate to motor systems activation, but likewise draws on perceptual mechanisms, as in the co-activation of superior-temporal cortex during silent speaking (Paus et al. 1996).The Lee effect of delayed auditory feedback on speech output (Lee 1950) demonstrates a profound automatic influence of acoustic-phonetic processes on ongoing speech output, a conclusion strengthened by the neurofunctional studies demonstrating motor systems activation by perception of movements and speech (Fadiga et al. 1995, Fadiga et al. 2002).Speech perception does not just 'take place' in the superior-temporal cortex alone.The motor system is co-activated and assists, modulates, and sharpens the speech perception process.One may describe this interaction in classic terms.Earlier proposals had suggested that perception is, in part, an active process by which action hypotheses play a role (see also Bever & Poeppel, this volume).Accordingly, 'bottomup' perceptual analysis triggers a hypothesis about the input, followed by an action-related 'top-down' synthesis, the product of which is finally compared with and eventually matched to further input information, thus confirming or rejecting the perceptual hypothesis (Halle & Stevens 1959, 1962).When recognizing a naturally spoken syllable such as [pIk], already the vowel includes coarticulatory information about the subsequent consonant, which may give rise to the -still premature -hypothesis that [pIk] is coming up (Warren & Marslen-Wilson 1987, 1988).This hypothesis can be compared with further inputespecially the plosion of the final [k] -until a match is reached.In the syntactic domain, a noun may generate the hypothesis that a sentence including a verb will emerge, and the N-V hypothesis can be compared, and eventually matched, with the input, a process compatible with the state sequence of a left-corner parser (Aho et al. 1987).This hypothesis generation and testing, as suggested by classic cognitive theories of phonological analysis-by-synthesis and left-corner parsing may capture aspects of the functional significance of the co-activation of frontal and posterior areas in cognitive processing in general and in language processing in particular.
While it is possible that similar descriptions in terms of perception-bysynthesis may capture aspects of the real mechanisms, they are not very precise and certainly not spelt out in terms of neurons.A vague formulation of actionperception interaction still leaves open questions including the following: How many concurrent hypotheses can be entertained at a time, and how many simultaneous top down predictions are allowed?Can analysis, hypothesizing, and synthesis, and matching run in parallel, with constant functional interaction between them, or are they serial modular processes?Are controlled conscious intentional decisions required in the literal sense or can the 'decision' process also be construed as automatic?There are many degrees of freedom here.One kind of description -in terms of 'hypotheses' and 'synthesis' -suggests attention demanding modular processes that are being entertained sequentially, one by one.However, much psycholinguistic research supports parallel processing of competing hypotheses.Gating experiments for example indicate that several competing hypotheses about possibly upcoming words are built, maintained and tested in parallel until one of them 'wins', a position immanent to models in the tradition of the cohort theory (Marslen- Wilson 1987, Gaskell & Marslen-Wilson 2002).The motor theory of speech perception, as one variant of an analysis-by-synthesis approach, made the additional assumption that speech perception as a modular process is separate from other acoustic perceptual processes, a view difficult to reconcile with current neuroimaging data showing that the same cortical foci are active when producing articulator movements and speech sounds (Pulvermüller et al. 2006).An obvious deficit of analysis-by-synthesis approaches is the lack of a time scale.Perceptual analysis, hypothesis generation, and action synthesis and matching could each last for seconds or take place nearsimultaneously.This general approach seems to be in need of additional detail to provide mechanistic explanations.
In my view, progress in clarifying the mechanisms underlying actionperception interaction requires brain theory.Correlational information between the speech signal and articulatory gestures together with neuroanatomical and neurophysiological knowledge and principles provide a firm basis for postulating action-perception circuits that (i) span inferior-frontal and superiortemporal areas, (ii) become active near-simultaneously with minimal inter-area delays determined by axonal conduction times, (iii) play a role in speech production and in speech perception too, and (iv) provide continuous facilitatory interaction between the inferior-frontal and superior-temporal parts of each circuit while at the same time (v) competing with other action-perception circuits.
The mechanisms of and processes in such circuits can be explored in computational work using networks with neuroanatomically realistic structure and plausible neurophysiological function (see Fig. 3; Wennekers et al. 2006, Garagnani et al. 2008).
Figure 3: Areas of the left-perisylvian language cortex, connections between them and implementation in the model of the language cortex (MLC; Garagnani et al. 2007, Garagnani et al. 2008).Explicit neuromechanistic models grounded in neuroanatomy and neurophysiology can be used to simulate language processes in the brain and, eventually, to explain them.The areas shown on the brain diagram at the top and implemented in the MLC at the bottom are: Primary auditory cortex (A1), auditory belt (AB), auditory parabelt (PB), inferior prefrontal (PF), premotor (PM) and primary motor (M1) cortex.AB and PB together are sometimes called the 'auditory language area' or 'Wernicke's region ' and AB and PB the 'motor language area' or 'Broca's region'.These simulations show that during the recognition of a word such as [pIk] the following processes take place: 1.
The auditory signal leads to stimulation of neuronal populations in superior-temporal cortex where activity spreads from primary auditory cortex, A1, to the surrounding auditory belt, AB, and parabelt, PB.This activation is mainly carried by the best-stimulated circuit(s), the target word, but partly also by its cohort members' and neighbors' circuits ([pIp], [kIk]).At the cognitive level, one may say that the system entertains several perceptual hypotheses.

2.
With a slight delay (realistically, 10-25 ms), activation also spreads to inferior-frontal cortex, to prefrontal, PF, premotor, PM, and (to a lesser degree) primary motor, M1, areas.This activation spreading is mainly carried by the phonological and lexical circuits best stimulated, which impose a fixed spatio-temporal pattern of activation.The advantage of such action links may lie in the separation of circuits whose perceptual parts overlap to a large degree.The syllable-initial phonemes [p] and [t] sound similar, but are based on motor programs for different articulators controlled by motor neurons at different locations in the motor system, in PM and M1, which are ~2 cm apart (Pulvermüller et al. 2006).Although their perceptual circuits (in A1, AB, and PB) overlap substantially, their action circuits (in PM, M1) do not to a similar degree.If circuits overlap, they cannot easily inhibit each other, a requirement for a decision and functional distinction between them.Therefore, the separation of circuit parts in the motor system enables between-circuit inhibition, and thus facilitates a discrimination and decision process between partly activated overlapping circuits.

3.
Activation from the most active motor circuit is fed back to superiortemporal circuit parts.The superior-temporal part of the circuit organizing the critical word [pIk] receives strong feedback activation from the action system, whereas those of competitor words receive comparably little (due to the competition process in the action system).The word-related cell assembly fully ignites; the correct word is being recognized.At the cognitive level, a perceptual decision has emerged.
Critically, as more and more auditory activation and information comes in, the activity hierarchy among the word-related cell assemblies shifts in favor of one; competitors are suppressed by an inhibition mechanism.Processes 1-3 involve a range of perceptual, phonological and lexical circuits, which accumulate excitation and compete simultaneously until, ultimately, activation entropy in the system decreases and one circuit ignites.Many factors, including noise, circuit overlap (cf.lexical neighborhood structure) and connection strength (cf.word frequency) -can influence the temporal dynamics of the processes.Although, the physiological word recognition process, including activation spreading, competition and ignition, may normally be very rapid (200-250 ms, see when-section, Hauk, Davis et al. 2006, Pulvermüller & Shtyrov 2006, and Pulvermüller, Shtyrov & Hauk 2009), a range of factors may lead to a delay in word recognition.Under high entropy conditions (e.g., high noise, very strong neighbors) these processes may be delayed both in superior-temporal and inferior-frontal areas.Note, however, that the inhibition mechanism which, as argued above, is most efficient in inferior-frontal cortex, implies entropy reduction with time.Processes of generating and deciding between hypotheses analogous to the ones spelt out here in detail for speech perception are envisaged to underpin meaning comprehension and speech production as well.
The neurobiological basis of cognitive processes such as perceptual hypothesis generation and decision can be traced with explicit neurocomputational studies (Wennekers et al. 2006, Garagnani et al. 2008, Garagnani, Wennekers & Pulvermüller 2009).As mentioned, these models build strong functional links between frontal action circuits and posterior perception circuits.Therefore, a lesion in the action part of the distributed circuits does not only impair actions, it may also impact on perception and understanding, and, vice versa, lesions in the perceptual network part can reduce motor output functions in addition to causing perceptual deficits (Pulvermüller & Fadiga 2010).This is consistent, for example, with well-known reports about speech perception deficits and abnormalities in patients with different types of aphasia (Basso et al. 1977, Blumstein et al. 1994).Under taxing conditions, aphasic patients with Broca aphasia, which typically relates to frontal lesion, have difficulty understanding single words (Moineau et al. 2005) and even under optimal perceptual conditions, comprehension is delayed and activation of phonological cohort members reduced (Utman et al. 2001, Yee et al. 2008).Inferior-frontal lesion has a similar effect on gesture discrimination (Pazzaglia et al. 2008).Some colleagues chose to ignore these and similar reports of inferior-frontal lesions and their related speech perception deficits, or play them down as 'not dramatic' (Hickok 2009, Bever & Poeppel, this volume;see Pulvermüller & Fadiga 2010 for a review of this discussion).In this context, the reader should also be reminded of the TMS evidence showing that precentral stimulation alters speech perception (see where-section, sounds, and, for example, D' Ausilio et al. 2009), and of the fact that neural degeneration in the motor cortex and inferior part of the frontal lobes leads to language and conceptual deficits, especially for words and concepts related to actions (see where-section 2.1 and, for example, Bak et al. 2001).
The instant flow of activation from superior-temporal 'perceptual' neurons to inferior-frontal 'action' neurons bound together in a distributed actionperception circuit is an automatic process under standard conditions (case A: unambiguous input, good signal-to-noise ratio), but is modified under specific circumstances.In case competitor circuits are stimulated by a perceptually ambiguous stimulus ([#Ik] with # in between [p] and [t]), two lexical circuits compete and, for reaching a decision, the inferior-frontal activations and competitive mechanisms are of greatest importance.Competitor circuit activations, the associated increased entropy of the activation landscape and subsequent regulation take time so that the ignition of the winning (target) circuit will be delayed (case B).In the worst case (C), the wrong action-perception circuit may ignite first, further perceptual evidence arriving later builds up activation in the 'correct' circuit so that its ignition is much delayed and a revision of the perception process results.In all three cases, straight perception (A), high entropy perception (B), and corrected perception (C), inferior-frontal 'action circuits' may be critical in the perception process, although the inferior-frontal activation may differ as a function of effort and attention necessary.Consistent with this view, a recent experiment on attention modulation in speech perception showed a modulatory effect of attention on inferior-frontal activation, which was stronger than the one seen in superior-temporal cortex (Shtyrov et al., in press).In order to allow for similarly specific predictions, an action-perception approach -or likewise analysis-by-synthesis approach -needed to stated in terms of a mechanistic brain-based circuit model.
Cognitive scientists entertain a major debate about the existence of symbol representations that behave in a discrete manner, becoming active in an all-ornone fashion rather than gradually.Linguistic and symbolic linguistic theories build on such discrete representations, whereas many neural processing approaches postulate distributed representations and processes that are gradual.In this case, the gradual distance of an activation pattern to the closest perceptual target vector determines the percept.Using networks fashioned according to the neuroanatomical structure and connection pattern of the left-perisylvian language cortex along with neurophysiologically realistic synaptic learning, we found neuronal correlates of discrete symbols in distributed action-perception circuits.These circuits indeed behave in a discrete fashion, showing an explosion-like ignition process to above-threshold stimulation.The circuits overlap to a degree and a specific type of realistic learning rule has a network effect of reducing this overlap (Garagnani, Wennekers & Pulvermüller 2009).The finding that distributed and discrete circuits develop in realistic neuroscience grounded networks may entail a better understanding of the mechanisms of symbol processing.

Why: From Brain Mechanisms to Explanation
A range of why questions target the causal origin of brain mechanisms of language as we can infer them: Why are phonetic distinctive features mapped on specific loci of cortex?Why in the intact language-competent brain do acoustic phonological representations in superior-temporal cortex functionally depend on and interact with articulatory-phonological representations in inferior-frontal cortex?Why do phonological circuits link together into lexical ones, and why do these perisylvian cell assemblies underpinning spoken words link up with semantic networks elsewhere in the brain?Why do these semantic networks encompass so many other sets of areas, that color word connect to parahippocampal and fusiform, arm action words to precentral, and odor words to pyri-form cortex?Why do certain neuronal networks support the emergence of discrete neuronal circuits for linguistic representations?Why do some networks -dependent on their internal structure, function and learning algorithms -either support the buildup of discrete combinatorial representations with a function similar to syntactic rules or suggest that such rule-like representations do not exist?Exciting questions like these can be added almost ad infinitum, but it is difficult to say how far we actually are from an ultimate answer to them.Let me make an attempt to outline components of such an explanatory answer using two examples.
There are known properties of the brain and of brain function that can form a basis of neuroscientific explanation.Our brain's most important structure for cognition, the cortex, is, structurally and functionally, an associative memory (Braitenberg & Schüz 1998).Its neurons are linked by way of their synapses and links strengthen depending on their use, or correlation of activation (Tsumoto 1992).The cortex is not, however, a tabula rasa learning structure.It is equipped with a wealth of information.Some of this information is manifest structurally, in the anatomy of the cortex, in the structure and microstructure of areas, connections between areas and even the microstructure of neurons and their biochemical properties.
Some of the explanations and answers to 'Why X?' -questions may therefore recur to such established knowledge.As one example, explanations can be of the form 'Because X is a necessary consequence of functional correlation of neuronal processes a and b and the structural connections between the neuronal structures (neurons, neuronal assemblies) A and B'.Due to such functional correlation of structurally connected units acoustic and articulatory phonological representations connect with each other to form action-perception circuits for integration of phonological information in speech perception and production.Statistical learning of co-occurrence patterns of phonemes in words and morphemes accounts for the formation of lexical representations (Saffran et al. 1996, Pelucchi et al. 2009), which are realized as neuronal ensembles distributed over perisylvian cortex.Correlation between word form and activity in sensory and motor systems of the brain also explains the binding between sign and its referential meaning, given the relevant connections necessary for such learning are available in the first place.Information about the referents is available in different brain systems -motor, visual, auditory, olfactory, etc. -for different kinds of words, therefore the differential distribution of category-specific semantic circuits results.As soon as a stock of signs with referential meaning is available, indirect, contextual semantic learning is possible due to the correlation of new word forms with familiar ones for which referential semantic information is already available.Co-activation of the new words' neuronal circuits with semantic circuits bound to familiar words, which appear in the context of the new ones, leads to the binding of semantic neurons to the new words' circuits, thus offering a neuronal basis for in-context semantic learning (Pulvermüller 2002).
An important component of this account is correlation learning and consequent binding (i) between action and perception circuits in phonological learning, (ii) between phonological circuits in lexical learning, (iii) between word form and referential action and perception circuits in semantic learning, and (iv) between new word form circuits and previously established semantic circuits that are co-activated in contextual semantic learning.Note again that such learning is only possible if the necessary neuronal substrates and connections storing the critical correlations are available in the first place.These substrates and connections are determined to a great extent by the genetic code (Vargha-Khadem 2005, Fisher & Scharff 2009), although some influence of neuronal activity on the formation of connections cannot be denied (Rauschecker & Singer 1979, Hubel 1988, Engert & Bonhoeffer 1999, Nagerl et al. 2007).
For some time, modern cognitive scientists were slightly skeptical regard-ing a major role of correlation-driven learning in cognition.This is understandable, because, historically, modern cognitive science set itself apart from behaviorism, which had once emphasized the role of some forms of associative learning in a tabula rasa system called a 'black box'.Evidently, such an approach is unable to explain area-specific activation patterns in the grey matter or the functional relevance of specific white matter tracts and thus cannot succeed in the neurobiological explanation of cognitive functions.However, a dismissal of the relevance of correlation learning at the neurobiological level would seem an undue over-reaction to behaviorism.The empirical support for learning based on correlation of neuronal activity seems too strong and its implications for brain language theory too clear (see Pulvermüller 1999).Importantly, and in sharp contrast with behaviorist approaches, the effect of correlation of neuronal activation needs to be considered in the context of existing neuroanatomical connectivity.Neuroanatomical connections of linguistic importance have been documented in the healthy undeprived human brain.In particular, there are multiple connections between superior-and middle-temporal gyrus and inferior-frontal prefrontal, premotor, and opercular cortex via the capsula extrema (Saur et al. 2008, Petrides & Pandya 2009) and further inferior-frontal to superior-temporal connections also including inferior-prefrontal and superior-temporal areas via the fasciculus arcuatus (Glasser & Rilling 2008, Petrides & Pandya 2009).Some of these latter connections seem to stop over in the inferior-parietal cortex and in general the parietal lobe seems strongly linked into the fronto-temporal network (Catani et al. 2005).Some fronto-temporal connections were already evident in non-human primates (Pandya & Yeterian 1985, Petrides & Pandya 2009), although direct comparison showed that especially the arcuate fascicle is most strongly developed in humans (Rilling et al. 2008).This congruency and gradual difference suggests a pathway by which the genetic code did influence the emergence of human language in evolution.Strong direct fronto-temporal connections enabled the build-up of a large numbers of action-perception circuits for phonological and lexical processing in humans.As the rich fronto-temporal connections could for the first time support a great variety of fronto-temporal actionperception circuits, this proposal suggests a neurobiological explanation for the large vocabularies of human languages.Human languages include large vocabularies of 10,000s of spoken words (Pinker 1994), whereas our closest relatives, great apes, use only ~20-40 different signs (Pika et al. 2003, Tomasello & Call 2007), and even under massive training show a limit of ~200-300 symbols (Savage-Rumbaugh et al. 1985).As the documented fronto-temporal links seem to be mainly left-lateralized (Catani et al. 2005), the neuroanatomical-neuro-functional approach also provides a natural explanation for why language is lateralized to the left hemisphere in most human subjects.Because the left hemisphere houses most of the fronto-temporal connections in perisylvian language cortex (Catani et al. 2005), action-perception correlation can best be stored there, so that the phonological and lexical action-perception circuits are lateralized to the left.
Whether aspects of syntax can also be learned in an associative manner, given some neuroanatomically manifest genetic information is available, remains a topic of debate.Investigations into statistical language learning demonstrate that much syntactic information is immanent to the correlations and conditional probabilities between words in sentences; this information can be used, for example, to automatically classify words into lexical classes (Briscoe & Carroll 1997, Lin 1998).Neural network research demonstrates that such combinatorial information can also be extracted with neuron-like devices and neurofunctionally-inspired algorithms (Elman 1990, Honkela et al. 1995, Christiansen & Chater 1999, Hanson & Negishi 2002).A dispute can still occur about the degree to which the neural processes and structures can be likened to constructs postulated by linguists and cognitive scientists.In one view, neural networks processing syntactic information are probability mapping devices entirely dissimilar to the rule systems proposed in linguistics (Elman et al. 1996, McClelland & Patterson 2003).
We recently explored sequence probability mapping in neuronal networks incorporating important features of cortical connectivity frequently omitted by neural approaches (Knoblauch & Pulvermüller 2005, Pulvermüller & Knoblauch 2009).In these networks, we found formation of aggregates of neurons that, after learning, responded in a discrete 'all-or-nothing' fashion to similar contexts.These neuronal assemblies were primed by a range of past contexts and, in turn, primed a range of possible successor contexts.As an example, a range of different nouns primed the neuronal aggregate, which, in turn, activated a range of verbs.The neuronal aggregates were connected to word representations in a discrete fashion, i.e. either strongly or very weakly.Such neuronal grouping is similar to the discrete grouping of words into lexical categories (noun or verb) and the linkage of such discrete combinatorial categories bears similarity to syntactic rules linking together lexical and larger syntactic categories in a sequential fashion.A rule such as 'S → N V' or 'S → NP VP' (along with other syntactic and lexicon rules) would equally connect a wide range of utterances it covers.Interestingly, the combinatorial neuronal assemblies connect constituent pairs not previously learned together, thus documenting a degree of generalization along with functional discreteness.
Features of the grammar network setting it apart from other models used in cognitive science to approach aspects of the serial-order problem include the following: Massive auto-associative connections within an area, neurons sensitive to sequential activation of input units, sparse activation and input coding, Hebbtype unsupervised learning, and activity control mechanisms using inhibition.These features, all of which also characterize the cortex, may contribute to the formation of combinatorial neuronal assemblies and may be important for understanding why brains build rules -assuming, as some evidence suggests, that they indeed do so.The combinatorial neuronal assemblies may play a crucial role in the neuronal grammar machinery, although additional mechanisms are necessary for such a device to process a range of sentence structures (Pulvermüller 2003).
In the why-section of this paper, some still incomplete explanation attempts were explored, covering the laterality of language functions and its relationship to cortical connectivity, the structural and functional basis of action-perception circuits in phonological, lexical and semantic processing, and the formation of functionally discrete circuits, especially in the combinatorial domain, and the still tentative relationship of such discrete circuit emergence to network structure.

Linguistic Summary and Synopsis
From the standpoint of linguistic theory, what is gained from recent neuroscience research?In the semantic domain, we have learned that there is a very real sense in which semantic categories exist.Meaning is not the mental representation of objects, relevant to it are action aspects as well.At the semantic level, language is 'woven into action' and this insight from the analytical theory of language is backed by brain research.Motor and sensory systems activation demonstrates semantic categories along brain dimensions.Additional areas, in the vicinity of sensorimotor domains, may play a role in abstract semantic processing and in general meaning access.
In a similar vein, phonological distinctions can be objectified based on brain correlates.Phonetic distinctive features have their correlates in local cortical activation in the auditory and motor systems.This addresses questions about the nature of phonological representation: Should phoneme features be construed as articulatory or as acoustic?In brain terms, they are both as the phonological circuits appear to link motor and auditory circuits with each other.
An intensive debate about the nature of mental computation can be addressed based on the results from the neuroscience of language.Neuronal ensemble theory along with empirical neurophysiological evidence supports the existence of discrete cortical representations and mechanistic underpinnings for rules of grammar.The position once backed by neural network simulations that rules do not exist at the neuronal level may be in need of revision.
The idea that it takes about half a second to understand a word or sentence -counted from the point in time when the last word critical for sentence understanding is first unambiguously present in the input -might imply a substantial delay in the comprehension process and, as discussed above, one may wonder whether such a delay could represent a substantial disadvantage biologically.Supportive of rapid, almost instantaneous understanding comes from recent neurophysiological studies suggesting latencies of <250 ms of the earliest brain correlates of semantic word and sentence understanding and syntactic parsing.These neurophysiological results support rapid and parallel psycholinguistic models and argue against slow-serial or -cascaded theories assuming sequential steps from phonological to syntactic and semantic modules of hundreds of milliseconds.Relevant time delays seem to range around 10-50 ms only, thus indicating near-simultaneous activation and information access.
Looking back at the review, progress in the where-and when-domain is certainly most impressive.In my view, however, the maturity of the field, its stage of development, will be evaluated in light of plausible approaches to how and why issues.Collecting wisdom about new plants, stars, and brain activation loci can advance a field in a hunter-gatherer sense.In order for it to transform into an explanatory science, explanations need to be offered (Hempel & Oppenheim 1948, von Wright 1971).In the neuroscience of language, these explanations use neuroscience facts and established principles of brain structure and function as explanans.It is in this explanatory domain where, in my view, further progress is most desperately needed.Some little progress has been made, which, however, lacks the flashy aspect of newly discovered neurocognitive hotspots.An im-portant achievement, now and in the future, may therefore be neuromechanistic explanations detailing why specific brain areas are necessary for, or light up and index, specific facets of language processing, how neuronal ensembles and distributed areas become activated with precisely timed milli-second delays, and which precise neuronal wirings can potentially account for neurometabolic activation of specific cortical clusters in semantic understanding.

I
would like to thank Rachel Moseley, David Poeppel, Dietmar Zäfferer, and two anonymous referees for their comments on earlier versions of this paper.This work was supported by the Medical Research Council (UK) (U1055.04.003.00001.01)and by the European Community under the 'New and Emerging Science and Technologies' Programme (NEST-2005-PATH-HUM contract 043374, NESTCOM).

Figure 2 :
Figure 2: Brain activation patterns during passive word reading: Cortical areas activated by all words alike (left side) are contrasted with areas specifically activated by fine-grained semantic word categories (right side), action words related to the face (lick), arms (pick) or legs (kick) and visually-related form words (square) (modified fromPulvermüller, Kherif et al. 2009).Areas found active generally to all kinds of words may indicate the distribution of circuits for processing of general lexical-semantic information, whereas the widely distributed area sets found active for specific semantic types may index the distribution of category-specific semantic circuits.