1 Introduction
A central research question in the history of generative grammar starting from Chomsky (1956) until the eighties has been whether natural languages fall within the class of context-free languages or occupy a higher position in Chomsky’s hierarchy for formal grammars. As late as in 1982, Pullum and Gazdar (1982) claimed that the prevailing answer at the time, namely that context-free grammars are inadequate for the description of natural languages, did not have a solid base. However, immediately after that paper, three independent studies1 (Culy, 1985; Huybregts, 1984; and Shieber, 1985) focusing each on three different languages (Bambara, Dutch and Swiss German) gave a more solid foundation to the claim that natural languages are not context-free, but are mildly context-sensitive, as they ended up being called (Joshi, 1985).
In formal linguistics, the interest for Chomsky’s hierarchy somewhat decreased after the eighties, partly because the main research question (does the grammar of natural language require context sensitivity?) had been answered, partly because efforts were concentrated on developing a theory which reduces the number of grammars that are accessible given fixed data, and this is at least in part independent of the Chomsky’s Hierarchy (cf. Lowe, 2021 for a recent discussion that relativizes the importance of Chomsky’s Hierarchy as a measure of complexity) . The radical simplification of the apparatus proposed in the Minimalist Program, including the development of Bare Phrase Theory (cf. Chomsky 1995 and much following work), contributed to this trend.
However, the interest has been revamped in more recent years, because grammatical abilities of non-human species have been investigated to evaluate the (alleged) uniqueness of human language (cf. ten Cate, 2017 and Berwick et al., 2011 for an assessment). More specifically, the position of non-human grammatical systems in Chomsky’s hierarchy has raised a significant controversy. It is debated whether some avian species can recognize serial nested dependencies characteristic of context-free grammars (cf. Gentner et al., 2006 and Ravignani et al., 2015 on this issue and Everaert & Huijbregts, 2013 for a discussion of similarities between birdsong and language). However, to the best of our knowledge no one ever attempted to show that non-human species can recognize sequences produced by mildly-context grammars. So, based on current knowledge, it is safe to say that mild context sensitivity is uniquely human.
In this context, identifying the locus of complex grammatical abilities is particularly important. Out of the four papers that take human language to be context-sensitive, three built their demonstration on clearly syntactic constructions, namely cross-serial dependencies in Dutch, Swiss German and Swedish. This is in line with the tradition that sought to establish that the internal complexity of the clause (as opposed to the internal complexity of the word) is the reason why natural languages are not context-free. These ‘syntactic’ attempts, building on the presence of cross-serial dependencies, followed other syntactic attempts, involving comparatives (cf. Chomsky, 1963) and sentences with respectively (cf. Bar-Hillel & Shamir, 1960).2
The Bambara paper is different, though. As already stated in its title, Culy (1985) posits that the non- context-free component lies in what he calls lexicon (although in more current terminology it would be called morphology). However, this specific claim is based on an observation that is very little developed (see below). Our goal in this note is to revisit the claim that what makes Bambara more complex than context-free is its morphology as opposed to its syntax. We believe that revisiting this question is interesting at the light of the current debate on where to draw a dividing line (if any) between syntax and morphology. Many current approaches reject the lexicalist view that there is a neat division of labor between morphology and syntax, namely morphology would come first and assemble morphemes into words, while syntax, which would come later, would assemble the output of morphology (words) into phrases and sentences. Non lexicalist approaches assume that such a line cannot be really drawn because there is only one computational component which is responsible for both word formation and clause formation (cf. Borer, 2013; Embick & Noyer, 2007; and Baunaz & Lander, 2018 for a presentation of different approaches that are however unified by the rebuttal of lexicalism). Under the assumption that there is a unique computational system, it is not expected for morphology to be context-free and for syntax not to be so (or the other way around). This expectation does not arise from lexicalist approaches, which remain neutral on this point.
A preliminary note: we assume for the sake of the argument a minimal notion of ‘word’, intended as a special kind of phrase that, unlike other phrases, respects a condition of lexical integrity (cf. Section 3 below for tests of lexical integrity and Fábregas, 2014 and Cecchetto & Donati, 2015, Chapter 1, for general discussion on the notion of word). Building on this, we develop our argument as follows. We introduce the structure that, according to Culy, makes Bambara not context-free. This structure results from the interaction of the two constructions that are described in Section 2. In Section 3 we ask whether these two constructions pass two lexical integrity tests, and we conclude that they do, converging with the initial characterization by Culy, who however did not systematically apply lexical integrity tests or other tests for ‘wordhood’. In Section 4 we discuss the Bambara findings at the light of the debate between lexicalist and non lexicalist approaches.
2 The Bambara Constructions
The context-sensitive configuration identified by Culy results from the interplay of two constructions. The first one, illustrated in (1) is what he calls ‘Noun o Noun’ construction, where o3 is encircled by the same noun:
1
| wulu | o | wulu |
| dog | o | dog |
| ‘Whichever dog’. | ||
The second construction is an agentive construction which has the form ‘Noun(N) + Transitive Verb(TV) + la’ and translates as "one who TVs Ns".
2
| wulu + | nyini + | la | = wulunyinila |
| dog | search | for | |
| ‘One who searches for dogs’ (dog searcher). | |||
This agentive construction is recursive, that is, the noun phrase in the agentive construction can result from a previous application of the same construction, as in (3).
3
| wulunyinila + | nyini + | la | = wulunyinilanyinila |
| dog searcher | search | for | |
| ‘One who searches for dog searchers’ | |||
Importantly, the noun phrase having the ‘Noun(N) + Transitive Verb (TV) + la’ form can be used in the ‘Noun o Noun’ construction.
4
| wulunyinila | o | wulunyinila |
| dog searcher | o | dog searcher |
| ‘Whichever dog searcher’. | ||
Culy shows that the combination of the ‘Noun(N) + Transitive Verb (TV) + la’ construction and the ‘N o N’ construction in (4) causes ‘the vocabulary of Bambara’ to be context-sensitive. We refer to his paper for a formal demonstration, which we take for granted. Our goal in this paper is not to reconsider Culy’s demonstration, which, to the best of our knowledge, has never been questioned. Our goal is to double-check the claim that the structure that results from the interaction of the two constructions indeed belongs to what Culy calls vocabulary and can be called morphology in more current terms. Since his description of the relevant constructions is very short, we start by describing them in more details before switching to the main research question. The grammatical judgments reported in this paper are those of one of the authors, who is a native speaker of Bambara. When necessary, he consulted other Bambara speakers. As the basic facts involve sharp grammaticality judgments, we did not deem necessary an experimental investigation.
2.1 Additional Information on the ‘N o N’ Construction
The N label in the ‘N o N’ construction is slightly deceptive because the N does not need to be a noun but can be an NP.
5
| Wulu | ba | o | wulu | ba |
| Dog | big | o | dog | big |
| ‘Whichever big dog’. | ||||
In fact, the N does not need to be overtly present, as an adjective suffices to license the construction, provided that the content of the implicit noun it qualifies can be retrieved:
6
| (Jiri) | fitini | o | (Jiri) | fitini |
| tree | small | o | tree | small |
| ‘Whichever small tree’. | ||||
More significantly, the N in the ‘N o N’ construction can result from a nominalization process out of a verb. The base verb can be intransitive (both unergative cf. 7 and unaccusative cf. 8) or passive (cf. 9).
7
| taamala | o | taamala |
| walking-person | o | walking-person |
| ‘Whoever walks’. | ||
8
| Sábagatɔ | o | Sábagatɔ |
| dying-person | o | dying-person |
| ‘Whoever dies’. | ||
9
| bugɔ+len | o | bugɔ+len |
| beaten person | o | beaten person |
| ‘Whoever is beaten’ | ||
2.2 Additional Information on the ‘Noun(N) + Transitive Verb (TV) + la’ Construction
The N in the agentive ‘Noun(N) + Transitive Verb (TV) + la’ construction does not need to be a N, but can be an NP:
10
| Wulu | jugu | nyinila |
| dog | wicked | search+la |
| ‘One who searches for wicked dogs’. | ||
Confirming the productivity of the agentive construction, the NP in (10) can be further modified by an adjective as in (11).
11
| Wulu | jugu | nyinila | teli |
| dog | wicked | searcher | fast |
| ‘A fast wicked dog searcher’. | |||
However, the verb in the ‘Noun(N) + Transitive Verb (TV) + la’ construction is constrained by several limitations. It cannot be inflected for tense, aspect or mood. This is shown in (12)4 whose ungrammaticality is induced by the addition of the perfective marker, and in (13) whose ungrammaticality is induced by the addition of a subjunctive marker implying obligation.
12
| *yé | wulu | nyinila |
| ASP | dog | searcher |
| Intended: ‘One who searched for dogs’. | ||
13
| *ká | wulu | nyinila |
| SUBJ | dog | searcher |
| Intended: ‘One who must search for dog normally’. | ||
Furthermore, in the ‘Noun(N) + Transitive Verb (TV) + la’ construction, the verb cannot be modified by an adverb (cf. 14) and cannot take an indirect object (cf. 15).
14
| *wulu | nyinila | títiti |
| dog | searcher | quickly |
| Intended: ‘One who searches for dog quickly.’ | ||
15
| *gafe | dila | John | ma |
| book | giver | John | to |
| Intended: ‘One who gives book to John’. | |||
2.3 Additional Information About the Combination of the Two Constructions
Culy claims that the combination of the two constructions illustrated in (4) above is a productive phenomenon. In order to double check this, we built more complex cases in which the N in the input ‘Noun(N) + Transitive Verb (TV) + la’ construction is a modified noun (an NP). The result remains fully grammatical:
Input ‘Noun(N) + Transitive Verb (TV) + la’ construction:
16a
| Malo | fin | nyinila |
| rice | black | search+la |
| ‘Black rice searcher’. | ||
Resulting ‘NP o NP’ construction:
16b
| Malo | fin | nyinilaw | o | malo | fin | nyinilaw |
| rice | black | searcher.PL | o | rice | black | searcher.PL |
| ‘Whichever black rice searcher’. | ||||||
We also embedded the NP resulting from the combinations of the two constructions in actual sentences to check for its distribution. In (17) the NP is in the subject position of an active transitive sentence and in (18) it is in the subject position of a passive sentence.
17
| Mobilikola o Mobilikola | bɛ | mobili | ko, | a | tɛ | minan | ko |
| whichever carwasher | IMP | car | wash, | 3.SG | NEG | dishes | wash |
| ‘A carwasher washes car, not dishes’ | |||||||
18
| Sàma | di.la | wulunyinila o wulunyinila | fisamanci | ma | José | bolo |
| present | give-ASP | whichever dog searcher | skillful | to | José | by |
| ‘A present has been offered to any skillful dog searcher by José’. | ||||||
In (19) we even have a complex case in which the subject of the sentence is the coordination of two NPs that both result from the combination of the ‘Noun(N) + Transitive Verb (TV) + la’ construction and the ‘N o N’ construction’.
19
| Wulunyinila o wulunyinila | ani | sògofeerela o sògofeerela | tɛ | bɛ̀n | yɔ́rɔ | kelen | na |
| whichever dogsearcher | and | whichever meatseller | NEG | keep | place | one | in |
| ‘Whichever dogsearcher and meatseller cannot live in the same place’. | |||||||
All in all, we can conclude that Culy is right when he claims that the configuration that makes Bambara (mildly) context-sensitive is a productive one.
3 Syntax or Morphology?
Having established that the linguistic constructions reported by Culy are attested and productive, we can switch to our main goal, namely establishing the locus of the non-context-free nature of Bambara. This is how Culy motivates that the complexity of Bambara lies in what he calls “vocabulary”:
“There is evidence that the Noun o Noun construction belongs in the vocabulary rather than in the syntax. Bambara is a tone language, and as such it has two types of rules governing the interaction of tones: rules dealing with the interaction of adjacent lexical items, and rules dealing with the interaction of components of a compound, be it nominal, verbal, or whatever. Internally, the Noun o Noun construction does not follow the rules for adjacent lexical items, but rather has its own peculiar rule (Cf. Bird et al., pp. 8-9, 166, for a description of the first sort of rules and for the Noun o Noun construction). Thus, tonal evidence indicates that the Noun o Noun construction does indeed belong in the vocabulary rather than the syntax.”
As he does not further develop this point but simply refers to an introductory textbook description which in turn introduces a special rule for the ‘Noun o Noun construction’, the evidence is not compelling. Therefore, we decided to explore this issue more in depth. As we mentioned, the tests that most researchers adopt to tell apart words from phrases involve the Lexical Integrity Hypothesis. Two of these tests are applied to the ‘Noun o Noun construction’ and to the ‘Noun(N) + Transitive Verb (TV) + la’ construction in the next sections.
3.1 Wh Extraction out of the Two Bambara Constructions
A standard way to distinguish words and phrases is based on wh-extraction, possible from phrases but not from complex words, including compounds:
20
What did you buy _? A dishwasher
21
*What did you buy a _washer?
Bambara is a wh in situ language but there is a wh word that naturally undergoes movement to the left periphery of the clause, and this is the interrogative pronoun ‘mun’. When this happens the resumptive pronoun ‘a’ occurs in the base position. This is illustrated in (23), the interrogative counterpart of (22).
22
| Bavié | yé | mobili | di | Malado | ma | sàma | yé |
| Bavié | ASP | car | give | Malado | to | present | as |
| ‘Bavié gave a car to Malado as present’. | |||||||
23
| Mun | Bavié | y’ | a | di | Malado | ma | sàma | yé ? |
| what | Bavié | ASP | 3.SG | give | Malado | to | present | as |
| ‘What did Bavié give to Malado as present?’ | ||||||||
In this section we apply the wh-extraction diagnostics to the two constructions that are the center of the demonstration that Bambara is not context-free.
Let us first consider the ‘Noun(N) + Transitive Verb (TV) + la’ construction in isolation. Importantly, the N in this construction can be replaced by ‘mun’ if ‘mun’ stays in situ, as shown in (24):
24
| I | yé | mun | nyinila | bɛn? |
| 2.SG | ASP | what | search+la | meet? |
| ‘You met a searcher of what?’ | ||||
The sentence remains grammatical if the entire ‘Noun(N) + Transitive Verb (TV) + la’ constituent is moved to the left periphery, provided that the resumptive pronoun ‘a’ is found in the base position. This case is somewhat reminiscent of English expressions like ‘what the hell’, which undergo wh-movement as a consequence of the fact that they are interrogative (cf. ‘what the hell did you buy?’): here ‘mun’ causes the NP it is embedded into to move with it (the parallelism with ‘what the hell’ is partial though, because the Bambara construction has a compositional meaning).
25
| Mun | nyinila | I | ye | a | bɛn? |
| what | search+la | 2.SG | ASP | 3.SG | meet? |
| ‘You met a searcher of what?’ | |||||
The crucial example is (26). Here ‘mun’ moves alone out of the ‘Noun(N) + Transitive Verb (TV) + la’ construction, in doing so violating lexical integrity. No matter if the resumptive pronoun is present or not, the sentence is sharply ungrammatical.
26
| *Mun | I | ye | a | nyinila | bɛn? |
| what | 2.SG | ASP | 3.SG | search+la | meet? |
| ‘Intended: You met a searcher of what?’ | |||||
The ungrammaticality of (26) is a case that reminds what happens in cases of wh-extraction out of compounds (cf. 21).
Let us now consider a case of wh-extraction out of the combination of the two constructions. (27) is the grammatical sentence over which we try wh-extraction. In (28), ‘mun’ replaces Mobili (‘car’) and the result is ungrammatical. In (29) it replaces kola (‘washer’) and the structure is equally deviant. Finally, in (30) ‘mun’ is extracted across the board from the ‘Noun o Noun’ structure. The result is again sharply ungrammatical. The same holds if the resumptive pronoun ‘a’ occurs in the extraction position.
27
| Chaka | na | dayɛlɛlan | di | Mobilikola o Mobilikola | ma |
| Chaka | COND | key | give | whichever car-washer | to |
| ‘Chaka would give key to any car-washer’. | |||||
28
*mun Chaka na dayɛlɛlan di [_kola o _kola] ma ?
Intended: Of what(ever) Chaka would give a key to a washer of it?
29
*mun Chaka na dayɛlɛlan di [mobili_ o mobili_ ] ma ?
Intended: What(ever) I would give a key to of that car ?
30
*mun Chaka na dayɛlɛlan di [ _ o _ ]
Intended: ‘Who(ever) I would give a key to?’
These effects are reminiscent of what happens in clear cases of compounds like (21). All in all, the complex construction that has been used to show that Bambara grammar is context-sensitive obeys lexical integrity. It behaves as a complex word rather than as a phrase.
3.2 Anaphoric Relations in the ‘Noun(N) + Transitive Verb (TV) + la’ Construction
Another standard way to distinguish words and phrases is based on the fact that a pronoun cannot refer to a part of a complex word but only to the entire word, as illustrated in (31)–(32).
31
*I have a dishi-washer, but it doesn’t clean themi properly
32
I have a dishwasheri, but iti doesn’t work properly
We applied the anaphora test to the internal parts of the the ‘Noun(N) + Transitive Verb(TV) + la’ complex. A pronoun can take as an antecedent the entire ‘Noun(N) + Transitive Verb (TV) + la’ NP but never its parts, not even in cases like (35) where world knowledge strongly invites the relevant interpretation (normally dogs bite, not dog searchers).
33
| N’ | ye | wulu | jugu | nyinila | bɛn | nga | n’ | bɛka | siran | a | nyɛ / | *u nyɛ |
| 1SG | ASP | dog | wicked | searcher | meet | but | 1SG | PROG | fear | 3SG | of | 3PL of |
| ‘I met a wicked dog searcher but I am getting afraid of him * of them’. | ||||||||||||
34
| N’ | ye | wulunyinila o wulunyinila | bɛ̀n | a | má | foli | laminɛ |
| 1.SG | ASP | whichever dogsearcher | meet | 3.SG | NEG | greeting | answer |
| ‘Any dogsearcher I met, he didn’t answer to my greeting’. | |||||||
| √ a = dogsearcher | |||||||
| * a = dog | |||||||
35
| N’ | ye | wulunyinila o wulunyinila | bɛ̀n | a | ye | n’ | cín |
| 1.SG | ASP | dogsearcher | meet | 3.SG | ASP | 1.SG | bite |
| ‘Any dogsearcher I met, it bit me’. | |||||||
| √ a = dogsearcher | |||||||
| * a = dog | |||||||
The anaphora test shows that the ‘Noun(N) + Transitive Verb(TV) + la’ unit obeys lexical integrity, therefore it behaves as a complex word.5
4 General Considerations and Conclusion
In this paper, we revisited the question of what is the locus of the structures that have been shown to make natural languages context-sensitive. While for Dutch, Swedish and Swiss German there is no doubt that the relevant structures are created in the syntactic component, the Bambara facts were less clear. After studying the properties of the relevant structures in more details and after applying two classical lexical integrity tests, we concluded that, assuming for the sake of the argument that a dividing line between syntax and morphology can be drawn, the initial characterization by Culy is indeed correct: Bambara is context-sensitive word-internally.
This result is relevant for the debate around lexicalism. That context-sensitivity is found in structures that lexicalists would characterize as morphological is fully consistent with the non-lexicalist view that there is just one computational system for structure building operations and this unique computational device create words as well as phrases and clauses. On the other hand, Bambara findings do not directly falsify lexicalist approaches, as lexicalists might argue that syntax and morphology, although being different modules, share some fundamental properties, including context-sensitivity.
Although overall non-lexicalists seem to be in a better position, a caveat should be advanced: in order for a non lexicalist approach to be fully convincing, it should be able to explain the facts that we reported in Section 3.1 and 3.2 as cases of violation of lexical integrity in a different way, as by definition, lexical integrity is not an analytical category available to non-lexicalists. Related to this, a reviewer asks whether the cases involving ungrammatical wh-extractions discussed in Section 3.1 can be explained as island effects. This explanation, although very attractive and fully in line with non lexicalist approaches, is not without problems. Take (25) and (26) above: the extraction site in both sentences is the complement position of the verb ‘meet’. This position normally does not introduce an island and furthermore (25) is grammatical while (26) is not. What changes in the two sentences is that in (26) there is wh sub-extraction out a category that a lexicalist would define a (complex) word. Of course it is possible to say that such a category is an island, but without independent evidence this is a stipulation. Furthermore, the island explanation does not immediately extend to the anaphora facts discussed in Section 3.2.
To summarize, we can confidently conclude that the context sensitivity of Bambara, differently from the context sensitivity of other languages, which is clearly syntactic, lies in the domain of what would be traditionally called ‘morphology’. While the fact that context sensitivity holds across ‘syntax’ and ‘morphology’ is very much in line with non lexicalist approaches, questions remain on how these approaches can explain the Bambara findings that prima facie call for the utilization of the lexical integrity hypothesis.
This is an open access article distributed under the terms of the Creative Commons Attribution License (