The Human-Fostered Gorilla Koko Shows Breath Control in Play with Wind Instruments

Breath control is critical to the production of spoken language and commonly postulated as a unique human adaptation specifically for this function. In contrast, non-human primates are often assumed to lack volitional control over their vocalizations, and implicitly, their breath. This paper takes an embodied perspective on the development of breath control in a humanfostered gorilla, examining her sound play with musical wind instruments. The subject Koko was video recorded in her play with plastic recorders, harmonicas and whistles. The results show that Koko exercises volitional control over her breath during instrument play. More generally, the findings suggest that all great apes share the potential to develop breath control, and that the original adaptive value of breath control was its flexible development for the service of behaviors that happened to be useful within particular sociocultural and physical environments.


Introduction
The notion of embodiment in cognitive science refers generally to the theory that, in order to understand the cognition of a given organism (whether human, gorilla, or octopus), we must take into account the nature of the organism's body as it is situated within its everyday physical and social environment (Gibbs 2006).
Cognition, according to this idea, does not happen in a computational vacuum, and neither does it evolve or develop in one, but rather it is distributed across the boundaries of brain, body and environment, and emerges through situated activity.Thus, cognition is understood to be ecological, and its science requires that we seek to understand the adaptive interactions that are created between the biological organism of interest and its particular environment.Stemming from this idea, Johnson (2010: 588) suggests that, "[e]mbodied models […] take cognition We are grateful to Joanne Tanner, Raymond Gibbs, Nathaniel Clark, and two anonymous reviewers for their helpful comments on previous versions of this article.We would also like to express our appreciation to the Gorilla Foundation caregiving staff, present and past.And finally, we are forever indebted to the gorillas Koko, Ndume, and Michael.
as developing through engagement with the world, and so motivate ontogenetic analyses of what bodies do under varying environmental constraints".The purpose of this paper is to apply such an embodied perspective to a case examination of breath control by a western lowland gorilla.The ability to voluntarily and skillfully employ one's breath is critical to human speech, and is often considered to be an innate biological adaptation that humans acquired specifically for functions related to language and possibly song (MacLarnon & Hewitt 1999;Fitch 2010).Yet, when humans engage with their typical world, they immerse themselves in a variety of activities related to the instrumental control of their breath and vocal apparatus that extend well beyond the nonetheless pervasive activities of talking and singing.In many North American families, for example, children grow up blowing out birthday candles, learning to whistle, learning breathing patterns for swimming, karate, or relaxation, holding their breath underwater, spitting, blowing spit balls, blowing up balloons and paper bags, imitating animals, trains, and flatulence, and so on.
In comparison, outside of their vocal behavior, we have little direct empirical knowledge of the breath-related abilities of great apes and other non-human primates.What we do know about their ability to control their breath is largely inferred from statements about their vocal behavior, which is often believed to be subcortical, involuntary and inflexible, with little capacity for learning (Pinker 1994;MacLarnon & Hewitt 1999;Corballis 2003;Premack 2004;Call & Tomasello 2007;Pollick & De Waal 2007;Arbib et al. 2008;Tomasello 2008).Often implicit in this perspective is the idea that the ability of an animal to control its vocalization and breathing is innate, either present or absent in a species, independent of environmental conditions.Humans have cortical, voluntary, and flexible control over these functions; great apes do not.However, as an empirical claim, this position does not account for regular environmental differences in their respectively typical developmental circumstances.This qualification is especially true with respect to breath control.Humans, in their usual physical and social environments, are afforded a variety of motivated opportunities to learn and flexibly use novel breathing-related behaviors.Yet we know little about the ability of a great ape to flexibly control its breath in comparable environments when such behavior is similarly motivated and reinforced.
One of the few empirical studies to directly investigate breath control in non-human apes focused primarily on the morphology of skeletons, rather than the behavior of living animals.MacLarnon & Hewitt (1999) compared the sizes of thoracic vertebral canals of humans and other modern apes, along with fossils of extinct hominids.The thoracic canal is of particular interest to the evolution of breath control, as its size poses a constraint on innervation to the intercostal and abdominal muscles that humans use to control breathing during speech.The authors conclude from their comparison that 'enhanced' breath control was not possible for ancestral hominids until at least 1.6 million years ago, around the time of Homo ergaster or early Homo erectus.They suggest that, "[i]t is not known whether non-human primates could be trained to take on a more human pattern of exhalations, but there is no evidence from their habitual vocalizations that they would be capable of doing so" (MacLarnon & Hewitt 1999: 358).So what do ape vocalizations reveal about their ability to control their breath?
Historically it was believed that ape vocal behavior -and hence the breathing that underlies it -was emotional and stimulus-bound, and not under volitional control.Even Jane Goodall, after many years observing the chimpanzees at the Gombe Reserve, came to the conclusion that, "the production of sound in the absence of the appropriate emotional state seems to be an almost impossible task for a chimpanzee" (Goodall 1986: 125).At one time, in fact, ape vocal behavior was believed to be so intractable that it was not subject to contingent control by operant conditioning (Skinner 1957).Today, however, Skinner's claim is recognized to be false, with counter evidence from a wide variety of primate taxa (see Pierce 1985 andOwren et al. 2011 for reviews).Increasing evidence also shows that apes are naturally flexible in their deployment of vocalizations -for example, suppressing vocalizations in certain situations when it is advantageous (Goodall 1986) or producing them only around particular audiences (Laporte & Zuberbühler 2010).In addition, a growing number of reports describe the use of novel and learned vocalizations and sounds by apes in different environmental circumstances, both in natural environments and in captive environments with and without human enculturation.
One example of learned vocal behavior in natural environments is reported in orangutans, which sometimes produce a lip sputter during an evening nest building routine (van Schaik et al. 2006).Critically, only certain populations produce the sound, and it is produced differently between those that do, being used at different phases of the nest building procedure.Thus the orangutan lip sputter appears to be acquired through social transmission and may be considered as a cultural tradition.Another oral sound tradition documented in free-ranging orangutans relates to a sound called a kiss-squeak, which is created by a sharp intake of air through pursed lips (although it is not clear the extent to which this sound reveals control over breathing per se versus the oral manipulation of air).Although this particular sound appears to be produced as a nearly universal part of the orangutan repertoire, in some populations, it is modified in particular ways -covering the mouth with either hands or leaves-that are maintained as cultural traditions.Owren et al. (2011) assess novel behaviors like the orangutan's lip sputter and kiss squeak as indicating "rudimentary volitional control over airflow".
Great apes in captivity have also been observed to produce learned vocalizations, even in cases without human enculturation.One study observed the spread of a lip sputter (also known as a 'raspberry' or 'Bronx cheer') element into the pant hoot sequence of a group of captive chimpanzees, originally produced by one member of the group, and eventually spreading to six (Marshall et al. 1999).Another set of studies found that chimpanzees in captivity sometimes use novel vocalizations and sounds as attention-getters with human caregivers, including a lip sputter and an extended modification of a voiced food grunt (Hopkins et al. 2007;Leavens et al. 2010).
Of particular relevance to the ontogenetic perspective taken here, there is also evidence that in environments with extensive human contact, apes can learn new vocal and breath-related behaviors from their human caregivers and companions.One such case concerns the zoo-born orangutan Bonnie, who acquired the ability to whistle from observing human caregivers (Wich et al. 2007).Bonnie, who was 30 years old at the time of the report, has been documented from the age of 13 years to whistle regularly, often apparently for her own amusement.Formal study showed that Bonnie would whistle readily on request, as well as spontaneously on her own, and was also able to imitate the whistle duration and number of repetitions produced by a human model.There are also several anecdotal reports of apes learning to smoke cigarettes.The apes in these cases appear to enjoy the activity and show a tendency to develop a habit (Witmer 1909;Kearton 1925; and see a recently publicized example of a smoking orangutan in a zoo in Malaysia at http://www.youtube.com/watch?v=B-h_JlpdRJQ).
In a few cases, great apes have developed notable breath control as a result of efforts to teach them to speak.A prominent example is the young humanfostered chimpanzee Viki, who underwent an intensive training regimen conducted by her 'foster parents', the psychologists C. and K. Hayes (Hayes 1951).From a traditional point of view, the project is remarkable for what is considered to be its abject failure.As critics like Pinker (1994) point out, Viki never succeeded in pronouncing more than four words: mama, papa, cup, and up, spoken with an unvoiced vowel described as a harsh stage whisper.Nevertheless Viki did succeed in achieving some degree of flexible control over her breath and vocal tract.Beginning at the age of 4 months, Viki was subject to an operant conditioning procedure, which usually took place at the dinner table and used food as reinforcement.By 15 months, Viki had learned to articulate the word mama on her own, and once acquiring this initial skill to vocalize at will, she was able to learn new speech sounds and words more easily by imitation.In addition to the speech sounds that comprised her 'words', /m/, /p/, /k/ and her vowel, Viki also learned to produce various other word-like sounds (e.g.blook, bloo), a Bronx cheer, and could blow a whistle as part of an imitation task.
Somewhat less accomplished but comparable results were achieved with a chimpanzee trained by Garner (1900), a chimpanzee described by Witmer (1909), an orangutan trained by Furness (1916), and, more recently, an orangutan taught by Laidler (1980).And although not involving explicit training to speak, the learned vocal repertoire of Kanzi, a bonobo raised from infancy with extensive human enculturation and immersion in symbolic communication, offers a more recent example (Hopkins & Savage-Rumbaugh 1991;Taglialatela et al. 2003).Contextual and acoustic analysis shows that Kanzi regularly made use of four novel peep vocalizations, each used to communicate in distinct semantic contexts.
Finally, the subject of the present study, Koko, a human-fostered gorilla, has been video-recorded in the performance of a variety of behaviors that appear to involve voluntary breath control (Perlman et al. in preparation).Video records show her huffing and grunting into a telephone, huffing on the lenses of eyeglasses, performing a fake cough, blowing her nose, performing her version of a 'raspberry' (folding her tongue lengthwise and blowing air through it), and blowing into her hand as a communicative gesture.When agitated at a visitor or caregiver, Koko sometimes performs a breath-related gesture known as you blew it, in which she expels air forcefully towards the face of the transgressor.More frequently, Koko performs a more welcoming routine with visitors and caregivers called the blow test.In an interactive exchange, Koko blows gently towards her interlocutor and smells their breath as they blow back.
Altogether these various reports suggest that under certain circumstances apes are able to develop flexible control over their breathing, and that they can acquire this control towards the performance of instrumental behaviors such as producing attention-getting vocalizations and socially transmitted, learned behaviors like whistling.Although Viki in particular was subjected to intensive operant conditioning without substantial progress towards actually learning to speak, the point is often neglected that she did acquire more general volitional control over her breathing and sound production.
These previous reports reveal the development of breath control in chimpanzees, a bonobo, and orangutans, yet studies have not addressed whether gorillas might acquire a similar level of volitional control.However, given their place in the great ape family, it is reasonable to expect gorillas to exhibit comparable potential for breath control, which would imply that the behavioral and neural basis to develop breath control is a general trait of the great apes.In support of this reasoning, the present paper reports on a series of video-recorded observations of the human-fostered gorilla Koko as she exercises breath control during play with musical wind instruments.

Subject
The subject of the study is Koko, a female western lowland gorilla (Gorilla gorilla gorilla) who was 37 to 39 years of age during the reported observations.Koko was born in 1971 at the San Francisco Zoo, but became ill at six months and was moved from the zoo's gorilla enclosure to be cared for by humans and nursed back to health (Patterson & Linden 1981).At the age of one year, she came under the care of the second and third authors (FGP, RHC).At this time, Koko began lifelong tutelage in a sign system derived from American Sign Language, as well as immersion in spoken English.Over the course of her life, play with musical instruments has been a common interactive activity between Koko and her caregivers, and as such, one that has been encouraged and rewarded.In general, Koko's novel breath-related and vocal behaviors have been subject to demonstration, molding and various forms of reinforcement including food and verbal praise, but have not been specifically trained by operant conditioning procedures.

Data Recording
The data come from video recordings made during regular daily sessions involving Koko, FGP and RHC.In most of the recordings, Koko interacted most directly with FGP while RHC operated the camera.On a few occasions, family or visitors are present, although these occasions did not happen to be part of the present report.Similar video recorded sessions have been a more-or-less regular event since the project's beginning in 1972 and are an ordinary part of Koko's daily routine.In general, the video was taken without a specific research project in mind, but does reflect a broader effort to document Koko's notable abilities, communications, and life landmarks.The present study focuses on video recordings from July 2007 through December 2010.

Analysis
The first author (MP) viewed the video corpus and identified all instances of Koko interacting with a musical wind instrument.It was known in advance of this search that Koko commonly participated in play with wind instruments, and that numerous instances had been recorded in the corpus.Audio .wavfiles were extracted from the positively identified video clips, and Praat phonetic analysis software (Boersma 2001) was used to determine the onset and offset of each audible manipulation with an instrument.These times were used to compute the duration of each sound and the intervals between them.Koko was observed to perform three relevant types of behaviors during her instrument play, and each instance was classified accordingly.The majority of the time she blew into the instrument with the result that it produced a tone, an event referred to as a toot.On some other occasions, she produced blows, instances in which she blew into or on the instrument, but without producing a tone.And finally, she sometimes produced markedly short tones that appeared to be produced by oral manipulation (e.g., sucking) but without any clear breath control.These cases were operationalized as any tone produced with a duration less than 0.26 seconds, and were excluded from further analysis.This operating definition was supported by a bimodal distribution of toot durations, with none occurring within the inclusive range of 0.26 to 0.30 seconds (i.e.orally performed toots were generally shorter than 0.26s while true toots were longer than 0.30s; Table 1).We generally did not see evidence that Koko selected particular notes in her play.
Koko's instrument play was divided into bouts, which were defined as a series of toots that were separated by no more than 30 minutes from another toot.Bouts were further divided into sequences, defined as a series of toots separated by less than 4.23 seconds from another toot.This cutoff point was selected as what appeared to be a natural dividing point for Koko: a local minimum in a bimodal distribution of inter-toot intervals (see Table 1).
Each bout was coded for the circumstances of its initiation -either selfinitiated by Koko or encouraged by FGP and RHC -and its consequences, resulting in a verbal response, food, or no interactive response at all.During each bout, RHC was generally operating the video camera and directing it at Koko, but this act in itself was not considered in coding initiation or consequence.Initiations were determined by examining the clip for 30 seconds immediately preceding the bout, and consequences were assessed during the play bout and extending for 30 seconds immediately following it.

Results
In total, Koko was observed to perform 137 individual toots (84% of total manipulations) over the course of 38 sequences and 17 different bouts, with only two bouts occurring on the same day (See Table 2).In addition, she produced 12 (7%) blows and 14 (9%) excluded instances of oral manipulations.Koko most commonly played with one of several plastic recorders, but also used other instruments including harmonicas and party favor whistles.Table 3 shows the counts for how each bout was initiated and its consequence.Of the 17 bouts, only 11 provided the 30 seconds of preceding video footage necessary to determine its initiation (specifically to verify that it was self-initiated).Of these 11, 6 were self-initiated compared to 5 that were encouraged.13 of the 17 bouts were videotaped with the necessary 30 seconds of subsequent footage, and of these, 8 resulted in some kind of verbal response, 2 in food, and 3 received no apparent acknowledgment.Table 4 shows the characteristics of Koko's musical bouts, sequences, and toots.On average, a bout lasted for 26.7 seconds (SD = 40.0),and individual sequences lasted for 6.3 seconds (SD = 5.64), with an average of 3.61 toots per sequence (SD = 2.68).Toot durations lasted on average 0.72 seconds (SD = 0.21) with a mean inter-toot interval of 1.06 seconds (SD = 0.18).
In addition, we were interested in Koko's rate of toots and blows for each sequence.Since sequence durations were calculated from the onset of Koko's initial toot, they failed to include the initial inhalation phase.Thus an adjusted sequence duration was calculated by subtracting the duration of the initial toot from the total duration of the sequence, thereby including only complete inhalation-exhalation breath cycles.(Consequently, the six single-blow sequences were not included in this calculation.)In sequences interrupted by oral manipulations, the longest continuous portion of toots and blows from the sequence was used.From this adjusted duration, an extrapolated rate of blow cycles per minute was calculated for each sequence.
On average, Koko blew into the instrument at a rate of 36.14 cycles per minute (SD = 8.71).A critical question is whether this rate varies from Koko's usual breathing rate, as this would indicate that her breathing was under voluntary control.In general, Koko was stationary during her instrument play and did not exceed a mild level of physical activity.For comparison, we estimated Koko's breathing rate just after mild to moderate activity on eight occasions in September and October 2011, and found the range to be from 16 to 20 breaths per minute (BPM).A rate of 25 BPM was selected as a conservatively high estimate, and a single sample t-test showed that the rate of blow cycles per minute was significantly faster than this baseline breathing rate, t(31) = 7.24, p < 0.0001.
Last, we examined the intensity of Koko's 12 blows with the question of whether they showed evidence of increased forcefulness over her baseline breathing intensity.In general, the energy of the blows was clearly audible during audio-only playback, as well as visible in a spectrogram, indicating a clear elevation in sound intensity from background noise.Koko's normal breathing, in contrast, was generally not detectable under the recording conditions, either audibly or by spectrogram.Intensity measurements were taken at the onset and offset of the blow, and these were averaged together as a baseline intensity for comparison to the blow's mean and maximum intensity.The results showed an average increase in mean intensity of 5.12 dB (SD = 3.99) and an average maximum intensity of 9.79 dB over baseline (SD = 4.67).

Discussion
We analyzed video recordings of the human-fostered gorilla Koko performing 38 sequences of play with musical wind instruments from 17 different bouts.
Presumably due to an environment more comparable to human children learning to play a musical instrument, Koko has developed the volitional ability to use her breath to produce tones with a variety of instruments, including plastic recorders, harmonicas, and party-favor style whistles.The analysis shows that when Koko blows on an instrument, she tends to adopt a pattern of breathing that is significantly faster than her normal rate.Moreover, although difficult to verify, it would seem that the air pressure necessary to sound a tone would also require Koko to produce extra forceful breaths while playing.This point is supported by the few occasions when she directed blows at the instrument that were not in the proper place or manner to produce a tone.The breaths she exerted in these instances showed a clear increase in sound energy over the background noise, whereas her normal breathing was not generally Koko thus shows the ability to exercise volitional control over the frequency and forcefulness of her breath.
It is true that instrument play is commonly modeled for Koko, and her performance is often encouraged, engendering positive social response, at times including explicit rewards of praise and food.We note, however, that this scenario of positive engagement and reinforcement is roughly similar to many environments in which human children learn to play musical instruments and perform other breathing-related behaviors.This point is critical from the ontogenetic perspective of embodiment, according to which breath control is understood to develop ecologically, in contexts in which it is useful and relevant.Furthermore, several aspects of Koko's instrument play point to an intrinsic interest she has gained in the activity, showing it to be more than a rewardcontingent trick.Koko is often inclined to play when the opportunity arises (e.g., when she is handed an instrument), and the video recorded bouts show that she often sounded her instrument repeatedly without any visible expectation of reward, sometimes for more than a minute at a time.Additionally, Koko herself initiated the majority of sessions, received food rewards for only a small proportion, and in a few cases, appeared to blow on the instrument without receiving explicit social acknowledgment (aside from video recording), possibly for her own amusement.In any case, it is clear that Koko understands the different instruments' sound making affordances, and is fully capable of taking advantage of them when she wishes.Fitch (2006) raises the question of whether increased breath control evolved specifically in the adaptive service of speech, or whether it may have evolved first for other reasons -he offers the examples of prolonged running or swimming -and thereby was available as a pre-adaptation for spoken language.Koko's instrument play is the first documented case of breath control in a gorilla, but it contributes to a growing literature that now spans reports of voluntary breath and vocal control across the great apes.The converging consensus is that the great apes, as a family, possess the potential to control their breath and that this ability develops flexibly in contexts when it is motivated and useful.Thus breath and vocal control does not arise as a monolithic skill that an organism either does or does not possess, but instead it is acquired and practiced within an instrumental, purposeful context.
From the present report and the literature reviewed above, we see a few different contexts that appear to motivate the development of breath and vocal control.With captive chimpanzees (Hopkins et al. 2007;Leavens et al. 2010), we see the instrumental use of innovative airflow and vocal sound as an attention getter with human caregivers.In a few cases with intensive human interaction, novel, more semantically specific communicative signals are developed and learned, as with the human-fostered bonobo Kanzi's untrained modifications of peep calls (Hopkins & Savage-Rumbaugh 1991;Taglialatela et al. 2003) and the chimpanzee Viki's similarly sized, trained vocabulary of 'words' (Hayes 1951).And a more vague, culturally determined relevance motivates the spread of several behaviors that are seemingly more arbitrary in function.These behaviors include, for example, the nest building lip sputters of wild orangutans (van Schaik et al. 2003) and the raspberries incorporated into chimpanzee pant hoots (Marshall et al. 1999).These behaviors are meaningful in their particular sociocultural context, but otherwise do not appear to serve any clear instrumental function.The case study of Bonnie the whistling orangutan (Wich et al. 2009) and the present study of Koko's wind instrument play may also fall into this category.However these activities also seem to provide intrinsic entertainment for Bonnie and Koko (perhaps similar to music in humans), and are performed more independently of any specific cultural routine.
Taken together, these various research findings suggest that the adaptive value of breath control among the great apes lies in its flexible development in the service of behaviors that are relevant and motivated within a particular environment.While one may argue that many observed instances take place within an 'artificial' human context, there are at least a few observations of flexible breath-related behavior in free-ranging apes.Additionally, it is important to note that the current cross section of observational time is extremely narrow in comparison to the millions of years of adaptive contexts and social traditions that may have cycled in and out of ape populations over their histories.Over the course of great ape cultures, it seems probable that a multitude of breath-related behaviors have come in and out of fashion.
Thus, from the present embodied, ecological perspective, Fitch's question of whether breath control evolved specifically to support speech versus some other specific behavior neglects the possibility that breath control is adaptive for its flexibility, rather than for any function (or set of functions) in particular.Indeed, this point is supported by the wide variety of behaviors involving breath control that are pervasive in human cultures, spanning children's play, smoking, spitting, whistling, vocal imitation, breathing patterns in activities like swimming, diving, running, and spiritual rituals, and of course, talking and making music.Clearly, humans show an extraordinary ability to hone their breath control into a dexterous and finely tuned instrument, serviceable for a number of culturally determined functions.Mounting evidence suggests that we are not fully unique in this respect, and that our great ape relatives share with us at least a rudimentary basis for this flexibility in their breathing and vocal behavior.

Table 1 :
Toot Durations and Inter-toot Intervals.Note that the times are in seconds.The final column shows the range in which the local minimum was established: 0 for the specified range, along with the count for the preceding and subsequent ranges of the same size (used to distinguish oral manipulations and sequences)

Table 2 :
Number of Bouts, Sequences, and Toots by Instrument Type

Table 3 :
Number of Bouts by Initiation and Consequence.Note that food also implies a verbal consequence.For verbal consequence, the parentheses indicate the number of instances verified for a full 30 seconds after the bout

Table 4 :
Characteristics of Bouts, Sequences and Toots