What is needed is a synergistic approach to the problem of language origins. How should one delimit the phenomenon of language? What would count as the origin of the process? Language is profounder than speech; its realisation may take different forms and what matters is the underlying language-capacity. This capacity is rooted in and forms part of the general perceptual and functional organisation of the human being. The emergence of language has been the result of a process of mosaic evolution, with diverse faculties found in animals (particularly in birds) coming together in the human and through a radical change in brain connections giving rise to language. integrated with the motor system.
We seek the origin of something we call language. What is it we are seeking the origin of? Say: the expression of some internal process, the internal reception of some externally-perceived structured activity, speaking, hearing, understanding. A relation between what is going on 'in our heads' and a bodily activity of ours perceivable by others and interpretable by them. A relation between a perceivable activity of others and what consequently goes on in our heads. Language more specifically as words, and the patterns into which words can be combined (semantics and syntax, lexicon and grammar). 'Speech' is different from 'language'; similarly gesture is different from language; and written forms are different from language. Or say that speech is a part of a language. Language is the capacity of one individual to alter, through structured sound, gesture or visual emission, the mental organisation of another individual.
What would count as the origin of language? To answer this question we must assume that we can recognise language and distinguish it from other forms of communication. Let us further assume that language is a complex system with many aspects: the articulatory, the serially-organised the lexically-structured, the expressive, the conceptual and so on. Do we conceive of language as having sprung into existence full-blown or as the result of the accretion of elements gradually coming to constitute some thing recognisable as language? What, after all, counts as the origin of any thing? Darwin's answer on the origin of species was in essence that there was no origin and there were no species. His answer was a lateral one: there is only an unending, infinitely complex, process of change and accommodation, leading to divergence of form and function over immense periods of time, with apparent separation of species in space and time produced by the disappearance of intermediate forms.
If we assume the continuity of nature (and hence of life), and the evolution of man as part of a general evolution, then we will not expect any distinct origin of particular human forms or functions. Language development would be a continuum from the simplest form of inter-individual communication to a more and more complex one. It might be similar to the development of perception from the simplest 'eye' to the most advanced - but perception is not the eye, or not only the eye, any more than language is speech, or only speech. Perception is the capacity to interpret relevantly for the individual the complex modifications of input to the eye. Clearly, in a sense there is evidence of a sharp discontinuity in language: humans speak and animals do not. But the discontinuity is a contemporary one. We have no reason to believe that there was as sharp a discontinuity in the emergence of language. No doubt the development of speech was a very important part of the process of language emergence - but the circumstances surrounding the development of speech cannot be identified as the origin of language, or even the originating situation. Darwin's "Origin of Species" offers a guide in several ways, not only in demonstrating that an absolutist approach to concepts such as origin, species or language is unprofitable but also in suggesting that progress in the enquiry can best be made in phases, of accumulation of possibly relevant evidence, of provisional assessment and interrelation of the evidence, of the tentative formulation of hypotheses and questions as a way of coming to grips with the material - coupling a uniformitarian approach (which assumes that traces of past development can be found in the present) with an unwillingness to accept any prior dogmatic assertions which close out lines of enquiry. So, in study of the origins of language, we already have much material drawn from very different disciplines, which may or may not be relevant or may have a relevance quite different from what the authors of the evidence believe it to have. For example, research directed towards a protolanguage and monogenesis may have relevance for the biological rather than the cultural development of language. No objective research is without value and no systematic examination of the subject is to be disregarded. It is not a matter of determining that one particular approach is wrong: we are all casting around and all wrong. Truth even in the most rigorous areas of science can at best be only provisional and will seem wrong in the light of later knowledge. The synergistic approach is to look sympathetically at whatever has been done or suggested and see how it can be fitted into a larger picture, or how far it suggests questions which might be tackled in other disciplines.
The only trap which should be avoided is an over-hasty linguistic analysis of the problem, e.g. by the premature introduction of technical terms or technical uses of terms into a field where confusion about the words we use is so easy. In this category of premature technical terms, I would put words like symbol, symbolicity, icon, iconicity, arbitrary, semiotic, dual articulation. Words or phrases such as these give the impression that we know what we are talking about or agree what we are talking about when we do not know or do not agree.
At the 1975 New York Conference on language origins (Harnad, Steklis and Lancaster 1976), the first section of papers were meant to be directed towards "Formulating the Target". I do not think they were successful. Various definitions of language were presented, some in technical terms. The safest course perhaps is to accept that we know well what we mean by language, human language, spoken language - as we can recognise a camel though we might not be able to define one. There seems no need to over-generalise language at this stage - to deal with homologues of spoken language such as deaf-and-dumb language or gestural languages as such or with animal 'languages' or attempt to bring animals nearer to the communicative patterns of behaviour involved in human language. For this enquiry, whilst we recognise that speech sounds and connected speech are vital components of the total human language behaviour, we should also recognise that language has many other aspects and in talking about the 'origin' of language we have to be concerned with these other aspects as well as the origin of speech.
Similarly for the question of the origin of spoken language. Looked at intently, there of course can be no absolute origin, e.g. on the 1st of April 40,000 BP or 500,000 BP. The whole of vertebrate evolution, of body, brain and behaviour, was the 'material cause' or origin of language development, the evolution of the lungs, pharynx, larynx, tongue, mouth etc. as well as the neural control for these. But what we can profitably look for are the possible distinctive features of human development, the interconnection of elements which led to the successful elaboration of a specifically human communication system with the capacity to grow in power throughout the millennia. In many ways, dating the stages in the development of human language is the least important part of the enquiry. There is an understandable temptation to weave a speculative web round the (highly uncertain) datings of human structural evolution, and the scattered remnants, fossil and artefactual, that have so far been found (always possibly quite unrepresentative and misleading, as the rather chequered history of palaeoanthropological dating shows). But relative dating within the process of the development of language can still be of importance, even if absolute dates must remain uncertain.
To summarise: we need not worry very much about the definition of language. We are concerned with ordinary human spoken language in the first place, though we need not exclude wider aspects of language and communication later in our research. We need not aim for a precise dated origin of language or a single event origin of language. We should recognise that language-capacity was composed of a mosaic of structural, anatomical, neural, behavioural and environmental features and be concerned to propose a plausible sequence of events in the evolutionary history of language. We need not suppose that once language was achieved, it had no further significant development to undergo; it is likely that the same factors which led to the appearance of language have continued to operate to increase its power. The dogma that there is no primitive language rests on the assumption that syntactic simplicity is the proper criterion of primitiveness. The essence of language development is lexical development (in size, in range of reference, in discrimination and in interconnectedness) linked to the possession of the concepts to which the words in the lexicon refer.
Given this preliminary clarification of what might be meant by origin of language, the methodology follows from the assumption that the human language capacity results from the integration of many different elements, anatomical, neural and behavioural. We are concerned with the origin of hearing speech just as much as with the origin of speaking, with the origin of individual words as much as with word-order and grouping, with the ability to distinguish speech-sounds categorically and to combine articulatory patterns into connected speech. The only evidence we can bring to hear on the origin of human language is what exists in the present; if our theory assumes the mosaic evolution of language, then what we should look for is evidence of the mosaic elements in present-day observation, both in animals and in ourselves (perhaps 'behavioural fossils'). Theories of the evolution of language have tended to focus on the major features of human anatomy and neuroanatomy e.g. laterality. It may be more profitable initially to see how many of the aspects of language we share with animals and then decide what it is we do not share with them.
If one approaches the question from the standpoint of more traditional theory, a number of features should be considered together: Bipedalism, Manipulative skill, Good sound discrimination, Ability to imitate and respond to differing sound patterns, Ability to form concepts, Ability to generalise and solve practical problems, Better than usual vertebrate sight, Bodily agility, Close group bonding, &c &c.
Put these elements together and what do you get? A bird - a pigeon -a parrot - a sparrow - a bluetit - a Superbird! As Thorpe commented (1967:10), it looks indeed as if the birds are the group which ought to have been able to evolve language in the true sense and not the mammals. If one is looking for bits and pieces of behaviour scattered about the animal kingdom which, put together, could have been used to construct human language, one might list:
1. pigeon (Thorpe, 1979) or octopus (Sutherland, 1964): power to form concepts
2. chinchilla (Kuhl and Miller, 1975) or Rhesus monkey (Morse,1976): ability to discriminate categorically between speechsounds and to generalise despite differing formant frequencies (Burdick and Miller, 1975)
3. mynah bird (Thorpe, 1967:19-20): ability to exactly produce human speech sounds
4. parrot or mynah bird ability to imitate sound exactly (Thorpe, 1967:8)
5. sparrow (Marler, 1976) or chaffinch (Marler and Peters, 1981): ability to learn vocal patterning from conspecifics
6. gibbon: ability to hear in a categorical way vocal sound and respond with vocal sound - the antiphonal ability also found in birds (Thorpe, 1967:9)
7. bee: (von Frisch, 1967) ability to convey environmental information by patterned body activity
8. budgerigar or elephant: memory.
This is in no sense a frivolous list. In examining aspects of animal behaviour in relation to human behaviour, one should pay attention to what strikes us as odd, surprising or divergent. Pigeons can recognise the presence of human beings in photographs (Thorpe, 1979:738), Rhesus monkeys appear to perceive speech sound categorically in a way similar to that in which human infants do (Eimas and Miller, 1971), a mynah bird can imitate human speech with astonishing exactness (even to reproducing a recognisably foreign accent in spoken English) (Thorpe, 1967:19), sparrows form song-dialects passed down from generation to generation (Marler, 1976), a bee can convey the correct direction of a honey-source allowing for wind drift (Russell and Russell, 1973), a budgerigar was recorded as having learnt 8 nursery rhymes with a great deal of occasional poetry, telephone numbers and other items (Milner, 1973:244). That animals of various kinds have these behavioural achievements means that they have the neural patterning required to support the motor programs for the different types of performance. In the human being, mosaic evolution of the neural system, bringing together brain connections and brain programs which had appeared quite separately in other animals, could go to form the basis for spoken language capacity. The evidence of what animals can do is directly relevant; brains are very similar in broad structure throughout the range of birds and mammals. The basic neural processes are uniform: the function of the neurone, nerve-fibre transmission, synaptic function and transmitter substances (Welker, 1976). The neural basis of the vocalisation system in birds is anatomically and functionally similar to the corresponding parts of the mammalian brain (Brown, 1974; Phillips and Peck, 1975).
What are the implications of the presence in animals of mosaic elements which might have formed the human capacity for language? First of all, they raise questions about the survival value of each of these elements separately and open up the possibility that the human language ability has been acquired as the result of a series of steps in brain and behavioural development. Secondly, they lead one to consider more closely the exact nature of the abilities these animals have and the importance they may have in the total human language capacity.
Much more has been said about the possible survival value of fully-achieved human language than about that which the different elements going to form language might have had at each stage. Some scholars have emphasized the so-to-say external selective benefit of language: the exchange of propositional information for hunting, children asking for food, instruction for toolmaking, communication between groups. Others have laid emphasis on the 'internal' benefits of language: the ability to create and retain improved cognitive maps, language as a strengthened organ of perception allowing the extraction of information about the world by a kind of process of triangulation the word acting as a bridge allowing the integration of information from different modalities and allowing humans to share referents.
The internal and external advantages of modern language seem to go together. Language can be seen as a world-analysing device, with displaced percepts linked to words forming the substance of thought, detaching the inner processes from involvement in the immediate emotional experience. In place of the 'buzzing, blooming, confusion' of William James (1890), language facilitated a decomposition of experience, the benefits of which could be extended, also by language, for the advantage of the family and group as well as of the individual. Language can thus be seen as lying at the beginning of the scientific process. Externally, language would have been a powerful force for group survival, both in struggling against nature, planning for the future, and in struggling against animals and other human groups. At its crudest, the selective advantage for the group can be seen as associated with early military technology. Upright stance freed the hands to use as and with weapons (against animals and other hominids). Language increased the effectiveness of group use of weapons. On this view, rather like Dart's (1959), the survival value of language was - survival! Since those early times, improved communications and improved weapons have gone together in successful warfare. The military satellites and ballistic missiles are descendants and continuations of the original communication equipment- spoken language- and the original weapon technology -sticks and stones. The speechless groups were eliminated.
If language capacity was acquired in stages as a process of mosaic evolution, a more interesting question is what selective advantage the mosaic elements of the capacity had in successive stages. One can ask this about the apparent presence in other animals of some parts of the human language capacity. For example, what is the survival advantage for parrots or mynah birds of their ability to imitate a wide range of sounds and, in the case of mynah birds, to produce human speech sounds very accurately? If a parrot's imitative ability is in some way related to the use of sound by other animals - e.g. as a way of strengthening group links, identification or territory-marking - the ability seems to have gone a long way further than is really needed. Possibly one needs to obtain more ethological data on these species or to revise the concept of survival value, at any rate in relation to changes in brain-function; what matters is not survival of the fittest but the disappearance of the least fit.
This would leave room for recognising neutral mutations (Wilson, 1985) or (in hindsight) preadaptations, viz. "forward-looking" changes or changes that go beyond the immediate needs of the situation. Applied to hominids, it might mean that e.g. an imitative power could have developed with no immediate survival value but at the same time no positive survival disadvantage; it might in fact be a side-effect of some other functional reorganisation.
Similarly, one might ask what is the selective benefit for chinchillas or Rhesus monkeys of being able to discriminate between human speech-sounds? Or is their ability the product of some more general organisational features of these animals? Of what use is it to a budgerigar to be able to reproduce 8 nursery rhymes? The existence of these seemingly unprofitable aspects of behaviour in a variety of animals suggests that in human evolution, a number of the elements ultimately required for language-capacity could have been acquired, so to say, incidentally, particularly mutations affecting connections in the associative areas of the brain. Imperfect adaptation would be the key to advance with neutral or even marginally harmful mutations accumulating to the point where they provided the material for mosaic evolution, with fitting together of the pieces required for human spoken language.
Whatever the selective benefit of the features listed may have been to the animals concerned, the fact remains that these abilities exist. They resemble often quite closely human abilities implicated in the capacity for language. It is important therefore to consider more closely the exact nature of the abilities and the significance they may have in human language capacity. Perhaps the first and most important of these abilities is that of imitation. Apart from man, birds are incomparably better imitators than any other living beings. But how is imitation possible, either for birds or for humans? The imitation of sound is only one segment of a much more general power of imitation that we have; some animals can imitate bodily action but not sound and the general problem of imitation is much the same in both cases. Imitation seems to require at least four types of ability: 1. to perceive external patterning (looking or listening); 2. to analyse the perceived external patterning into discrete uniform elements; 3. to transfer the set of elements to another functional system in the brain, possibly transform them there and form them into a production program; 4. to activate the production program through the peripheral devices of the second functional system (produce imitated speech, facial expression, or other bodily action.) Presumably both in the mynah bird and in man, some similar processing to imitate sound, and especially speech-sounds, must occur. The difference, of course, is, as someone said at the New York conference, that no matter what the parrot or the mynah bird says, he is not telling us anything. In man, imitation plays a role different from that of imitation in the parrot. Whilst straightforward imitation of spoken language may be of importance to children learning to speak, in the adult the circle of imitation must be broken; imitation by the adult may have been of value in the early spread of language (rather like the spread of song-patterns in a group of birds) but for conversation, heard speech had to be, not imitated but responded to. Overt imitation had to be suppressed, covert imitation would be linked to the conceptual structure.
But the power of imitation seen in birds such as the parrot and mynah has other important implications for understanding human speech-capacity. If the mynah bird is able to imitate human speech very exactly, then the production of human speech-sounds, combined in an appropriate way, does not depend on a uniquely human articulatory apparatus or a uniquely human neuromotor system for controlling articulation. Mynah birds can imitate human speech and other primates cannot. Quote here Thorpe's response to a questioner who said that primates' inability to imitate speech was due to the defective structure of their larynx: "All one can say is that the vocal organs of birds are apparently much less appropriate for imitating human speech than those of the chimpanzee or gorilla. I think... that if you showed a bird syrinx to a laryngologist who had never seen one and said: 'How is it that an animal with this can talk?' he would say: 'It is utterly impossible"' (1967:11). On similar lines, Nottebohm at the New York conference said that the supposedly unique properties of human language could probably have evolved in many other vertebrate forms with little need if any to change their vocal tracts (1976:645); and Wind, at the same conference, stated that a chimpanzee larynx grafted into a human would enable the latter to produce normal speech (1976:626). The ability of a bird to produce human speech-sound also casts doubt on the proposition that the growth in size of the human brain was due to the heavy demands made on the neural system for control of articulation specifically (apart from other neural requirements for language).
Rather similar questions arise in relation to the ability of a variety of animals to respond categorically to different human speech-sounds, and the infant's ability to respond to a range of phonemes wider than the set found in the ambient language. Why should chinchillas be able to distinguish different vowel-sounds and maintain the categorical perception of them over 27 speakers with very different formant frequencies. The ability of Rhesus monkeys to distinguish different consonantal speech-sounds categorically is equally surprising. Rather like the power of imitation in the parrot or mynah bird, there is no obvious selective advantage that the ability to respond in a discriminating way to different human speech-sounds could have for chinchillas or monkeys (though obvious advantage for the human infant to be able to discriminate a wider range of phonemes before he finds out what the local language is). In both instances, the ability looks more like a by-product of some other important function or a neutral character or an inexplicable one. Even more relevant for the human language ability how are chinchillas and monkeys able to make these discriminations? Presumably they have not learnt human speech-sounds and they can have no schemas to match them against - or at any rate no vocal schemas. They must have a set of abilities: 1. to attend selectively to speech-sound as such; 2. to analyse the sound in some way to extract uniform elements from it; 3. to transfer the elements to another functional system or transform it in that system; 4. to recognise uniformities there; 5. to activate a discriminating response in the second functional system specific to the categorically distinct sound-element. Quite a performance - which is very similar to what the human needs to do to extract uniform speech-sound patterns from very various acoustic experiences.
In so far as words are linked to concepts, the ability of most animals, perhaps all animals, to form concepts and use them to program action is very relevant. An octopus can distinguish between a triangle and a square, between a vertical rectangle and a horizontal rectangle (Sutherland, 1964). There seems to be nothing uniquely human about the formation of concepts, and the process must be very similar throughout the animal kingdom. Probably it is constituted by the following elements: 1. selective attention to a particular segment of visual or other experience. 2. analysis of perceptual information to produce separable contours &c; 3. transfer or transform for incorporation in an experience record-system (cognitive map, body-map or body-image); 4. abstract and generalise features from repetition of the experience; 5. establish a concept-structure for the percept (eliminating accidental features found in differing circumstances); 6. link the concept-structure to another functional system and to specific patterning there related to the concept; 7. activate the second functional system to produce a uniform response to the concept as instantiated in a particular percept. This crude description of what might be involved in concept-formation bears some obvious similarities to the description of what might be involved in imitating specific speech-sounds or producing discriminating responses to different speech-sounds.
Thus, many of the abilities required for human speech are possessed in some degree by other animals (and particularly birds). If one considers more closely the nature of these abilities, there are certain general features that can be seen. First of all, the very great importance of cross-modal or transfunctional links (on the existence and significance of cross-modal connections, see Davenport, 1976; Ettlinger and Blakemore, 1967; Ettlinger, 1973; Premack, 1976). In imitation, there is transfunctional linking between visual perception and bodily action, between hearing and articulatory activity; in the discrimination of speech sounds, there is transfunctional linking between hearing and the action response (in the case of infants, sucking on a teat). In the case of concept formation, there is transfunctional linking between vision (or other forms of perception) and action, in the case of the bee, a link between vision and bodily action. Secondly, despite very differing peripheral apparatus, syrinx or larynx, the observed behaviour seems to require similar 'programs of the brain' - to use J.Z. Young's phrase. This fits in with Jan Wind's view that cerebral reorganisation was decisive for the origin of speechlike communication with the ability to form cross-modal associations and increased memory (1976:628). The history of the development of human language then becomes a demonstration of E.M. Forster's words: 'Only connect', and one needs to examine the nature and the progress of this increase in the connections between the various parts of the human brain, which has resulted in the supreme cross-modal device, the linking of experience of the real world to the internal structure of language.
Some idea of the richness and potentialities of cross-modal development can be found in aberrant, extreme forms of the phenomenon in man and in forms of cross-modal linking which we do not have but might have. Links can be established between more than two functions, triple or quadruple links, or what might be described as supramodal links. MacDonald Critchley gives examples in his discussion of synaesthesia in relation to music (Critchley and Henson, 1977:217-233): many people experience specific different colours in relation to differences in musical pitch; for some, musical patterning is converted into much more extensive patterned visual experience; one person found that so vivid were the photisms resulting from music that he could sketch them, their contours appearing more important than their colours; in some people, music produces an imagery first of taste and then of colour (minor chords are bitter, major chords sweet). He comments that synaesthesia is not a linguistic matter of metaphor but is the outcome of genuine intersensory attributes: the employment of transmodal metaphors in speech is something more than a turn of phrase, being the product of veritable perceptual attributes at an intersensory level. One musician found that he could recall a particular musical pitch more accurately by matching to the remembered associated colour rather than by sound alone. The significance of such cross-modal linking seems to be in providing a finer distinction of sensory experience. This was certainly the case with the remarkable memory of S - the Mnemonist - described and studied by Luria (1968). He experienced the objects and events he remembered in many different senses: taste, smell, touch, vision - his only real problem was his inability to forget anything.
The lack of some key cross-modal links may explain the ape's inability to speak. Though a bird such as the pigeon can readily learn to transfer its response from patterned sound to patterned light, apes cannot readily do this. The ability to make such a transfer implies the existence of some trans-modal coding of the pattern and some connection between the visual and the auditory apparatus. Lack of the necessary intracerebral connections may also explain the chimpanzee's restricted use of its vocal apparatus. Perhaps one of the chief values of chimpanzee language studies is to force investigators to consider more deeply what the characteristics and necessary requirements for human spoken language are, why it is that, considering their brain sizes, apes (particularly gorillas and chimpanzees) are so 'stupid'- when Lenneberg's nanocephalic dwarfs with brains no larger than a gorilla's, were able to learn language (1967:69-71). More generally, why have no other animals developed a human-type language if there are important survival advantages attached to it? Perhaps, for birds, it was a matter of choosing flight as the most profitable use of the freedom given by bipedalism (after all, even though they have no propositional language, there are today more birds than humans and they occupy a greater geographic range; apparently, flight has worked well for them in the survival stakes). Otherwise, it seems that the absence of language in animals is due to lack of the appropriate central connections in their nervous systems, not to other anatomical deficiencies.
If then human language capacity is due to cerebral reorganisation, particularly increments in brain connections, what were the stages, the means and the organising principle of the reorganisation? In the cross-modal abilities described, what is apparent is the involvement of the motor system, viz. the expression of cross-modal linkings in action of some kind or other, whether the result is e.g. imitation of a facial expression or bodily movement, production of sound or speech, or some discriminating action in response to heard sound. This is not really surprising in view of the central role of motor control in behaviour; motor control seems to be the primordial ability of the organism, with sensory input devices developing in larger organisms to provide a refinement by external inputs modifying action and with the nervous system extending to maintain control over distant parts (perhaps the essence of the nervous system indeed is maintaining speed of response as organisms increase in size). It seems reasonable to assume an extensive relation between human spoken language and the motor system and indeed to expect that the language capacity has been built up on the framework provided by the motor system, as Lieberman has suggested (1984).
But where does this conclusion, or suggestion, that language in its development and its functioning has an intimate relation to the organisation of the motor system, lead? What lines of investigation are indicated? There are two main directions: first, investigation of the relation between the motor system and other modalities or functions like vision, hearing, conceptualisation and memory; secondly, examination of the relation between the motor system and the substance of spoken language, viz. wordforms and word-order. We can observe, for example, in dance-forms a direct relation between sound-patterning and body-movement. Similarly, we can observe a direct relation between visual contour and body-position and movements when mouth, face and hand are used to mime some perceived object. We might even suppose that the structures of concepts for visual experience or for auditory experience have a direct relation to motor programs and in this way a straightforward link could be assumed between perception and action. Consider, for example, how the concept of a triangle might be represented: rather than anything like a picture of a triangle, it would be quite sufficient to have a program for the production or scanning of a triangle. On a computer, one might have a general 4-step program: 1. Plot any point on the centre-line 2. Move down any distance in the centre-line 3. Move left any distance and plot 4. Reverse to the right for the same distance plus any additional distance and plot. This on a computer would be a completely general triangle-producing program, parallel to a plausible program for scanning a triangle visually. What one would have is a motor program for a triangle functioning as a concept for categorising an unlimited range of different triangles. A motor program itself does not of course involve or require movement, perhaps simply being the prescribing of biases in the muscular system. It seems plausible and indeed economically desirable that the motor system can mediate between different modalities, converting visual perception into bodily action, perceived shapes into sounds, heard sound into shapes or contours.
If, as the discussion so far suggests, aspects of animal behaviour resembling features of human language capacity are cross-modal or trans-functional and the major component in them is the motor control system, it is essential to examine more closely how far there is evidence, in current speech, for a special relationship between language and the motor system. At the same time, one can speculate about the nature of the intimate relation of motor control and different aspects of the human speech process. Others have, by different routes, arrived at the idea of a special relationship between language and motor function (see for instance Swinney, 1981: Lomas and Kimura, 1976; Fowler et al, 1980; Ojemann and Mateer, 1979; Kelso and Tuller, 1984; Turvey, 1980; Kinshourne, 1978; Kimura, 1973; 1976; Kertesz, 1982). Comments can be found such as: language is essentially behavioural muscular processes; action and language are homologous; a formal theory of language and a formal theory of movement control would be qualitatively indistinguishable; human language is primarily a series of actions; language and motor action are intimately connected, ontogenetically, perhaps phylogenetically, and in the continuous daily use of language by adults (McNeill, 1980:240).
The most literally visible evidence of the relation between language and motor function is to be found in gesture, or subgesture. Others have commented that vocal and kinesic behaviour in children develop together, that manual and language skills mature in parallel in very similar ways. McNeill, in particular, has proposed for gesture an important role in the speech production process. He suggests (1981) that gestures can be regarded as externalised traces of the internal speech programming processes: "Many utterances are constructed in terms of concrete models of reality, or sensory-motor representations ... gestures are a fundamental part of speech production and meaning structures."
Gesture bears witness to the external relation, so to say, of language and movement. A motor theory of language origin implies that, internally, every significant feature of language can be approached in motor terms. There should, on this view, be a relation between motor control and bodily movement associated with speech, between motor control and conceptualisation, between motor control and perception, in so far as the motor system is assumed to be the essential intermediary between the different sensory modes and language. The logical next step is to seek to trace (or at least make plausible suggestions for) the relation between motor organisation and the different aspects of human spoken language: speech production, speech perception, syntactic organisation, word formation, phoneme function, concept formation, and the linking of concepts to words, the storage of words and of the related concepts.
The motor basis of speech production is obvious. The motor theory of speech perception was proposed many years ago (Liberman et al., 1967) and, though it has been said that it suffers from a serious logical weakness, the arguments against the theory seem far from compelling. It seems perfectly conceivable that the link between the auditory analysis of speech-sound and speech production can be established by the linking of speech analysis and speech production in the individual as he monitors his own speech. The analysis and the production by the individual of articulated sound proceed together, and an association could readily be established by the currently operating motor programs for speech and the extracted features of that same speech as the speaker listens to himself. Once such an association had been established, the linkage between speech motor programs and auditory features could be used in reverse to decode heard speech produced by other individuals.
There is no need to comment further on a possible motor basis for the formation of concepts nor on the motor expression of speech in gesture; simply note the evidence of an extremely precise correlation in timing and content between speech and gesture, emphasised by McNeill and investigated more closely by Kendon (1970 and 1980) and Condon and Sander (1974). The novel enquiry is into the possible relation between the motor system and word-formation, the motor character of individual phonemes, and the extent to which a link can be established between word and concept by the mediation of the motor control system. The impact of the motor organisation on the lexicon seems rather harder to tackle than its impact on syntactic ordering - a subject which has received a good deal of attention since Karl Lashley's treatment of it many years ago (1951).
The argument for rejecting the general assumption that word-forms are arbitrary and have no intrinsic relation to the meaning of the word have been set out at length earlier (Allott, 1973; 1981; 1983). The proposal then was that each speech-sound element was correlated with a particular body-movement (a position or partial movement of the hand and arm, for example); that, somewhat on the lines suggested by McNeill, the gesture formed as a result of the total movement associated with a word-form (derived from the movement-elements linked to each speech-sound in the particular word) bore some relation to the meaning of the word, perhaps a contour, perhaps an imitation of an action, perhaps a deictic gesture. A systematic classification of the phoneme-set was proposed, relation to a systematic classification of possible bodily movements.
In the broader view of the central importance of the motor system suggested in this paper, a radical development is now proposed of the earlier theory. It starts with a restatement of Schmidt's account of the main characteristics of the motor control system (at the level of the motor cortex). The motor system, he proposes, functions in the following way:
Movements are controlled by programs, the essential features of which are that they are generalised, containing an abstract code for the order of events, for phasing (or temporal structure) of the events and for the relative force with which the motor events are to be produced. The same motor program can produce movements in entirely different limbs. At the heart of the schema theory of motor control (of which this is a statement) is the idea of the generalised motor program, as part of the hierarchical nature of motor control (Schmidt, 1982).
The radical new proposal, in the light of this account of the motor system and the emphasis on the importance of the motor system in relation to language, is that there is a limited set of elements in the motor system, that is a limited set of motor subroutines available for forming all types of movement programs, whether the movement is general bodily action, facial expression, gesture or articulation. The central feature is this collection of motor subroutines which can be used to produce an open-ended collection of distinct patterns of movement (in much the same way as speech-elements combined in an infinitely large number of different ways go to produce an open-ended collection of distinct words and word strings). On this view of the essential character of the motor system, the (limited) range of phonemes found in human languages is one manifestation of the limited central set of motor routines. To add plausibility to this account, one can observe that in completely different areas of the organism's functioning, one can find a similar pattern to that proposed. The pattern comprises the availability of a discrete set of elements (a limited number), rules for the combination of the elements, rules for sequential ordering of the elements in strings, and for the reading out or decoding of the programs formed from these elements using different effector devices. As in Schmidt's account of motor programs, any one program, in a different context or channelled through a different effector device, may produce many different types of result. Examples of systems organised on these lines are, as Jakobson pointed out (1973:51-56), to be found in the genetic coding system, probably also in the immune system which can generate a virtually unlimited range of antibodies (see Tonegawa, 1985), as well as in the language system itself. A parallel can also be found in the operation of the visual system, based on feature-elements, built up hierarchically into more complex patterns, on the lines proposed by Hubel and Wiesel (1962). The biological economy and parsimony of this kind of system is obvious, since from a limited range of elementary processes or forms, it produces an essentially infinite range of genetic, immunological, visual or linguistic possibilities.
If the motor system is organised in this kind of way, one can start to see how it could function as the intermediary between percept and word, between concept and word, between sound-pattern and movement-pattern (gesture) and so on. The theory also suggests some interesting possibilities for the neural storage of language. If phonemes are closely related to or identical with the central motor subroutines, then words would be stored in very much the same way as skilled action programs, formed from a number of movement-elements. Recall of words could make use of phonemic classification (as Hewes, 1983, has suggested) because the phonemic classification is also an action classification. The speed of recall of words would be very similar to the necessarily high speed of recall of action-programs. Continuous speech would thus be very similar to complex skilled action, such as playing the piano, performing gymnastics or driving a car. Language would be a skill to be learnt like any other skill.