from 'Language Myths', eds. L. Bauer and P. Trudgill, Penguin, 1998, pp. 150-8
“Some Languages are Spoken More Quickly Than Others”
We all make judgements about how quickly someone is speaking, but it is not at all easy to work out what we base these judgements on. Speakers of some languages seem to rattle away at high speed like machine-guns, while other languages sound rather slow and plodding. We find the same when we listen to dialects of our own native language - within English, for example, it is a familiar cliché that cowboys in Westerns (usually set in Texas or neighbouring states) speak slowly, with a drawl. English rural accents of East Anglia and the South-West are also thought of as slow-speaking, while urban accents such as those of London or New York are more often thought of as fast-speaking. However, impressionistic judgements about such things are often unreliable. Ilse Lehiste, who has studied very many languages, wrote ‘Whether there are differences in the rates of speech of speakers with different linguistic backgrounds is not well known” (Lehiste, 1970, p.52). More recently, Laver (1995) has written “The analysis of phenomena such as rate is dangerously open to subjective bias ...listeners’ judgements rapidly begin to lose objectivity when the utterance concerned comes either from an unfamiliar accent or (even worse) from an unfamiliar language” (p. 542). Can we establish scientifically that there really are characteristic differences in speaking speed? There are, it seems to me, three possibilities:
(1) some languages really are spoken more rapidly, and some more slowly, than others as a natural result of the way their sounds are produced.
(2) we get the impression that some languages are spoken more quickly than others because of some sort of illusion.
(3) in some societies it is socially acceptable or approved to speak rapidly, and in others slow speaking is preferred.
1. Measures of speaking rate in different languages.
We need to look for appropriate ways to measure how quickly someone is talking. We are used to measuring the speed at which someone can type, write or take shorthand dictation in terms of how many words per minute are taken down. Some adjustment usually has to be made to penalise someone for going so quickly that they make a lot of mistakes. In measuring speech, we can do the same thing - we can give someone a passage to read, or a speaking task such as describing what they did on their last holiday, and count how many words they speak in a given time. However, in speech it makes a big difference whether or not we include pauses: if I want to work out how long it took me to cycle somewhere, I might make a note of my times both including and excluding rest stops that I made on the way. In a similar way, most studies of speaking have found it necessary to make two different measurements of the rate at which we produce units of speech: the rate including pauses and hesitations, and the rate excluding such things. The terms usually used are speaking rate and articulation rate (Laver, 1995). Both are highly correlated with perceived speech tempo, according to van Bezooyen (1984). Tauroza and Allison (1990) measured words per minute, syllables per minute and syllables per word in different styles of spoken English and found substantial differences. It is quite possible that some languages make more use of pauses and hesitations than others, and our perception of speed of speaking could be influenced by this (Ofuka, 1996). In comparing different languages, however, there is a more serious problem: some languages (e.g. German, Hungarian) have some very long words while others (e.g. Chinese) have very few words of more than one or two syllables. It has been found that Finnish was faster than English if syllables per second are measured, but slower if words are counted, since Finnish words tends to be longer than English words. Much depends, of course, on how we define what a word is (Palmer, 1984, pp.41-8). This inter-language difference could have a serious impact on the accuracy of our measurements, and for this reason many investigators have chosen instead to measure the number of syllables spoken in a given amount of time. This usually results in a syllables-per-second measurement, and at this more detailed level of measurement it is usual to exclude pauses. This is not the end to our problems, however: although counting syllables is likely to be a much more reliable way of comparing different languages for speaking rate than counting words, we should bear in mind that different languages have very different syllable structures. Many of the world’s languages do not use syllables with more than three or four sounds, while other allow syllables of many more sounds. In English, for example, the word ‘strengths’ /streNTs/ contains seven sounds; the six-syllable English sentence “Smith’s strength crunched twelve strong trucks” (containing 32 sounds) would take much longer to say than the six-syllable Japanese phrase “kakashi to risu ” which contains 12 sounds. So if a language with a relatively simple syllable structure like Japanese is able to fit more syllables into a second than a language with a complex syllable structure such as English or Polish, it will probably sound faster as a result. Dauer (personal communication) has found that Greek and Italian are spoken more rapidly than English in terms of syllables per second, but this difference disappears when sounds per second are counted. It seems, then, that we should compare languages’ speaking rate by measuring the number of sounds produced per second, rather than the number of syllables. Within a particular language, it is clear that speech rate as measured in sounds per second does vary quite widely: Fonagy and Magdics (1960) measured different speaking styles and found rates varying from 9.4 sounds (average) per second for poetry reading to 13.83 per second for sports commentary. But this still leaves us with a problem. The faster we speak, the more sounds we leave out. Speaking slowly, I might pronounce the sentence “She looked particularly interesting” as /Si lUkt p«tIkj«l«li Int«r«stIN/, which contains 27 sounds, but speaking rapidly I might say /Si lUk p«tIkli IntrstIN/, which only contains 20 sounds. In theory, then, it could happen that in speaking quickly I might produce no more sounds per second than when speaking slowly. In order to get a meaningful measure, it would be necessary to count not the sounds actually observable in the physical signal, but the “underlying phonemes” that I would have produced in careful speech.
Osser and Peng (1964) measured sounds per second for speakers of Japanese and of American English, and found no significant difference between them. den Os (1988) compared Dutch and Italian and found no significant difference in terms of syllables per second, though Italian was somewhat slower in terms of sounds per second. In a review of measurements of a number of different languages, Dankovicova (1994) quotes average figures from various studies: for German, 5.55 and 5.7 syllables per second, for French 5.29, 5.2 and 5.7 syl/sec, for Dutch 6.1, and for Italian 6.4. These are all for “normal” speaking rate - in different circumstances, of course, rates can vary. I have a recording of a friend who left a message on my telephone answering machine and kept up an average speed of over 8 syl/sec over a period of about 20 seconds. Arnfield and Roach (1995) showed rates in English varying between 3.3 and 5.9 syls/sec. But overall, it seems that, on the evidence available at present, there is no real difference between different languages in terms of sounds per second in normal speaking styles.
How might we pursue this question further? One possibility would be to make use of some of the carefully assembled speech databases stored on computer which have been phonetically labelled. Databases such as EUROM-1 (Chan, 1995), which comprises speech of six Western European languages and BABEL (Roach et al, 1996) containing five languages of Eastern Europe, will, when complete and available to researchers, give us valuable new material. But the expectation is that these collections of normal, unemotional monologues will give us the same answers as the other surveys - we will find no difference between languages in terms of sounds per second or syllables per second.
2. Speaking rate as an illusion
Our impression of a language being spoken faster or slower may depend to some extent on its characteristic rhythm. More precisely, it is said that we are influenced by whether a language is perceived to be stress-timed or syllable-timed. The distinction was given a detailed exposition by Abercrombie (see for example Abercrombie, 1967), though the idea had been proposed long before by Pike (1945). Pike refers (p.37) to the “pattering” effect of Spanish speakers and their “sharp-cut syllable-by-syllable pronunciation”. Most people feel intuitively that there is a genuine rhythmical difference between languages such as English (classed as stress-timed) and French or Spanish (classed as syllable-timed), and it usually seems that syllable-timed speech sounds faster than stress-timed to speakers of stress-timed languages. So Spanish, French and Italian sound fast to English speakers, but Russian and Arabic don’t. The theory suggests that in syllable-timed languages all syllables tend to be given equal amounts of time, while in stress-timed languages more time is given to stressed syllables and less to unstressed. In addition, it is said that stressed syllables occur at regular intervals of time in stress-timed languages. Unfortunately, many studies based on detailed measurement of time-intervals in different languages (e.g. Roach, 1982; Dauer, 1983) have been unable to confirm these claims, with the result that we are forced to retreat to a weaker claim: that some languages sound stress-timed and others sound syllable-timed. We may be forced to accept something similar in answer to our present question - perhaps languages and dialects just sound faster or slower, without any physically measurable difference. The apparent speed of some languages might simply be an illusion.
One of the questions raised by this possibility is the degree to which listeners can detect differences of speaking rate in their own language and in other languages. If it turns out that we are no good at detecting speed differences in different languages, we will have to conclude that our judgements of speaking rate are unreliable. Vaane (1982) carried out a study using recordings of Dutch (the subjects’ native language), English, French, Spanish and Moroccan Arabic; these were spoken at three different rates. Two groups of listeners, one phonetically trained and the other untrained, had to try to judge the speed of utterance. Vaane tested the hypothesis that we will be less adept at judging the speed of a language we do not know, and an unknown language is likely to sound faster than our own language (presumably because it “sounds harder to do”). Her results suggest that in fact both trained and untrained listeners are quite accurate in judging the rate of speaking for their own language and also for languages with which they are unfamiliar, a finding which compares interestingly with the view quoted from Laver (1995) above. From this we can conclude that the judgements are not based on linguistic knowledge (such as we use in identifying words). We must be using one or more phonetic characteristics of the speech that we are able to detect whether or not we know the language being spoken.
Useful though the above findings are, they do not yet bring us an answer to the question of whether some languages are spoken more rapidly than others (when situational and personal factors have been taken into account). Vaane does quote mean syllables-per-second rates for the test passages in her experiment, but does not tell us if the inter-language differences are statistically significant. Interestingly, Dutch comes out with the highest speaking rate in all three conditions, though this is not a language that most English people would immediately think of as being rapidly spoken.
3. Social and personal factors and speaking rate
Social factors influence the speakers of a language in different ways: a number of anecdotal sources suggest that in some societies it is regarded as acceptable or approved to speak rapidly, while in others slow speech is preferred. There is almost certainly an interaction with gender here, with slow speech usually being preferred for males. This would mean that, while at normal speaking speed the sounds-per-second rate for all languages may be effectively the same, some languages are characteristically using higher and lower speaking rates than other languages in particular social situations. In a carefully controlled study, Kowal et al (1983) looked at two very different types of speech (storytelling and taking part in interviews) in English, Finnish, French, German and Spanish. They found significant differences between the two styles of speech (both in terms of the amount of pausing and of the speaking rate) but no significant difference between the languages. They concluded that the influence of the language is negligible compared with the influence of the style of speech. Similarly, Barik (1977) showed that differences in tempo between English and French were due to the style of speech, not to the language. Certainly we are all capable of speaking faster and slower when we want to. There are variations in speed associated with the situation in which the speech is being produced - we speak more rapidly if we are in a hurry, or saying something urgent, or trying not to be interrupted in a conversation. We tend to speak more slowly when we are tired or bored. The emotional state of the speaker at the time of speaking is clearly influential. There seems also to be a personal factor - some people are naturally fast talkers, while others habitually speak slowly, within the same language and dialect and in the same situation. Research has shown that our opinion of speakers is influenced by their speaking rate: Giles (1992) reports that “a positive linear relationship has repeatedly been found between speech rate and perceived competence”, and Stephen Cowley (personal communication) says that in Zulu society, slow speech tempo is a sign of respect and sincerity. Yet another social factor is the amount of temporal variability, where the alternation between speaking rapidly and speaking slowly may itself have considerable communicative value - this has been pointed out by Cowley (1994), who has found very wide tempo variation from phrase to phrase among Italian speakers in conversational data.
While this idea of social determination of speed seems the most plausible explanation, the only way we are going to be able to test it is by much more research across a wide variety of languages and social situations. Let us hope that this research will be carried out.
My thanks to Bill Barry, Stephen Cowley, Jana Dankovicova and Marianne Jessen for their advice and discussion.
Abercrombie, D. (1967) Elements of General Phonetics, Edinburgh University Press.
Arnfield, S., Roach, P., Setter, J., Greasley, P. and Horton, D. (1995) ‘Emotional stress and speech tempo variation’, Proceedings of the ESCA/NATO Workshop on Speech Under Stress, Lisbon, pp. 13-15.
Barik, H.C. (1977) ‘Cross-linguistic study of temporal characteristics of different types of speech materials’, Language and Speech, 20, 116-126.
Bezooyen, R. van (1984) Characteristics and Recognizability of Vocal Expressions of Emotion, Dordrecht: Foris.
Chan, D. and others (1995) ‘EUROM: a spoken language resource for the EU’ in Proceedings of Eurospeech 95, Madrid, pp.867-870.
Cowley, S. (1994) ‘Conversational functions of rhythmical patterning’, Language and Communication, vol.14.4, pp. 353-376.
Dankovicova, J. (1994) ‘Variability in articulation rate in spontaneous Czech speech’, unpublished M.phil thesis, University of Oxford.
Dauer (1983) ‘Stress-timing and syllable-timing re-analysed’, Journal of Phonetics, vol.11, pp. 51-62.
den Os, E.A. (1988) Rhythm and Tempo of Dutch and Italian, Utrecht: Drukkerij Elinkwijk.
Fonagy, I. and Magdics, K. (1960) ‘Speed of utterance in phrases of different lengths’, Language and Speech 4, 179-92.
Giles, H. (1992) ‘Speech tempo’ in W.Bright (ed.) The Oxford International Encyclopedia of Linguistics, Oxford University Press.
Kowal, S., Wiese, R. and O’Connell, D. (1983) ‘The use of time in storytelling’,
Language and Speech, vol. 26.4, pp. 377-392.
Laver, J. (1995) Principles of Phonetics, Cambridge University Press.
Lehiste, I. (1970) Suprasegmentals, MIT.
Ofuka, E. (1996) Acoustic and Perceptual Analyses of Politeness in Japanese Speech, unpublished PhD thesis, University of Leeds.
Osser, H. and Peng, F. (1964) ‘A cross-cultural study of speech rate’, Language and Speech, 7, 120-5.
Palmer, F.R. (1984) Grammar, (Second Edition), Penguin.
Pike, K.L. (1945) The Intonation of American English, University of Michigan Press.
Roach, P. (1982) ‘On the distinction between “stress-timed” and “syllable-timed” languages’, in Crystal, D. (ed.) Linguistic Controversies , Edward Arnold.
Roach, P., Arnfield, S. and Hallum, E. (1996) ‘BABEL: A multi-language database’, Proceedings of the Australian International Conference on Speech Science and Technology (SST-96), pp. 351-4.
Tauroza, S. and Allison, D. (1990) ‘Speech rates in British English’, Applied Linguistics, 11, pp.90-105.
Vaane, E. (1982) ‘Subjective Estimation of Speech Rate’, Phonetica vol.39, pp. 136-149.