A ramble through vowels and consonants

It’s probably unfashionable to say this, and it’s certainly a sign of a thoroughly colonized mind, but English is my favorite language. There are many reasons for this: the massive vocabulary, the puns, the double-streamed Germanic-Romance roots (so that ‘mistake’, ‘wood’ and ‘hue’ mean and evoke differently from ‘error’, ‘forest’ and ‘color’). But a large part of my affection for English lies in the sounds of the language.

This is a complicated thing to say about your first language. It’s much easier to know what a language sounds like when you don’t speak it, before comprehension has made the language transparent. It’s hard to reconstruct the way a language sounded before you learned it, and this is much more so if you grew up speaking it. Still, while some impressions are only available to a non-native speaker and others are irretrievably lost, others never leave or even wait to be discovered later on.

To me, the most striking thing about English is its diversity of vowels, something I only noticed after many years of speaking the language. English, in many dialects, has about 15 vowels (not counting diphtongs). Listen to the vowels through these words: a, kit, dress, trap, lot, strut, foot, bath, nurse, fleece, thought, goose, goat, north[1]. There are languages that have more (Germanic ones tend to be vowel rich), but there aren’t many of them, and none that I know well enough to frame a sentence in. And compare this vowel list to the relative paucity of vowels in so many other languages. Hindi really has only about 9 or 10 vowels; Bengali, which has lost several long-short distinctions has slightly fewer (though lots of diphtongs). Some languages (including these two) do include extra vowels formed by nasalizing existing ones; these nasalized vowels often sound lovely, but feel very similar to their base vowels. It’s more a flourish than a genuinely new creation. Japanese and Spanish have about 4 or 5 apiece, and I’m told that Mandarin and Arabic have about 6.

English, then, is capable of exceptionally rich assonance and exuberant plays on vowel sound[2]. Listen to the interplay between the ‘ai’ sound in ‘light’, ‘shines’, ‘tides’ and ‘file’ with the ‘o’ in ‘no’, ‘broken’, ‘ghosts’, ‘glow’ and ‘bones’, and notice the diverse vowel background they’re embedded in:

“Light breaks where no sun shines;

Where no sea runs, the waters of the heart

Push in their tides;

And, broken ghosts with glow-worms in their heads,

The things of light

File through the flesh where no flesh decks the bones.”

(From “Light breaks where no sun shines” – Dylan Thomas)

Or, for a more recent example, here’s the beginning of Eminem’s “Drug Ballad”. Listen to the long ‘a’ in ‘Mark’, ‘party’ and ‘start’ play with the short repeated tight ‘i’ you hear in ‘This is’, ‘mix’, and ‘kicks in’[3]:

“Back when Mark Wahlberg was Marky Mark

This is how we used to make the party start

We used to mix hen’ with Bacardi Dark

And when it kicks in you can hardly talk”

Just to balance things out, though, there are a number of sounds that English lacks. It’s a well known fact that English has no sound for bilingual[4]. It also has few consonants, and they’re randomly arranged. If you’ve studied a South Asian alphabet (with some exceptions) or even a few of the South-East Asian ones, you’ve probably noticed that there are a large number of consonants, that the alphabet arranges them in rows, and that the sounds on a row are similar. Of course this isn’t accidental, but if you start to pay attention to the rows you’ll probably be surprised by how logically the alphabet is organized (as I was, when I began paying more attention to the sounds of words).

I’m going to use Hindi to illustrate this organization. There are several aspects you control when you make a consonant and the ordering of the consonants in the Devanagari alphabet reflects a number of these.[5]

First, a little background. Consonants are produced by impeding the flow of air through your vocal tract, and there are a few different ways to do this. One is to constrict some part of your vocal tract completely and then release in a little puff of air. This is what you do when you say ‘t’, ‘d’, ‘p’ or ‘b’. These consonants are called stops or plosives (a lovely word – evocative and friendly). Say them to yourself a few times and hear the stop and release. You constrict at different places for these sounds (with tongue behind the upper teeth for ‘t’ and ‘d’; with both lips for ‘p’ and ‘b’) but the mechanism is the same. If you constrict completely but also allow air to escape through your nose while you’re doing this you’ll make a nasal stop. These are sounds like ‘n’, ‘m’ or the ‘ng’ in ‘sing’.

You can also make a consonant by constricting enough to produce turbulent airflow but not enough to block the flow completely. When you make sounds like ‘s’, ‘z’, ‘sh’, ‘v’ and ‘f’ you’re forcing air through a narrow channel in your mouth. These are the fricatives, a word which sounds more painful than it should.[6] If you constrict even less, so that you’re halfway between a fricative and a vowel, you’ll get an approximant. These are sounds like ‘l’, ‘r’, ‘y’ and ‘w’. There are a few more ways of forming consonants, but these are the main ones.

What’s the difference between ‘p’ and ‘b’ or between ‘t’ and ‘d’ or ‘s’ and ‘z’? If you pay attention, you’ll notice that your mouth does about the same thing for each pair (if you don’t immediately notice this, shut your eyes and pay attention to what your lips and tongue do when you make each sound pair), so this can’t be the difference. Now say ‘ssssss’ and ‘zzzzzz’ to yourself (perhaps not in a public place), with two fingers on your throat. You should feel your vocal cords vibrating for ‘zzzzz’, but not for ‘sssss’. This is the difference between a voiced and unvoiced consonant. Now return to ‘t’ and ‘d’ or ‘p’ and ‘b’ and see if you can feel the same.


 Now we  can return to the alphabet. There’s a picture alongside to help you out. The first three rows are vowels, and should be ignored. The next five rows are the stops. In all of these, you block the airflow completely and then release.

The first row sounds like ‘k’, ‘kh’, ‘g’, ‘gh’ and ‘ng’ (as in ‘sing’). For each of these you constrict with the back of your mouth and then release. Your vocal cords vibrate to make a ‘k’ into a (hard) ‘g’, and you push air through your nose to turn these into a ‘ng’. The ‘kh’ and ‘gh’ are aspirated versions of ‘k’ and ‘g’. Aspiration is the slight breathy puff of air that distinguishes the ‘p’ in ‘par’ from the one in ‘spar’ (at least in some accents); don’t worry if you can’t immediately hear it.

The same pattern is repeated on rows two through five, with the constriction moving forward through the mouth. The order for each row is unaspirated-unvoiced, aspirated-unvoiced, unaspirated-voiced, aspirated-voiced and nasal. For the second row it’s ‘ch’ (as in ‘chair’), ‘chh’, ‘j’ (as in ‘jay’), ‘jhh’ and ‘ny’ (the last being rather like the Spanish ñ). Here your tongue squishes up against the hard palate[7] and then releases. For the third row, you’ll need to create the obstruction by curling your tongue back against the roof of your mouth. These sound like hard versions of the English ‘t’, ‘d’ and ‘n’. Again, the second and fourth sounds are aspirated versions of the first and third. Next, you have a similar row, but with soft “t’s” and “d’s” (think Spanish or French if that helps) and ‘n’, formed with the tongue pressing against the back of the teeth before releasing. And finally you have the bilabials: ‘p’, ‘ph’, ‘b’, ‘bh’ and ‘m’. Press your lips together to form the stoppage and then release, again aspirating, voicing and nasalizing to make different sounds. Notice that you’ve gradually moved the obstruction from the back of your mouth (‘k’) to the front (‘p’).

Now that you’re done with the stops, you have a couple of less well-organized rows where the approximants and fricatives live (the absence of severe constriction gives them an anarchic bent) . After the approximants ‘y’, ‘r’ and ‘l’, you have the fricatives ‘v’, ‘sh’, ‘sh’ (with tongue curled back), ‘s’ and ‘h’, and then you’re done.

This organization of consonants is common to most of the languages of India, Nepal, Bangladesh and Sri Lanka and, with some modifications, to Khmer, Thai and Lao, and its spread was particularly tied to the diffusion of Buddhism. Its influences reach into Central Asia; then all the way to Japanese kana, though by this point the sound organization is almost lost; and perhaps even to Korean Hangul, where the shapes of the consonants indicate the mouth shape used to produce them. But that’s a story for another time.

So what does poetry in a consonant-rich language look like? Does Hindi show complicated flights of consonance to rival English’s diverse assonance? I actually don’t know. The last time I read Hindi poetry was in high-school and we were too busy confronting poems as puzzles to be deciphered to spend time getting to know the sounds. The same was true of high-school English, but I kept reading English poetry and even tried writing some and it was this that finally lead me to discover the sounds.

[1] The illustrative words are taken from John Well’s lexical sets

[2] Before you complain: I’m perfectly aware that other languages have interesting sound-landscapes, and can do all sorts of poetic things that I haven’t dreamed about.

[3] The whole piece is a wonderful play of off-rhymes and assonance

[4] This is a joke

[5] Vowels can be classified along a few major dimensions, but there are fewer possible vowels, which makes them feel more unique and makes the classification less interesting.

[6] The ‘th’ in ‘thin’ and the ‘th’ in ‘that’ spoken with a North American or English accent are also  fricatives, and rather unusual ones; many English speakers from other places cannot say these and substitute with other sounds.

[7] Some of the sounds on this row have a fricative component as well.