Thursday, November 12, 2015

The Art of Communicating - Why is English the First Language?

Juan Osborne doesn't just create portraits using type, the Spanish artist
only chooses words relevant to that particular subject

If we were really into communicating English would be the last language we would want for the international language although that is what we are trying to do.  What is it about English other than the fact we learned it to some degree growing up?
Most people consider the root language of all languages to be Latin or Ancient Greek, yet if these are the basis for all languages, how odd that of all languages they have the fewest words.  Here are some of the popular languages and the number of words in each language.
Latin                                   4,000

Ancient Greek                  10,000

Hebrew                             45,000

Spanish                             83,431

French                            100,000

Russian                           150,000

Arabic                            200,000

English                        1,025,109

Now what does that tells us?
I think the Ancient Greeks were the most advanced of all cultures as they laid the foundation for philosophy, religion, mathematics, science, music, medicine, you name it, they did it.  Yet between Latin, which they used, and Ancient Greek, it only took them 14,000 words to lay the foundation for all future languages.
John Bagnall is one of the most viewed writers on the Internet, and he says the following, about the number of words used by English speaking people.
Britain’s Guardian newspaper, in 1986, estimated the size of the average person’s vocabulary as developing from roughly 300 words at two years old, through 5,000 words at five years old, to some 12,000 words at the age of 12 [1].

The Guardian’s research suggested that it stays at around this number of words for the remainder of most (average) people’s lives—adding that this is roughly the same number of words as those drawn on by a popular newspaper in the course of producing its daily editions—while a graduate might have a vocabulary nearly twice as large (23,000 words). Shakespeare, according to Robert McCrum et al (whose estimate of the average vocabulary is 15,000 words), had one of the largest recorded vocabularies of any English writer at around 30,000 words[2].

In point of fact, it’s all but impossible to be sure. Not simply because of the difficulty of estimating the number of words any given individual does use and understand, but because of the difficulty of defining what does or does not represent a discrete “word”. For example, is “hair-dryer” one word or two (“hair dryer”)? Do you include abbreviations and acronyms such as “a.m.” and “p.m.”, “’flu” and “BBC”? Is “haven’t” to be considered the same as “have not”, or is it a separate word? What about proper names, brand names? Do you count slang and regional dialect words? Texting and other online conventions? Different grammatical tenses of the same verb? Are popular idioms and phrases (''see you soon'', ''crash out'', ''lol'') to be counted singularly? And so on.

There's also a distinction to be drawn between the words that people use (their active vocabulary) and those they never use of their own volition but understand should they encounter the word when used by others (their passive vocabulary). Clearly, a person's passive vocabulary is (much) larger than their active one.

If you want to investigate the size of your own vocabulary (active and passive), David Crystal’s invaluable The English Language (2nd ed, Penguin 2002) describes a method you can use.

[1] The Guardian, 12 August 1986, cited in David Crystal, The English Language, 2002, p46
[2] Robert McCrum et al., The Story of English, 1986, p102

Vocabulary size

Lexical facts

May 29th 2013, 16:02 by R.L.G. | NEW YORK

SEVERAL years ago we mentioned here on the blog. Not long ago, the site reached its two millionth test result, and so the researchers have put together some data:

  • Most adult native test-takers range from 20,000–35,000 words
  • Average native test-takers of age 8 already know 10,000 words
  • Average native test-takers of age 4 already know 5,000 words
  • Adult native test-takers learn almost 1 new word a day until middle age
  • Adult test-taker vocabulary growth basically stops at middle age
  • The most common vocabulary size for foreign test-takers is 4,500 words
  • Foreign test-takers tend to reach over 10,000 words by living abroad
  • Foreign test-takers learn 2.5 new words a day while living in an English-speaking country

In a separate post, though, comes a surprising fact: the reading of fiction specifically is as important as reading generally.  People who read "lots" and fiction "lots" outscore those who read "lots" but fiction only "somewhat" or "not much". This is because a wider range of vocabulary is typically used in fiction than in non-fiction writing. 

And if you're wondering "how accurate can this short test be?" the details of the methodology are quite interesting and clearly explained. So if you haven't tested yourself, do.

Everyone ignored my remark that "bragging in the comments is naff" last time, so go ahead and brag away.

BBC News Magazine 28 April 2009

The words in the mental cupboard

By Caroline Gall
BBC News Magazine

Children are to be offered lessons on how to speak English formally amid fears that many are suffering from "word poverty", it has been reported. But how many words do people tend to know and use?

Do people know more words than they actually use? And is having a large vocabulary something you learn or have a natural ability for?

These are burning issues in the worlds of linguistics and education. On Monday it was reported that children in England will have lessons in formal language amid fears that some are suffering from stunted vocabularies.
US company Global Language Monitor (GLM) believes that the one millionth word will be added to the English language in mid-June.

While there is agreement that a word becomes a word when it is used by one person and understood by another, grammarians and lexicographers stand divided when deciding which to include when calculating a total.

Obamamania, bankster and bloggerati are just some of the "brand new words" GLM has been tracking.

The operation, based in Austin, Texas, says 25,000 citations in the worldwide media, social networking sites and elsewhere are its benchmark for a word to be included in its total.

They estimate a new word is created every 98 minutes.

The English language is likely to contain the most words of all languages, according to the Oxford English Dictionary, and estimates for the number of words range from one to two million.

Agreement will probably never be reached over whether or not to include words used in botany or chemistry, let alone slang, dialects and influences from foreign shores.

Some areas GLM does not include are product names and chemicals and Paul Payack, president and chief word analyst, says the 600,000 species of fungus are not in.

So, can a precise word total ever be known? No, says Professor David Crystal, known chiefly for his research in English language studies and author of around 100 books on the subject.

"It's like asking how many stars are there in the sky. It's impossible to answer," he said.

An easier question to answer, he maintains, is the size of the average person's vocabulary.

He suggests taking a sample of about 20 or 30 pages from a medium-sized dictionary, one which contains about 100,000 entries or 1,000 to 1,500 pages.

Tick off the ones you know and count them. Then multiply that by the number of pages and you will discover how many words you know. Most people vastly underestimate their total.

"Most people know half the words - about 50,000 - easily. A reasonably educated person about 75,000 and a really cool, smart person well, maybe all of them but that is rather unusual.

"An ordinary person, one who has not been to university say, would know about 35,000 quite easily."

The formula can be used to calculate the number of words a person uses, but a person's active language will always be less than their passive, the difference being about a third.
Prof Crystal says exposure to reading will obviously expand a person's vocabulary but the level of a person's education does not necessarily decide things.

"A person with a poor education perhaps may not be able to read or read much, but they will know words and may have a very detailed vocabulary about pop songs or motorbikes.

"I've met children that you could class as having a poor education and they knew hundreds of words about skateboards that you won't find in a dictionary.

"We must avoid cultural elitism."

His research led him to ask people how many different words appeared on average in a copy of

The Sun newspaper. All respondents came back with a low figure.

The Sun v The Bible 

After counting a paper picked from random he found there to be about 8,000.

"That's the same as the King James version of the Bible.

"It is not very varied and names don't count but you see, people see headlines like 'Gotcha!' and make a judgment."

But surely, the perfect outlet for having a vast vocabulary is Scrabble.

Allan Simmons, crowned UK champion last year, says he can recognise around 100,000 of the 160,000 words of nine letters or under included on the Scrabble list.

"I've always liked words, their meanings and dictionaries. Patterns of words are interesting - I see it as an art form.

"I have a good memory and a lot of words I learn just for the game although that is a bit artificial."

And while the language grows, words will fall out of use by being replaced.

Experts predict words like "stab" or "throw", have a language lifetime of about 800 to 1,000 years whereas the words "three", "five", "I" and "who" may last anything up to 20,000 years.

So as new words are created at such a pace will we ever keep track? Worry not, says Prof Crystal.

"Of course words become obsolete when they are not used in everyday speech. Look at Shakespeare's plays. But words never, ever get forgotten."

Facts regarding English in England

Some children start school knowing 6,000 words, others just 500.


American Ammon Shea spent a year reading the Oxford English Dictionary

He digested 20 volumes, 21,730 pages and 59 million words

'I'm not against big words per se... but I'm opposed to using them for their own sake,' he said

Susanne M. Glasscock School of Continuing Studies

Spring 2003

The Thirty Million Word Gap
A summary from "The Early Catastrophe: The 30 Million Word Gap by Age 3" by University of Kansas researchers Betty Hart and Todd R. Risley. (2003). American Educator. Spring: 4-9, which was exerpted with permission from B. Hart and T.R. Risley (1995). Meaningful Differences in the Everyday Experiences of Young American Children. Baltimore, MD: Brookes Publishing.

In this groundbreaking study, University of Kansas researchers Betty Hart and Todd Risley entered the homes of 42 families from various socio-economic backgrounds to assess the ways in which daily exchanges between a parent and child shape language and vocabulary development. Their findings were unprecedented, with extraordinary disparities between the sheer number of words spoken as well as the types of messages conveyed. After four years these differences in parent-child interactions produced significant discrepancies in not only children’s knowledge, but also their skills and experiences with children from high-income families being exposed to 30 million more words than children from families on welfare. Follow-up studies showed that these differences in language and interaction experiences have lasting effects on a child’s performance later in life.

The Early Catastrophe
Betty Hart & Todd R. Risley

Betty Hart and Todd Risley were at the forefront of educational research during the 1960’s War on Poverty. Frustrated after seeing the effects of their high quality early intervention program aimed at language skill expansion prove unsuccessful in the long-term, they decided to shift their focus. If the proper measures were being taken in the classroom, the only logical conclusion was to take a deeper look at the home. What difference does home-life make in a child’s ability to communicate? Why are the alarming vocabulary gaps between high school students from low and high income environments seemingly foreshadowed by their performance in preschool? Hart and Risley believed that the home housed some of these answers.

Experimental Method:

Hart and Risley recruited 42 families to participate in the study including 13 high-income families, 10 families of middle socio-economic status, 13 of low socio-economic status, and 6 families who were on welfare. Monthly hour-long observations of each family were conducted from the time the child was seven months until age three. Gender and race were also balanced within the sample.


The results of the study were far more severe than anyone could have anticipated. Observers found that 86% to 98% of the words used by each child by the age of three were derived from their parents’ vocabularies. Furthermore, not only were the words they used nearly identical, but also the average number of words utilized, the duration of their conversations, and the speech patterns were all strikingly similar to those of their caregivers.

After establishing these patterns of learning through imitation, the researchers next analyzed the content of each conversation to garner a better understanding of each child’s experience.  The number of words addressed to children differs across income groups. They found that the sheer number of words heard varied greatly along socio-economic lines. On average, children from families on welfare were provided half as much experience as children from working class families, and less than a third of the experience given to children from high-income families. In other words, children from families on welfare heard about 616 words per hour, while those from working class families heard around 1,251 words per hour, and those from professional families heard roughly 2,153 words per hour. Thus, children from better financial circumstances had far more language exposure to draw from.

In addition to looking at the number of words exchanged, the researchers also looked at what was being said within these conversations. What they found was that higher-income families provided their children with far more words of praise compared to children from low-income families. Children's vocabulary differs greatly across income groups. Conversely, children from low-income families were found to endure far more instances of negative reinforcement compared to their peers from higher-income families. Children from families with professional backgrounds experienced a ratio of six encouragements for every discouragement. For children from working-class families this ratio was two encouragements to one discouragement. Finally, children from families on welfare received on average two discouragements for every encouragement.

To ensure that these findings had long-term implications, 29 of the 42 families were recruited for a follow-up study when the children were in third grade. Researchers found that measures of accomplishment at age three were highly indicative of performance at the ages of nine and ten on various vocabulary, language development, and reading comprehension measures. Thus, the foundation built at age three had a great bearing on their progress many years to come.


Within a child’s early life the caregiver is responsible for most, if not all, social simulation and consequently language and communication development. As a result, how parents interact with their children is of great consequence given it lays a critical foundation impacting the way the children process future information many years down the road. This study displays a clear correlation between the conversation styles of parents and the resulting speech of their children. This connection evidences just how problematic the results of this study may truly be.

The finding that children living in poverty hear fewer than a third of the words heard by children from higher-income families has significant implications in the long run. When extrapolated to the words heard by a child within the first four years of their life these results reveal a 30 million word difference. That is, a child from a high-income family will experience 30 million more words within the first four years of life than a child from a low-income family. This gap does nothing but grow as the years progress, ensuring slow growth for children who are economically disadvantaged and accelerated growth for those from more privileged backgrounds.

In addition to a lack of exposure to these 30 million words, the words a child from a low-income family has typically mastered are often negative directives, meaning words of discouragement. The ratios of encouraging versus discouraging feedback found within the study, when extrapolated, evidences that by age four, the average child from a family on welfare will hear 125,000 more words of discouragement than encouragement. When compared to the 560,000 more words of praise as opposed to discouragement that a child from a high-income family will receive, this disparity is extraordinarily vast.

The established connection between what a parent says and what a child learns has more severe implications than previously anticipated. Though Hart and Risley are quick to indicate that each child received no shortage of love and care, the immense differences in communication styles found along socio-economic lines are of far greater consequence than any parent could have imagined. The resulting disparities in vocabulary growth and language development are of great concern and prove the home does truly hold the key to early childhood success.
Sources Cited:

Hart, B. & Risley, T.R. “The Early Catastrophe:The 30 Million Word Gap by Age 3” (2003, spring). American Educator, pp.4-9..

— Prepared by Ashlin Orr, Kinder Institute Intern, 2011-12.

For more information about putting this research into practice, please explore our work at the Rice Oral and Written Language (OWL) Lab.

No comments: