[This article reposted from the Washington Post website's Archives.]
By Linton Weeks
June 28, 2001
When the 1,728-page Microsoft Encarta College Dictionary appears in bookstores next month, "it will start the Third World War of Dictionaries," says its top American editor, Anne Soukhanov.
"It will shake things up," says Michael Agnes, editor in chief of Webster's New World dictionaries.
The dueling lexicographers may be right. We could be on the verge of an all-out melee, with a handful of book-publishing bruisers fighting for lucrative desk space among students, teachers and office workers. Among the combatants: Random House, Merriam-Webster and American Heritage.
Then again, these ancestral voices prophesying Armageddon may be wrong.
Some scholars believe a new war of the words is coming, but not the one that Soukhanov and Agnes envision. Rather than battling among themselves, traditional dictionary publishers may have a common enemy: the computer.
With software-based spell-checkers, thesauri, grammar monitors and all manner of lexicographic Web sites available at the tap of a mouse, the dead-tree dictionary as we know it could be history.
In any case, the ponderous world of dictionaries is changing fast. The coming combat, no matter what form it takes, might not alter our everyday lives, but it could change in picayune and profound ways our language and the very ways we communicate with each other.
Just southwest of Charlottesville, the town of Bedford, Va., is an unlikely headquarters for the newest 21st-century college dictionary. Bedford's high points are the just-dedicated National D-Day Memorial and a retirement home for the Benevolent and Protective Order of Elks.
But with a fax machine, a high-octane computer and access to the Internet, Anne Soukhanov is able to oversee the assembly of the Microsoft Encarta College Dictionary. In an upstairs room of the 1865 mansion where she was born, Soukhanov is doing final touch-ups on the book. It's a sunny day. She wears a Chanel-style jacket over a turtleneck. Her puffy hair is brown. Her obviously overworked eyes are blue.
She scours several newspapers a day. She sends e-mail to, and receives e-mail from, 72 editors around the world and a panel of 41 English professors. The academics have told her the common language problems that students have, such as knowing the difference between "hoard" and "horde," using "affect" and "effect" and spelling "a lot" as one word.
Nearby, Soukhanov's competitors -- four tomes -- are stacked on the floor to help her remember what she's up against.
She's quick to point out that her new book is not based on the controversial full-size Encarta World English Dictionary that came out in 1999. Its claim to fame was that it brought together English words from around the globe. The college dictionary will follow suit, with entries such as "bienvenida," a Philippine word meaning "a party held to welcome somebody" and "car-lifting," a South Asian term for auto theft.
The WED is just one item in Microsoft's line of Encarta office reference works. Predicated on Funk & Wagnalls and Compton's encyclopedias, Microsoft's Encarta encyclopedia and dictionary were first produced as a CD-ROM in 1993.
In book form, the WED received some raves: "This hefty, well-produced volume marks a milestone in the history of our language," observed Robert McCrum, co-author of "The Story of English."
And it took some hits: "It seems to have no standards," one lexicographer told the New York Times. "You find inaccurate information, missing information, repetitious information."
Lexicographers can get nasty. "It's a dog-eat-dog world," says Joseph P. Pickett, executive editor of the American Heritage Dictionary.
As she sits by her terminal, Soukhanov explains that the new Microsoft Encarta college edition draws from the same unabridged corpus, or database, of words as the World English Dictionary. However, it will be geared toward students and office workers.
The new dictionary holds 320,000 definitions, more than any other college dictionary. And hundreds of commonly misspelled words. In an easy mid-Virginia drawl, Soukhanov explains, "The misspellings will be printed with a line drawn through them to make it clear that the spelling is wrong." A cross-reference will direct readers to the correct spelling.
Homophones will be listed in "Spellcheck" notes. Soukhanov -- who began using a computer only with her last dictionary, has taken advantage of certain computer-friendly terms. The dictionary is littered with 200 such "Spellchecks." "Do not confuse allude with elude, which has a similar sound. Beware: your spellchecker will not catch this error."
The book also contains some 500 prescriptive notes on correct usage, such as when to use "blatant" and when to use "flagrant."
"I feel passionate about it," she says. "I'm trying to keep people from butchering the language."
Though the college version will include such up-to-the-nanosecond online words as "ego-surfing" and "dimpled chad," it also reflects Soukhanov's offline conservatism. Her note on "issues": "The euphemistic use of issues to denote intentionally unstated problems, typically emotional or mental problems, should . . . be avoided, as in He's one of those people who always has issues."
(Some grammarians will be quick to point out that the entry should read: "one of those people who always have issues.")
If you've ever thought about dictionaries, you've come to the studied conclusion that dictionaries are: (a) good. They help us spell and pronounce and know what words mean.
Or (b) bad. They freeze-dry words, choking all life from the dynamism of a fluid language.
Or (c) both.
Irregardless . . . many believe that the English language demands a spelling and usage guide that keeps people from abusing the Mother Tongue with words such as . . . irregardless.
Historically, dictionaries have fallen into two broad categories: descriptive and prescriptive.
A descriptive dictionary captures a moment in a language's history. It records the standard spelling and meanings of a word but makes no value judgments about how the word should be used. To descriptivists, writes lexicographer Sidney I. Landau, "no form of English is purer than another. They believe that language change is a normal process that cannot be retarded, and that users should not be stigmatized as ignorant or careless because they use language in a way others disapprove of." Such as playfully ending a sentence with a preposition.
A prescriptive dictionary, on the other hand, argues that there is a correct way to spell and use every word.
Prescriptivists, Landau explains, "agree that correctness should be promoted, instilled, and enforced by vigorously criticizing writers and speakers who use incorrect usages. Implicit in this view is the idea of a 'pure English' that has been corrupted through ignorance and indifference."
If usage advice, such as how to avoid sexist language, is stressed, "you're probably looking at a prescriptive dictionary," explains Webster's New World's Michael Agnes.
But in the land of the free, who's to say how we should say what we say? American English, after all, is wild-frontierish and open-armed inclusive. It's an undammed and leveeless rushing river in which the words "fly" and "dope" and even "bad" can all mean "good." By definition, American dictionaries are less prescriptive, less didactic than, say, dictionaries in France, where the academy strives heroically to preserve an endangered language and keep foreign words out of the lexicon.
But when it comes to suggesting how words should, and shouldn't, be used, there are distinct differences in America's reference books.
Take "ain't." Merriam-Webster defines it as the contraction of "are not" and observes descriptively that "although widely disapproved as nonstandard and more common in the habitual speech of the less educated, ain't . . . is flourishing in American English. It is used in both speech and writing to catch attention and to gain emphasis."
The American Heritage Dictionary is more of a scold. It defines the word as the combination of "am" and "not." Citing language commentators, such as those on its 205-member usage panel -- which includes humorist Garrison Keillor, novelist Jamaica Kincaid and playwright Wendy Wasserstein -- the dictionary points out prescriptively that "the use of ain't is often regarded as a sign of ignorance."
A dictionary's job, says American Heritage editor Pickett, is "to get the warning right."
He says, "That's really important with words that are socially offensive."
In her dictionaries, Soukhanov writes, she hopes to help readers "use the language without incurring criticism."
Random House, which contains more than 1,000 usage notes in its 1,600-page tome, warns readers about offensive words. "We're interested in the impact that a word will have if you use it," says its editor, Wendalyn Nichols.
Word sleuths believe that the first dictionary known to humankind was a word list on a clay tablet that dates back to 2000 B.C. or so. Robert Cawdrey's "A Table Alphabeticall," printed in 1604, was the first all-English dictionary of weird and difficult words. Samuel Johnson produced his amazing "Dictionary of the English Language" in 1755.
In the early 1800s, Noah Webster, an attorney and a teacher, reacted to Johnson's British-slanted dictionary by producing a homegrown word list for the new American language. His 1828 tour de force, "An American Dictionary of the English Language," is arguably the first truly American dictionary and the initial shot fired in Word War I of American dictionaries.
Webster's most vocal detractor was an English traditionalist named Joseph Emerson Worcester. He escalated Word War I by publishing his response to Webster, "A Dictionary of the English Language," in 1860, but Webster with his emphasis on American terms and spellings won out in the United States; through the aggressive marketing campaigns of George and Charles Merriam, who bought the rights to his book after he died, Webster's dictionary became the American standard. Eventually the Merriams lost the exclusive rights to the Webster name; now it's used by many companies.
Word War II broke out in 1961 with the publication of the extremely descriptive Third New International Dictionary by Merriam-Webster, a company directly descended from Old Noah himself. Many critics felt that the descriptive dictionary makers had gone too far. Too many words -- presented in a nonjudgmental fashion -- were acceptable.
The dictionary "became a cultural icon to degenerate permissiveness of the time," according to Pickett.
In direct response, American Heritage published its own dictionary in 1969 as a prescriptive reference designed to preserve pure American English and correct usage. Ironically, the conservative American Heritage was the first widely used dictionary to include certain four-letter vulgarities.
Just after that dictionary came out, Anne Soukhanov went to work for Merriam-Webster as an editorial assistant. She remembers an edited copy of the American Heritage Dictionary on a table in the office, with errors and questionable definitions marked up by the Merriam-Webster staff.
This may have been war, but it was war conducted according to the ethos of the world of dictionaries.
"Talking was discouraged," Soukhanov remembers. "We spoke to each other by pink slips. The editor in chief used white slips. We used cigar boxes as in and out boxes. We invited each other to lunch with pink slips."
Soukhanov worked for Merriam-Webster nine years. Then she was hired by the competition. She eventually became executive editor of the American Heritage Dictionary.
In 1997, Soukhanov -- author of a passel of word books and a word column that ran in the Atlantic Monthly from 1986 to last year -- was hired to be U.S. general editor of the Encarta World English Dictionary, a joint production of London-based Bloomsbury Publishing, Microsoft and St. Martin's Press.
Soukhanov leans toward the prescriptive side of the debate. "Times have changed," she says. "There are folks who don't read, can't spell and can't use the English language." A dictionary, she says, should be a guide, as well as a record of the language.
So begins Word War III with Encarta, billed as "the Dictionary for the Internet Age."
But does the world really need another college dictionary on the desk? Will the generation that "keyboards" instead of writes, that speaks in e-mail jargon and that doesn't seem to care that much about correct spelling buy a new print dictionary?
"Encarta is a newcomer," says Wendalyn Nichols. "We have yet to see whether the quality will measure up."
"My expectation is that the Encarta College will be a flop," says independent lexicographer Frank Abate, formerly an editor with Oxford University Press and now president of his own Connecticut firm, Dictionary and Reference Specialists. "There is certain negativity toward Microsoft in that field. It's one thing to buy their software; it's another to say they know how to do a dictionary."
He adds: "What is happening is that the younger the person, the more likely they are to turn to an online place for reference. If you go into a middle school, for example, you'll see a bunch of dictionaries. But ask kids to look something up, they go to the computer. That doesn't mean they will find good data or better data online. Often they won't. But that's where they turn."
Today's users may not be looking for dictionaries that are descriptive or prescriptive, but e-scriptive -- online, fast and convenient.
Soukhanov says: "People have come to believe that technology is the answer for everything . . . and it's not."
American Heritage (www.bartleby.com/61) and Merriam-Webster (www.m-w.com/dictionary) are taking a different approach to the future. Both dictionaries are online and free. "We felt it was important to keep our name in front of the public in the world of online dictionaries," says Merriam-Webster Editor in Chief Frederick C. Mish. "We were convinced that we could do that without having a deleterious effect on the sale of print dictionaries."
So far, Mish says, sales have remained constant.
Microsoft offers its Encarta World English Dictionary, the Encarta Encyclopedia and a world atlas on its wide-ranging, interactive Learning and Research Web site (http://encarta.msn.com). Users of the site can look up words, read the definitions and usage notes and even hear the word pronounced. The Encarta store on the site advertises the Encarta World English Dictionary in CD-ROM and DVD format but not as a traditional hardback book. Full-reference Web sites have proliferated as more people write on computers and fewer in longhand or with typewriters.
Even the historic Oxford English Dictionary, a scholarly work of art and the granddaddy of all descriptive dictionaries, is in the process of revising its many volumes and hundreds of thousands of words online. Oxford University Press, the publisher, charges users $555 a year to subscribe.
Jesse Sheidlower, head of the dictionary's U.S. office, says that "we don't know when -- or if -- we'll come out with another print edition," which now would take up 40 volumes. He foresees a day when prices will come down and students and office workers will sign up to use the ever-expanding online OED.
Dictionary makers have to make choices. They must leave out certain words. But the Internet makes it possible to include all words and all meanings and all usage notes. An online dictionary can be descriptive, prescriptive and e-scriptive.
According to the essay "Using the Internet as a Research Tool" by Andrew Harnack and Gene Kleppinger, the Internet's "availability through personal computers has brought the realm of information out of library reference rooms to every home and office."
If you want to read the essay about how the Internet is replacing reference books, you can look it up -- in the front of Anne Soukhanov's new Encarta College Dictionary.
Encarta's Anne Soukhanov, at home in Bedford, Va., believes dictionaries should guide as well as record usage.
[This article originally published on Chicago Manual of Style Shop Talk website.]
Peter Sokolowski is editor at large at Merriam-Webster, where he works on the Word of the Day podcast, Ask the Editor videos, and short articles about word trends and etymologies (which he also presents on Twitter). In addition to attending professional and academic conferences to talk about dictionaries, he conducts workshops for teachers of English as a second language, serves as pronouncer for spelling bees around the world, and is a substitute jazz host for New England Public Radio. (We also hear he plays a mean jazz trumpet.)
Here he talks with Shop Talk editor Carol Fisher Saller.
CFS: Recently I was lucky enough to hear you speak about what lexicographers can deduce from the words people look up at Merriam-Webster.com. You gave examples of how a political or celebrity event can cause certain “lookups” to spike—such as the word emaciated when Michael Jackson died. Later I saw your tweet about the spiking of canonize and homily when Pope Francis visited the US. I believe you said that more than a billion words a year are looked up at the M-W Dictionary website and apps. What I’d like to know is, how do you keep an eye on a billion lookups? What kind of tools do you have and how do you use them?
PS: It is indeed a lot of data to take in all at once. Google Analytics churns all that data with each request, making it a slow way to get information, so we have developed some simpler engines that give us fast answers; one is the list of words being looked up at any given moment. It can be refreshed by the second but stores no archive.
Then we have several colored graphs that show lookups aggregated by the hour, day, week, and back several years to when we started keeping track.
Finally, there’s my favorite, the multiplier, which enables me to look at words that have seen increases by 200 percent, 300 percent, or any factor of 100 in the previous twenty-four hours. Since a dictionary database represents a very long tail of information, a word that jumps to, say, the 5,000th position today from the 150,000th would represent a considerable relative spike—but one that I’d miss by only looking at the first few hundred words on the list. It measures speed rather than volume.
CFS: Fantastic—I’m picturing you in goggles in front of a giant screen of words. But seriously, what a privilege to have those tools at your command. Are you able to tell where lookups originate geographically? Can you tell whether lookups are coming from computers or smartphones? Do you make use of that information?
PS: We can tell the country of origin for lookups, but since the majority of our traffic is domestic, we feel that we’re reading American curiosity best and most accurately. We have occasionally seen isolated lookups coming from a foreign country when we can’t quite figure out why a given term is spiking. For example, the term physical education spiked when a change in school policy was made in the Philippines, which is a large market for us both digitally and in print. We found that an e-mail to parents included a link to our entry. A while back we looked at the traffic from American university domains, and it showed a big uptick from colleges and universities all across the country at the beginning of the school year in September. This is very encouraging; it tells us that academic traffic is a big part of the mix. We can also tell if the lookups are from a smartphone, which gives us some insight as to how people use the two platforms. Desktop lookups peak during business hours for serious words, and smartphones peak in the evening for less businesslike reasons: words like love and two-letter Scrabble words are looked up more often then.
The convenience of a smartphone is important. In fact, last year we saw lookups from small screens exceed the desktop site in traffic (and Google just reported the same pattern), so we expect to see even faster responses to news events through the dictionary data, since so many people now carry a dictionary on their phones all the time.
Last year we saw lookups from small screens exceed the desktop site in traffic.CFS: When you think about the ways you collect word-use data compared with a hundred or two hundred years ago when Noah Webster was collecting words, what comes to mind? Just today I heard your colleague Kory Stamper compare language to a river with contributing streams and currents, with people adding words/droplets to it all the time. Do you think technology makes the river run faster? Is language change faster? Are changes less permanent? Has the dictionary’s role changed over time?
PS: Communication is obviously faster these days. But just because we can measure better today doesn’t mean that there was less to measure, relatively speaking, in the past. With more people contributing to the stream, those droplets join an ever-bigger river. I try to guard against cultural myopia and the illusion of omniscience that comes with sitting all day in front of a computer. I’ve seen spikes for words that leave not a single trace on the Internet, later to learn that the word was used on a prime-time TV show watched by twenty million people. It wasn’t exactly a secret, but I couldn’t find it for the simple reason that the show’s script wasn’t indexed and optimized. Our lives aren’t indexed and optimized. The Internet doesn’t always have all the answers.
On the other hand, the biggest change for lexicographers is how fast and comprehensively research can be done today. Whether it’s a new archive of medieval texts, letters from Civil War soldiers, or Google News, we have tools to search with speed that would have been unimaginable even a generation ago. Our office is essentially a library, and we have shelves full of literary concordances. A concordance for Milton or Shakespeare was a lifetime’s work a century ago, and it usually gave the author the status of a major scholar. Today we can search those works in seconds. It seems to me that the increase in volume of text is more than matched by the speed of research. The key is to know where to look.
Our lives aren’t indexed and optimized. The Internet doesn’t always have all the answers.People had thought that the telegraph, the telephone, and radio would ruin language. Like today’s technologies, they’ve just added to the stream.
CFS: I’m getting the idea that your technology is a fun bonus, but that the real payoff is in the cultural and societal insights from seeing what words people are looking up most.
PS: Exactly. Knowing the most looked-up terms tells us what people really use the dictionary for. People want information about abstract words and ideas, words that function at a higher plane of language than concrete nouns. Neologisms get attention in the media, but most of us aren’t looking up novelties like twerk and LOL; we’re looking up words like integrity, pragmatic, and socialism. In a way, the data isn’t about numbers. It’s about what people want and expect from a dictionary. We’ve only just begun to respond to this feedback in our editing and product development.
A dictionary is a tool in a person’s intellectual toolbox. It’s a utility for grounded and objective information, most often used to answer an immediate question. I believe that the dictionary serves a contemplative purpose even when we aren’t writing or editing: people look up love around Valentine’s Day and surreal following national tragedies. Words bring a sharp focus to thought.
CFS: So from your seat at Merriam-Webster, does language use appear to be healthy or in decline?
PS: I’m not worried. The written word is more important than ever because of digital communication. We are all judged constantly and harshly by the way we write and speak; I’m reminded of the Match.com survey that puts “good grammar” as the second most important quality sought in potential dates, after “good teeth.” (Seth Meyers joked on Saturday Night Live that this is good news for “whomever has both.”) But for professional and academic reasons, not to mention the importance of American English as a lingua franca of Internet business, standard English has become its own reward. Standard English is not necessarily a superior form of the language, but it is a privileged form of the language.
At the same time, language declinism is a waste of time and intellectual energy. Everyone who makes a “kids today” comment about the state of language was once a kid. The only constant is change. Languages certainly do follow rules, but they don’t follow orders. As linguists and lexicographers, we notice changes and novelty—we don’t complain about them. The English language is a great marketplace that will accept some changes and reject others. Our job is to observe and report.
Standard English is not necessarily a superior form of the language, but it is a privileged form of the language.Obviously, copyeditors enforce standards. The straw-man oversimplification that supposedly divides “descriptivists” and “prescriptivists” misses a very important point. A dictionary records two kinds of facts: linguistic facts such as spelling, etymology, pronunciation, and meaning; and cultural facts, which we call usage. Usage is the manners of language. The usage paragraphs are where the descriptive and the prescriptive meet, and they are often the most interesting and useful part of a dictionary. Look at the usage note at irregardless, for example. It gives very clear advice. A good dictionary always gives good advice.
CFS: I’m familiar with this confusion over the role of the dictionary. I see it in the backlash online against news that a respected dictionary has “accepted” a nonstandard word or meaning. Readers think that “accepted” means “declared acceptable in formal usage” instead of “accepted the need to explain a nonstandard word or meaning.” Sometimes I think people forget that we count on dictionaries to tell us what words mean. All words.
OK, I just looked at irregardless at Merriam-Webster.com—it’s even playable in Scrabble! I’m not sure my grandma—who always won at Scrabble—would approve, but she would love knowing that I got to talk with someone from the Official Scrabble Players Dictionary. Last question: can you tell us something fun about Scrabble lookups?
PS: That’s easy! We noticed something remarkable when we saw a difference in words looked up from the mobile app compared to the website: late at night, lookups for qi and za spike, which means that in bars and in beds across the country, people are playing Scrabble. There’s also a big spike in lookups of these two-letter words during the afternoon on Thanksgiving and Christmas. With no competition from business or academic traffic, the Scrabble words soar to the top.
People often say to me that they wouldn’t want to play me in Scrabble, but honestly I don’t play. I’ve asked a bunch of other lexicographers, and we seem to agree: it’s either too much like math or too much like work.
(adj.) wild and frenzied; from Greek κορυβαντες (Korybantes)