The Limits of Quantification, Part II

Over the past couple of days I've had a chance to see the Oxford English Corpus in action, and I'm really impressed. Covetous. The thing contains 2 billion words of text (and counting), making it by far the largest linguistic corpus in existence. All of the sources are 21st-century, and every passage is meticulously tagged as to whether it's British, American, Canadian, Australian, etc., and whether it's from news, fiction, blogs, online chat rooms, medical journals ... Naturally, the tags make it possible to pick apart usage in the different realms. If you want to see sentences containing the word "balloon" in British fiction or in American medical literature (where it's not as scarce as you might suppose, owing to "balloon angioplasty"), no problem. Click, click, hit "Enter," and the passages line up neatly on the screen.

The developers of the corpus have tried to make the text as representative a sample of contemporary English as possible. Which of course gets me thinking, What does that mean? Certainly, the developers have given a lot more thought to this question than I have. They're obviously smart, experienced, and passionate about their work - I'm not at all skeptical of them. I would love to get my hands on the corpus. But I can't help being skeptical that anything anyone could come up with could be "representative" of contemporary English. Have I zeroed in on a fundamental design problem, a fundamental problem with the nonspecialist's relationship to technology, or a fundamental problem with my state of mind?

Presented by

How to Cook Spaghetti Squash (and Why)

Cooking for yourself is one of the surest ways to eat well. Bestselling author Mark Bittman teaches James Hamblin the recipe that everyone is Googling.

Join the Discussion

After you comment, click Post. If you’re not already logged in you will be asked to log in or register.

blog comments powered by Disqus

Video

How to Cook Spaghetti Squash (and Why)

Cooking for yourself is one of the surest ways to eat well.

Video

Before Tinder, a Tree

Looking for your soulmate? Write a letter to the "Bridegroom's Oak" in Germany.

Video

The Health Benefits of Going Outside

People spend too much time indoors. One solution: ecotherapy.

Video

Where High Tech Meets the 1950s

Why did Green Bank, West Virginia, ban wireless signals? For science.

Video

Yes, Quidditch Is Real

How J.K. Rowling's magical sport spread from Hogwarts to college campuses

Video

Would You Live in a Treehouse?

A treehouse can be an ideal office space, vacation rental, and way of reconnecting with your youth.

More in Entertainment

Just In