Putting Your Weird Word Choices on the Map

It's been said that Britain and America are two great nations separated by a common tongue. New visualizations storming the Web suggest the same may be true of America by itself.
Is it coke (green), pop (blue), or soda (red)? Let your surroundings decide. (IMAGE: JOSHUA KATZ)

Over the years I’ve had observer status in a number of brushfire grammar wars, like the pop/coke/soda conflict, the spat over whether they’re highways or freeways, and the skirmish on sneakers versus tennis shoes. About a decade ago, linguist Bert Vaux made some waves with his Harvard Dialect Survey, which didn’t attempt to settle these weighty matters but to catalog them.

“I ... realized that none of the existing dialect grammars or dictionaries actually contained forms that were relevant today,” Vaux told the Harvard Crimson in 2002. “They were all based on the speech of old white farmers from the 1920s ... how many students today know what a whiffletree or bonny-clabber are?” So he and Scott A. Golder surveyed students and Web denizens to detail their words for various objects or situations for the new century. And despite the homogenizing effects of TV, there are still differences.

For example, what do you call the long sandwich that contains cold cuts, lettuce, and so on? I call it a sub, and in Vaux’s survey 77 percent of the 10,708 respondents used the same term. But about seven percent of respondents termed it a hoagie, five percent a hero, three percent a grinder, and two percent a poor boy. But just as interesting, perhaps more so, Vaux created a cyber-atlas, mapping out where responses came from. New Englanders like the word grinder, New Yorkers hoagies, and those on the Gulf Coast poor boys. And in San Francisco, nearly every usage made an appearance, highlighting California’s role as a mecca for Americans and that particular city’s diversity.

An example of linguist Bert Vaux's visualizations looking at the variations in use of "dinner" and "supper."

All told, Vaux asked about 122 separate usages—from how to say “aunt,” “been,” and “caramel” to “What do you call the little gray creature (that looks like an insect but is actually a crustacean) that rolls up into a ball when you touch it?,” “What do you call the wheeled contraption in which you carry groceries at the supermarket?,” and “Can you call coleslaw "slaw"?

Beyond its presence on the Web and its ability to either settle or start arguments, Vaux’s results sat mostly undisturbed until statistician Joshua Katz, a Ph.D. student at North Carolina State, took an interest in re-visualizing this fun dialect data he’d first encountered as an undergrad. He didn’t just lay the same data points on a nicer base map, he smoothed out the data to create what I’d call heat maps of dialect differences, indicating the likelihood of any particular usage dominating at any point in the continental U.S.

And these heat maps have, the Brits would say, hotted up. (Click here for Katz's full project set; click here for a Business Insider subset that may load faster.) “The response to these maps has been overwhelming,” Katz said in an email he’s sending to inquiring journalists. “I had no idea that they would catch fire like this. It's all pretty surreal.” In large part, it’s because it takes something that was fun already—looking at how those other people have the wrong word for everything—and make it more fun by showing how widely—by city, region, state (sans Alaska and Hawaii)—the proper usage extends. You can also see how individual usages fared and what the results were in a number of specific cities.

These maps make explicit something merely divineable from the 2002 maps: “You can see that some areas, particularly those around northeastern cities, have very distinct dialect boundaries. Then as you move south and west, everything starts to spread out.” Other observations include items that Americans may intuitively have known, but now have proof of. For example, that northwest Wisconsin is a land unto itself, with word choices that differ both from the rest of the state and the nearby Twin Cities, or that the Mason-Dixon Line is as much about language as it is cartography.

The new visualizations did leave some data on the cutting room floor. For example, Vaux’s data provided a richer sandwich experience. Katz explains in his project’s FAQ that only the four most popular answers appear, which creates a problem when “other” is among the top four. He cites the sandwiches, where in the Vaux survey grinder and poor boy were options, but scored below other in that set.

“I've always found regional variations in dialect really fascinating,” Katz said. “Language says so much about who a person is. To me, dialect is a badge of pride—it's something that says, ‘this is who I am; this is where I come from.’ So, just to take one example, being from South Jersey, what everyone else calls a sub will for me always be a hoagie.” (And his work shows the Philly-South Jersey axis is a linguistic outlier in the style of Green Bay.)

While Katz emphasizes he’s a statistician and not a linguist, there’s more language fun afoot, like an interactive quiz in which an algorithm will guess where you’re from based on your responses. “I feel like people would get a kick out of it, and as a bonus, we could collect a whole bunch more data.”

Or in other words, it'll be a hoot.