This piece contains strong language from the beginning, as they say on the BBC. But only in the name of science – for a new study of how slang expressions spread on Twitter could offer insights into a more general question in linguistics: how language changes and evolves.
You might, like me, have been entirely innocent of what “af” denotes in the Twittersphere, in which case the phrase “I’m bored af” would simply baffle you. It doesn’t, of course, take much thought to realise that it’s simply an abbreviation for a vulgarity – a tamer version of which is “as hell”. What’s less obvious is why this pithy abbreviation should have jumped from its origin in southern California to a cluster of cities around Atlanta before spreading more widely across the east and west US coasts, as computer scientist Jacob Eisenstein of the Georgia Institute of Technology in Atlanta and his co-workers Brendan O’Connor, Noah Smith and Eric Xing of Carnegie Mellon University in Pittsburgh report in an, as yet unpublished, study.
Other neologisms have different life stories. Spelling bro, slang for brother (male friend or peer) as bruh began in the southeastern US (where it reflects the local pronunciation) before finally jumping to southern California. The emoticon “-__-“ (denoting mild annoyance) began in New York and Florida before colonising both coasts and gradually reaching Arizona and Texas.
Who cares? Well, the question of how language changes and evolves has occupied linguistic anthropologists for several decades. What determines whether an innovation will propagate throughout a culture, remain just a local variant, or be stillborn? Such questions decide the grain and texture of all our languages – why we might tweet “I’m bored af” rather than “I’m bored, forsooth”.
There are plenty of ideas about how this happens. One suggestion is that innovations spread by simple diffusion from person to person, like a spreading ink blot. Another idea is that bigger population centres exert a stronger attraction on neologisms, so that they go first to large cities by a kind of gravitational pull. Or maybe culture and demography matters more than geographical proximity: words might spread initially within some minority groups while being invisible to the majority.
Sophisticated computer models of interacting “agents” (that is, virtual people having virtual conversations) can be used to examine these processes, but they tell us little unless there are real data to compare them against. Such data has been extremely difficult to obtain, but now social media channels provide an embarrassment of riches – a precise and searchable record of our exchanges, which are being used to explore everything from the cycle of emotions experienced in everyday life to the changing social sentiment during the course of the Arab Spring.
Twitter, FTW! :-)
In this case, Eisenstein and colleagues scoured through messages from the public feed on Twitter. They collected around 40 million messages from around 400,000 individuals between June 2009 and May 2011 that could be tied to a particular geographical location in the USA because of the smartphone metadata optionally included with the message.
The researchers then assigned these to their respective “Metropolitan Statistical Areas” (MSAs): urban centres that typically represent a single city. For each MSA, demographic data on ethnicity are available which, with some effort to correct for the fact that Twitter users are not necessarily representative of the area’s overall population, allows a rough estimate of the ethnic makeup of the message source.