On January 28 2011, three days into the fierce protests that would eventually oust the Egyptian president Hosni Mubarak, a Twitter user called Farrah posted a link to a picture that supposedly showed an armed man as he ran on a “rooftop during clashes between police and protesters in Suez”. I say supposedly, because both the tweet and the picture it linked to no longer exist. Instead they have been replaced with error messages that claim the message – and its contents – “doesn’t exist”.
Few things are more explicitly ephemeral than a Tweet. Yet it’s precisely this kind of ephemeral communication – a comment, a status update, sharing or disseminating a piece of media – that lies at the heart of much of modern history as it unfolds. It’s also a vital contemporary historical record that, unless we’re careful, we risk losing almost before we’ve been able to gauge its importance.
Consider a study published this September by Hany SalahEldeen and Michael L Nelson, two computer scientists at Old Dominion University. Snappily titled “Losing My Revolution: How Many Resources Shared on Social Media Have Been Lost?”, the paper took six seminal news events from the last few years – the H1N1 virus outbreak, Michael Jackson's death, the Iranian elections and protests, Barack Obama's Nobel Peace Prize, the Egyptian revolution, and the Syrian uprising – and established a representative sample of tweets from Twitter’s entire corpus discussing each event specifically.
It then analysed the resources being linked to by these tweets, and whether these resources were still accessible, had been preserved in a digital archive, or had ceased to exist. The findings were striking: one year after an event, on average, about 11% of the online content referenced by social media had been lost and just 20% archived. What’s equally striking, moreover, is the steady continuation of this trend over time. After two and a half years, 27% had been lost and 41% archived.
This is just one investigation, and a preliminary one at that. The figures, though, suggest a clear linear trend: the loss of just over 10% of the resources shared via social media each year, even when archiving is taken into account, or around 0.02% of this content lost every day.
This isn’t the same thing as Tweets themselves vanishing. For those wishing to analyze exhaustively trends within social media utterances themselves, services like Gnip – which, for a fee, promises “complete and comprehensive access to every publicly available Tweet dating back to the very first Tweet from March 21, 2006” – offer an unprecedented “fire hose” of data, from which marketing and research firms are already gratefully guzzling.
What’s most vulnerable, rather, is the network of living connections into which social media is a window: the nexus of sources, resources, sounds, images and updates that together constitute the stuff of many millions of people’s daily experience. One commercial firm may well be able to sell you every extant public tweet ever sent – and another may do the same for other social media services. As work like SalahEldeen and Nelson’s study suggests, however, preserving these individual threads does little by itself to stop the tapestry of present history unravelling.
It’s a phenomenon that, in a different form, has been much on my mind recently, thanks to my work on a new book delving into the history of many digital developments since the end of the Second World War.
As you might expect, the internet itself is an endless treasure trove for such research. At the same time, though – and especially when it comes to the pre-web contents of early internet technologies such as Usenet or Bulletin Board Systems – all that remains of many seminal exchanges or ideas is often the copy-paste of a copy-paste of a copy-paste. Less than three decades after many discussions took place, both the “original” source and the technological platform on which it existed are not only impossible to find, but literally non-existent.