There it is in your Facebook timeline or Instagram gallery – a digital footprint of your mental health.
It’s not hidden in the obvious parts: the emojis, hashtags and inspirational quotes. Instead, it lurks in subtler signs that, unbeknownst to you, may provide a diagnosis as accurate as a doctor’s blood pressure cuff or heart rate monitor.
For those who see social media mainly as a place to share the latest cat video or travel snap, this may come as a surprise. It also means the platform has important – and potentially life-saving – potential. In the US alone, there is one death by suicide every 13 minutes. Despite this, our ability to predict suicidal thoughts and behaviour has not materially improved across 50 years of research. Forecasting an episode of psychosis or emerging depression can be equally challenging.
But data mining and machine learning are transforming this landscape by extracting signals from dizzying amounts of granular data on social media. These methods already have tracked and predicted flu outbreaks. Now, it’s the turn of mental health.
Studies have found that if you have depression, your Instagram feed is more likely to feature bluer, greyer, and darker photos with fewer faces. They’ll probably receive fewer likes (but more comments). Chances are you’ll prefer the Inkwell filter which converts colour images to black and white, rather than the Valencia one which lightens them.
Even then, these patterns are hardly robust enough in isolation to diagnose or predict depression. Still, they could be crucial in constructing models that can. This is where machine learning comes in.
US President Donald Trump's tweets score highly for upbeat language (Credit: Alamy)
Researchers from Harvard University and the University of Vermont used these techniques in their recent analysis of almost 44,000 Instagram posts. Their resulting models correctly identified 70% of all users with depression. compared to a rate of 42% from general practitioners. They also had fewer false positives (although this figure drew from a separate population, so may be an unfair comparison). Depressive signals were evident in users’ feeds even before a formal diagnosis from psychiatrists – making Instagram an early warning system of sorts.
Meanwhile, psychiatrists have long linked language and mental health, listening for the disjointed and tangential speech of schizophrenia or the increased use of first-person singular pronouns of depression. For an updated take, type your Twitter handle into AnalyzeWords. It’s a free text analysis tool which focuses on junk words (pronouns, articles, prepositions) to assess emotional and thinking styles. From my 1017 most recent words on Twitter, I’m apparently average for being angry and worried but below average on being upbeat – I have been pretty pessimistic about the state of the world recently. Enter @realdonaldtrump into AnalzyeWords and you’ll see he scores highly on having an upbeat emotional style, and is less likely than average to be worried, angry, and depressed.
The behaviour we exhibit online can be used to inform diagnostic and screening tools – Chris Danforth, University of Vermont
But far beyond this quick and sometimes amusing scan of emotional and social styles (AnalyzeWords tells you if you’re more “Spacy/ValleyGirl” than average), researchers are exploring profound questions about mental health.
Telling signals of depression include an increase in negative words (“no”, “never”, “prison”, “murder”) and a decrease in positive ones (“happy”, “beach”, and “photo”), but these are hardly definitive. Taking it a step further, researchers at Harvard University, Stanford University and the University of Vermont extracted a wider range of features (mood, language and context) from almost 280,000 tweets. The resulting computational model scored highly on identifying users with depression; it also was correct in about nine of every 10 PTSD predictions.
The ratio of positive to negative words was a key predictor within the model, says Chris Danforth, one of the researchers and Flint professor of mathematical, natural and technical sciences at the University of Vermont. Other strong predictors included increased tweet word count.
Danforth emphasises that only a small, specific group of people were assessed so he sees this study as proof-of-concept. But he’s optimistic. “These and other similar results suggest that the behaviour we exhibit online can be used to inform diagnostic and screening tools," he says. Incorporate physical information (from FitBits and sleep apps, for example) and those tools could yield even greater power.
There are still linguistic challenges, though. Take these tweets:
“My schizophrenia article got approved for my #Psychopharmacology presentation! #yass #cantstopwontstop”
“Watching True Life: I Have Schizophrenia Yessss... My kinda topic, future Clinical Psychologist right here!”
This is “noisy data” – a computerised model might incorrectly recognise it as belonging to users with schizophrenia. In a 2017 US study, mental health specialists first eliminated this sort of noise from 671 Twitter users. Machine learning then predicted a schizophrenia diagnosis with a mean accuracy of 88%, a level of success only made possible by human-machine collaboration.
What to do with all this information? Empowerment would be a good start. A Microsoft Research team has managed to forecast which new mothers might develop extreme changes in behaviour and mood, all based on pre-natal and early post-natal Twitter usage. Although perinatal depression and anxiety are underdiagnosed, the researchers emphasise they’re not aiming to replace traditional diagnostic and prediction methods. But imagine, they say, if expectant mothers could opt to run this sort of predictive model on their smartphones. This way they could receive a “PPD risk score” via an app, with information about resources or more intensive and immediate help offered if needed.
Users are frequently unaware their data has been mined
Reservations persist more broadly in this field, though, especially around privacy. What if digital traces of your mental health become visible to all? You might be targeted by pharmaceutical companies or face discrimination from employers and insurers. In addition, some of these types of projects aren’t subject to the rigorous ethical oversight of clinical trials. Users are frequently unaware their data has been mined. As privacy and internet ethics scholar Michael Zimmer once explained, “just because personal information is made available in some fashion on a social network, does not mean it is fair game for capture and release to all”.
AnalyzeWords looks at the words you use on Twitter to assess your mental state (Credit: AnalyzeWords)
Circumspection about this brave new world is also required. In 2013, Google Flu Trends drastically overestimated peak flu levels. A group of Harvard researchers blamed Big Data Hubris: “the often implicit assumption that big data are a substitute for, rather than a supplement to, traditional data collection and analysis.”
Data mining and machine learning offer the potential for earlier identification of mental health conditions. Currently, the time from onset of depression to contact with a treatment provider is six to eight years; for anxiety, it’s nine to 23 years. In turn, hopefully we’ll see better outcomes. Two billion users engage with social media regularly – these are signals with scalability. As Mark Zuckerberg wrote recently while outlining Facebook’s AI plans, “there have been terribly tragic events – like suicides, some live streamed – that perhaps could have been prevented if someone had realized what was happening and reported them sooner.”
Mental health exists between clinic appointments. It ebbs and flows in real time. It lives in posts and pictures and tweets. Perhaps prediction, diagnosis and healing should live there, too.
If you liked this story, sign up for the weekly bbc.com features newsletter, called “If You Only Read 6 Things This Week”. A handpicked selection of stories from BBC Future, Earth, Culture, Capital, and Travel, delivered to your inbox every Friday.