Identifying Depression Based on Social Media Posts
Researchers hope algorithms could help detect cases of depression and PTSD.
Posted Jan 20, 2018
By Cameron Evans
We think we choose what we share about ourselves on social media, but according to new research, we may be sharing more than we think, including insights into our mental health.
Using algorithms, a research team recently analyzed the data of Twitter users to see if it could be used to predict depression and PTSD in users—and found that roughly 9 times out of 10, the predictions their algorithms made were correct.
The team of researchers analyzed Twitter data from 204 individuals—105 with known clinical depression and 99 without—to see if differences in language could help them detect depression. They found that depressed people used fewer words that were independently rated as positive, such as “happy,” “beach,” and “photo,” and more negative ones such as “death,” “no,” and “never.”
To obtain the results, the researchers used two different kinds of models. One, called a classifier, learned to separate the tweets of healthy and depressed Twitter users (or, alternately, those of healthy users and users suffering from PTSD). The second kind learned to identify indicators of these mental health conditions as they emerged over time.
Using the latter type of model, researchers were able to draw timelines of depression onset, course, and recovery based only on the language people used in their tweets, says Andrew Reece, a postdoctoral researcher at Harvard University who co-authored the study. When the team tested the model on a separate group of 174 individuals—63 with PTSD and 111 control participants—it detected a change in Twitter behavior among those with PTSD immediately following the traumatic event.
Notably, the algorithmic approach was often able to detect signs of PTSD and depression many months prior to clinical diagnosis. Although there is still much research to be done, Reece says that some day, computational approaches to mental health screening could be used to assist physicians in making diagnoses. The researchers chose only tweets posted prior to the date of subjects’ first diagnosis to rule out the possible influence that a clinical diagnosis could have on how some individuals portray themselves on social media. The use of pre-diagnosis tweets also supports the idea that Twitter data could be used to detect mental health conditions early.
When looking at the findings, Reece says, it’s important to not only consider precision (the proportion of predictions that were correct), but also recall (the amount of evidence of mental illness that the models identified). While the algorithms’ predictions of depression and PTSD diagnoses were accurate more than 85 percent of the time, they only flagged about half of all posts made by depressed individuals as such, and about two-thirds of those written by PTSD-afflicted people. “While our models clearly have a lot of room to improve, they're working to improve on a task that is clearly very difficult for humans,” Reece says.
Another limitation of the study is that researchers were only able to study individuals who used Twitter and were willing to share both their mental health history and their social media data, meaning the findings may not be generalizable to those who are less open about their personal information. Reece says that many people dropped out of the study when they were asked to share their social media information, leading him to wonder whether the models are only good at picking out depression in extroverts or in very trusting people.
“Future iterations of our work will have to determine whether this algorithmic approach can reliably screen for mental health issues in the general population,” he says. “We're not there yet, but we think this is a good start.”
Reece, A. G., Reagan, A. J., Lix, K. L., Dodds, P. S., Danforth, C. M., & Langer, E. J. (2017). Forecasting the onset and course of mental illness with Twitter data. Scientific Reports, 7(1). doi:10.1038/s41598-017-12961-9