Data Science and the History of Subjective Wellbeing
To put wellbeing on par with GDP, we need to understand the history of emotions.
Posted Jul 16, 2015
Numerous governmental policies are based on improving GDP. Yet, in the words of John F. Kennedy, "[GDP] measures everything in short, except that which makes life worthwhile” (Presidential Library and Museum, n.d.)."
The 'worthwhileness' of life, unfortunately, hasn't had the same research or evidence-based history as GDP. But that is changing. Calls from David Cameron (Prime Minister of the UK), the UN World Happiness Report, the OECD’s Better Life Index, along with psychologists and economists, all reflect on the need to better understand subjective wellbeing (i.e., “happiness”). Though many contemporary economies have tracked crime, education, and economic production since the mid 1900s, subjective wellbeing only began to become a staple of world economic indicators in the 1970s.
Unlike national income accounting, which initiated collection of GDP in the 1930s, subjective wellbeing is a rather young indicator. Though there have been successful projects to roll back GDP, such as the Madison Historical GDP Project, this has not yet been proposed for subjective wellbeing. Lacking are the greater historical trends, which would allow us to better understand how wellbeing responds to key historic events, such as expansionary monetary policies, education, and longevity.
How can we extend existing subjective wellbeing measures when direct survey evidence was only initiated in the 1970s? The key insight in a recent article by myself, and economists Eugenio Proto and Daniel Srgio, "Historical Analysis of National Subjective Wellbeing Using Millions of Digitized Books," is that language conveys sentiment. And thus the growing availability of digitized text provides unprecedented resources to construct a quantitative history of subjective wellbeing based on historical language use.
The foundation of our work involves combining multiple large corpora of natural language going back two centuries with state-of-the-art methods for deriving public mood (i.e., sentiment) from language. The recent digitization of books, newspaper, and other sources of natural language—such as the Google Books Ngram database—represent historically unprecedented amounts of data on what people thought and wrote over the past few centuries (Michel et al., 2011). These databases have already proved fruitful in detecting large-scale changes in language, such as recent work showing the evolution of learnability in American English, which correlates with social and demographic change (Hills & Adelman, 2015).
These data offer the capacity to infer public mood using sentiment analysis. Deriving sentiment from large collections of written text represents a growing scientific endeavour. Examples include recovering large-scale opinions about political candidates, predicting stock market trends, understanding diurnal and seasonal mood variation, detecting the social spread of collective emotions, and understanding the impact of events with the potential for large-scale societal impact such as celebrity deaths, earthquakes, and economic bailouts (e.g., Pang & Lee, 2008).
Applying the same methods to historical text we can begin to produce more quantitative accounts of national happiness.
In the approach we took, sentiment measures were based on valence norms for thousands of words. These already exist in the literature and were collected from a large group of individuals who were asked to rate a list of words on how those words make them feel (e.g., Warriner et al., 2013). In the present case, valence norms based on the Affective Norms for English Words have already been collected for five languages: English, French, Spanish, Italian, and German. We applied these norms to the Google Books corpus for each of these languages, allowing us to derive proxies for subjective wellbeing going back to 1776 for six nations.
An initial comparison with subjected wellbeing collected with survey data is shown in Figure 1 (below). The data reflect the residuals after controlling for country fixed effects and clearly show a strong and significant correlation with our measure based on historic language. In other words, when people reported having higher subjective wellbeing in surveys, our measure based on the language they produced also indicated they had higher subjective wellbeing.
Figure 1. Comparison between survey measures of life satisfaction and residuals (after controlling for country fixed effects) for our measure based on sentiment from historic text. The grey area represents the 95% confidence interval.
Rolling the text-derived measures of subjective wellbeing back to 1776 reveals a quantitative picture of how public sentiment has changed across the six countries. Though we make clear in our article that this approach is not useful for long-term trends, it is nonetheless clear in Figure 2 that short-term events, such as the American exuberance of the 1920s, the depression era, and World War I and II show clear and distinguishable influences on subjective wellbeing. I only provide the data for the USA, the UK, and Germany; data on the other countries can be found in the original article here.
Figure 2. USA
Figure 2. Britian
Figure 2. Germany
We can use the data above to make additional predictions about the relative influence of different historical events. For example, in our article, we report a number of interesting relatinoships:
- we find a positive short-run effect for GDP and life expectancy on subjective wellbeing.
- An increase of 1% in life expectancy is equivalent to more than 5% increase in yearly GDP.
- One year of internal conflict costs the equivalent of a 50% drop in GDP per year in terms of subjective wellbeing. Compare this with the 25% drop in Greek GDP following the crisis in 2010.
- Public debt, on the other hand, has a short-run positive effect--as one might hope for expansionist fiscal policies.
- Our estimated index of subjective wellbeing generally does not feature any positive trend, which is consistent with the Easterlin paradox, although we caution against long term analysis given the historical variation of written texts (which parallel similar issues with historical GDP statistics).
Why is a quantitative history of wellbeing important?
The fledgling state of wellbeing data has limited our collective ability to understand how wellbeing responds to different historic events. This has in turn limited the use of wellbeing in public policy, health initiatives, and financial decision making. In practice, if subjective wellbeing is to become a key factor in guiding our collective behaviour, then we need accounts of wellbeing on par with those of GDP. Using wellbeing as a measure to guide behaviour, however, takes more than the desire to simply improve wellbeing. As noted by Daniel Gilbert in Stumbling on Happiness, people have problems understanding what is called affective forecasting—the ability to understand how one will feel in the future—and with this also comes a limited capacity to understand how prior events and decisions influenced our past happiness. To overcome this, especially at the level of government, we must develop our capacity to predict how wellbeing responds to both deliberate and unexpected events.
To take one real world example, a recent article by Filipe Campante and
David Yanagizawa-Drott in The Quarterly Journal of Economics, "Does Religion Affect Economic Growth and Happiness? Evidence from Ramadan" found that though Ramadan has a negative impact on countries practicing Ramadan, subjective wellbeing increases.
Better predicting economic fortunes was the motivation of the national income accounting following the depression in the 1930s, which later became the GDP. Of course, now numerous decisions are based on the GDP, despite a broad acceptance of the words of John F. Kennedy. Thus, like GDP, governments and other agencies recognize the importance of this additional ‘emotional accounting’ and, by all accounts, they want to understand how better to use it to improve future wellbeing. But to do that, we need historical informed accounts of what this means.
Campante, F. R., & Yanagizawa-Drott, D. H. (2013). Does Religion Affect Economic Growth and Happiness? Evidence from Ramadan (No. w19768). National Bureau of Economic Research.
Hills, T. T., & Adelman, J. S. (2015). Recent evolution of learnability in American English from 1800 to 2000. Cognition, 143, 87-92.
Hills, T. T., Proto, E., & Sgroi, D. (2015). Historical analysis of national subjective wellbeing using millions of digitized books. IZA Discussion Paper No. 9195.
Michel, J.-B., Shen, Y. K., Aiden, A. P., Veres, A., Gray, M. K., Pickett, J. P., et al. (2011). Quantitative analysis of culture using millions of digitized books. Science, 331 (6014), 176-182.
Pang, B., & Lee, L. (2008) Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2 (1-2), 1–135.
Warriner, A. B., Kuperman, V., & Brysbaert, M. (2013). Norms of valence, arousal, and 21 dominance for 13,915 English lemmas. Behavior Research Methods, 45 (4), 1191-1207.