They also found that the Bible belt stretching across the American south and into Texas was less happy than the west or New England. The saddest town of the 373 urban areas studied was Beaumont in east Texas. The happiest was Napa, California, home of many drunk people wine makers. The only town among the 15 saddest that was not in the south or Rust Belt was Waterbury, Connecticut. (Although Waterbury has appeared on several "worst places to live" lists, which seems like mean lists to make.)
The researchers coded each tweet for its happiness content, based on the appearance and frequency of words determined by Mechanical Turk workers to be happy (rainbow, love, beauty, hope, wonderful, wine) or sad (damn, boo, ugly, smoke, hate, lied). While the researchers admit their technique ignores context, they say that for large datasets, simply counting the words and averaging their happiness content produces "reliable" results.
Here's a closer look at how they calculated a happiness for the top and bottom cities. The illustration is a little confusing, so let's walk through it because it really shows the methodology of the research.
Next to each word are two symbols, a plus or minus (+/-) and up or down arrows. The plus or minus indicates whether that word is considered happy or sad. The up or down arrow indicates whether that word was used more or less than average in that city. So, let's take 'shit' as an example. Shit, a negative word, was used less often in Napa and more often in Beaumont. The size of the bar that you see shows how much that word contributed to the happiness rating for the city. So, the lack of shits in Napa played a substantial role in its high rating, while the prevalence of shits hurt Beaumont's happiness rating. Looking just at Beaumont, one can see why it got a low rating. The only positive words at the top of its ledger are "lol" and "haha," and there were not enough hahas to bring it up to the national average. The rest of the words -- shit, ass, damn, gone, no, bitch, hell -- were negative and used often.
For individual cities, the Vermont researchers note, the amount of swearing contributed substantially to their final scores. They think it's worth investigating this phenomenon, which they call "geoprofanity."
One difficulty I have with the study is that it doesn't take into account that people might just talk about happiness differently in some parts of the country or within some demographic groups. The study identified people with Norwegian ancestry as happier than African Americans. Is that because the Norwegians are actually happier or do they just tweet as if they're happier?
This is not an easy problem to solve, but the authors of the new paper do an admirable job showing that their data correlates with other existing measures of happiness, primarily surveys conducted by Gallup. They also show that their happiness data correlates with income and the prevalence of obesity in an area.
We should also note that many people vacation in Napa (the top city) and Hawaii (the stop state), which might throw off the numbers at the very top. But if you look a bit farther down the lists, you see cities (Longmont, Green Bay, Spokane, San Jose) and states (Idaho, Maine, Washington) that are not year-round tourism hot spots, but still score very high on the hedonometer.
Another problem is that the researchers did not look at Twitter in Spanish. If the researchers contention that income is positively correlated with happiness is true, cities where the poor population is primarily Spanish speaking would appear happier on this list than warranted. The prevalence of Western states with large Latino populations on the happy list would seem to suggest this bias is worth exploring.
Nonetheless, it's fascinating to see people exploring how to quantify happiness beyond survey data. I'd love to see examples of cities that overperform on happiness relative to their economic factors. Do they just have good weather or has some set of policies had an actual impact?
Update: The list of happiest and saddest states was incorrect in a caption in the original paper. The paper has been corrected, and so we have changed the list here, too.