Hospitals in the United States are again filling up with COVID-19 patients, most of whom have refused COVID-19 vaccinations, likely due in part to medical misinformation circulating online and on television. We are eight months removed from a violent attack on the nation’s capital, organized online and carried out by individuals who believe—incorrectly—that the 2020 election was stolen. The majority of Republicans polled believe that Donald Trump is the legitimate American president.
It should come as no surprise that many people—including academic researchers, journalists, the U.S. surgeon general—are very interested in the effects of misinformation and disinformation on society, and are particularly focused on information shared online. This profound desire to comprehend the relationship between our informational environment and civic behavior is necessary context for this week’s battle between social-media platforms and the scholars who want to understand them better.
On August 3, Facebook shut down the accounts of researchers associated with the Ad Observatory, an NYU-based research project that collected ads displayed on Facebook pages. According to Facebook, the researchers were engaged in unauthorized data collection, and they were stopped, Facebook explained, to “protect people’s privacy in line with our privacy program under the FTC Order.”
The order in question is the settlement Facebook confronted after an academic researcher transferred the personal data of 87 million Facebook users to the political-consulting firm Cambridge Analytica. Facebook accepted a $5 billion fine from the Federal Trade Commission and a set of restrictions on how user data could be used. (The FTC has made clear to Facebook, and to the public, that the Ad Observatory project is not prohibited by the consent decree and wrote that “the consent decree does not bar Facebook from creating exceptions for good-faith research in the public interest.”)
While Facebook has good reasons—financial and reputational, not to mention ethical—to be worried about violating user privacy, NYU’s project is far from a repeat of the Cambridge Analytica scandal. In that instance, users who took an apparently innocuous “personality quiz” unwittingly gave personal data about themselves and their Facebook friends to a political-marketing firm. With Ad Observatory, 6,500 Facebook users have voluntarily installed a web-browser plug-in that communicates what ads they encounter on Facebook to a central repository. The plug-in site is admirably clear about what information it does and doesn’t collect, and its open-source code has been reviewed by Mozilla, the privacy-focused nonprofit, to ensure that it complies with Ad Observatory’s stated privacy policies.
Unlike the 87 million victims of Cambridge Analytica, the 6,500 users who make Ad Observatory possible have entirely opted in—meaning they know what information is being collected on them and why. One of the major purposes of the Ad Observatory project is to ensure that Facebook’s Ad Library—one of the company’s most meaningful efforts at transparency as mandated under the FTC order—accurately reflects what ads actually run on Facebook. Another goal is learning targeting patterns for political ads, information that Facebook does not make available through its Ad Library. This information is crucial for understanding how political candidates deliver different messages to different voters, a technique that could allow a candidate to try to suppress turnout of one voting group while encouraging the turnout of a different group.
The battle between Facebook and NYU is important not just because we need better information on political advertising online. The method that Ad Observatory uses to understand Facebook is an online research panel, and it’s deeply helpful for asking two types of questions: How common is a particular behavior online? And who participates in this behavior?
Researchers working with the Anti-Defamation League to understand extremist content on YouTube show why panels are such useful research tools. The researchers analyzed data donated by 915 demographically diverse users to understand what videos they encountered online. In the months the study covered, 9.2 percent of YouTube users encountered at least one extremist or nationalist video online, which suggests that these videos are disturbingly common, but far from universally viewed.
What’s perhaps most interesting about the ADL study is that more than 90 percent of the extremist-content views on YouTube came from a small group of users who had answered demographic-survey questions in a way that identified them as high in racial or gender resentment. On the one hand, it is far from surprising to learn that racists watch racist videos. On the other hand, understanding that audiences for extremist content are concentrated, rather than distributed evenly through the overall population of YouTube users, is exactly the sort of information that policy makers would need to fight the spread of extremism and associated violence.
The Markup, a newsroom focused on examining how algorithms and technologies shape society, has started a project called Citizen Browser, in which a panel of more than 2,000 Facebook users, recruited to represent a broad demographic spectrum, are paid to share their Facebook data. The Markup used data from Citizen Browser to demonstrate that Facebook was recommending political groups to users despite Mark Zuckerberg’s promises that Facebook would end these practices. Mozilla has launched an ambitious data-donation program for users of its Firefox browser called Rally. Princeton researchers are already using Rally to track how often users encounter COVID-19 misinformation online and how they share it, important information for countering anti-vaccine misinformation.
Whether Facebook continues to block NYU researchers from its platform or relents under pressure, we are long overdue for a conversation about how journalists, researchers, and citizens can know what’s actually happening on major technology platforms. Last year, I co-led a team that interviewed dozens of researchers about access to data concerning behavior on social-media platforms. Not one of the researchers we spoke with felt they had the data they needed to understand behavior online, including the researchers working with platforms to set up researcher access to those platforms. (Twitter expanded researcher access after our report was finalized in ways that would have met the needs of some of our interviewees.)
Many researchers were optimistic about answering questions using data coming from the projects like the one Facebook has just blocked. Others have turned to unauthorized and adversarial data collection, scraping data from sites such as Gab and Parler to understand the spread of misinformation and disinformation on sites actively hostile toward researchers studying those topics.
The conflict over Ad Observatory scratches the surface of a much more complex question. Given the influence that social networks have on contemporary politics and social behavior, should companies be allowed veto power over research conducted on their platforms? If not, can Congress step in to create a researcher exemption to laws like the Computer Fraud and Abuse Act to ensure that academic and journalistic researchers are not prosecuted as hackers and mischaracterized by companies focused primarily on protecting their profits?
Facebook and other platform companies say they believe that the tools they build are, on balance, benefiting humanity. Until they let projects like Ad Observatory study key civic questions in careful and privacy-preserving ways, no one should believe them.