What Facebook Did for Chauvin’s Trial Should Happen All the Time

If the social-media giant can discourage hate speech and incitements to violence on a special occasion, it can do so all the time.

A megaphone emitting muted sound waves
Katie Martin / The Atlantic

On Monday, Facebook vowed that its staff was “working around the clock” to identify and restrict posts that could lead to unrest or violence after a verdict was announced in the murder trial of the former Minneapolis police officer Derek Chauvin. In a blog post, the company promised to remove “content that praises, celebrates or mocks” the death of George Floyd. Most of the company’s statement amounted to pinky-swearing to really, really enforce its existing community standards, which have long prohibited bullying, hate speech, and incitements to violence.

Buried in the post was something less humdrum, though: “As we have done in emergency situations in the past,” declared Monika Bickert, the company’s vice president of content policy, “we may also limit the spread of content that our systems predict is likely to violate our Community Standards in the areas of hate speech, graphic violence, and violence and incitement.” Translation: Facebook might turn down the dial on toxic content for a little while. Which raises some questions: Facebook has a toxic-content dial? If so, which level is it set at on a typical day? On a scale of one to 10, is the toxicity level usually a five—or does it go all the way up to 11?

This is not the first time Facebook has talked about reducing the amplification of inflammatory posts to make its platform a better and safer place. In the run-up to and the aftermath of the 2020 presidential election, Facebook talked about the “break glass” measures it was taking to limit the spread of misinformation and incitements to violence in the United States. Such steps had previously been reserved for “at-risk countries” such as Myanmar, Ethiopia, and Sri Lanka. Now these exceptional measures may be deployed in Minneapolis, which Facebook has “temporarily deemed to be a high-risk location” because of the Chauvin trial.

When Facebook detects heightened social tension, it puts in place policies to promote authoritative information and add “friction” so that content on the site moves more slowly and more calmly than usual. In short, the company reins in the virality that online platforms otherwise so spectacularly enable. Facebook largely avoided epic failure on Election Day, the New York Times columnist Kevin Roose asserted, by “dialing back the very features that have powered the platform’s growth for years.”

These kinds of decisions—about what platforms choose to amplify or add friction to—are the most important in content moderation. Discussion about content moderation tends to focus on binary decisions concerning whether individual pieces of content are left up or taken down. But content moderation is much more about knobs and dials that regulate the overall flow of posts. An individual piece of content is a mere drop in the ocean of Facebook content; the underlying systems that move this content around are the tides. The public discussion about content moderation typically fixates on the drops—what should Facebook have done with Donald Trump’s posts?—but when you’re weathering a storm, what matters is the tides.

Mark Zuckerberg agrees. In 2018, the Facebook CEO laid out a “blueprint for content governance” that included what the Harvard law professor Jonathan Zittrain has said should be the “most famous graphs in content moderation.” Stay with me—the graphs are interesting.

A graph of entanglement and prohibited content.
Courtesy of Facebook

The graph above shows what usually happens whenever Facebook draws a policy line. Posts that clearly fall beyond that line—overt and credible incitements to violence against police or protesters, to choose a hypothetical example—will be taken down. But posts that don’t cross the line will naturally gain more engagement as they tiptoe closer to it. In this default world, a post that uses hateful language targeted at police might stay up, because police are not a protected class, and attract more likes and shares than a post criticizing law enforcement on policy grounds. The more inflammatory post, in other words, would likely land on the upward curve of user engagement. (These examples are speculative, because Facebook does not provide specific details about what “approaching the line” means.)

But platforms can train their systems to recognize this “borderline content” and make engagement look like the graph below:

A graph of engagement and prohibited content.
Courtesy of Facebook

In this scenario, the more inflammatory a post is, the less distribution it gets. Posts describing police in hateful terms might stay up but be shown to fewer people. According to Zuckerberg, this strategy of reducing the “distribution and virality” of harmful content is the most effective way of dealing with it.

He’s right: The strategy works! Facebook has recently touted reductions in the amount of hate speech and graphic content that users see on its platform. How did it make these improvements? Not by changing its rules on hate speech. Not by hiring more human content moderators. Not by refining artificial-intelligence tools that seek out rule-breaking content to take down. The progress was “mainly due to changes we made to reduce problematic content in News Feed.” The company used dials, not on-off switches.

Facebook’s critics accuse it of spreading hate and violent content because such material increases users’ time on the site and therefore the company’s profits. That trope is probably overblown and too simplistic. Advertisers don’t like their ads running next to divisive content, and in the long term, users won’t keep coming back to a platform that makes them feel disgusted. Still, some leaks from employees have detailed projects to tamp down divisive or harmful content that were killed internally for business reasons. And the top 10 most-engaged-with posts after the election contained many more mainstream press accounts than before the break-glass measures took effect. The list looked so different from the usual fare of right-wing viral content that Facebook released a blog post trying to explain it. (The company conceded that the temporary measures had played a role, but suggested they were not the primary driver for the change.)

Without any independent access to internal data, outsiders can’t know how much of a difference Facebook’s break-glass measures make, or where its dials usually sit. But Facebook has a reason for announcing these steps. (To the company’s credit, it at least announced measures in anticipation of the Chauvin verdict; other platforms seemed to just keep their head down.) What the company hasn’t explained is why its anti-toxicity measures need to be exceptional at all. If there’s a reason turning down the dials on likely hate speech and incitement to violence all the time is a bad idea, I don’t see it.

Facebook’s old internal motto “Move fast and break things” has become an albatross around its neck, symbolizing how the company prioritized growth and scale, leaving chaos behind it. But when confronted with inflammatory content, the platform should move faster and break more glass.

The Chauvin trial may be a unique event, but racial tension and violence are clearly not. Content on social media leading to offline harm is not confined to Minneapolis or the U.S.; it is a global problem. Toxic online content is not an aberration, but a permanent feature of the internet. Platforms shouldn’t wait until the house is burning down to do something about it.