In the crucial early hours after the Las Vegas mass shooting, it happened again: hoaxes, completely unverified rumors, failed witchhunts, and blatant falsehoods spread across the Internet. But they did not do so by themselves: they used the infrastructure that Google and Facebook and YouTube have built to achieve wide distribution. These companies are the most powerful information gatekeepers that the world has ever known, and yet they refuse to take responsibility for their active role in damaging the quality of information reaching the public. Buzzfeed’s Ryan Broderick found that Google’s “top stories” results surfaced 4chan forum posts about a man that right-wing amateur sleuths had incorrectly identified as the Las Vegas shooter. 4chan is a known source not just of racism, but hoaxes and deliberate misinformation. In any list a human might make of sites to exclude from being labeled as “news,” 4chan would be near the very top.

Yet, there Google was surfacing 4chan as people desperately searched for information about this wrongly accused man, adding fuel to the fire, amplifying the rumor. This is playing an active role in the spread of bad information, poisoning the news ecosystem. The problem can be traced back to a change Google made in October 2014 to include non-journalistic sites in the “In the News” box instead of pulling from Google News. But one might have imagined that not every forum site could be included. The idea that 4chan would be within the universe that Google might scrape is horrifying. Worse, when I asked Google about this, and indicated why I thought it was a severe problem, they sent back boilerplate. Unfortunately, early this morning we were briefly surfacing an inaccurate 4chan website in our Search results for a small number of queries. Within hours, the 4chan story was algorithmically replaced by relevant results. This should not have appeared for any queries, and we’ll continue to make algorithmic improvements to prevent this from happening in the future. It’s no longer good enough to note that something was algorithmically surfaced and then replaced. It’s no longer good enough to shrug off (“briefly,” “for a small number of queries”) the problems in the system simply because it has computers in the decision loop. After I followed up with Google, they sent a more detailed response, which I cannot directly quote, but can describe. It was primarily an attempt to minimize the mistake Google had made, while acknowledging that they had made a mistake.

4chan results, they said, had not shown up for general searches about Las Vegas, but only for the name of the misidentified shooter. The reason the 4chan forum post showed up was that it was “fresh” and there were relatively few searches for the falsely accused man. Basically, the algorithms controlling what to show didn’t have a lot to go on, and when something new popped up as searches for the name were ramping up, it was happy to slot it at as the first result. The note further explained that what shows up “In the News” derives from the “authoritativeness” of a site as well as the “freshness” of the content on it. And Google acknowledged they’d made a mistake in this case. The thing is: this is a predictable problem. In fact, there is already a similar example in the extant record. After the Boston bombings, we saw a very similar “misinformation disaster.” Gabe Rivera, who runs a tech news service called Techmeme that uses humans and algorithms to identify important stories, addressed the problem in a tweet. Google, he said, couldn’t be asked to hand-sift all content but “they do have the resources to moderate the head” i.e. the most important searches. The truth is that machines need many examples to learn from. That’s something we know from all the current artificial intelligence research. They’re not good at “one shot” learning. But humans are very good at dealing with new and unexpected situations. Why are there not more humans inside Google who are tasked with basic information filtering tasks? How can this not be part of the system, given that we know the machines will struggle with rare, breaking news situations?

Google is too important, and from what I’ve seen reporting on them for 10 years, the company does care about information quality. Even from a pure corporate trust and brand perspective, wouldn’t it be worth it to have a large enough team to make sure they get these situations right across the globe? Of course, it is not just Google. On Facebook, a simple search for “Las Vegas” yields a Group called “Las Vegas Shooting /Massacre,” which sprung up after the shooting and already has more than 5,000 members. The group is run by Jonathan Lee Riches, who gained notoriety by filing 3,000 frivolous lawsuits while he was in prison for 10 years for stealing money by impersonating people whose bank credentials had been phished. Now, he calls himself an “investigative journalist” with Infowars, though there is no indication he’s been published on the site, and given that he also lists himself as a former Male Underwear Model at Victoria’s Secret, a former Nuclear Scientist at Chernobyl, and a former bodyguard at Buckingham Palace, his work history may not be reliable. The problems with surfacing this man’s group to Facebook users is obvious to literally any human. But to Facebook’s algorithms, it’s just a fast-growing group with an engaged community. Most people who joined the group looking for information presumably don’t know that the founder is notorious for legal and informational hijinx. Meanwhile, Kevin Roose of The New York Times, pointed out that Facebook’s Trending Stories page was surfacing stories about the shooting from Sputnik, a known source of Russian propaganda. Their statement was, like Google’s, designed to minimize what had happened.