Using data scraped from users also raises interesting ethical questions. "You took that photo with your cell phone, so it has GPS embedded into the meta information," Jackson says. "I know the time, date, and who took it and where. There’s no reason I can’t use that in an information map. But I don’t know whether that’s unethical to do without that person’s permission."
Nicholas Evans, who works on research ethics and information technology at the University of Pennsylvania, says that these are questions the field will have to grapple with if scientists want to start using this kind of data in published papers. "Say we’re trawling through Twitter looking to make a map," Evans says—"then we can expect that data will be reasonably anonymized. We’ll just use the location information and nothing else. But if we’re using really small amounts of data, we’re talking about revealing where people live, depending on where they found it, and that’s different. We’ve learned that it’s a really bad idea to broadcast where you live on the Internet."
If you took that photo in your backyard, in other words, anybody who reads that paper knows where your backyard is.
There’s also the issue of crediting amateurs who may have been instrumental in a scientific discovery. In the lacewing case, Winterton and Guek were coauthors on the paper. In other cases, a scientist might name the new species after whoever snapped the photo. I asked Jackson if we’re about to enter an era of species with names like "frogsplaysoccer145" or "h0ckeymom."
"If all I’ve got is a username, and I can’t get in touch with them, personally I would consider it," he says. "It would get me some side-eye, but I wouldn’t have a problem if that’s the proper credit to put there."
And storing all this data will be crucial, too, another place where privacy and science might butt heads. If someone deletes his or her Instagram or Twitter account, and makes the conscious choice to remove that information from the web, can researchers still hang on to an archived copy?
The solution might be as simple as contacting the photographer and working out what she's comfortable with, like Winterton did with Guek. But in cases where you’re trying to use thousands of images, that can get unwieldy. And in the case of the ants, where the original footage has been reposted by someone else, finding the person who first captured an image or shot a video can be difficult.
As with almost everything, the field will likely struggle to keep up with technology. "Taxonomists work on the order of decades to almost centuries in some cases," Jackson says. "So to have things like this where you could have a new data set every day to be testing your ideas and hypotheses with—it’s outside the scope of what we’re used to working with."
If they can figure it out, though, researchers could be on the verge of tapping into an incredibly valuable resource. There are more than 500 million photos uploaded and shared every day. Some of those photos are bound to depict things nobody has ever seen before. It's just a matter of finding them.