Throughout the contest, Khan monitored the Kaggle forums, where competitors were sharing ideas and frustrations. After one poster asked, “Do those white thinger-doodles change over time?” Khan posted some tips and basic terminology for whale identification: the white thinger-doodles are callosities; the nose end of the whale is its bonnet.
“Is the whale in w_8026.jpg pooping?” asked another competitor.
“Yes!” Khan responded. “We marine biologists refer to it as defecation.”
The competition’s winning entry, announced in early January, came from a team at the Warsaw office of the data-science company deepsense.io. Their algorithm could identify whales with 87-percent accuracy.
One of the team’s core members, the data scientist Jan Kanty Milczek, says the challenge was more similar to human facial recognition than he’d expected. After cropping a photo around a whale’s head, the next step was to get the computer to align it a certain way, with the whale’s blowhole on one side and its bonnet on the other, a process he likens to “making the passport photo of each whale.”
The team used a neural network, a kind of computer program that learns by example. The scientists trained the neural network to search for patterns among the photos, first on the scale of a few pixels, then with increasingly larger swaths of an image. “I think of it as more like giving hints to the actual algorithm than doing things for it,” Milczek says.
Khan says her next step will be to talk with other members of the right-whale research community and decide whether they should move ahead with creating software that uses the winning team’s algorithm, or solicit a second algorithm for identifying whales from vessel photos and then package the two together as a single piece of software.
Either way, Khan says, the deepsense.io algorithm could help researchers in several ways: Identifying a whale immediately would be useful to scientists doing biopsies of whales to study their genetics, for example. If they pull up alongside an animal and can identify it right away as one they’ve already tested, they won’t need to bother it a second time.
Then there are entangled whales, who’ve gotten caught up in bits of fishing gear and are dragging it from their bodies while they swim. When researchers spot an entangled whale, they contact a network of on-call disentanglement experts along the coast. Responders will jump in a boat, race to the scene, and decide whether to try to help the animal (some entanglements are life-threatening; at other times it’s better to leave the animal alone and keep monitoring it). If these responders can immediately identify an animal, Khan says, they’ll be able to pull up other photographs to get a better idea of how much gear it’s dragging, or how exactly that gear is attached.
But perhaps the biggest benefit of facial-recognition software for right whales, Khan says, is that it would free up researchers’ time to do actual research. Rather than spending long hours in the office clicking through the right-whale catalog, these chronically underfunded teams could use that time to collect data out in the field or work on papers for publication. Someday, similar software could even help the researchers who study other marine mammals—bottlenose dolphins are identified by their dorsal fins, for example, and humpback whales by their distinctive tail patterns.
In short, the software would buy researchers more time—and help the endangered whales they study have more time on the planet, too.