Massive Volunteer Collective Proofreads 25,000 Public-Domain Books

By Rebecca J. Rosen
2061878205_dbbf42b494_b-650.jpgsanickels/Flickr

Give these people a spot in the Guinness Book of World Records, because they surely deserve one: As of today, 100,000 people around the world have taken part in a massive proofreading project to correct the electronic texts of 25,000 publicly available books on the Project Gutenberg site.

Project Gutenberg relies on computers to "read" scanned books and convert the print into e-book-ready texts. The problem with this is that when it comes to reading a scanned text, a computer's "eyes" are inferior to a human's. In the process, tons of small errors creep in -- and humans are the only machines we have for ferreting them out. This is where the Distributed Proofreaders project comes in.

Take, for example, this passage from Isle o' Dreams by Frederick F. Moore, as it appears on the Distributed Proofreaders site, and compare the scanned version with the textual output the computer discerned:

Screen Shot 2013-04-10 at 3.42.15 PM-570.jpg

Pretty clearly needs help ("and" vs. "arid" just in case the image isn't clear enough here). It's just the sort of mistake a computer would make. And it's also just the sort of persnickety, thankless task the 100,000-odd volunteers of Distributed Proofreaders have taken care of over the last decade. As a result, 25,000 of Project Gutenberg's e-books aren't just free; they're free of small, computer-induced copy mistakes too.

So let their work not be thankless after all, and let us say thank you to everyone who spent their time taking care of our literary commons.



H/t @scott_bot

This article available online at:

http://www.theatlantic.com/technology/archive/2013/04/massive-volunteer-collective-proofreads-25-000-public-domain-books/274876/