Byliner Sure Is Slick, But Is It Also Stealing?
The age-old question of fair use gets hauled out thanks to site scraping services
Everybody loves the sleek, well-designed, good-for-journalism startup called Byliner. The newly launched content site scrapes the Internet for links to long reads and a team of informed editors flag the best ones to be featured on Byliner. The site tracks what you read and recommends items you might like. There's even a publishing platform that offers up original content from famous authors for a small fee. From a reader's perspective, the site is a breeze--fast, personalized, social, a Pandora for narrative nonfiction. From a publisher's perspective, the idea is intriguing--in the best of scenarios a source of long-tail traffic that will steer new readers to good content.
But blogger and startup man about town Matt Langer thinks it's a trick. Recalling Google's recent war on content farms and site scrapers, Langer reminds us that, despite all of the positive chatter keeping Byliner in the news this week, we normally hate sites like this. That is, the ones that pull content from other sites, republish it in a different form and then make money from selling ads against it. Langer writes:
I’m having a really hard time wrapping my head around how its ideologically consistent to villainize copycat sites while at the same time celebrating the launch of services that facilitate both the removal of content from an original source as well as the repackaging of that content in another form.
What’s the difference? Is it because the Byliner is well designed? Or because scraper sites are run by machines in foreign countries and the content on Byliner is being hand-picked by Real People Who Care About Reading? Or something else entirely? Or am I just incredibly daft and confusing two completely different things here?
Another feature of Byliner is a Read It Later button that repackages the content again and saves it for future, ad-free enjoyment on a mobile phone or tablet. There are a number of products like this: Instapaper, Readability, Flipboard. Some of them offer publishers a cut of the price users pay for the app or a cut of the advertising revenue, if the service supports ads. When Apple launched an updated version of Safari with a Reader feature that works a lot like these other products, Choire Sicha at The Awl launched into his own his own tirade about the proliferation of content-repurposing products that have been sucking up shares of publishers' traffic and revenue since the dawn of RSS. Though some services allow publishers to opt out of being compatible with the site scraping, that doesn't entirely neutralize the issue:
If you complained every single time someone decided that an RSS feed was somehow an invitation to republish in full on the web or elsewhere, you'd never have the time to actually help create the material that these people then republish with their own fees and ad. But being put in this situation is particularly cruel, it feels like, to sites that are attempting to prioritize paying writers--something that is a struggle! (And a struggle that will be won, but only after a long time coming.)
The difference between pulling an entire article and offering a limited preview is pretty huge, though. Billed as a "discovery engine," Byliner tackles a different problem than the apps that dump feeds into a clean list that avoids ads at all costs. It highlights content that may otherwise have been forgotten. A large amount of the content featured on Byliner so far is from the pre-Internet days and is likely pretty difficult to discover on its own. In the very worst case scenario, a reader would check out the preview on Byliner and not click through--and you would't think that would be a trend among the type of users Byliner is targeting. In the best case scenario, readers discover great content, long lost in the Google grid, that pulls them onto the original content site, where ideally, they click around and become a new customer.
So when Langer asks what the difference is between Byliner and other scraper sites, that's easy: Byliner only takes an excerpt. The U.S. Copyright Office says whether or not the borrowing of content fits into fair use laws depends on "the amount and substantiality of the portion used in relation to the copyrighted work as a whole." Byliner pulls about 300 words out of articles that are typically over 2,000 words total. This seems okay, doesn't it?
But is it a slippery slope? Maybe a little. At the end of the day, though, Byliner could get more people to read which is indisputably and inevitably good for the people who publish content meant to be read.
CORRECTION (2:22 6/23/11) - An earlier version of this post insinuated that The Atavist re-packaged others' content. In fact, The Atavist only publishes original content that they solicit from writers.