Just 2 percent of the British Library's massive archive of print newspapers have been digitized.
That's going to change.
The institution is completing a seven-year effort to upgrade its news archives, a $55 million (£33 million) project that's aimed at expanding the library's definition of "news." In a blog post about the effort, curator Luke McKernan said that "news" can mean "anything of relevance to a particular community at a particular point in time."
Most people by now will acknowledge that news is recorded in newspapers and on Facebook and on Twitter and on blogs, etc., etc., but McKernan told me he's also thinking about "diaries, oral history, recordings, maps, posters, letters," and so on. McKernan wants to establish links between different kinds of resources, a strategy that's becoming increasingly important as institutions like libraries rethink how their resources will fit into a larger network of interconnected data and information online.
"A lot of the thinking is about metadata and data models," he said in an email. "I’m particularly interested in demonstrating how TV and radio news can be made word-searchable alongside traditional text sources, so I’ve been looking at speech-to-text solution, subtitle capture and entity extraction."
In the next three years, the library plans to have digitized 30 million newspaper pages, 100,000 television news recordings, 100,000 radio broadcasts, and 1.5 million web pages. Today the collection houses some 60 million newspapers dating back to the 1600s, 40,000 television and radio broadcasts, and 25,000 newsy websites. "News does not exist, and probably never has existed, through one medium," McKernan wrote. "It is we, the readers, who construct the news by selecting from the variety of forms on offer."
One note that may disappoint: For readers not in the U.K., access isn't free. Much of the library's digitized content is unavailable unless you're physically in the British Library's London Reading Rooms or willing to pay. But a $16.75-per-month subscription will grant access to the library's rapidly growing collection of digitized newspapers.