Making Haystacks, Finding Needles

New programs let you easily categorize anything you come across on the Web or in your own files—and, more important, let you find it all again

Also see:

Searches, Backups, Soul of a New Program"
James Fallows recommends some search engines that "cluster" or classify the pages they have found. Also, his tech-literature picks for this month.

It’s time for another look at one of the software world’s longest-running and potentially most important sagas: the attempt to create effective data- management programs.

Computer users are always coming across little bits of information—from e-mails, from conversations, and from blogs, RSS feeds, and other Web sites. Often we’d like to do something with this information, rather than letting it whiz by into oblivion. As the systems that present information have evolved, so have the means for storing and keeping track of data we might like to use later on. Starting twenty-five years ago, when almost the only data that flowed into a computer came via early e-mail or bulletin-board systems, I’ve tried about 200 different data-management programs, including two I mentioned several months ago in this space. These are Microsoft’s OneNote, which looks sleeker and more Mac-like than other Microsoft Office applications and, while predictably well integrated with Microsoft’s Outlook and Word, also works very well with non-Microsoft products like the Firefox Web browser; and Chandler, from Mitch Kapor’s Open Source Applications Foundation, a still-emerging effort to create a free, all-platform information manager.

Good data managers do three things. They let you bring information into the program easily, from textual sources like e-mail or from Web sites. They let you classify, or “tag,” incoming information, if you already know what you’d like to do with it: for example, you might want to save a certain Web clip for information about next summer’s vacation or for a work project you have under way. Other information you might want to dump into a general storage bin without a specific purpose but on the chance you might want to look at it again. And the programs let you later retrieve the information you’ve stored, whether you’ve classified it or not, by means ranging from simple keyword searches to elaborate ways of detecting relationships among data.

All the programs I’ll mention meet three crucial tests. They allow effortless data collection, tagged and untagged storage, and flexible retrieval options. They differ in where they’ve placed their emphasis, and in whether they’re aimed at users who like structure and outlines or those who prefer looser organizational schemes. Compared with OneNote and even Chandler, they are generally less ambitious in their aims and come from smaller (and sometimes shakier) enterprises. In wine terms, these are mainly garagiste offerings rather than from the main châteaus. But each excels in a certain way, and all are worth at least experimenting with, since they’re available for free trial periods. Unless otherwise noted, each program can be found at a Web site of the same name or via a simple Web search. With a few exceptions, these programs are for PCs only. The two programs for Macs, though, are particularly elegant.

Although not necessarily the most powerful of these programs, EverNote will probably feel the most natural to most users. It’s the one I recommend as a starting point for people first trying this kind of program. EverNote comes from a company in California, most of whose founders were computer experts in the old Soviet Union.

After you install EverNote, it places a stylized “E” logo on your browser’s tool bar. If you click on the E while visiting a Web page, the whole contents of the page go into EverNote’s storage, along with URL information so you can visit the page again. You can also select text or images from the Web page and store those selections with a click of the E—or drag material to the E from Word or most other programs. The basic version, which can do all this, is free. An advanced version, for $34.95, can import handwritten notes from tablet computers and, in many cases, convert them to searchable text.

The program’s most distinctive feature is its “Time Band,” which runs along the right edge of the screen. Each note is assigned a place on the band, based on the instant when it was created. The band serves as an endless reel on which all the notes are stored, from oldest to newest. The concept sounds trivial, but it is surprisingly interesting to use—it reminds you of the huge roll of paper on which Jack Kerouac supposedly wrote On the Road, so he wouldn’t be distracted when coming to the end of a page.

When looking for a note, you can jump to the time when it was created, or quickly scroll through the contents until you find a picture or layout that looks familiar; or, of course, you can also use familiar keyword-search tools. EverNote auto-categorizes notes according to types (Web clips, to-do items, pure text, pictures, and so on) and lets you manually assign categories of your own. You can also create “keyword categories—every item with the word Tahiti will automatically go into the “Next Year’s Vacation” category—and do Boolean searches involving those categories. (For instance, all items that are assigned to the “Next Year’s Vacation” category but not assigned to the “Prohibitively Expensive” category.) Unlike OneNote, EverNote cannot store audio or video clips. It is also less convenient than OneNote as a way to type in new information (as opposed to clipping or capturing it). Still, many people will find that it does most of what they want.

Net Snippets, from a small company in Israel, is not as pretty as EverNote but is very quick and effective, especially for text-based research projects. After installation it displays a thumbnail-size box in one corner of the screen. You drag whatever you want to store over to that box, and the program saves it for later retrieval. The basic version is free and easy to learn. More-advanced versions cost up to $129.95 and include features for producing bibliographies or other academic reports based on the captured data.

A relatively new entry, Surfulater, created by a veteran developer in Australia, differs from most of the others in the elaborate ways it allows you to comment on, classify, and even edit the material you have collected. For instance, if you’ve copied and stored a blog entry or a passage from a Web site, you can enter notes of your own—“There he goes again!” “This detail is interesting—right alongside the clip, and search for those comments later on. It also has a variety of special categorization tools. Surf‑ ulater costs $35; its creator, Neville Franks, chronicles the program’s ongoing evolution at

Until this spring, Onfolio was a strong stand-alone competitor to the likes of EverNote and Net Snippets in allowing the simple capture and categorization of text and images. Its edge was the elegant way it handled RSS feeds, and its unique system for dealing with very long Web postings. Newspapers, magazines, and some other sites often break lengthy articles into a series of separate pages. When you get to the bottom of each page, you have to click “Next” to keep reading. On some sites you can get around this by clicking on a “Print-Friendly Version” button, which usually opens up a separate window with the article’s entire contents. Onfolio can spare you this process by itself amassing and storing material from a multipage site. Last March Microsoft acquired Onfolio and made the program part of its Windows Live Toolbar, which is free (from but runs only in Microsoft’s Internet Explorer (and Outlook), not in Firefox or any other browser.

Google has a free data manager called Google Notebook, which will run in most browsers. But it is bare-bones and far more limited than anything else mentioned here. Some of Google’s new utilities, like its Calendar and Spreadsheet programs, can do almost everything that a “real” desktop program like Outlook or Excel can. Google’s Notebook has not yet reached that level.

For more than twenty years, the AskSam company, of Perry, Florida, has offered very powerful “free-form database” programs. You enter e-mail, research notes, and other textual or numeric data into AskSam, and then you can retrieve it according to precise, structured queries. (“Show me all notes from June 2001 that belong to the following two categories and contain these three keywords.”) These searches can sometimes reveal relationships that would not otherwise be obvious. The company’s SurfSaver product is slightly more cumbersome than EverNote or Onfolio when it comes to storing information, but it is much more powerful in retrieval.

Every month or two since 2001, a small, multinational team of designers based in Beijing has issued successive releases of its Advanced Data Management program, or ADM. (I visited the team in Beijing recently and will have more to say about its work another time.) Like other programs mentioned here, ADM allows users to collect a variety of data in one place. It will be more attractive to those who (like me) naturally think, organize, and even create in the outline form familiar from term-paper days.

Now, the promised Mac programs. One is DevonThink, created in Germany, which was mentioned previously in these pages but can’t be mentioned often enough. It is only so-so in collecting data, but it is superb at organizing and searching what you have amassed. It has an exceptional “semantic search” utility, which like a good Web search engine can often find what you’re looking for even if that is not exactly what you typed in. (The only comparable PC program I’m aware of is dtSearch.) The other is Tinderbox, from Eastgate Systems, of Watertown, Massachusetts. Its Web-clipping utilities are primitive at best, but it is a wonderful tool for arranging ideas, seeing, and changing the relationships among them, and generally doing creative work.

Each user’s taste will vary. Some PC users will want to move past EverNote, and some Mac users will find Tinderbox too tricky to be worthwhile. But those two programs are good places to start.