It used to be simpler. You woke up, and there was one newspaper you could read. It printed on pulp and delivered to your driveway. You got in your car, and there was a Sears strategically located within reasonable distance from your home on the highway. It had just about anything you needed, from baby clothes to car insurance—if you knew which part of the store to find it. The life of a American consumer for so much of the 20th century was defined by this comforting narrowness of choice, and the city paper and the area department store were two hallmarks of this localized scarcity .
But one of the byproducts of the Internet has been the shift from scarcity to abundance for the consumer. Google News, Twitter, and Facebook aren't local newspapers: They're global portals to the local newspapers of every city in the world. Amazon, the everything store, is so vast, it makes mid-twentieth-century Sears look like a late-19th century corner grocery. This revolution introduces a new challenge for both people and the companies serving them: What do you offer the customer who has access to everything?
Two of the major consumer portals for news (Facebook) and stuff (Amazon) responded to the problem of abundance with algorithms.
An algorithm is just a piece of code that solves a problem. Facebook's problem, with the News Feed, is that each day, there are 1,500 pieces of content—news articles, baby photos, engagement updates—and much of it is boring, dumb, or both. Amazon's problem is that it wants you to keep shopping after you buy what you came for, even though you don't need the vast majority of what Amazon's got to sell.
Both organizations narrow the aperture of discovery by using their best, fastest, most scalable formulas to bring to the fore the few things they think you'll want, all with the understanding that, online, you are always half a second away from closing the tab.
More in this series
Take the News Feed, perhaps the most famous and sophisticated media algorithm ever built. The full recipe of the News Feed is ultimately mysterious, but we have a sense of some of the portions. The most important ingredient is you. When you like something, hide something, click on something, or do nothing, Facebook's machine-learning algorithms considers your activity and bakes it into your next News Feed so that you see more of the stuff you've indicated you like. At the same time, Facebook also allows companies and individuals to pay for promotions to appear toward the top of the feed. Finally, the company routinely adjusts its dials, for example to show more news stories from respectable organizations with large digital followings. The News Feed is a little bit of behavioral psychology, a little bit of capitalism, and little bit of secret sauce.
Amazon's storefront also radically changes for each consumer, showing different pages to a gadget nerd, a romance novel reader, or a new parent. Building a recommendation engine from, not 1500 new stories, but millions of products means processing even more data within the half-a-second of a page load. This leaves little time to draw a personalized data map for each customer.
No algorithm existed to do what Amazon needed to do at the scale Amazon needed to do it. So the company built a unique patented recommendation formula for itself in the late 1990s, as its chief engineer explained. Rather than match customers to similar customers, Amazon built an index of items that customers tend to purchase together. When you check out a page or make a purchase, the site shows you products with high ratings and similar qualities based on that index. Here, scraped from the patent application, is a diagram of how this system works, at the simplest level.
The key to this formula, which goes by the term "item-to-item collaborative filtering,” is that it’s fast, it’s scalable, and it doesn’t need to know much about you. This is a recommendation engine based on products rather than people. At its simplest, that means suggesting a football book to somebody who buys a football video game.
The strengths and weaknesses of each algorithm is clear. Facebook knows more about its users; Amazon knows more about its inventory. Each could stand to learn a bit from the other. Facebook is desperately trying to better identify its higher quality inventory, while it's often obvious that Amazon doesn't know its users. Amazon knows what's good, because it knows (a) what's been bought and (b) what's been highly rated. Facebook has likes, which are similar to ratings, but people might not be reading most of the content that they like, as Chartbeat CEO Tony Haile suggested in Time. In short, Amazon and Facebook are solving the problem of abundance with similar, but conceptually opposite, formulas.
The knock on Amazon I've heard from both friends and software developers is that its recommendation bots reveal themselves to be awfully, well, robotic. This Reddit comment is representative of the feeling:
Why is Amazon's recommendation engine still so ABSURDLY bad?
I've bought hundreds of books from Amazon. My last purchase was a John Sanford novel. It was okay, but I didn't feel strongly enough to write a review or even rate it.
Yet when I was looking for a new book today, and clicked on "Recommended For You," it quickly gave me 20 recommendations. Nineteen of which were Sanford novels.
That's not the recommendation of a friend who knows you. It's the recommendation of a software index that has no idea who you are, unless you've already bought those 19 books from Amazon already.
Maybe we like it that way. The equivalent knock on Facebook has often been that it knows us too personally and that its insinuation into our lives is creepy. But that's just the thing. For the age of algorithms to succeed on its own terms, we have to embrace a new version of intimacy that felt natural with the local newspaper and corner shop clerk who knew our name. The machines have to know us.