The Future is Discovery, not Just Search

by Jesse Farmer on Friday, April 25, 2008

Let's start with a picture from Radar Networks' CEO Nova Spivack: $keyword-search-slide$

Erick Schonfeld, asking "Is Keyword Search About to Hit its Breaking Point?," talks about Spivack's view of the future of the web. According to him it lies ever-more-refined search technologies such as semantic search, natural language search, and artificial intelligence. A quote:

Keyword search engines return haystacks, but what we really are looking for are the needles . The problem with keyword search such as Googleâ€™s approach is that only highly cited pages make it into the top results. You get a huge pile of results, but the page you wantâ€”the â€œneedleâ€ you are looking forâ€”may not be highly cited by other pages and so it does not appear on the first page. This is because keyword search engines donâ€™t understand your question, they just find pages that match the words in your question.

Spivack wants to "do for data what the Web did for documents" and develop a standard, uniform system for semantic metadata. It's the classic "dumb software, smart data" idea. Tagging works to a degree, but it's neither uniform nor standard — the same tag can mean two different things for two different people, and two different tags can mean the same thing.

That said, the premise underpinning Spivack's whole argument is that search will is the correct interface when faced with a world of exponentially-increasing information. His version of the future says, "Keyword search will become increasingly inefficient and the solution is to develop semantically-aware systems that search based on meaning, rather than content."

Search and Discovery

Let's take a step back and think of other situations where we are faced with more information than we can handle at once, for example, music. How do you get new music? If you want some new hip hop do you search for it?

In truth, nobody I know searches for new music. How can you search for something you don't know, anyhow? Search doesn't just profit off intent, it requires it. To find new hip hop I'd ask a friend who is into that scene and get his opinion, or browse through the new releases at my local record store or iTunes.

The same pattern exists on TV. People don't search for new shows, they discover them either through friends and advertisements, or by channel surfing.

A Bi-Modal Future

The future of information on the Web does not rest in super-advanced search, but in both search and discovery. This bi-modal existence makes sense because people behave in two ways depending on whether they have intent or not.

If someone knows what they want, say, the average RBI among hitters in the American League, then search is perfect. If, however, you're in a channel surfing mood, then search is worthless because you don't know what you want — but you will when you see it.

Lots of sites straddle this divide. Yelp, for example, helps in discovery by giving you sensible metadata in the form of ratings. This fits into Spivack's hypothesis. I have some level of intent (e.g., "I want Thai food in San Francisco"), but not much.

But sites like YouTube fall clear off discovery side of the gap. Who searches YouTube unless they're trying to find a video they've already seen and want to show a friend? Furthermore, who uses the metadata on the site (besides, perhaps, related video) to find new content? Most of the highly-rated and highly-viewed stuff, speaking for myself and my friends, are not the things I watch regularly.

Instead I discover videos on YouTube through my social network or by serendipitously finding a great video embedded in a website I happen to be reading. Indeed, there are whole sites, like StumbleUpon, whose main mechanic is serendipity.

I'm still uncovering new information, but I'm sure as heck not searching for it in the search enging sense of the word.

Summary

In short, search is what we do when we have an idea of what we want and discovery is what we do the rest of the time. When looking for something to watch on TV people don't search, they channel surf. And when people want to find facts people search, they don't stumble around aimlessly.

As information density increases and more pieces of media, information, knowledge, and, in general, data become available online both mechanics, search and discovery, will have to be developed to accommodate the volume. Why?

In a world with more and more data the percentage of data that we are actively able to query becomes smaller and smaller. That is, if there is more data not only do we know less as a percentage of all the information out there, but we have less knowledge of what we do and do not know.

This is where discovery fits and it's a mistake to think the only solution is a single, ultra-intelligent search agent, or a single, unifying data structure for the Web.

Human behavior tells us otherwise.