Let's start with a picture from Radar Networks' CEO Nova Spivack:

Erick Schonfeld, asking "Is Keyword Search About to Hit its Breaking Point?," talks about Spivack's view of the future of the web. According to him it lies ever-more-refined search technologies such as semantic search, natural language search, and artificial intelligence. A quote:

Keyword search engines return haystacks, but what we really are looking for are the needles . The problem with keyword search such as Google’s approach is that only highly cited pages make it into the top results. You get a huge pile of results, but the page you want—the “needle” you are looking for—may not be highly cited by other pages and so it does not appear on the first page. This is because keyword search engines don’t understand your question, they just find pages that match the words in your question.

Spivack wants to "do for data what the Web did for documents" and develop a standard, uniform system for semantic metadata. It's the classic "dumb software, smart data" idea. Tagging works to a degree, but it's neither uniform nor standard — the same tag can mean two different things for two different people, and two different tags can mean the same thing.

That said, the premise underpinning Spivack's whole argument is that search will is the correct interface when faced with a world of exponentially-increasing information. His version of the future says, "Keyword search will become increasingly inefficient and the solution is to develop semantically-aware systems that search based on meaning, rather than content."

Search and Discovery

Let's take a step back and think of other situations where we are faced with more information than we can handle at once, for example, music. How do you get new music? If you want some new hip hop do you search for it?

In truth, nobody I know searches for new music. How can you search for something you don't know, anyhow? Search doesn't just profit off intent, it requires it. To find new hip hop I'd ask a friend who is into that scene and get his opinion, or browse through the new releases at my local record store or iTunes.

The same pattern exists on TV. People don't search for new shows, they discover them either through friends and advertisements, or by channel surfing.

A Bi-Modal Future

The future of information on the Web does not rest in super-advanced search, but in both search and discovery. This bi-modal existence makes sense because people behave in two ways depending on whether they have intent or not.

If someone knows what they want, say, the average RBI among hitters in the American League, then search is perfect. If, however, you're in a channel surfing mood, then search is worthless because you don't know what you want — but you will when you see it.

Lots of sites straddle this divide. Yelp, for example, helps in discovery by giving you sensible metadata in the form of ratings. This fits into Spivack's hypothesis. I have some level of intent (e.g., "I want Thai food in San Francisco"), but not much.

But sites like YouTube fall clear off discovery side of the gap. Who searches YouTube unless they're trying to find a video they've already seen and want to show a friend? Furthermore, who uses the metadata on the site (besides, perhaps, related video) to find new content? Most of the highly-rated and highly-viewed stuff, speaking for myself and my friends, are not the things I watch regularly.

Instead I discover videos on YouTube through my social network or by serendipitously finding a great video embedded in a website I happen to be reading. Indeed, there are whole sites, like StumbleUpon, whose main mechanic is serendipity.

I'm still uncovering new information, but I'm sure as heck not searching for it in the search enging sense of the word.

Summary

In short, search is what we do when we have an idea of what we want and discovery is what we do the rest of the time. When looking for something to watch on TV people don't search, they channel surf. And when people want to find facts people search, they don't stumble around aimlessly.

As information density increases and more pieces of media, information, knowledge, and, in general, data become available online both mechanics, search and discovery, will have to be developed to accommodate the volume. Why?

In a world with more and more data the percentage of data that we are actively able to query becomes smaller and smaller. That is, if there is more data not only do we know less as a percentage of all the information out there, but we have less knowledge of what we do and do not know.

This is where discovery fits and it's a mistake to think the only solution is a single, ultra-intelligent search agent, or a single, unifying data structure for the Web.

Human behavior tells us otherwise.

4 Comments

  1. matt mcknight April 25th, 2008 / 1:14 pm

    Interesting thoughts, I have a lot comments on your post, which was far better than the one on TechCrunch. I definitely agree with you about discovery being important. I think adding things like Guided Navigation really can help with that.

    “Spivack wants to “do for data what the Web did for documents” and develop a standard, uniform system for semantic metadata. It’s the classic “dumb software, smart data” idea.”
    We tried it before. They were called meta tags. A huge spam nightmare. External tagging (such as del.icio.us) is much more valuable if the size of the tagging community outweighs the spam.

    “Furthermore, who uses the metadata on the site (besides, perhaps, related video) to find new content?” I search on YouTube a lot. I look for general categories of stuff (japanese tv show) as well as searching for an artist name to see if anything new comes up. I definitely click on the related links quite frequently. I love to click on things like Google Tech Talks and just browse through on the site. I find your argument strange, because I feel that I get more discovery out of YouTube than most other websites.

    One interesting site you should look at is EveryZing. It indexes a lot of video and audio content by doing speech to text. The text is not good enough to read on it’s own, but it’s good enough to search.

    Maybe I am weird though, because I never channel surf. I Tivo and use search- albeit, not keyword search, but browsing by category. Guided navigation…that’s what’s next (it’s actually mostly already here). http://www.flickr.com/photos/morville/collections/72157603789246885/

  2. Jesse April 25th, 2008 / 1:40 pm

    Matt,

    Hmm, interesting. Is it just the case that search and discovery are distinct now but will meld over time?

    Keyword search is unnatural, IMO, but natural language search isn’t the solution.

    I guess the root is “I want to find something.” You have varying ideas of what you want to find. The less you know the more likely discovery will proffer a solution, and the more you know the more likely search will.

    As both technologies mature the transition between the two becomes less clear. Is that how it is going?

    The prototypical examples of search and discovery are Google and StumbleUpon. Serendipity is part of discovery — learning about a new band for the first time, finding a movie I can watch ten times in a row, etc. It’s not party of search, where I know by-and-large what I want.

  3. Brandon Wirtz April 25th, 2008 / 4:19 pm

    I don’t entirely agree with the stumble upon reference. Stumble doesn’t really show you “more things like this” it shows you more things in this category.

    This is like saying You like Mythbusters so you like all things on Discovery Channel. When in truth I like all things DIY Science.

    I find new things often while looking for old things. Commercials durring my favorite shows often introduce me to new things I would like. Yes they only do so with about 20% accuracy but this is because I could like Mythbusters for a lot of different reasons. I could like all things with a Cute Red Head (Kari Byron), or I could like all things with modifying Junk Cars, or I could like all things related to urban mythology. each of these would create different cross discovery scenarios, and with only a single point of reference there would be no way to know which was the reason I liked the show.

    If however I had a list of all the shows you liked and a list of all the reasons you could like shows I could create a web of why you might like shows and look for intersections.

    If you like Sex and the City, Myth Busters, Blood Rayne, Buffy the Vampire Slayer, and I know who killed me the movie…. We can determine you have a thing for red heads.

    If you like Mad Max Beyond Thunderdome, Myth Busters, Junk Yard Wars, and Monster Garage we can tell you like modifying cars.

    If you like New Yankee Workshop, Myth Busters, Tooltime, and This Old House, we can tell you like shows with power tools.

    The problem with search is that we typically only have your current search, and we don’t index sites by all of the micro categories, that you would need to make intelligent decisions about the best site for a sliver of a category. Though that is what I’m building….

  4. Jesse April 25th, 2008 / 4:29 pm

    Brandon,

    Discovery isn’t just about “more things like this,” and neither is StumbleUpon. Rather, I go to StumbleUpon with no agenda other than “I want to find something neat.” At least that’s why I use it when I do.

    I don’t know what I want beforehand, though, because if I did I could just go to Google and find it. That’s why I call serendipity a “mechanic” of StumbleUpon.

    What you’re arguing for is a refined type of discovery, not something that isn’t discovery. Just as search was improved from the directory-and-keyword-based search engines of yore (e.g., AltaVista), discovery will be improved from its current model.

    NetFlix provides a good example of this kind of “improved discovery.” There are tons of movies I want to see, but I’m really bad at noting them. By the time they’re out on DVD I’ve often forgotten that I wanted to watch them in the first place. NetFlix knows my preferences well enough that it often recommends them to me. Serendipity!

    It even recommends movies I never heard of but wound up really liking. Just the other day I watched If…., which I doubt I ever would have seen without discovering it through NetFlix. I certainly couldn’t have searched for it and found it.

Leave a Reply