free html hit counter Future of Search Archives - Page 6 of 8 - John Battelle's Search Blog

The Anatomy of a Large-Scale Social Search Engine

By - February 02, 2010

Screen shot 2010-02-02 at 6.02.56 PM.pngThe folks at Aardvark have posted an ambitious paper over on the ‘vark blog. Titled after Brin and Page’s original “Anatomy of a Large-Scale Hypertextual Web Search Engine”, the paper presents the Aardvark engine and, in its authors’ words: “describes the fundamental differences between the traditional “Library” paradigm of web search — in which answers are found in existing online content — and the new “Village” paradigm of social search — in which answers arise in conversation with the people in your network.”

I have read most of the paper, which has been accepted at WWW 2010 (it reminded me of all the search papers I read in preparation for writing The Search), and found a lot worthy of interest.

First, the paper’s authors, both of whom have worked at Google, clearly have a sense of potential history here, in that they not only crib Google’s original paper’s title, they also mirror the first line (substituting “Aardvark” for “Google”, of course). Now that’s some b*lls. Of course, when Larry and Sergey first presented Google, they couldn’t even get their paper accepted (it took three tries, if I recall correctly. Someone should write a book about that…).

Second, it’s unusual for a Valley startup to lay out its architecture and technological specs as willingly as Aardvark has. There’s a lot of math in here that I couldn’t parse even if I had the will to try.

Third, we learn some cool things about how Aardvark works. Check this quote out: “…unlike quality scores like PageRank [13], Aardvark’s quality score aims to measure intimacy rather than authority. And unlike the relevance scores in corpus-based search

Screen shot 2010-02-02 at 5.57.33 PM.png

engines, Aardvark’s relevance score aims to measure a user’s potential to answer a query, rather than a document’s existing capability to answer a query.”

Also interesting: ” this involves modeling a user as a content- generator, with probabilities indicating the likelihood she will likely respond to questions about given topics. Each topic in a user profile has an associated score, depending upon the confidence appropriate to the source of the topic. In addition, Aardvark learns over time which topics not to send a user questions about…”

There’s a lot more like this in the paper, it’s worth reading. The authors even did a test of Aardvark results against Google, with the results being something of a push (see the last page for details). Not bad for an upstart service.

Lastly, we learn a lot about the service, thanks to a number of charts, including something about Aardvark’s growth, which I had not really anticipated. It’s up and to the right, as you can see from the chart.

  • Content Marquee

Google Rolling Out Social Search: But Does It Leverage Facebook?

By - January 27, 2010

Screen shot 2010-01-27 at 1.56.59 PM.png

Forget the iPad, today Google is taking another step toward its stated goal of “making search more social.” There’s a lot of goodness in here, in terms of features and approach, but it’s just silly to pretend you can do any of this without directly addressing the 400 million-person elephant in the room called Facebook. Put simply: I can’t figure out if this new service uses my Facebook social graph. And to my mind, that’s a problem.

From the blog post announcing the public beta of social search (first announced at Web 2 late last year):

We think there’s tremendous potential for social information to improve search, and we’re just beginning to scratch the surface. We’re leaving a “beta” label on social results because we know there’s a lot more we can do. If you want to get the most out of Social Search right away, get started by creating a Google profile, where you can add links to your other public online social services.

Indeed – a lot more, like make it really easy to use your Facebook social graph, the way tons of other sites and apps do. Why not just use Facebook Connect? Hang on a tick, the video giving us an overview of the service says once you create that Google Profile, you can add connections via Blogger, Twitter, and “any other online networks you might be a part of” (45 seconds in). Might that include Facebook?

OK dear readers, I’m going to do it. I’m gonna make a Google Profile, just to find out…. Well, I’m still a bit perplexed. You can add any URL as a “Link” in your profile, so I added my Facebook pages. However, once I got through the initial form (which was not simple – I had to fill out all the info I already did with Facebook and LinkedIn, and my own name is not available as a profile URL, not /johnbattelle, not jbattelle. Darn! I picked /johnlinwoodbattelle, so now you all know my middle name…) Er, anyway, there *was* a prompt to “Share It On Facebook” after all that…

Aha! Maybe this will get my Facebook social graph goodness into Google Social Search?

Not that I could tell. Just a simply “share on Facebook” implementation, declaring my profile to my FB pals. But no deep integration. As far as I can tell, my Facebook social graph will not inform my social searchin’ on Google. As I understand it from reading previous coverage of the product, Google social search *will* leverage FriendFeed, recently purchased by Facebook. But as far as I can tell, it does not leverage Facebook proper.

And that, to my mind, is just silly. Silly in the main, because as a consumer, clear, direct, and transparent integration with Facebook would be a huge *win* for my understanding of Google’s social searching. Wouldn’t it? Or am I missing something? (Besides the competitive issues, of course…)

I’ve pinged Google and other sources to find out if I’m just deeply in the dark….

Update: Google has provided me an answer to my initial question:

“If someone links to his Facebook account from his Google profile, Social Search may surface that user’s public profile page. These are the same public profile pages already available on a search of Google.com and other search engines today. While we’re interested to continue expanding the comprehensiveness of Social Search, we do not currently use your Facebook connections as part of Google Social Search.”

What I’d like to know then is this: Why not?

The Evolving Search Interface: Mobile Drives Search As App

By - January 15, 2010

Screen shot 2010-01-15 at 11.10.13 AM.png

I’ve said before that search interfaces, stuck in the command line interface of DOS, will at some point evolve into applications on top of a commodity search index. I further opined that Bing, in particular Bing’s limited but compelling visual search, was just such an example: search as an interactive, rich application, as opposed to search as a list of results.  

The commodity of search results is critical, but as we shift our usage to the mobile web, the use case for a list of results weakens. Instead, as this Bizweek article points out, we’re using apps. On their face, these apps don’t seem like search at all. Except they are.

Take the popular iPhone app Exit Strategy, for example (at left). The app helps folks navigate the NY transit system. In essence, it consolidates a subset of search queries and answers them with a combination of domain-specific structured results and an elegant user interface. The structured dataset is the NY transit map and schedule, the UI is based on the iPhone’s unique ecosystem of interface. The result: No one with this app is Googling “best route Bronx Midtown“. Instead, there’s an app for that.

Google can’t help but see this as a threat. For nearly every structured set of results, there’ll be an app for that, if there isn’t already. To my mind, the question becomes one of using search to find the best apps. I wonder how Google is surfacing iPhone apps as answers to questions pertinent to destroying its own query volume? For it seems to me that a very good result for the query above, if done on Google over an iPhone, would be “Exit Strategy.”

Huh. Yet another reason to lean into Android, no doubt.

Search Getting Worse? What Did I Mean?!

By - January 06, 2010

(Excerpted from a longer post on BingTweets, part of a series I’ve been writing, underwritten by Bing).

In my predictions this week I seemed unusually glum about the state of search, writing: Traditional search results will deteriorate to the point that folks begin to question search’s validity as a service.

This statement did not go unnoticed by folks in the industry, and I received quite a few emails, Tweets, and comments asking what on earth I meant. Well, in the post I tried to explain:

This does not mean people will stop using search – habits do not die that quickly and search will continue to have significant utility. But we are in the midst of a significant transition in search – as I’ve recently written, we are asking far more complicated questions of search, ones that search is simply not set up to answer. This incongruence is not really fair to blame on search, but so it goes. Add to this the problem of an entire ecosystem set up to game AdWords, and the table is set.

Let me use this final BingTweets entry to expand on what I meant.

My statement about how we’re asking “far more complicated questions of search” is a riff on the writings I’ve done here on the BingTweets blog, specifically, my three part series on “Decisions Are Never Easy” (1, 2, 3). In short, I find that all of us are expecting search, a technology built to answer one-dimensional questions like “capital of Yemen”, to answer questions that have more than one semantic meaning (“Yemen al qaeda leadership diplomacy”). As a reader (and search entrepreneur) put it in an email to me: “When people move to complex queries (defined as two or more semantically disjunct terms), search breaks down. All it is really fit to do is deliver all the permutations. Imagine a 5-term query, all semantically disjunct. …. such as … “green tea, life quality, life expectancy, cancer, tumor”. Did you ever try and read 40,000 documents?”

Well no, none of us ever try to read all the documents search brings back – all the “permutations” that search faithfully (and rather unintelligently) renders to us. We all know by now that when we ask a complicated question of search, search will pretty much throw everything and the kitchen sink at us. And we don’t want all that information. We want our answer!

I have no doubt that such an answer is coming, but before it does, we have to go through a period of disappointment. ……. (continued)

Predictions 2010

By - January 03, 2010

nostraD-tm-3-tm-tm-tm.jpg

crystal ball-tm.jpg

Related:

2009 Predictions

2009 How I Did

2008 Predictions

2008 How I Did

2007 Predictions

2007 How I Did
2006 Predictions
2006 How I Did
2005 Predictions
2005 How I Did
2004 Predictions

2004 How I Did

A new decade. I like the sound of that. I’m a bit late on these, but for some reason these predictions refused to be rushed. I haven’t had the contemplative time I usually get over the holidays, and I need a fair amount of that before I can really get my head around attempting something as presumptive as forecasting a year.

So I’ll just start writing and see what comes.

While past predictions have focused on specific companies and industry segments (like Internet marketing), I think I’ll try to stay meta this time. Except for Google, of course, which is still the only company in the Internet economy that can be seen from space. For now. But we’ll get to that.

1. 2010 will mark the beginning of the end of US dominance of the web. I am not predicting the decline of the US Internet market, but rather its eclipse in size and overall influence by other centers of web economies. In essence, this is not an Internet prediction, but an economic one, as the web is simply a reflection of the world, and the world is clearly moving away from a US-dominated model.

2. Google will make a corporate decision to become seen as a software brand rather than as “just a search engine.” I see this as a massive cultural shift that will cause significant rifts inside the company, but I also see it as inevitable. Google, once the “pencil” of the Internet, has become a newer, more open version of Microsoft, and it has to admit as much both to itself as well as to its public, or it will start to lose credibility with all its constituents. While the company flirted with the title of “media company” I think “software company” fits it better, and allows it to focus and to lean into its most significant projects, all of which are software-driven: Chrome OS, Android, Search, and Docs (Office/Cloud Apps).

This shift means Google will, by years end and with fits and starts, begin to minimize its efforts in media, including social media, seeking to embrace and partner rather than compete directly. This is a significant prediction, as Facebook is clearly Google’s most direct competitor in many areas, but Google will realize, if it has not already, that it cannot out Facebook Facebook, but it sure can be a better software company.

3. 2010 will see a major privacy brouhaha, not unlike the AOL search debacle but around social and/or advertising related data. Despite the rise of personalized privacy dashboards for most major sites, there is still no industry standard for how marketing data is leveraged, and there is a brewing war for that data between marketers, their agencies, and third parties like ad networks and measurement companies. Add in a querulous legislative environment, and it’s hard to imagine there not being some kind of major flap in the coming year.

4. By year’s end the web will have seen a significant new development in user interface design, one that will have gained rapid adoption amongst many “tier one” sites, in particularly those which cover the industry.

Despite nearly ten years of blogging, most publishing sites are still stuck in the mode of “post and push down,” which is, frankly, a terrible UI for anyone other than news hounds. Thanks to the three-headed force of social, gaming, and mobile, I think the PC web is due for a UI overhaul, and we’ll see new approaches to navigation and presentation evolve into a recognizable new standard.

apple_newton130_iphone3g.jpg5. (image) Apple’s “iTablet” will disappoint. Sorry Apple fanboys, but the use case is missing, even if the thing is gorgeous and kicks ass for so many other reasons. Until the computing UI includes culturally integrated voice recognition and a new approach to browsing (see #4), the “iTablet” is just Newton 2.0. Of course, the Newton was just the iPhone, ten years early and without the phone bit….and the Mac was just Windows, ten years before Windows really took hold, and Next was just ….oh never mind.

6. 2010 will see the rise of an open gaming platform, much as 2009 was the year of an open phone platform (Android). Imagine what might happen when the hegemony of current game development is questioned – I want open development for Halo and Guitar Hero, damnit!

7. Traditional search results will deteriorate to the point that folks begin to question search’s validity as a service. This does not mean people will stop using search – habits do not die that quickly and search will continue to have significant utility. But we are in the midst of a significant transition in search – as I’ve recently written, we are asking far more complicated questions of search, ones that search is simply not set up to answer. This incongruence is not really fair to blame on search, but so it goes. Add to this the problem of an entire ecosystem set up to game AdWords, and the table is set. Google will take most of the brand blame, but also do the most to address the issue in 2010.

8. Bing will move to a strong but distant second in search, eclipsing Yahoo in share. Of course, with the Yahoo deal, it’s rather hard to understand search share, but I measure it by “where search queries originate.” This is a pretty bold prediction, given the nearly 7-point spread between Bing and Yahoo now, but I think Microsoft will pick up significant share using cash to buy distribution.

GoogIPO.jpg

9. Internet advertising will see a sharp increase, and not just from increased search and social media platform (PPC/PPA) spending. Brands will spend a lot more online in 2010, and most predictive models are not accounting for this rise.

10. (Image) This is probably a layup, but one never knows, layups are sometimes the ones you miss: The tech/Internet industry will see a surge in quality IPOs. However, at least one, if not more will be withdrawn as public scrutiny proves too costly and/or controversial. A corollary: There will also be a surge in M&A and “weak” IPO filings.

11. I’m out of my depth on this one, but it feels right so I’m going to go with it: We’ll see a major step forward in breaking the man/machine barrier. By this I mean the integration of technology and biology – yes, the same fantasy that fuels the blockbuster movies (Avatar, Matrix, Terminator). I’m not predicting a market product, but rather a paper or lab result that shows extraordinary promise.

12. I’ll figure out what I want to do with my book. SOGOTP, so to speak. Three years of predicting that I’ll start it is getting a bit old, eh? I feel good about branching back out into more contemplative fields, with FM in a strong position and our economy coming out from its defensive crouch.

As always, thanks for reading and responding. I look forward to 2010, it’d be hard to predict anything other than it’ll be a better year, overall, than 2009.

What's Up?

By - December 13, 2009

Screen shot 2009-12-11 at 12.40.58 PM.pngScreen shot 2009-12-11 at 12.40.48 PM.png

(This piece was written for the BingTweets blog and is part of an ongoing exploration of search underwritten by Microsoft. See my series on the interplay of search and decisions here, here, and here. I wrote the piece below before today’s web-wide conversation about content farms, but I think it’s related. We need new frameworks for search, and real time points us toward one potential path.)

———

The rise of real time search (just this past week, Google rolled Twitter, Facebook and Myspace data into its results) has everyone buzzing. Of course, BingTweets was the first real time mashup from a major player in search (and Microsoft has already announced its intentions to go further), but we’re just at the start of where real time search might go. What might things look like a few years from now?

In my last BingTweets post (Decisions Are Never Easy) I posited the idea of a real time service that connects us to each other based on expertise. So if I wanted to talk with someone who was an expert in buying classic cars, the service would find that expert and connect me to him or her.

I think real time search is a step toward building an ecosystem that makes such a service possible. But we have to get out of our current modes of understanding search interfaces to really grok how this might work. At present, we still see search as a modal dialog box, where we type in a request, then wait for an answer. As different search interfaces develop, new opportunities arise. We’ve seen a fair amount of innovation in search interfaces lately (here’s more on Pivot, for example), but real time data presents a significant challenge.

We can see the challenge in the companies most directly responsible for feeding data into the real time search index. Twitter recently changed its opening question from “What are you doing?” to “What’s happening?” That subtle shift invited a much more robust set of potential responses to be poured into the service (and subsequently parsed by search services). And Facebook just this week announced it will make all of its members’ status updates part of its universally public feed. Its question? “What’s on your mind?”

I recently heard from a reliable source inside Facebook that there are 40 times more status updates daily on Facebook’s network than on Twitter. That’s a lot of data to parse, whether you are a search service, or a consumer of that service’s product. What might it look like?

Well, start with the use case. Why might we want to query a real time search index? My first answer is simply this: To find out “what’s up.” Now, there are nearly endless refinements of that general concept: What’s up with the smoke I can see in the mountains behind my house? What do people who bought the Palm Pre recently think of their new phone? What bands are playing in Chicago this weekend that I might like? What’s up with Jahvid Best, will he play in Cal’s bowl game? All of these questions are variations on the theme of “What’s up?”

Given the right approach to interface, algorithms and filters, all of these queries can be answered by real time search.

(more at BingTweets….)

More on Facebook Public Data and Google Implications

By - December 11, 2009

wooden_chess_board_12_02.jpg

You know, I just realized I suggested that Facebook do exactly what it’s doing. Read this post from back in June, deconstructing an article in Wired about the emerging Facebook v. Google battle. In it I say:

I think it’s a major strategic mistake to not offer [as much information on Facebook as possible] to Google (and anyone else that wants to crawl it.) In fact, I’d argue that the right thing to do is to make just about everything possible available to Google to crawl, then sit back and watch while Google struggles with whether or not to “organize it and make it universally available.” A regular damned if you do, damned if you don’t scenario, that….

The angle here is that Facebook, by making everything public, will force Google’s hand in search, and potentially dilute Google’s ability to compete in the social graph game (because Facebook will own the results on Google). If, on the other hand, Google decides to de-prioritize Facebook data in its results, Google’s brand will clearly be tarnished as favoring its own solutions (remember when Google announced its incorporation of Google accounts into search? Yep.)

Interesting. The chess is getting really interesting.

Help Grok Pivot, A Novel Approach to Search Interface

By - December 01, 2009

Screen shot 2009-11-22 at 7.52.16 PM.png

Microsoft has been kind enough to give me a limited number of invitations for readers of Searchblog to grok Pivot, which I wrote about here last week.  

In that post I promised to grok Pivot, then report back more here. Alas, Pivot is currently Windows only, and – alas – I am currently Mac only. I do have a couple of PCs in my house, but they are owned by my son and my wife, and it’s fair to say I’m not eager to to use them for experimental installs. My son in particular will kill me if I touch his machine (though I’m pretty sure he’s going to download Pivot before I ever do).

Anyway, those of you with a PC and a desire to check out a new approach to search, you’re in luck.

Head to the Live Lab’s Pivot page, and when you hit the download button, enter this code during the install process:

A1C8 7318 57F3 E92C

But hurry. This code expires after a certain number of you use it…..Tell ‘em Searchblog sent ya, and please, let me know what you think. I wish I could play with it…

Update: There are some international use issues, from Gary’s email explaining it:

…we think your readers are encountering another issue which is summarized with a work-around at:

  http://www.getsatisfaction.com/live_labs_pivot/topics/no_setup_internet_connection

Basically, in order to release the Pivot as early as we did, we chose to defer fully internationalizing the code. As a result, Pivot will not cooperate with a system that is non-English and non-US. However, some of our users have reported that by changing the system defaults for language and location, they have been able to successfully install and use Pivot.

Web 2: Help Me Interview Qi Lu

By - October 02, 2009

web 2 09.png_@user_60805.jpg In the personality-driven world that is our industry, Qi Lu stands out for his relative lack of public profile. Widely respected as a technological leader while heading up search at Yahoo, Qi burst onto the industry scene when he defected to Microsoft last year and took the role of President of the Online Service division. In short, Qi is the man in charge of Microsoft’s online strategy.

Our interview later this month will mark Qi’s debut on the Web 2 stage. From all accounts, Qi is a very different character from his boss Steve Ballmer (who was a highlight of Web 2 two years ago). I’m looking forward to our interaction. Clearly we have a lot to discuss – the shifting sands of alliances (Facebook, Yahoo, Myspace, etc.), the rise (and fall?) of Bing, the Yahoo search deal, the future of MSN with regard to content, the role of ad exchanges and platforms (the Aquantive deal), and much more.

But I digress. What do *you* want to hear from Qi this year?

Others we’ll be interviewing (and I’ve asked for your help):

Carol Bartz

Evan Williams

Brian Roberts

Jeff Immelt

To come: Aneesh Chopra, Sheryl Sandberg, Jon Miller, Austan Goolsbee, Paul Otellini, Shantanu Narayen, Tim Armstrong, Tim Berners Lee, and more. Again, an amazing lineup.

If you want to come, I can still get you a Searchblog discount (for about another week). Just ping me here.

Why Are Conversations (With the Right Person) So Much Better Than Search?

By - September 30, 2009

hal.jpegThanks to the BingTweets program, I’ve been asked to opine on search and decision engines. I’m kind of proud of my third and final post, which riffs on the first two and goes a bit, well, meta. I’d love to know what you guys think of it. I’ll repost the first half here, and link back to the whole post on the original site that commissioned the work.  

Over the past two posts I’ve outlined my hopes and frustrations around search and decision making, using my desire to acquire a classic car as an example of both the opportunity and the limitations of web search as it stands today. As an astute commentator noted on my last post – “normally a 30 minute conversation is a whole lot better for any kind of complex question.”

Which leads me to my last post in this series. What is it about a conversation? Why can we, in 30 minutes or less, boil down what otherwise might be a multi-day quest into an answer that addresses nearly all our concerns? And what might that process teach us about what the Web lacks today and might bring us tomorrow?

Well the answer is at once simple and maddenly complex. Our ability to communicate using language is the result of millions of years of physical and cultural evolution, capped off by 15-25 years of personal childhood and early adult experience. But it comes so naturally, we forget how extraordinary this simple act really is.

I once asked Larry Page, co-founder of Google, what his dream search engine looked like. His answer: The computer from Star Trek – a omnipresent, all knowing machine with which you could converse. We’re a long way from that – and when we do get there, we’re bound to arrive a with a fair amount of trepidation – after all, every major summer blockbuster seems to burst with the narrative of machines that out think humans (Matrix, Terminator, Battlestar Galactica, 2001, I Robot…you get the picture).

But I have hope. Given this is my last post in the series, allow me to wax a bit philosophical. While we in the search and Internet industry focus almost exclusively on leveraging technology to get to better answers, perhaps we might take another approach. Perhaps instead of scaling machines to the point of where they can have a “human” conversation with us (a la Turing), perhaps instead (or, as well), we might leverage machines to help connect us to just the right human with whom we might have that conversation?

Continued…