From Matt at Venturebeat. More when I can grok…
Powerset, a San Francisco search engine company, will announce Friday it has won exclusive rights to significant search engine technology it says may help propel it past Google.
The technology, developed at Palo Alto Research Center (PARC) in Silicon Valley, seeks to understand the meanings between words, akin to the way humans understand language — and is thus called “natural language.” It has been thirty years in the works.
The deal is significant because practical use of linguistic technology has eluded Google. The giant search engine has said it wants to implement language-understanding technology one day. However, tests of linguistic approaches haven’t made any difference in Google’s results so far, it says (see VentureBeat’s Thursday Q&A with Google’s director of research Peter Norvig below; also see his speech last year about this at Berkeley). Google has shunned reliance on word meanings, instead focusing on finding the most popular pages that contain the keywords. As for relationships between words, Google relies on statistical relationships, such as frequency they appear together, but not on linguistic relationships.
The deal with PARC, which is owned by Xerox, is an answer to Powerset critics, such as search expert Danny Sullivan, who all but heaped scorn on Powerset’s ambitions when we first wrote about them. At the time, Sullivan didn’t know the degree to which Powerset has focused on this.
7 thoughts on “Powersets – Breaking News”
man that is a fine piece of VC PR. Does anyone actually fall for this type of journalism ^H^H^H^H^H^H^H^H crap?
It’s hard to know why stuff like this just gets pushed out when a company decides to send out a press release. Very odd.
I really don’t understand how this could possibly work. Natural language processing or no, there’s simply not enough information available to a search engine to properly distinguish our pathologically short queries’ intentions: “bass”, “apple”, “bat”. At this point, to do any sort of meaningful natural language processing you need to have a Google-sized database of personal information — that’s the only way that my friends can tell that I’m interested in music, not fish.
It’s also amusing to look at the perks on their career page — obviously targeting Googlers who commute from SF.
On a side note, I couldn’t leave my blog’s address in the comment information fields because I’m not logged in using TypeKey, but search as I may I can’t figure out how to do that.
As promising as their natural language platform sounds, the greatest threat to Google’s growing hegemony in the search/paid search arenas…given that about 1/2 of all searches are known to be for products and services…may actually spring not from better search, but from patent pending (#11/250,908) paid match, which will target people’s actual demographic and psychographic traits and characteristics (keytraits) instead of just the words we all type into little search boxes.
Though paid match is not yet an operating system, our own US Dept of Labor does run a very popular service (over 500,000 users/month) which provides an enlightening and instructive peak at the potential that such a paid match search/ad platform possesses.
Called GovBenefits (available at govbenefits.gov), it utilizes a personal profile and a match engine to determine what government benefit programs people qualify for.
Were such a system populated with the 100’s of thousands to millions of products and services companies provide nation/worldwide instead of just the 400-odd government programs it includes now, one can only imagine what its public popularity would be…
…and with the world’s advertisers having the ability to pinpoint target and control; via bidding directly on those keytraits most relevant and applicable to their products and services, exactly who sees their ads (goodbye click fraud); one can also only imagine the deleterious effects that such an elegant and superior system/platform would have on a 95% PPC income dependent company like Google…
…and as for the million+ users such a system would ideally initially need to appeal to the largest number of advertisers?
…just one simple invite e-mail to their 100+ million e-mail users…along with one to their 50-100,000+ advertisers by Microsoft and/or Yahoo…and POW!
Instant new billion+ dollar advertising marketplace.
I am someone that builds search technology and has experience with NLP. All I have to say is “show me the relevancy”.
Powerset is a late comer and far behind others in the NLP search tech space. NLP has always thrown away context to fit SQL database calls. A fundamentally new database architecture is required (Patents filed as early as 1994) to use every scrap of context expressed by well articulated needs (query). You can experience an award winning NLP enterprise search offering (activated in 2005) at Boston’s Children’s Hospital’s Center for Media and Child Health – http://www.cmch.tv – go to their “research” page and experience “Smart Search.” This NLP engine encourages (for highest precision) an everyday conversational query of unlimited length and complexity including “user jargon” of ten social science professional domains.”
The next and final (post Google/Powerset) achievement in breakthrough user experience will be Jarg Corporation’s Semantic Knowledge Indexing Platform (SKIP) launch mastering “NOP” Natural Object Parsing that co-populates “well-understood native object content fragments” in the same master index with NLP-graph fragments. This final step – using conversational style requests (over a cell phone or keyboard) will provide total information awareness associated with the “roll” of the user – as derived on the fly from the full context of the request’s information needs. Only relevant knowledge will be considered and the more contexts in the request – the more highly personalized will be the returns-ranking. These returns will be a “collage,” ranked by fit-to-context, of image segments, fragrances, text, structure segments, music segments and all forms of knowledge with precise contextual relation to your on the fly the needs – fit to your “user’s roll” of the moment. Jarg will be seeking its very fist institutional capital starting in March 2007. Jarg has incorporated Semantx Life Science, Inc. Care Commons, Inc and Preemptive Alert Corporation to become best of breed in their verticals.
I was just thinking that this strategy worked so well with Inxight. I’m sure their venture backers are remembering how easy it was to build a company around that technology and turn it into a return.
I’m wondering if what was licensed by Xerox is the same finite-state transducer technology that is behind their NLP technology. If so, many people have already tried and failed to scale a schema that extracts using that technology into an effective semantic search system (in fact there are probably a half a dozen firms in the Valley that are finding out why that is so hard right now).
Irony is that if they had worked with the high end consumers on the Intel/Defense side, they could give them a ton of hints about what not to do that might save the current backers of PowerSet about half of their initial round in development money. But… the Valley tends to take old ideas (from a technical side), rehype them, put on a cute name and see if the resumes can get acquired. At least Google had the good sense to repurpose well validated co-reference/citation analysis that had been used for years for ranking the value of tech transfers/development in S&T analysis in the Intel space and apply it to URLs.
If Google wanted to compete/nullify Powerset out of the box, they could probably stroke a $100M check to Inxight and pick up all of the same stuff in production form.
More power to the Powerset gang, but I get the sense that I’ve seen and heard all of these ideas before from arguably smarter people with about 10X the resources. Then again, I have a hunch that Powerset is mostly marketing around basic extraction and heuristic schemas that will be UI-engineered to seem a lot more intelligent than is actually the case.