28 June 2007

Powerset and Powerlabs, roundup

In previous posts, I liveblogged about Powerset's presentation this evening of Powerlabs to a select group of journalists, technologists, and bloggers. The big idea is that Powerlabs (launching in September, ahead of Powerset's search engine) will be a Digg-like site where community members can suggest and vote on Powerset features. Powerset aims to be incredibly open - as Steve Newcomb joked, the only thing stealth about them is that they're in stealth mode.

What Powerset is shooting for is ambitious, and has the potential to greatly improve how we find information on-line. Google has a lot of talent and smarts, but all the major players are doing variants of the same thing - statistical analysis on top of keyword search, an idea that goes back decades to Salton's work on document indexing. Powerset's approach has its roots too - decades of linguistic research and development at Xerox PARC - but turning even the best research platform into an internet search engine requires a lot of work. One example - the core engine they licensed originally took over a minute per sentence to index Wikipedia entries - now, with optimizations, it's down to a second or less. Still pretty CPU intensive, but as Steve Newcomb pointed out, indexing costs are small compared to the normal runtime costs of a popular search engine. At Google scale, Powerset would be profitable even with the increased compute needs.

Steve also made a point of saying that Powerset has never called themselves a Google killer. Still, they're trying to do something that's very cool. If they can do what they demod tonight on a grand scale, I'll switch.

No comments: