28 June 2007

Powerset demo

Part two of my liveblogging from Powerset. They're talking about how they differ with respect to indexing. This helps them with both matching documents and ranking them.

In indexing, they parse each sentence on the page. For example:
'Sir Edward Heath died of pneumonia.'

Here's how they index this sentence.

- extract entities and semantic relationships.
- - expand to find similar entities and abstractions

-In this phase, they understand that:
1. Sir Edward was a UK prime minister - a politician
2. pneumonia is a disease
3. if you died from something, you were killed by it

This is a big change from the search we know (Google). It lets the user phrase their query in a lot of different ways. For instance, Powerset can answer the following:

- 'what killed edward heath'
- 'which prime minister died of pneumonia'
- 'what was sir edward heath killed by'
- 'what politician died from pneumonia'
- 'politician died from disease'

Powerset has so far indexed NY Times corpus, Wikipedia, and is working with Freebase.

