Powerset Gives Microsoft Semantic Search Tools
After its Powerset acquisition, Microsoft is pushing semantic search as the replacement for traditional keyword search.
Call it Web 3.0, the intelligent Web, or the semantic Web, the next iteration of the cloud is all about extracting meaning from that vast fogbank of unstructured content. To do that, many argue that existing search technology just won't cut it. Behind the backdrop of recent reports questioning if Microsoft will acquire Yahoo!'s search business while also looking at revamping its own Live Search technology, the company has numerous parallel efforts aimed at developing new ways to improve the way users find information.
One such effort involves the technology developed by Powerset, a San Francisco-based provider of semantic search technologies that Microsoft recently acquired. "Current search technology just doesn't leverage all of the affordances of Web 3.0-semantic Web, structured data, interoperability, collaborative filtering," says Scott Prevost, general manager and product director at Powerset. "Keyword search just doesn't address these things."
Prevost gave the day-two keynote at Jupitermedia Corp.'s Web 3.0 conference in Santa Clara, Calif., in October, where he outlined his vision for where search will evolve. "Everything today is about the keywords," Prevost told attendees. "And there are a lot of casualties in this keyword economy."
The way he sees it, semantic search replaces the core conceptual orientation of keyword search around content, consumers and actions with "concepts," "communities" and "behaviors." That shift supports vastly improved searches, he says. "By some measures, 40 percent of queries are unsatisfactory, and 50 percent need some kind of refinement," he says.
Roots from PARC
Powerset relies on very deep natural language processing (NLP), technology that has been in the lab for 30 years but has only recently become computationally feasible because of high-performance computing technologies. Powerset's semantic search app uses this technology-which the company licenses from Palo Alto Research Center (PARC)-to extract meaning from documents one at a time and encode that meaning into its index. Meaning and intent are extracted from queries at runtime. Matching the meaning of those queries with the meaning in the index renders better search results, he says.
Gartner Inc. analyst Rita Knox defines semantic searches as those that use clues about the information users are seeking based on semantic subtleties that machines have yet to master on their own.
"When pieces of information are labeled, computers can see what they mean; but most information is not labeled-it's unstructured," Knox says. "Without the labels or tags, computers are at a loss to make a human kind of sense out of things like text. The promise of semantic search is that it will provide that understanding to the machines, and that will result in more meaningful and relevant search results."
Powerset isn't the only company working on semantic search. Culver City, Calif.-based Cognition Technologies Inc. provides a set of NLP technologies that add word and phrase meanings and understanding to computer apps. Another player is New York-based Hakia Inc., which offers a general-purpose semantic search engine.
In July, Microsoft blogged about its plan to acquire Powerset. In the post, the Redmond software giant claimed to share Powerset's vision "to take search to the next level by adding understanding on the intent and meaning behind the words in searches and Web pages." In addition to acquiring Powerset, Microsoft shelled out $1.2 billion for corporate database search provider Fast Search & Transfer.
Google's massive search engine, which is based on keyword search technology, will make it difficult for Google to provide semantic understanding, Knox says. Large search engines like Google's-or Microsoft's, for that matter-have scanned and indexed reams of Web pages to support their keyword search processes. Semantic search, however, is a fundamentally different process, Knox says; any move in that direction might require them to rescan everything-a Herculean task, to say the least.
In a way, Microsoft is in the same boat, but its purchase of Powerset might give it a leg up on the competition. "There's a chance here that Microsoft could use the Powerset technology to leapfrog Google," Knox says. "Microsoft has very deep pockets, but I wouldn't underestimate Google. They're investing in semantic search, too."
||"By some measures, 40 percent of queries are unsatisfactory, and 50 percent need some kind of refinement."
|Scott Prevost, General Manager and Product Director, Powerset
Open APIs Coming?
During his presentation, Prevost emphasized the market opportunities semantic search presents: the "vast untapped dollars of the keyword economy." But if the post-keynote questions are any indication, the conference attendees were more interested in figuring out how to build applications that take advantage of semantic Web technologies and concepts. One attendee peppered Prevost with questions about the possibility of an open API.
"Semantic search is going to impact everything, including application developers," Prevost said. "Suddenly the world's information is available in a form we can compute with. Developers will be able to take content and have ways of understanding and importing it into other applications that they don't have now. The impact, I think, will be tremendous."
On the subject of providing open APIs to those developers, Prevost was circumspect. "Will we be doing that down the road? Absolutely," he said. "We've always had that somewhere on the roadmap, but we don't have any specific plans right now. Right now, we're focused on the core work. Now that we're part of Microsoft, we have a lot of resources, so stay tuned."
John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS. He can be reached at [email protected].