Microsoft Uses Powerset to Extend Open Source Effort
Microsoft’s Powerset acquisition brings the first open source code into a Microsoft product.
Last summer when Microsoft acquired Powerset-a San Francisco-based provider of semantic search technologies-it touted the shared vision of the two organizations to advance search by incorporating context in the form of user intent and meaning.
But the Powerset acquisition that made Microsoft a contender in semantic search last fall also put the once-vocal critic of open source software in the position of contributing actively to a popular open source project. Powerset contributes code to the Apache Software Foundation's Hadoop project, which means that for the first time there's open source code in a Microsoft product.
Getting to Know Hadoop
The Hadoop Framework is an open source, distributed-computing platform designed to allow implementations of Google MapReduce to run on large clusters of commodity hardware. MapReduce is a programming model for processing and generating large data sets. MapReduce divides applications into small blocks of work, enabling parallel computation over large data sets on unreliable computer clusters.
Powerset has based its index-build process entirely on a Hadoop cluster running the Hadoop Distributed File System (HDFS), and it also uses MapReduce. HDFS creates multiple replicas of data blocks for reliability and places them on compute nodes around the cluster, enabling MapReduce to process the data where it's located. The Powerset team also contributes to a Java-based, column-oriented, distributed database called Hbase, which is part of Hadoop.
Scott Prevost, general manager and product director at Powerset, says Microsoft is proving to be a hands-off master. "They really are committed to providing the resources we need and just letting us build out our technology," Prevost says.
In an Oct. 14 blog entry on the Microsoft Port 25 site, Bryan Kirschner, Microsoft's director of open source strategy, wrote about the new Hadoop relationship: "We're just scratching the surface on the range of opportunities for Microsoft to participate in and contribute to open source communities in ways that are good for customers, good for communities and good for business."
The Hadoop partnership is only the latest open source engagement by Microsoft, which in February 2008 launched a strategic interoperability initiative. Gartner Inc. analyst Mark Driver says that effort has been designed, in part, to draw more developers to Windows.
"Microsoft's first moves around open source have been about making sure that popular open source efforts are deployed on Windows," Driver says. "They don't want Linux to become the automatically preferred platform for open source. They want to give no one a reason not to use Windows."
Driver, who specializes in application-development technologies and open source software, sees Microsoft's strategy evolving now in ways that play to the company's strength as an effective supporter of its developers.
"In the last couple of years," he says, "we've seen the company promoting the use of Microsoft technologies to build open source projects. And that further increases the strength of Windows."
Microsoft now works with a number of purveyors of open source technologies, including PHP company Zend Technologies Ltd., CRM applications provider SugarCRM Inc. and the Apache Software Foundation. In October, Microsoft announced it would fund "advanced Silverlight development capabilities" for the Eclipse IDE.
"It's all about the ecosystem, with Windows at the heart of everything," Driver explains. "Why were they at EclipseCon last year? Why are they engaged with Apache? It's the only way they can bring those developers into the fold."
|At a Glance
|Microsoft says it will extend Powerset's open source contributions through technologies such as HBase:
- An open source, column-oriented, distributed database written in Java
- Runs on top of the Hadoop Distributed File System, providing BigTable-type capabilities
- Is supported by a very active development community
Keeping a Balance
Bola Rotibi, principle industry analyst at Macehiter Ward-Dutton Ltd., sees Microsoft's evolving relationship with the open source community as a balancing act. Through Powerset's involvement with Hadoop, she says, Microsoft takes another step while maintaining a kind of arms-length interaction.
"The Powerset group is a fairly separate entity within the company," Rotibi says. "But Microsoft will learn from its interactions with the open source community at that level, and I see that as a good thing for the company."
One factor that might hasten a cultural shift at Redmond, Driver points out, is the changing skill sets and interests of the workforce. Rotibi agrees: "Microsoft wants to hire the young hotshots out of the universities, and these kids are open source fanatics," she says. "Open source has been around for ages, but this is the first generation that grew up with things like PHP, and who have never developed a Java application without Eclipse. So there's a kind of organic growth taking place that's likely transforming the culture of the company. I believe that there's still a thick layer of management at Microsoft that harbors a bias against open source, but even they are looking at the world differently."