News

Microsoft Uses Powerset to Extend Open Source Effort

Microsoft’s Powerset acquisition brings the first open source code into a Microsoft product.

Last summer when Microsoft acquired Powerset-a San Francisco-based provider of semantic search technologies-it touted the shared vision of the two organizations to advance search by incorporating context in the form of user intent and meaning.

But the Powerset acquisition that made Microsoft a contender in semantic search last fall also put the once-vocal critic of open source software in the position of contributing actively to a popular open source project. Powerset contributes code to the Apache Software Foundation's Hadoop project, which means that for the first time there's open source code in a Microsoft product.

Getting to Know Hadoop
The Hadoop Framework is an open source, distributed-computing platform designed to allow implementations of Google MapReduce to run on large clusters of commodity hardware. MapReduce is a programming model for processing and generating large data sets. MapReduce divides applications into small blocks of work, enabling parallel computation over large data sets on unreliable computer clusters.

Powerset has based its index-build process entirely on a Hadoop cluster running the Hadoop Distributed File System (HDFS), and it also uses MapReduce. HDFS creates multiple replicas of data blocks for reliability and places them on compute nodes around the cluster, enabling MapReduce to process the data where it's located. The Powerset team also contributes to a Java-based, column-oriented, distributed database called Hbase, which is part of Hadoop.

Scott Prevost, general manager and product director at Powerset, says Microsoft is proving to be a hands-off master. "They really are committed to providing the resources we need and just letting us build out our technology," Prevost says.

In an Oct. 14 blog entry on the Microsoft Port 25 site, Bryan Kirschner, Microsoft's director of open source strategy, wrote about the new Hadoop relationship: "We're just scratching the surface on the range of opportunities for Microsoft to participate in and contribute to open source communities in ways that are good for customers, good for communities and good for business."

Selling Windows
The Hadoop partnership is only the latest open source engagement by Microsoft, which in February 2008 launched a strategic interoperability initiative. Gartner Inc. analyst Mark Driver says that effort has been designed, in part, to draw more developers to Windows.

"Microsoft's first moves around open source have been about making sure that popular open source efforts are deployed on Windows," Driver says. "They don't want Linux to become the automatically preferred platform for open source. They want to give no one a reason not to use Windows."

Driver, who specializes in application-development technologies and open source software, sees Microsoft's strategy evolving now in ways that play to the company's strength as an effective supporter of its developers.

"In the last couple of years," he says, "we've seen the company promoting the use of Microsoft technologies to build open source projects. And that further increases the strength of Windows."

Microsoft now works with a number of purveyors of open source technologies, including PHP company Zend Technologies Ltd., CRM applications provider SugarCRM Inc. and the Apache Software Foundation. In October, Microsoft announced it would fund "advanced Silverlight development capabilities" for the Eclipse IDE.

"It's all about the ecosystem, with Windows at the heart of everything," Driver explains. "Why were they at EclipseCon last year? Why are they engaged with Apache? It's the only way they can bring those developers into the fold."

At a Glance
Microsoft says it will extend Powerset's open source contributions through technologies such as HBase:
  • An open source, column-oriented, distributed database written in Java
  • Runs on top of the Hadoop Distributed File System, providing BigTable-type capabilities
  • Is supported by a very active development community

Keeping a Balance
Bola Rotibi, principle industry analyst at Macehiter Ward-Dutton Ltd., sees Microsoft's evolving relationship with the open source community as a balancing act. Through Powerset's involvement with Hadoop, she says, Microsoft takes another step while maintaining a kind of arms-length interaction.

"The Powerset group is a fairly separate entity within the company," Rotibi says. "But Microsoft will learn from its interactions with the open source community at that level, and I see that as a good thing for the company."

One factor that might hasten a cultural shift at Redmond, Driver points out, is the changing skill sets and interests of the workforce. Rotibi agrees: "Microsoft wants to hire the young hotshots out of the universities, and these kids are open source fanatics," she says. "Open source has been around for ages, but this is the first generation that grew up with things like PHP, and who have never developed a Java application without Eclipse. So there's a kind of organic growth taking place that's likely transforming the culture of the company. I believe that there's still a thick layer of management at Microsoft that harbors a bias against open source, but even they are looking at the world differently."

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS.  He can be reached at [email protected].

comments powered by Disqus

Featured

  • AI for GitHub Collaboration? Maybe Not So Much

    No doubt GitHub Copilot has been a boon for developers, but AI might not be the best tool for collaboration, according to developers weighing in on a recent social media post from the GitHub team.

  • Visual Studio 2022 Getting VS Code 'Command Palette' Equivalent

    As any Visual Studio Code user knows, the editor's command palette is a powerful tool for getting things done quickly, without having to navigate through menus and dialogs. Now, we learn how an equivalent is coming for Microsoft's flagship Visual Studio IDE, invoked by the same familiar Ctrl+Shift+P keyboard shortcut.

  • .NET 9 Preview 3: 'I've Been Waiting 9 Years for This API!'

    Microsoft's third preview of .NET 9 sees a lot of minor tweaks and fixes with no earth-shaking new functionality, but little things can be important to individual developers.

  • Data Anomaly Detection Using a Neural Autoencoder with C#

    Dr. James McCaffrey of Microsoft Research tackles the process of examining a set of source data to find data items that are different in some way from the majority of the source items.

  • What's New for Python, Java in Visual Studio Code

    Microsoft announced March 2024 updates to its Python and Java extensions for Visual Studio Code, the open source-based, cross-platform code editor that has repeatedly been named the No. 1 tool in major development surveys.

Subscribe on YouTube