News

Microsoft Uses Powerset to Extend Open Source Effort

Microsoft’s Powerset acquisition brings the first open source code into a Microsoft product.

Last summer when Microsoft acquired Powerset-a San Francisco-based provider of semantic search technologies-it touted the shared vision of the two organizations to advance search by incorporating context in the form of user intent and meaning.

But the Powerset acquisition that made Microsoft a contender in semantic search last fall also put the once-vocal critic of open source software in the position of contributing actively to a popular open source project. Powerset contributes code to the Apache Software Foundation's Hadoop project, which means that for the first time there's open source code in a Microsoft product.

Getting to Know Hadoop
The Hadoop Framework is an open source, distributed-computing platform designed to allow implementations of Google MapReduce to run on large clusters of commodity hardware. MapReduce is a programming model for processing and generating large data sets. MapReduce divides applications into small blocks of work, enabling parallel computation over large data sets on unreliable computer clusters.

Powerset has based its index-build process entirely on a Hadoop cluster running the Hadoop Distributed File System (HDFS), and it also uses MapReduce. HDFS creates multiple replicas of data blocks for reliability and places them on compute nodes around the cluster, enabling MapReduce to process the data where it's located. The Powerset team also contributes to a Java-based, column-oriented, distributed database called Hbase, which is part of Hadoop.

Scott Prevost, general manager and product director at Powerset, says Microsoft is proving to be a hands-off master. "They really are committed to providing the resources we need and just letting us build out our technology," Prevost says.

In an Oct. 14 blog entry on the Microsoft Port 25 site, Bryan Kirschner, Microsoft's director of open source strategy, wrote about the new Hadoop relationship: "We're just scratching the surface on the range of opportunities for Microsoft to participate in and contribute to open source communities in ways that are good for customers, good for communities and good for business."

Selling Windows
The Hadoop partnership is only the latest open source engagement by Microsoft, which in February 2008 launched a strategic interoperability initiative. Gartner Inc. analyst Mark Driver says that effort has been designed, in part, to draw more developers to Windows.

"Microsoft's first moves around open source have been about making sure that popular open source efforts are deployed on Windows," Driver says. "They don't want Linux to become the automatically preferred platform for open source. They want to give no one a reason not to use Windows."

Driver, who specializes in application-development technologies and open source software, sees Microsoft's strategy evolving now in ways that play to the company's strength as an effective supporter of its developers.

"In the last couple of years," he says, "we've seen the company promoting the use of Microsoft technologies to build open source projects. And that further increases the strength of Windows."

Microsoft now works with a number of purveyors of open source technologies, including PHP company Zend Technologies Ltd., CRM applications provider SugarCRM Inc. and the Apache Software Foundation. In October, Microsoft announced it would fund "advanced Silverlight development capabilities" for the Eclipse IDE.

"It's all about the ecosystem, with Windows at the heart of everything," Driver explains. "Why were they at EclipseCon last year? Why are they engaged with Apache? It's the only way they can bring those developers into the fold."

At a Glance
Microsoft says it will extend Powerset's open source contributions through technologies such as HBase:
  • An open source, column-oriented, distributed database written in Java
  • Runs on top of the Hadoop Distributed File System, providing BigTable-type capabilities
  • Is supported by a very active development community

Keeping a Balance
Bola Rotibi, principle industry analyst at Macehiter Ward-Dutton Ltd., sees Microsoft's evolving relationship with the open source community as a balancing act. Through Powerset's involvement with Hadoop, she says, Microsoft takes another step while maintaining a kind of arms-length interaction.

"The Powerset group is a fairly separate entity within the company," Rotibi says. "But Microsoft will learn from its interactions with the open source community at that level, and I see that as a good thing for the company."

One factor that might hasten a cultural shift at Redmond, Driver points out, is the changing skill sets and interests of the workforce. Rotibi agrees: "Microsoft wants to hire the young hotshots out of the universities, and these kids are open source fanatics," she says. "Open source has been around for ages, but this is the first generation that grew up with things like PHP, and who have never developed a Java application without Eclipse. So there's a kind of organic growth taking place that's likely transforming the culture of the company. I believe that there's still a thick layer of management at Microsoft that harbors a bias against open source, but even they are looking at the world differently."

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS.  He can be reached at [email protected].

comments powered by Disqus

Featured

  • What's New in TypeScript 5.5, Now Generally Available

    Microsoft shipped the latest iteration of its type-infused superset of JavaScript, TypeScript 5.5, introducing inferred type predicates, control flow narrowing, JSDoc @import and other enhancements.

  • GitHub Copilot for Azure Gets Preview Glitches

    This reporter, recently accepted to preview GitHub Copilot for Azure, has thus far found the tool to be, well, glitchy.

  • New .NET 9 Templates for Blazor Hybrid, .NET MAUI

    Microsoft's fifth preview of .NET 9 nods at AI development while also introducing new templates for some of the more popular project types, including Blazor Hybrid and .NET MAUI.

  • What's Next for ASP.NET Core and Blazor

    Since its inception as an intriguing experiment in leveraging WebAssembly to enable dynamic web development with C#, Blazor has evolved into a mature, fully featured framework. Integral to the ASP.NET Core ecosystem, Blazor offers developers a unique combination of server-side rendering and rich client-side interactivity.

  • Nearest Centroid Classification for Numeric Data Using C#

    Here's a complete end-to-end demo of what Dr. James McCaffrey of Microsoft Research says is arguably the simplest possible classification technique.

Subscribe on YouTube