Code Focused

Free Databases in the Window Azure Marketplace

The Windows Azure Marketplace has a hidden jewel: a host of free and nearly free databases ready for monetization.

During the second day keynote of the Microsoft BUILD conference in September, Satya Nadella, president of the Microsoft Server & Tools Business, announced that the Bing Translator API was available from the Windows Azure Marketplace and used by large sites such as eBay. Microsoft expanded the Marketplace to 25 new countries in early October. According to Nadella, developers can easily monetize databases through the Windows Azure Marketplace. (You can view the keynote and read the full announcement.)

I hadn't heard of the Windows Azure Marketplace prior to the keynote, so I was intrigued to see how data was being sold in it and, more specifically, to see what interesting databases I might find for free or at very low cost. And find them I did.

In this article, I'll look at how to access the Windows Azure Marketplace and a baker's dozen of databases (data feeds) that are free or available in limited subscriptions -- free for the first several thousand transactions per month. The code download includes a sample ASP.NET MVC 3 Web site that presents sample data queries from these services. The Web site is written in Visual Basic 10 for the Microsoft .NET Framework 4; equivalent data-access routines for C# developers are included for some of the databases as well.

Windows Azure Marketplace
The Windows Azure Marketplace is at datamarket.azure.com, and a good description of the highlights and benefits can be found here. The Marketplace includes software application subscriptions as well as databases. A Windows Live ID is required to create a Marketplace account. Once created, you're issued a Customer ID and an Access Key. This is your username and password for each data service accessed; it must be protected with the same privacy level as a credit card, as they tie directly to your online billing account for paid services.

Some databases, such as those provided by the United Nations and the U.S. government, are free at any level of usage. Other databases, such as the Microsoft Translator, are free to a certain level and carry a fee above that level.

Fortunately, you determine what usage level and expense is appropriate for you. Each subscription level will prevent further use if the usage limit is exceeded, so you won't automatically incur fees beyond that subscription level. See Figure 1 for the subscription levels of the Microsoft Translator service as an example. To increase the available usage level, the existing subscription must be canceled and the higher-level subscription purchased. Any remaining usage on the canceled subscription level will be forfeited.


[Click on image for larger view.]
Figure 1. Subscription levels for the Microsoft Translator service.

Database subscriptions can be added to your account and become available under the "My Data" link within your account profile, collecting all your database subscriptions together for easy review.

Accessing the Databases
The Marketplace supports both fixed query and flexible query databases. The fixed query services make available a C# source code library; all access is done via a secured OData feed from the provided Service URI via predefined methods and parameters. The flexible query service involves creating a Service Reference to the Service root URL, then using LINQ to query the service. A particular data feed is either fixed or flexible. The code download demonstrates accessing both fixed query and flexible query databases.

To run the code download and access the Windows Azure Marketplace databases, you'll need to use your Windows Live ID to create a Marketplace account and then subscribe to the appropriate databases at the desired level. No credit card is needed as long as only free and limited-tier subscriptions are added. The Settings tab of the AzureMarketplaceDemo project has a CustomerID field and AccessKey field to hold your identifying information. Alternatively, you can place that information in the hosting PC's Windows Registry, as described in the GetCredentials method.

The Service URIs are not meant to be accessed outside an application because they must be passed valid network credentials (username = CustomerID, password = AccessKey) in order to return a valid response.

[Screen images of sample query results from the databases are included in the code download for those who would like to see more of the type of data available without creating a Windows Azure Marketplace account. -- Ed.]

Microsoft Translation Services
The Microsoft Translator page is found here. It's a fixed query service that provides the first 2,000 monthly translations for free. In general, try to keep each translation request at less than 1,500 characters. Translation requests are capped at 50 per second per TCP/IP address.

As you can see in Listing 1, the service takes the source language string, the destination language code and the source language code. The OData Service URI is https://api.datamarket.azure.com/Bing/MicrosoftTranslator. The available transaction count is decremented for the database on your "My Data" Marketplace account page after each call, so it's easy to determine how much usage remains at the subscription level for the current subscription period.

Twitter Rank
The Trstrank service by Infochimps provides a measure of the quality of a Twitter account, based on the number and quality of the followers. Located here, the database is updated the first day of each month. It returns the TrstQuotient value, a decimal number between 0 and 100 that "indicates how โ€˜normal' a user's Trstrank is, given his number of followers." Two Twitter accounts with the same number of followers can have very different TrstQuotient values based on the "trustworthy" value, or quality, of their followers. A spam or abusive Twitter account will have a low TrstQuotient.

Up to 100,000 Trstrank transactions are permitted each month at no charge. Trstrank is a fixed query service with a sample C# class library provided. Note that Twitter account names need to be in lowercase when passed to the service. See the code download for a Visual Basic sample.

United Nations
The UNdata organization provides five separate databases in the Windows Azure Marketplace for the United Nations. Each one provides a wealth of information and is available for unlimited access at no charge:

  • Gender Info 2007
  • Key Global Indicators
  • Millennium Development Goals
  • Joint United Nations Program on HIV/AIDS
  • World Health Organization (WHO)

The UNdata databases are flexible query services accessed via a Service Reference. Listing 2 shows the code used to obtain the list of data measures available from the United Nations Gender database. The controller code is very similar to the controller code in Listing 1. Because a maximum of 100 results are returned per service query, the GetDataSeriesList method is written to loop until all the results are returned.

The United Nations Gender Info 2007 data measures available include 116 items, such as "Female/male ratio of population," "Women's share of labor force," "Women's share of tertiary enrollment in social sciences, business and law" and other interesting measures.

U.S. Crime 2006-2008
The U.S. Crime 2006-2008 database, located here, provides data from the Federal Bureau of Investigation (FBI) Uniform Crime Reporting database. This flexible query database returns data fields, including aggravated assault, arson, burglary, forcible rape, larceny, theft, motor vehicle theft, murder and non-negligent manslaughter, population, property crime, robbery and violent crime. Searches are performed by city, state and year. Access is free for unlimited usage.

Wolfram|Alpha Facts
The Wolfram|Alpha Facts database is the hardest to describe, but perhaps one of the most useful. It's published by a private company and provides data on thousands of measures over a wide range of topics, many over a time series. The database uses the same underlying data as that used at wolframalpha.com. It's a fixed query database located here. Access is free for up to 5,000 transactions per month. Currently no paid subscription level with a higher transaction count is available.

World Development Indicators
The World Bank database "provides a comprehensive selection of economic, social and environmental indicators, drawing on data from the World Bank and more than 30 partner agencies. The database covers more than 900 indicators for 210 economies with data back to 1960 in many cases." This fixed query service is available here. It's updated quarterly in April, July, September and December. Access is free for unlimited use.

Zillow Real Estate Information
Zillow.com provides four different fixed query databases that are each available at no charge, but capped at 30,000 transactions per month (see Table 1). That should be sufficient for most usage scenarios.

Data Feed Name Publisher Description Windows Azure Marketplace Page
Home Valuation Search results lists, Zestimate home valuations, home valuation charts, comparable houses and market trend charts. bit.ly/n0dKnr
Mortgage Information Current mortgage rates from Zillow Mortgage Marketplace as well as monthly payment calculations based on the current rates. bit.ly/oayaX8
Neighborhood Data Neighborhood and city affordability statistics, demographic data at the city and neighborhood level, and lists of regions. bit.ly/qBRgCo
Property Details Property-level data, including historical sales, price and year, taxes, beds/baths and more. bit.ly/qOUEaT

Table 1 The four Zillow.com databases.

Data, Data and More Data
For those of you counting, I included one extra database for a total of 14. I've focused on databases that are either free or available at a limited subscription level with no fees for that level. There are many more databases in the Windows Azure Marketplace, with more than 100 available the time of this writing.

The programming required to access the information in these databases in the .NET Framework -- from either C# or Visual Basic -- is extremely approachable and follows a distinct pattern. The code download contains an ASP.NET MVC 3 Web site that demonstrates sample queries for these databases. Screen images are included in the download to allow readers to see a bit of the sample query results without having to acquire a Windows

Azure Marketplace ID to run the queries. There's a lot of great data available in the Windows Azure Marketplace: Go out and make full use of it.

About the Author

Joe Kunk is a Microsoft MVP in Visual Basic, three-time president of the Greater Lansing User Group for .NET, and developer for Dart Container Corporation of Mason, Michigan. He's been developing software for over 30 years and has worked in the education, government, financial and manufacturing industries. Kunk's co-authored the book "Professional DevExpress ASP.NET Controls" (Wrox Programmer to Programmer, 2009). He can be reached via email at [email protected].

comments powered by Disqus

Featured

  • Microsoft Ships Stable Versions of OpenAI Libraries for .NET and Azure

    Further leveraging the relationship that vaulted Microsoft and OpenAI into leadership positions in the AI era, Microsoft this week announced stable versions of two new OpenAI libraries.

  • Microsoft Further Embraces OpenAPI Spec (formerly Swagger)

    Microsoft has long embraced the OpenAPI Specification (formerly known as Swagger) for describing APIs, and it's now taking that support to the next level with a new online resource.

  • Get Good at DevOps: Feature Flag Deployments with ASP.NET WebAPI

    They provide developers with the ability to toggle features on and off without having to redeploy code, making it easier to manage risk, test features in production, and facilitate smoother releases.

  • Implementing k-NN Classification Using C#

    Dr. James McCaffrey of Microsoft Research presents a full demo of k-nearest neighbors classification on mixed numeric and categorical data. Compared to other classification techniques, k-NN is easy to implement, supports numeric and categorical predictor variables, and is highly interpretable.

Subscribe on YouTube