VSM Cover Story

Retire Your Data Center

Visual Studio 2008, ASP.NET, and the Azure Services Platform combine to simplify local development of data-intensive Web apps and automate their deployment in Microsoft data centers. The result: You get maximized availability and reliability with almost limitless on-demand scalability, while you pay only for resources consumed.

TECHNOLOGY TOOLBOX: VB.NET , C#, ASP.NET, XML, Other: Azure Services Platform Elastic Compute Cloud (EC2)

The three-year hiatus between the Microsoft Professional Developers Conference (PDC) 2008 and its 2005 predecessor gave Ray Ozzie and his newly expanded development team the opportunity to play catch-up with Amazon.com Inc., Google Inc., and other major players in the race to capture developer mindshare forcloud computing. Microsoft's cloud candidate -- Windows Azure -- occupied the bulk of Ozzie's Day One PDC keynote and was the subject of 39 sessions, more than twice as many as the next-most-discussed topic -- Visual Studio. Windows Azure is an Internet-facing operating system that promises to enable .NET developers to leverage their current ASP.NET, Windows Communication Foundation (WCF), and Windows Workflow programming skills to deploy .NET Web applications to Microsoft's newly built data centers quickly and easily. Microsoft's message to VS developers is to "use your existing tools, knowledge, and skill set" for projects you deploy to Windows Azure.

The Azure Services Platform, for which the Windows Azure OS serves as the foundation, provides "massively scalable" table and blob storage services, a persistent message-queue service, several .NET utility services (formerly code-named "Zurich"), a Live operating environment, and the successor to SQL Server Data Services that's now called SQL Data Services (see Figure 1). I'll give you a brief introduction to the platform's objectives and its architectural details to achieve them, and describe how Azure differs from its cloud-based competitors. Then I'll drill down into programming data-intensive Web applications that take advantage of Azure's unique development and deployment tools. Finally, I'll discuss an instrumented ASP.NET test harness for Azure Table Services that you can download, run locally with the community technology preview (CTP) of the Windows Azure SDK, and then deploy to the Windows Azure CTP running in a Microsoft data center. Future VSM issues will cover Blob Services, Queue Services, SQL Data Services, and .NET Services.

"Cloud computing" is a catch-all term for Web-based utility computing operations provisioned as pay-by-usage software services and accessed over the Internet. Amazon Web Services and Google App Engine are probably the best-known cloud-computing services, but Web-hosting firms, such as Rackspace Hosting Inc., and specialty vertical-market providers like SalesForce.com Inc., also fit into the cloud-computing picture. The primary economic justification for moving data center activity to the cloud is avoidance of capital expenditure for servers and associated networking hardware to handle peak instead of average loads. Other benefits include reduction of IT management and operating costs, improved application reliability and availability, and the ability to scale Web applications up-and-out quickly to match rapid increases in traffic. When the traffic subsides, such as after holiday sales, resources can return to the pool automatically or by manual intervention. Cloud deployment also offers a rapid method for proving Web application or service concepts without making infrastructure investments.

Azure Echoes AWS
Azure's architecture most closely resembles a combination of Amazon Web Services (AWS) Elastic Compute Cloud (EC2) running Windows Server 2003 R2 with SimpleDB for semi-structured tables, Simple Storage Service (S3) for blob storage, Simple Queue Service (SQS) for messaging between applications, and Elastic Block Store to persist other instance data (see Table 1 and Additional Resources). Amazon EC2 running Windows Server and SimpleDB are in the beta-testing stage, as is the Google App Engine (GAE). Microsoft won't reveal pricing for Azure services until later in 2009 when version 1 is closer to release. However, the current word is that Azure will be "competitive," presumably with the AWS, GAE, or both price lists, and the Service Level Agreement will be factored into monthly charges. It's not known whether Microsoft will adopt Google's approach of billing only for App Engine usage in excess of fixed quotas for free CPU time, network ingress and egress, and storage.

Windows Azure runs on Windows Server 2008 with virtualization provided by Microsoft's Hyper-V hypervisor technology to deliver a runtime fabric that handles load balancing, data replication, and resource management. According to Microsoft's Erick Smith, the Azure Fabric Controller maintains a graph of the inventory of physical and virtual machines, load balancers, routers, and switches it manages in a Microsoft data center. Edges of the graph are interconnections of various types, for example, network, serial, and power cables. You specify the topology of your service -- the number and connectivity of roles, the attributes and locations of the various hardware components, as well as the numbers of fault/update domains and maximum instances of each role you need with a declarative Service Model. In this respect, Windows Azure management features are similar to those offered by RightScale for AWS and employed by the GAE. Roles are runnable components of an application; role instances run on the fabric's nodes and channels connect roles. The CTP limits applications to managed code authored in VS 2008 that runs under a custom version of medium trust Code Access Security. Microsoft promises future support for Python, Ruby, native code, and Eclipse. Fault domains for role instances represent a single point of failure, such as a rack; update domains for performing rolling software upgrades or patches run across multiple fault domains (see Figure 2). Ultimately, you'll be able to specify your Service Model with Oslo's domain-specific language tools and store the model in the Oslo repository.

The current CTP released at PDC 2008 doesn't expose the Service Model; instead the Windows Azure Tools for Microsoft Visual Studio add-in defines common Azure application-role templates for Web Role, Worker Role, and Workflow. The Web Role enables creating a new Web Cloud Service or Web and Worker Cloud Service as an ASP.NET Web application running on IIS 7 instance(s) under Windows Server 2008 (see Figure 3). Windows Azure doesn't support file-system Web site projects. Worker Roles are asynchronous operations that perform background processing when added to Web projects; roles also can create standalone Worker Cloud Services. The most common use for Worker Roles is processing messages added to an Azure Queue. Workflows enable writing standalone CloudSequentialWorkflow projects or can be incorporated in Web or Worker Cloud Services. Azure's October 2008 CTP limits testers to a maximum of 2,000 runtime hours with up to eight instances of a single production application having one Web and, optionally, one Worker Role.

Emulate the Cloud Locally
Azure was in a limited (private) beta stage at press time. You can apply for admission to the Azure Services Platform beta program at the Microsoft Connect site; the OakLeaf blog offers a walkthrough of the somewhat convoluted sign-up process (see Additional Resources). However, you don't need to wait for a beta invitation because downloading and installing the Windows Azure SDK (October 2008 CTP) and Windows Azure Tools for Microsoft Visual Studio (October 2008 CTP) lets you emulate the Azure Services Platform on your local development machine. Installing the SDK adds a start menu folder with Development Fabric and Development Storage nodes, as well as Release Notes, Windows Azure SDK Command Prompt, and Windows Azure SDK Documentation nodes. Choosing the Development Fabric and Development Storage nodes or clicking their icons in the Taskbar's Notification Area opens management dialogs for the two service emulators (see Figure 4).

When you specify one of the Cloud Services templates (typically a Web Cloud Service) for a new VS 2008 project, the add-in generates a solution with a Cloud Service project, which contains ServiceCon­figuration.cscfg and ServiceDefinition.csdef files, as well as an ASP.NET Web Role application. You also must expand the Windows Azure SDK's \Program Files\Windows Azure SDK\v1.0\samples.zip file to a temporary folder. Then add the \Temp\StorageClient\ Lib\StorageClient.csproj and \Temp\HelloFabric\Common\Com­mon.csproj projects to your solution by right-clicking on the solution in Solution Explorer and selecting Add an Existing Project. Add the two project files as references to your ProjectName_WebRole application. The StorageClient library delivers wrapper classes for REST API operations on Azure Blob, Queue, and Table Services; the Common library provides ApplicationEnvironment classes for logging and other local fabric-related activities. The Azure Tools add a reference to Microsoft.ServiceHosting.ServiceRuntime.dll for the local fabric automatically. Right-click on the Cloud Service project node and choose Create Test Storage Tables to add ProjectName tables to your local instance of SQL Server 2005+ Express. At this point, you can import page and class files for existing ASP.NET projects or create a new Web application from scratch.

Settings in the ServiceConfiguration.cscfg file determine whether your application uses local or cloud storage when emulating the hosted service. Here's a ServiceConfiguration.csfg file with entries for both local and cloud storage with AccountSharedKey values abbreviated:

<?xml version="1.0"?>
<ServiceConfiguration 
    serviceName="SampleWebCloudService" 
    xmlns="http://schemas.microsoft.com/ 
      ServiceHosting/2008/10/
      ServiceConfiguration">
  <Role name="WebRole">
    <Instances count="3"/>
      <ConfigurationSettings>
        <Setting name="AccountName" 
            value="devstoreaccount1"/>
        <Setting name="AccountSharedKey" 
            value="Eby8vd … MGw=="/>
        <Setting name="BlobStorageEndpoint" 
            value="http://127.0.0.1:10000/"/>
        <Setting name="QueueStorageEndpoint" 
            value="http://127.0.0.1:100001/"/>
        <Setting name="TableStorageEndpoint" 
            value="http://127.0.0.1:10002/"/>
 <!-- <Setting name="AccountName" 
            value="<youraccountname>"/>
        <Setting name="AccountSharedKey" 
            value="<YourPrimaryAccessKey>" />
        <Setting name="BlobStorageEndpoint" 
            value="http://blob.core.windows.net" />
        <Setting name="QueueStorageEndpoint" 
            value="http://queue.core.windows.net" />
        <Setting name="TableStorageEndpoint" 
            value="http://table.core.windows.net" /> -->
      </ConfigurationSettings>
  </Role>
</ServiceConfiguration>

Settings for cloud storage services are commented out in the preceding example; local setting values are the same for all users. The <Instances count="3"> element causes three instances to run in the Development or Azure Fabric. Note that the preceding 127.0.0.1 (localhost) port numbers appear in Figure 4's Development Storage dialog. YourPrimaryAccessKey is the base64Encoded Primary Access Key value on the Azure Portal's Project Summary page.

Each <Setting name="*Endpoint" … /> element in the ServiceConfiguration.csfg file must have a corresponding entry in the ServiceDefinition.csdef file (indicated here in red):

<?xml version="1.0" encoding="utf-8"?>
<ServiceDefinition name="SampleWebCloudService" 
    xmlns="http://schemas.microsoft.com/
      ServiceHosting/2008/10/ServiceDefinition">
  <WebRole name="WebRole">
    <InputEndpoints>
      <!-- Must use port 80 for http and port 443 for 
             https when running in the cloud -->
      <InputEndpoint name="HttpIn" 
                               protocol="http" 
                               port="80" />
    </InputEndpoints>
    <ConfigurationSettings>
      <Setting name="AccountName"/>
      <Setting name="AccountSharedKey"/>
      <Setting name="BlobStorageEndpoint"/>
      <Setting name="QueueStorageEndpoint"/>
      <Setting name="TableStorageEndpoint"/>
    </ConfigurationSettings>
  </WebRole>
</ServiceDefinition>

If the <Setting> element name attribute values don't match in the two files, you receive an "Invalid configuration file" message when running in the Developer Fabric; the Azure Fabric adds the names of missing or misspelled <ConfigurationSettings> elements. The matching <Setting> name attributes requirement is hidden in an "Issues and Constraints in the Windows Azure Tools and SDK" white paper (see Additional Resources). The <InputEndpoint> element's port attribute determines whether Azure expects clear-text HTTP or encrypted HTTPS -- Secure Sockets Layer (SSL) -- protocol. Setting up HTTPS requires completing the entries on the SSL page of the ProjectName node's properties sheet.

Connect to Schemaless EAV Tables with REST
Windows Azure tables are similar to the schemaless Entity-Attribute-Value (EAV) tables of the initial SQL Server Data Services (SSDS) CTP's Authority-Container-Entity (ACE) model, which has been incorporated into Windows Azure as SQL Data Services (SDS).

The EAV data model is much more scalable than conventional file system-based relational tables. Azure tables have a distributed architecture that's related to Google's Bigtable, and Microsoft says Azure tables are designed to scale to billions of entities and terabytes of data. Azure and SDS table versions offer free-form "open properties" (formerly "flex properties") or property bags and permit adding or removing table attribute/value pairs at will. However, the two versions have different required properties and sets of data types.

Azure tables require PartitionKey and RowKey string property values to form a concatenated primary key and automatically add a DateTime Timestamp property value to each entity. SDS entities require a unique Id property value, permit an optional Kind property, and supply an autoincrementing Version value for concurrency management.

Azure tables support Binary, Bool, DateTime, Double, GUID, Int, Long, and String data types; SDS offers Base64Binary, Blob, Boolean, DateTime, Decimal, and String data types. SimpleDB property values are limited to strings and require padding numbers with leading zeros and adding offsets for negative values, which complicates client programming. My "Comparing Google App Engine, Amazon SimpleDB, and Microsoft SQL Server Data Services" blog post provides a detailed comparison of those three services (see Additional Resources).

Azure tables and SDS containers are units of consistency and have a size limit of 2GB; SimpleDB domains, which correspond approximately to Azure tables, hold a maximum of 10GB. Azure tables and SDS containers have strong consistency; all observers see the same value immediately after an update; special algorithms guarantee consistency over multiple replicas. SimpleDB domains promise eventual consistency after a period with no new updates, but Amazon doesn't specify the inconsistent window's maximum duration.

Transactions are crucial for online order processing and other business applications, but neither Azure nor SimpleDB tables currently support transactions. However, the Azure team promises that their tables will "at some point in the future perform atomic transactions across multiple entities within the same partition." Partitions are similar to Bigtable's tablets with some characteristics of the App Engine's entity groups thrown in; a single VM or server stores all entities in the table with the same PartitionKey value. (App Engine offers transactions for entities in the same Entity Group.) The CTP provides equality filters only and sorts in PartitionKey/RowKey order; inequality filters and developer-specified secondary indexes for alternative sorting orders are slated for the release version.

Progress Toward Standards
Microsoft's data-access teams are moving to standards-based RESTful data access protocols at an increasingly rapid pace. Azure tables, blobs, and queues have a REST API, while SDS and SimpleDB provide REST and SOAP protocols.

Azure tables and SDS use a new version of the ADO.NET Data Services ("Astoria") runtime, which supports "open properties" with the Atom syndication wire protocol and operates with the existing ADO.NET Data Services client libraries and tools. Pablo Castro's "ADO.NET Data Services in Windows Azure: pushing scalability to the next level" blog post describes how the Astoria team modified the runtime to accommodate a dynamic "open properties" provider for Azure and SDS, in addition to the static CLR types defined by the more common Entity Framework and LINQ to SQL data providers (see Additional Resources).

The REST API for Azure defines HTTP GET (query), POST (create), PUT (update with properties replaced), MERGE (update without properties replaced), and DELETE operations on entities. Here's the GET request header for the second page of the sample project's Customers GridView, which starts with CENTC as the CustomerID and RowKey value:

GET /CustomerTable()?$top=12&NextPartitionKey=
    Customer&NextRowKey=CENTC HTTP/1.1
User-Agent: Microsoft ADO.NET Data Services
x-ms-date: Fri, 19 Dec 2008 23:37:31 GMT
Authorization: SharedKeyLite oakleaf:Z/KA … 5Yc=
Accept: application/atom+xml,application/xml
Accept-Charset: UTF-8
DataServiceVersion: 1.0;NetFx
MaxDataServiceVersion: 1.0;NetFx
Host: oakleaf.table.core.windows.net

Azure Table and Queue Services require authentication, as do private Blob Services; blobs are the only storage type that can be specified for public access. The Authorization header's SharedKeyLite value, shown abbreviated in the preceding request header, is specific to the Astoria client and is the HMAC-SHA256 encoding of this string:

Fri, 19 Dec 2008 23:37:31 GMT\n/oakleaf/Tables

Queries return a maximum of 1,000 entities and support the $top operator. The Atom-formatted response body includes x-ms-continuation-NextPartitionKey and x-ms-continuation-NextRowKey continuation tokens to identify the first entity of the next page. Optimistic concurrency management uses eTags and Timestamp values to detect conflicts with If-Match HTTP headers. Here's the GET response header and abbreviated body of the second GridView page:

HTTP/1.1 200 OK
Cache-Control: no-cache
Content-Type: application/atom+xml;charset=utf-8
Server: Table Service Version 1.0 Microsoft-
    HTTPAPI/2.0
x-ms-request-id: 2a109a5d- … f6e
x-ms-continuation-NextPartitionKey: Customer
x-ms-continuation-NextRowKey: FRANK
Date: Fri, 19 Dec 2008 23:37:00 GMT
Content-Length: 15691

<?xml version="1.0" encoding="utf-8" 
    standalone="yes"?>
<feed xml:base=
    "http://oakleaf.table.core.windows.net/" 
          xmlns:d="http://schemas.microsoft.com/
              ado/2007/08/dataservices" 
          xmlns:m="http://schemas.microsoft.com/
              ado/2007/08/dataservices/metadata" 
          xmlns="http://www.w3.org/2005/Atom">
  <title type="text">CustomerTable</title>
  <id>
    http://oakleaf.table.core.windows.net/
        CustomerTable
  </id>
  <updated>2008-12-19T23:37:01Z</updated>
  <link rel="self" title="CustomerTable" 
      href="CustomerTable" />
  <entry m:etag="W/"datetime'2008-12-
      9T22%3A10%3A30.2752Z'"">
    <id>
      http://oakleaf.table.core.windows.net/
          CustomerTable(PartitionKey='Customer',
                                    RowKey='CENTC')
    </id>
    <title type="text"></title>
    <updated>2008-12-19T23:37:01Z</updated>
    <author>
      <name />
    </author>
    <link rel="edit" title="CustomerTable" 
             href="CustomerTable(PartitionKey=
             'Customer',RowKey='CENTC')" />
    <category term="oakleaf.CustomerTable" 
        scheme="http://schemas.microsoft.com/
            ado/2007/08/dataservices/scheme" />
    <content type="application/xml">
      <m:properties>
        <d:PartitionKey>Customer</d:PartitionKey>
        <d:RowKey>CENTC</d:RowKey>
        <d:Timestamp m:type="Edm.DateTime">
          2008-12-19T22:10:30.2752Z
        </d:Timestamp>
        <d:Address>
          Sierras de Granada 9993
        </d:Address>
        <d:City>México D.F.</d:City>
        <d:CompanyName>
            Centro comercial Moctezuma
        </d:CompanyName>
        <d:ContactName>
          Francisco Chang
        </d:ContactName>
        …
      </m:properties>
    </content>
  </entry>
</feed>

Use LINQ to REST to Query Tables
ADO.NET Data Services enables a subset of the LINQ Standard Query Operators to compose LINQ queries, which an expression tree translates to an HTTP URI. The StorageClient library's TableStorage, TableStorageDataServiceContext, and TableStorageDataServiceQuery classes and TableStorageEntity abstract class handle interaction with the ADO.NET Data Services client library (System.Data.Services.Client) and implement Azure Table Services helper functions for CRUD operation on tables, authentication, and error handling.

You define a TableName DataModel type that implements TableStorageEntity and a TableName DataServiceContext that inherits from TableName DataServiceContext. Next, you instantiate the latter in the Page.Preload event handler with a StorageAccountInfo type as its parameter, and create and execute a LINQ to REST query with optional paging as shown here for the sample project:

protected void Page_Prerender(object sender,  
                                    EventArgs e)
{
    // This LINQ to REST query gets a page of 12
    // Customer entities at a time
    // From paging code by Microsoft's Steve Marx

    var query = 
        (DataServiceQuery<CustomerDataModel>)
        (new CustomerDataServiceContext(account).
        CustomerTable.Take(12));
    
    // Get the continuation tokens from the request
    var cTokens = Request["ct"];
    if (cTokens != null)
    {
        // ct parameter format is "<partition>/<row>"
        string[] tokens = cTokens.Split('/');
        var partitionToken = tokens[0];
        var rowToken = tokens[1];

        // These QueryOptions become continuation
        // token query parameters in the request
        query = query.AddQueryOption(
            "NextPartitionKey", partitionToken)
           .AddQueryOption("NextRowKey", rowToken);
    }
    // Execute the LINQ to REST query
    var result = query.Execute();

    // Cast result to a QueryOperationResponse
    var qor = (QueryOperationResponse)result;

    // Get the continuation token values
    string nextPartition = null;
    string nextRow = null;
    qor.Headers.TryGetValue(
        "x-ms-continuation-NextPartitionKey", 
        out nextPartition);
    qor.Headers.TryGetValue(
        "x-ms-continuation-NextRowKey", out nextRow);

    if (nextPartition != null && nextRow != null)
    {
        // Add the continuation tokens to the GET query
        nextLink.NavigateUrl = 
            string.Format("?ct={0}/{1}", nextPartition, 
            nextRow);
    }
    // Set the customersView DataView's DataSource 
    // to the query result
    customersView.DataSourceID = null;
    customersView.DataSource = result;
    customersView.DataBind();
}

You'll also need code like this that runs once for each session in the Global.asax.cs file to create the initial table from your TableName DataModel class:

StorageAccountInfo account = 
    StorageAccountInfo.   
GetDefaultTableStorageAccountFromConfiguration();
TableStorage.CreateTablesFromModel
    (typeof(CustomerDataServiceContext), account);

After debugging your project in the SDK's local Developer fabric, copy the ApplicationID from the Live Services and Active Directory Federation section of the Azure Services Development Portal to the Portal page of the CloudService node's Properties sheet, right-click on the node, and choose Publish to open a window containing your project's …\bin\Debug\Publish folder. Click on Deploy in the Manage page's Staging area to open the Staging Deployment page, browse to the ProjectName.cspkg and ServiceConfiguration.cscfg files to fill the App Package and Configuration Settings text boxes, add a unique name (label) for the project to the Properties text box, and click on Deploy to copy the package and configuration data to a Staging instance (see Figure 5). After you test the role with the private Web site URL, click on the central icon to move the project to Production status and test it with your public accountname.cloudapp.net URL (see Figure 6). "ClickTwice" (once for Staging, once for Production) deployment of roles to the Azure Fabric is a snap -- and much easier than migrating apps to AWS or the GAE.

SDS vs. Azure
There's considerable overlap of Azure Table Services and SDS features, which portends a future conflict at least as serious as that between LINQ to SQL and the Entity Framework. A SQL

Services Labs incubation project is in the works to align SDS with the ADO.NET Data Services and enable Atom and JavaScript Object Notation (JSON) as wire protocols. The SDS team intends to add more relational-style features to the SDS data model in the future.

According to Microsoft partner architect Gopal Kakivaya, "Among the things SQL Data Services will include in the future are improvements in backup and restore, differential b-trees, distributed transactions, geo-replication (synchronous and asynchronous), distributed materialized views, job framework, and distributed queries."

The SDS team has promised for months to support schemas and full-text search in later CTPs. SDS probably will become a premium offering, with a substantial surcharge over basic Azure Table Storage. For example, Amazon extracts a $0.60 to $1.20 per instance-hour surcharge for SQL Server 2005 Standard. (SQL Server 2005 Express comes free with the $0.125 to $1.20 hourly charge for Windows Server 2003 R2 instances.) Amazon's surcharge for Windows Authentication Services is $0.25 to $0.60 per instance-hour, depending on instance type. That totals $7,344 to $15,552 extra per instance-year. Windows Azure with Azure Table Storage probably will be the price leader that entices enterprises to give Azure Platform Services a serious shot at playing a significant role in their data center expansion plans.

More Information

Additional Resources:

comments powered by Disqus
Upcoming Events

.NET Insight

Sign up for our newsletter.

I agree to this site's Privacy Policy.