VSM Cover Story
Accelerate Apps With Velocity Reporting
Learn to set up and change durations in a supply-chain application that's both metadata- and template-driven.
Technology Toolbox: VB .NET, C#, SQL Server 2005, other Check out the XMLA specification at www.xmla.org.
Supply-chain applications must often deal in durations: How long does it take to move a commodity through a warehouse? How long does it take to do some particular work on said commodity? How much time is spent waiting needlessly?
The answers to these questions reflect how well a business is serving its role and are of central importance to executives and accountants who track time, expenses, revenue, and, ultimately, profitability. In supply chains that deal with adding value -- where a company has been tasked with enhancing items as they flow through the chain -- the customer who has outsourced these tasks will be keenly interested in these metrics as well. Being able to deliver this information quickly, accurately, and clearly is sometimes the difference between making a sale and losing one.
Velocity reporting describes a class of reports that focus on durations, such as those within a supply chain. This article provides an example of building a specialized reporting infrastructure for a hypothetical logistics company. This by no means limits the solution, and you can extend it to deal with many other situations, including order processing, software development projects, customer service, and help-desk management, to name a few.
One challenge imposed on this hypothetical company is the need to accommodate changes in what kinds of durations are tracked. You may even have certain reporting criteria that demand you track different types of durations. For example, product A might be quite different from product B and thus have different reporting needs. A solution that's both metadata- and template-driven allows you to be agile in dealing with new requirements. The advantages will become obvious as you go through the steps involved in setting up and then changing the durations of interest.
A key part of the reporting solution is the use of SQL Server Analysis Services (SSAS). You may be familiar already with the advantages of using SSAS to not only improve the performance of reporting, but to simplify it in many cases as well. It also brings you in contact with Analysis Management Objects (AMO), which is the fully managed replacement for Decision Support Objects (DSO) of SSAS in SQL 2000. In this article, you'll learn how to use the Microsoft.AnalysisServices namespace to programmatically create and process cubes from a C# console application that's the heart of the template engine.
Once you have rich data to use in reports, you'll also implement a simple front end using Microsoft Office SharePoint Server 2007 (MOSS). MOSS provides a sophisticated repository for enterprise reports, including those implemented using Excel 2007.
Understand Analysis Services
What exactly is Analysis Services? It's an On-line Analytical Processing (OLAP) storage and retrieval engine that you can populate from a variety of external data sources. You need a mechanism to get at this data, and for SSAS, you typically use Multidimensional Expressions (MDX). In many ways, MDX is to SSAS, as Data Manipulation Language (DML) is to SQL Server. (DML includes statements such as SELECT, INSERT, and so on.)
Again similar to SQL Server, SSAS contains objects that represent the structure of data, as well as programmability elements. For SQL Server, this means tables, views, stored procedures, and so on. For SSAS, this includes cubes, dimensions, measures, and calculated members, to name only a few.
How do you manipulate these structural objects? SSAS supports the XML for Analysis (XMLA) industry specification, which defines a general API for OLAP engines.
You can use XMLA quite easily, but sometimes it's preferable to use an object model to manipulate underlying structures. Many developers are familiar with this pattern, which avoids the need to understand the corresponding XMLA commands. AMO is the object model for SSAS, and it includes classes to represent cubes, dimensions, measures, and all other structures. It's worth noting, however, that under the hood, AMO is translating actions performed against object instances into XMLA (Listing 1).
You can also use Analysis Services Scripting Language (ASSL) to manipulate SSAS. ASSL is an XML dialect that packages object definition commands. Client tools such as SQL Management Studio and Visual Studio use ASSL to interact with SSAS.
Work Within a Business Scenario
Let's assume that the fictional company provides services such as inspections and other various jobs on items (widgets) that come to a warehouse. A central entity in this model is an Event table, which tracks discrete happenings to each widget as it moves through the supply chain (Figure 1). The EventCode is a master list of all possible things that can be done to the widgets, or specific milestones as they move through the chain. EventCodeGroup becomes useful when you want to track the completion of a number of events as a whole. In fact, some sample data implies a particular event flow, where some events are natural predecessors for others (Figure 2). Assuming an application writes to the Event table when the event completes, this implies an interval is really the time frame that exists between the completions of two classes of events.
You need to do some analysis in order to arrive at an acceptable reporting solution, which can mean listening to the questions users want to ask of the data. For example, they might ask, "What's the average duration between arrival and inspections?" or "What's the count of widgets that took between three and five days from inspection to release?" The latter question leads you deeper into requiring "interval duration ranges," which may in fact be variable, depending on the interval type. For example, durations between adjacent events would be much smaller than the time frame from start to end.
In a typical Business Intelligence (BI) development effort, you'd generally follow a set of steps that lead from analysis through construction and deployment. The modeling includes designing the proper data-warehouse schemas, followed by writing extract, transform, and load (ETL) logic as needed. In order to construct the actual SSAS structures, you could use XMLA, ASSL, or even AMO, but it's easiest to use the graphical Analysis Services Project template in Visual Studio.
Don't be confused about the distinction between Visual Studio and SQL Server Business Intelligence Development Studio: They're the same thing, essentially. If you don't have Visual Studio installed, Business Intelligence Development Studio will allow you to create projects based on the Business Intelligence templates.
With an SSAS database built, you need to think about latency and ask, "How current does data need to be kept?" SSAS is effectively a second-level storage mechanism for pre-aggregated source data. SSAS 2005 offers a number of options, all the way from unscheduled multidimensional OLAP (MOLAP) to realtime relational OLAP (ROLAP). The term "processing a cube" is the act of ensuring the cube's contents are consistent with the source data and therefore ready for use.
With all this in place, what happens when users ask for some basic changes, such as new measures or dimensions? You're forced to go back to the early stages of the development cycle, sometimes even to the beginning to alter warehouse schema. The improvements in tools over the years have made this less painful for developers, but it'd still be nice to avoid tedious changes, such as adding elements that vary only slightly from other existing elements.
An Alternative: Use the Template Processing Engine
Realizing that users may change their minds about which intervals are of interest, you can take advantage of AMO and build a .NET application that alters the development process to make it more data-driven. The goal is to abstract some common aspects of the solution you might have arrived at in the static development process (Figure 3).
The first step is to introduce a set of metadata tables. Different fundamental business rules or structural needs could lead to multiple cubes, so the model includes a many-to-many relationship between Interval and Summary types (in other words, cubes). Intervals are also allowed to have a DurationRange, which in turn can have multiple-range start/end pairs.
The .NET console application (DynCube, short for "Dynamic Cube") is called a template engine because, for maximum flexibility, it adds a simplistic scripting language on top of the ETL step (Table 1).
It's also possible to loop through all Interval types associated with each Summary (Listing 2, which contains the script used to construct a fact table for shipped widgets). Assume users want to deal with the granularity of a widget per shipment. This leads to a transformational requirement: The Event table is at the granularity of widget events (of all types), so you pivot Interval types of interest into columns on the fact table -- one for the interval duration in days, and another describing the range this duration falls into. These intervals are now represented as the attributes of a widget, from the time it arrives to the time it is shipped.
The <!CUBE/> tag is unique. Rather than being replaced with a value from metadata, it expresses the structure of the resulting cube. For example, <!CUBE(measure|MyField|MyFact)/> is a request to create a measure called MyField, sourced from the MyFact table's MyField column. Tags can be nested, so you can use other tags within <!CUBE/>.
The engine doesn't currently re-create cubes from scratch. The implication is that you need to establish an initial baseline SSAS database. This has the benefit of being able to include other things, such as drill-through actions, aggregations, security, and so on.
What's the easiest way to create a baseline? DynCube accepts a /factonly command-line argument that you can use to run the ETL process and create the shipped product fact table. Next, you can create an Analysis Services Project in Visual Studio, construct a Data Source View and a Cube, and deploy it. Now when you run DynCube without the /factonly switch, it will perform ETL, update the SSAS elements, and process cubes. This is a good time to set up an automated schedule to run it.
The DynCubeASDB.sln solution included in the article's code download illustrates the elements that were set up as part of the baseline step (download the code here). Because this step corresponds to what you would do in an "ordinary" BI effort, it's outside the scope of this article's focus on AMO.
The baseline can contain dimensions and measures that you don't need to express using the <!CUBE/> tag, so you can add "custom" to the first tag argument to effectively say, "Ignore this during parsing." This is important because the engine deals with the possibility of removing Interval types by clearing away all measures, dimensions, and calculated members that are not flagged as being "custom" before adding back the ones that are currently defined through metadata. An example of a custom measure is a Shipped Product Count, which would typically never change after you create the cube initially (Listing 3).
How does DynCube do its SSAS magic? One possible task you might need to perform is a new cube dimension, given a CubeDefinitionItem instance (the variable "i"), which is populated for each <!CUBE/> tag (Listing 4). Notice that you can manipulate the underlying Data Source View itself, which is represented in AMO as a DataSet, complete with Data-Relations and DataTables. Also note that a CubeDimension object effectively expresses that an existing Dimension (created during the baseline step) is being reused with a different measure group column. To see how measures and calculated members are dealt with, consult the Functions.cs file in the DynCube project.
Now that you have rich data in an SSAS database, you're probably wondering how to format and deliver the data for end users. You have a number of choices, each with pros and cons, and it's a case of finding the right tool for the job, given the trade-offs. Excel has always been a common tool used to access cubes. Pivot tables are a natural fit for ad hoc slicing and dicing of OLAP data.
With the advent of MOSS, the deployment headaches of Office Web Components (OWC) have disappeared. MOSS-based Excel reports are effectively XSLX files that are stored in SharePoint and can be opened in a browser window. Opening them this way causes server-side code (Excel Services) to render them using DHTML, meaning that there's a certain amount of richness within the browser: You can drill up and down, and apply sorts and filters. If you want to take advantage of drill-through or change the Pivot table layout itself, you can select "Open in Excel." Also note that it's possible to deploy Pivot charts as well (Figure 4).
Suppose now that your users have come back with a new request: They'd like to track the time between inspection completion and approval to ship. Due to the fact that intervals are defined using metadata, all you need to do is execute two SQL INSERT statements (Listing 5), then run the DynCube.exe application. (You could already be running it on a schedule.)
After you run DynCube, the cube structure will change to include a new dimension, new measures, a new calculated member, and the updated underlying Data Source View. Now if you refresh a Pivot table that references this cube, the field list will include the new items.
At this point, the structures actually in SSAS have diverged from the baseline Visual Studio Business Intelligence project. However, if you want to work through Visual Studio again, there are options. For example, you could manipulate the SSAS database directly using Management Studio, perform a degree of refresh on your existing BI solution, or you could reverse-engineer what's sitting in SSAS by using the Import Analysis Services 9.0 Database project template.
AMO is a powerful object model, but it can be daunting to understand. Even trying to examine AMO object properties while debugging in Visual Studio can be more difficult than it should be: The inability to show indexed properties (such as Item) is needlessly limiting. This article's code download includes a small Windows application called DumpAMO, which you can use to navigate through the AMO objects graph. To prevent infinite cycles, the text "... referenced elsewhere (double-click)" is displayed for object instances that appear elsewhere in the tree (Figure 5).
The code that accompanies this article includes the DynCube and DumpAMO applications, a backup of the SQL 2005 database, and the DynCubeASDB Business Intelligence project. The instructions contained in the ReadMe.txt file cover the necessary installation steps, in sequence, along with some troubleshooting hints.
Although DynCube includes a number of assumptions about the problem domain, it's possible to extend it in a number of ways. For example, you could add more tag types, and more complex scenarios, such as multiple fact tables, could be supported. Ideally, you could do everything from the engine, removing the need to create the initial SSAS baseline database. In addition, you could enhance and adapt the metadata to meet other business needs. Using a business object framework also helps to simplify the application, although it's already less than 1,000 lines of code, including templates. Clearly, the flexibility available through AMO complements the advantages inherent in SSAS, opening doors to numerous new custom reporting solutions.