Developer's Toolkit

Blog archive

Architecting for Scalability

I had heard Pat Helland speak several times over the last couple of years, when he served Microsoft, and enjoyed his Megalopolis talk. That presentation compared an enterprise architecture to the evolving infrastructure of a city, putting the architect in the role of city planner and government official.

Well, Pat moved on to Amazon last year, but his talk this week on Architecting for Scalability was as entertaining and thought-provoking as any I had heard from his Microsoft days. One thought in particular that sticks in my mind was his assertion that it might be time to rethink some of the accepted wisdoms concerning data management.

For example, he notes that data normalization is not necessarily a good thing. Normalization, which manifests itself (in one way) as the requirement that data should only be changed in one location in the database, makes sense only if you're planning on changing that data, or, as Pat put it, executing UPDATE WHERE . . .

In many cases, that is a normal database activity. But he points out that is changing. First, offline storage (especially volatile storage) is cheap, enabling people to more easily save whole copies of databases for archive. Second, more businesses have to save database information as it is, rather than continually update it. Humorously, he notes that not only are business required not to update certain databases, but in the Sarbanes-Oxley era, doing so can be a felony (not so humorous, I suppose, if you are the one declared a felon).

But the point is that data management requirements are changing dramatically, and both architects and DBAs have to understand the implication to their jobs. In the case of architects, it has implications to how you go about architecting an application. Scalability has a different meaning when you have to treat data differently. In Pat's case, he notes that it becomes more important to not apply data across different business entities. It doesn't necessarily matter what database that data comes from; rather, it depends greatly on how it is used.

I don't normally get excited about data management, but I am in this case for two reasons. First, I work for a company (Progress) that produces a non-normalized database (although it can be validated for most of the behavior of the third normal form), so it was interesting to hear conventional wisdom being questioned in that regard. Second, it is always interesting to watch business requirements change and observe how both technology and skills adapt. We're still in the early stages here, so keep your eyes open to see how database technologies and best practices shift over the coming years to adapt to new business and legal standards.

Posted by Peter Varhol on 02/01/2006


comments powered by Disqus

Featured

  • Uno Platform Wants Microsoft to Improve .NET WebAssembly in Two Ways

    Uno Platform, a third-party dev tooling specialist that caters to .NET developers, published a report on the state of WebAssembly, addressing some shortcomings in the .NET implementation it would like to see Microsoft address.

  • Random Neighborhoods Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the random neighborhoods regression technique, where the goal is to predict a single numeric value. Compared to other ML regression techniques, advantages are that it can handle both large and small datasets, and the results are highly interpretable.

  • As Some Orgs Restrict DeepSeek AI Usage, Microsoft Offers Models and Dev Guidance

    While some organizations are restricting employee usage of the new open source DeepSeek AI from a Chinese company due to data collection concerns, Microsoft has taken a different approach.

  • Useful New-ish Features in .NET/C#

    We often hear about the big new features in .NET or C#, but what about all of those lesser known, but useful new features? How exactly do you use constructs like collection indices and ranges, date features, and pattern matching?

  • TypeScript 5.8 Beta Speeds Program Loads, Updates

    "TypeScript 5.8 introduces a number of optimizations that can both improve the time to build up a program, and also to update a program based on a file change in either --watch mode or editor scenarios."

Subscribe on YouTube