Redefining Windows Storage
Get the scoop on Windows Server 2003 enhanced data storage features that enable enterprises to transcend traditional Windows storage limitations in storage area networks (SAN).
For This Solution: Windows Server 2003, SAN hardware, Backup tools
There is a new trend in IT today. More and more organizations are supplementing their server-centric architectureswhere you designate each service the network requires and which server will render itwith data-centric architectures (DCAs). To design a DCA, you must identify the information in your network, categorize it, rate it, determine how to protect it, determine who needs access to it and, if your organization is geographically dispersed, determine how to make it available regionally. As you know, information is made of both data and documents, and categorizing it is no easy feat. However, it's an essential step if your organization wants to manage data storage effectively, especially if you want to develop strategies that allow you to implement storage area networks (SANs).
The release of Windows Server 2003 changes this playing field and we expect it will boost the use of DCAs in the years to come. It provides new data and storage management features that have the potential to revolutionize the way you look at storage strategies today. In fact, Windows Server 2003 builds on storage features Microsoft included in Windows 2000 to transcend traditional Windows storage limitations. With their PC origins, Windows servers traditionally demanded exclusive access to attached storage units. In a SAN, this exclusive control is contrary to basic SAN concepts. Therefore, SAN manufacturers had to develop proprietary utilities that circumvented Windows limitations. With Windows Server 2003, this is no longer the case because it boasts several storage enhancements (see Table 1). The combination of these features help map out a new story for Windows Server 2003 storage, but to profit fully from them, you need to review your DCA needs.
Design Your Storage Solution
According to storage guru Val Bercovici, chief technical architect with Network Appliance Canada Ltd., data is the most stable aspect of any architecture. Hardware platforms and operating systems are changing and evolving constantly, but data is constant and will always form the core of information systems. This is one of the main reasons why storage networks make sensethey free data held captive by single application (including middleware and database) servers. Bercovici says that to create a DCA, you need to focus on four pillars:
- Storage Networking and Consolidation
- Data Center Operations
- Business Continuance
- Distributed Enterprise
The first pillar, Storage Networking and Consolidation, focuses on decoupling data from single servers. This helps you liberate data from the traditional one server, one data store view and enables you to consolidate storage into volumes or logical units (LUN) that can scale independently of your computing environment. When designing your storage architecture, look for storage networking platforms that don't bind you to one particular protocol. Fibre Channel and the Common Internet File System (CIFS) are popular SAN protocols you can consider. However, Internet Small Computer System Interface (iSCSI), a new IP-based storage-networking standard, has a promising future with Windows Server 2003 as a low-cost alternative to Fibre Channel. Many organizations also require good Network File System (NFS) support for Unix and Linux interoperability.
Next is Data Center Operations. Once you regroup your storage in a single area, you need to implement the right tools to administer, as well as backup and restore, the data. This is where you can introduce the concept of provisioningrobbing Peter to pay Paul so to speakto gain a better understanding of your storage needs. For example, a file server might have more storage space than it requires while a database server might not have enough. When you store data on single servers, it's hard to share or reallocate this space; but when it's regrouped into a single storage network, it becomes much easier to reallocate space. Provisioning capacity on a storage network can involve many steps (storage frame, storage network, and server volume and file system management), so pay special attention to solutions that automate and optimize these complex tasks.
Backup becomes consolidated with storage networks, yet remains a time-consuming chore for most administrators. The challenge of protecting multiple terabytes within ever-shrinking backup windows is straining even state-of-the-art tape and robotic library innovations. Restoring data promptly to a storage network requires advanced technology to wade through millions of backed-up files and/or rapidly copy terabytes of data back to their desired locations.
Business Continuance is where the availability of storage networks comes into play. Today more than ever, it's important for organizations to offer data availability guarantees, especially in light of new, post 9/11 legislative regulations requiring readily available copies of your data at all times. Implementing disaster recovery solutions has suddenly become a core information system requirement.
The Distributed Enterprise pillar enables you to consider new data-centric approaches for geographically dispersed data. You can determine whether staff in your Los Angeles office need access to the same data as people in the Seattle office or how to disperse data regionally. You can also figure out the level of latency that's acceptable when users access information, and whether you should put in a fat pipe and replicate data everywhere, or use data caching technologies to make appropriate data available in each location. The answers to these questions will help you define where you localize data, how you'll replicate it, and how people will access it.
With previous editions of Windows, DCAs had a significant drawback: When you decided to move to SAN-based architectures, you were often locked into a single hardware solution because of limitations in the operating system. With Windows Server 2003, this is no longer the case.
Windows Server 2003 Writes Its Own Data Story
The Windows Server 2003 data story begins with dynamic disks. Windows 2000 introduced the concept of dynamic disks to support the capability to manage disks as volumes instead of as partitions. This volume-based approach was designed to transcend traditional disk management strategies, but when they're used with Windows server clustering, dynamic disks require a third-party tool, Veritas Volume Manager, to work properly. Therefore, organizations that didn't want to acquire this tool used basic disks, even in SAN configurations. The logic was sound: If Windows doesn't see the SAN and doesn't offer the extensions required for the SAN to expose data to operators, there was no reason to use Windows features, even it they might make more sense. Microsoft even released a Technical Advisory warning to customers to limit their use of dynamic disks and advising they stick with basic disks unless they had a situation that required Windows-based disk management features, such as disk spanning, disk mirroring, or disk striping (see Resources). Although Windows 2000 offers a unified disk management snap-in, it was basically useless for both clusters and SANs.
With the introduction of the Virtual Disk System (VDS) in Windows Server 2003, Microsoft now provides storage manufacturers with a tool that extends the Dynamic Volume Management (DVM) snap-in to create a single storage management interface. VDS provides a set of disk management APIs that let developers tie into this unified view. In fact, VDS is comparable to the Unidriver Microsoft uses for printer management. Printer manufacturers can simply write to the Unidriver specification to produce a single, unified method for accessing, managing, and using printers (see Figure 1). This feature supports the first pillar of a DCA. The VDS makes it be easier to decouple storage networking and disk devices from servers, and implement multi-vendor storage networking with SANs.
In addition, SANs can now expose data to the VDS and the DVM, allowing operators to view extensive information about their storage networks such as port information and the hardware random array of inexpensive disks (RAID) system for each volume. This feature simplifies and standardizes storage management in Windows and is related directly to the Data Center Operations pillar.
Windows Server also includes the capability to support Multipath I/O. This feature lets operators create redundant paths to storage in the SAN, making sure it's always available to the system that needs it. Even more important, Multipath I/O supports the use of multiple storage paths on heterogeneous systems, freeing your organization from single vendor solutions and allowing you to mix and match storage technologies according to need. According to Bercovici, this new approach will have an impact on hardware certification for Windows Server 2003: It's likely Microsoft will certify only storage technologies that use both VDS and Multipath I/O. Eventually, no legacy storage device drivers will be certified under Windows Server 2003. Although you might not agree with this approach, it has a definite advantage. With this strategy, Microsoft forces vendors to streamline and standardize storage solutions for Windows networks. In a world where stability is the foremost preoccupation, this can't be all bad.
Take a Picture: It's Faster
Windows Server's Volume Shadow Copy service feature also provides new storage management capabilities. Shadow-copies, or disk snapshots as they are commonly referred to, are one of the major reasons why people have implemented SANs to date. SANs include built-in technologies that support the capability to maintain multiple concurrent copies of data. Among other things, you can use these copies for backups (backing up a copy is much simpler than backing up live data because all files are closed) and restores (through special interfaces, users can even restore their own files). But constructing these features is complex. Manufacturers have to design systems that will tell a running application to freeze all requests for data input and output for the time it takes to create the snapshot. This is a costly undertaking at best. Even worse, because most organizations implement heterogeneous solutions, they don't necessarily use the same drivers for backups as for undeletes. This is often a cause of instability.
By building snapshots into Windows, Microsoft provides both Volume Shadow Copy service engines for its own applications (Exchange, Active Directory, and SQL Server) and a single standard set of APIs for application developers to write their own Volume Shadow Copy service engines, taking the onus away from the SAN hardware manufacturers. In addition, Volume Shadow Copy service virtually eliminates the impact of data usage on backups, and lets users perform their own restores through the Previous Versions client (see Figure 2). It also allows organizations to make data available in different formats for analysis, development, and stress-testing purposes (see the sidebar, "Volume Shadow Copy at Your Service"). This is directly related to the third pillar of a DCA, Business Continuance.
In addition to the ability to make backups based on snapshots, Windows Server 2003 supports features that make it easier to rebuild servers. The Automated System Recovery (ASR) enables you to reconstruct a non-working system. It's composed of two portions: the ASR Preparation Wizard and the ASR Restore. The Preparation Wizard captures everything on a system, from disk signatures to system state and other data, and creates an ASR boot floppy disk. Then, if you need to restore a system because of a disk crash and total data loss, you simply start system setup using the proper Windows Server installation CD, press F2 when prompted, then insert the ASR floppy. ASR Restore will restore the disk signatures, install a minimal version of Windows, and restore all system and data files. It's not a 100-percent perfect solution, but it's the best recovery tool to date to come with Windows.
The ASR is really a recovery strategy built for standalone servers. When working with SANs, Windows Server 2003 offers much better recovery strategies because it can actually boot from the SAN. This means that Windows Server finally supports the concept of a diskless serversomething most Windows network administrators are not used to. If a server's entire disk structure is stored within a SAN, it's fairly easy to restore the server because you simply need to restore the disk structure the server believes it owns. Besides, massive data losses don't occur frequently in a SAN.
The Distributed File System (DFS) rounds out the storage features that let Windows Server write its own data story. For Bercovici, DFS is the Domain Name System (DNS) of storage. "No administrator today would ever think of managing server naming by hand, yet most still name file servers and file shares manually. DFS was a diamond in the rough in Windows 2000, but with all its enhancements, it should become more important in Windows Server 2003," he says.
DFS is designed to replace network drive mapping. It creates a unique file share alias, through which users can access files on a server. This means you can change the target file server and/or share without impacting users, because they access the alias and not the physical file server and share name. DFS works in conjunction with the File Replication System (FRS) to facilitate disaster recovery. DFS roots integrated with Active Directory (AD) are domain DFS roots. Domain DFS roots can now include up to 5,000 targets or links that can be geographically dispersed.
DFS also supports standalone DFS roots. Standalone roots aren't fault tolerant in the same way domain DFS roots are, because they're located on a single machine. On the other hand, a standalone DFS root can be located on a cluster server and provide fault tolerance through cluster services. It can contain up to 50,000 targets.
DFS is extremely powerful. For example, if your developers need to work in different environments when preparing corporate applications, they can take advantage of DFS by creating a standard DFS root for development purposes in each staging environment and using the same DFS name that will be used in the production network. This way, they don't have to modify paths within the code whenever they change environments, even into production. DFS has many other useful implementations such as public folders replicated in each regional site; project folders that span a specific number of regions; and file shares that are transparent to mobile users. DFS, especially domain DFS roots, is designed to support the fourth pillar of a DCA, the Distributed Enterprise.
As you can see, Windows Server 2003 writes a powerful data story. If you want to take full advantage of its storage features, you'll need to rethink your data solutions and reconsider your data decisions. As you do, you might find that these features justify a migration to Windows Server 2003. If so, you can begin your migration with the file servers. But remember, before you do so, learn about your data needs and test everything thoroughly before implementing production systems.