Implementing a Low-Budget Disaster Recovery Plan
How to prepare a small or medium business for disaster without overextending its already tight budget or overwhelming management with the details of an in-depth proposal.
- By Daniel Curry
I found myself in a challenging situation recently. I was hired on contract to help an acquaintance who runs a small and quickly growing startup rebuild its IT infrastructure. I think her business has potential, so I took the contract at below-normal rates. Redesigning the network was simpleI could have set up the infrastructure in my sleep. What kept me awake at night, though, was the lack of forethought about the consequences of an IT mishap. I knew the company needed some kind of disaster recovery plan; the challenge was getting the company's managers to think about it without overextending their already tight budget or overwhelming them with the details of an in-depth proposal.
Rather than walk away and cross my fingers that nothing bad would happen, I decided to take another approach. I based this compromise approach on education and focused on finding inexpensive solutions to critical items that provide a measure of security and the ability to recover easily. Here is the solution I presented and eventually implemented for my client.
Data backups are critical in today's IT structure. But they can be very difficult to manage, and a good solution can be expensive. With that in mind, we developed a two-phase backup approach. First, we needed a tape backup solution for dynamic data such as Exchange and SQL databases on the company's rack-mounted server running Small Business Server 2003 (SBS). I bought a used 4-mm tape drive on eBay for less than the cost of filling my car with gasoline. DAT drives offer modest capacity and speed, but the drives and media are reliable, available, and cheap. The company backs up all data stored on the servers nightly, and tapes are rotated off-site on a weekly schedule. Users are reminded to store company data on the file server's network drives.
Second, we added a DVD burner to the backup server to handle "static" data, such as users' home directories and the company intranet site. We bought Symantec's Ghost, which we used to make a baseline image of the servers and the workstations. This allows us to restore a dead workstation from a common file share using a boot floppy to get the ailing machine on the network. The DVD burner allows us to take snapshots of the static data twice a month. The oldest copy goes to an off-site location every payday. The most current copy stays in-house for in the event of an accidental deletion.
Even though my client is using SBS 2003 (a great platform for small to medium-sized businesses), the e-mail worm/virus de jour is a top-level concern. Even with current virus scanners on the Exchange server, some bugs still slip through. My client is painfully aware of how unprofessional bouncing e-mail appears to potential customers, and wants to avoid bounces while Exchange is offline for the inevitable cleaning, recovery, and maintenance. So, we put a small Linux system in the DMZ to act as an SMTP gateway. This server was an old workstation on its way to a landfill when I intercepted and installed a Debian Linux distribution on it. We configured Exchange to accept inbound messages from this gateway.
Because stopping spam is another issue, I installed SpamAssassin on the Linux server to provide a first line of defense against it. In the event that the Exchange server is offline, this SMTP gateway queues the inbound messages. We configure the DNS MX records to give first priority to the Linux gateway and second priority to the Exchange server should the Linux gateway be offline. This eliminates either server from being a single point of failure for inbound SMTP traffic.
I convinced the client to move the company's Web and e-mail presence to an off-site hosting facility. This was a difficult change for company managers to accept, and it required me to spend all the political capital I had to bring them around to my way of thinking. I used such news items as the New England blackout and how many of those companies are still recovering their lost business. UPSs and backup generators are a wonderful thing to keep your PBX and data center up and running, but they don't help when your ISP has no power.
There are several benefits to off-site hosting. First, with a tertiary MX record, all mail that can't get through to the SMTP gateway or Exchange server goes to the hosting facility server in one mailbox. In the event of a disaster or other event that knocks the first two mail servers offline, one person can dial into the off-site host and parcel out e-mail to the correct people. Orders can still be processed on paper and entered into the database later if need be.
Second, the Web site is on a faster server and higher speed Internet connections with more redundancy. With plans as little as $100 a year from various providers, there are many options. You should do some research before deciding which to use, and you should choose one that is in another portion of the country to avoid the possibility of a local disaster wiping out both the client and the ISP simultaneously.
My client's ordering system and back-end Web application for its Web sales is still hosted on its business-class DSL. A static e-mail form page is on the hosted Web server. In the event of an outage, a five-minute dial-up connection via NetZero or some other free or low-cost provider will have that page in place, instead of the order system. This supports the idea of mitigating the loss of sales after a disaster. Later, we plan to add aliases and individual mailboxes on the hosted server to simplify processes in the event that this solution is needed.
My client was happy with the way we resolved her IT issues in an inexpensive and relatively painless manner. After everything was set up, she surprised me by asking if I thought there were any process or security issues the company needed to address. Underlying her question, as I understood it, were the questions, "What can we automate or change around here to save money or make my life easier?" and "Are we a sitting duck for hackers?"
There was failover of critical services, and not a great outlay of business operating capital. My client is a bit of a control fanatic, and as a result, the next steps were much more difficult for her to accept. Not all were implemented fully, and those that weren't I'll tackle later.
The Easy Items
The first step was to get the receptionist's phone list organized and distributable. Every employee was given a business card-sized phone list. The employees' cards had their phone numbers, the phone numbers of others on their team or in their department, and the phone numbers of those up the chain of command, with the exception of the owner's home phone number. She did not want to have her employees calling her at all hours. Instead they would call her voice mail or route through their supervisor. This same phone list was published on the company intranet.
Each employee was asked to use Outlook's Journal feature continually for one week. This permitted them to see how their time was spent and how processes might be integrated, separated, and otherwise improved. We were able to determine that some people spent much more time than needed on the phone, in meetings, or surfing the Web. We were also able to establish that some tasks are best performed during the morning hours.
Simple things such as ordering office supplies were performed faster when done via e-mail or over the Web versus by phone or fax. This did require a change of supplier, but it saves money in two ways: labor time and cost control. The receptionist/office manager places the order, and someone else (the owner in this case) approves the order. This will change once trust with the vendor is built and the relatively new office manager proves that she can keep spending at a reasonable level.
Next, I cornered the in-house IT person. He also provides operational support where needed, including sorting and delivering snail mail. Together, we generated a list of all the equipment and the corresponding passwords, and wrote the information down by hand in two notebooks. He stashed one away in a locked desk drawer and gave the other to the owner.
We also established documentation processes. Using the collaboration features of Small Business Server 2003 and the default company intranet, each department had its own Web page, viewable by the whole company, as well as "private" pages, limited to people in that department. Here, procedural documents are stored for all to access. Each team leader and department head was asked to start documenting all processes within their sphere of responsibility. Even the part-time human resources/finance person saw an immediate benefit of this. No longer did he have to maintain forms for vacation requests and expense reports. He could keep one copy on the Web site, and users could copy and print as needed.
For the IT stuff, we set up subdepartments for each computer and server. The in-house technician will be performing an internal audit of each machine's hardware, serial numbers, software, and licenses. Whenever a machine is modified in any way, its pages get updated in an easy-to-manage system. Users are also being educated and encouraged to use the help-desk feature of the company's default Web installation. I provide contract support on the issues that can't be solved or answered in-house.
Managers Learn to Cross-Train
After I had a lengthy discussion with the owner, she decided that the managers would start an in-place rotation program. Each manager will perform his or her own weekly duties. One manager would be chosen to rotate among the other managers for a week. The floating manager would shadow the other managers for one full day. This would give each manager insight into the duties, responsibilities, and managing styles of his or her peers. One result is that if a manager is unavailable due to illness or vacation, other managers have a notion of how the other departments operate. The other managers might not have the answers, but at least they know the right questions.
The owner also decided to start a mentoring program. Department managers will start special training for at least two people in their teams to help cover managers in the event that one of the managers is not available.
The intranet is critical to mentoring. It provides a crutch for others to lean on when filling in for a manager. This also supports the ideal of promoting from within and structures teams and departments for expansion and growth. Morale is improved by exposure to training that could lead to promotion. Other managers get a sense of the stresses of other departments, and reasonable suggestions for process changes to increase efficiency are more easily accepted from an experienced peer than from the other manager down the hall.
Physical Access, Security, and Control
Surprisingly, the owner was the only person with a key to the office. When she is not around she sends a family member to unlock or lock the building at the opening and closing of the day. In my opinion, it is demeaning for a worker to have to wait for the owner's 17-year-old son to unlock the door. She agreed this was not her best idea. As a result, one manager was chosen to be given a key and security codes. This manager was chosen partly because of her attendance record and her duties. As the manager of the shipping department, it was unlikely that she would be out on a sales call, for example, at the start of a business day. Giving a key to only one manager isn't optimal, of course, but it's a start.
The owner also expressed a desire for an increase in productivity and growth within the company. Although I disagree with several of the following techniques, I implemented them at the owner's insistence.
First, we added a Web proxy. This helps limit the Web sites employees visit, as well as track Internet usage. In accordance with her wishes, I configured the company firewall to block the ports IM tools use commonly.
Next, I wrote a script to parse the logs of desktop machines to track computer login and idle times. The owner wanted a way to track employees' time in various tasks as well as to try to match up their reported weekly hours with what their computers are doing. Of course, many managers don't require their computers to do their jobs, so the login and activity time report is best taken with a grain of salt.
Electronic security is easy, and not so easy, as you probably know. A predecessor put in place a low-end Cisco router to handle firewall duties. It's up to the task. I applied appropriate patches, double-checked to make certain no obvious ports were open, and reset the administrator password. That was the easy part. The hard part is maintaining the discipline to check the log often to make certain no unauthorized access has taken place. This is where I will spend at least a portion of my monthly fee in making certain the first line of defense is maintained properly.
About the Author
Daniel Curry is a founder of Professional Computer Programming & Networking, a San Francisco-area IT management firm. He specializes in Linux and Windows network cohabitation. When not working with computers, he is riding his motorcycle on California's back roads.