Data Center Disaster Recovery

(Ping! Zine Issue 38) – Protecting company data and ensuring 24 x 7 x 365 uptime of IT infrastructure requires people, process, and technology. It just so happens those same components are essential to extending that protection and assurance through a disaster recovery (DR) strategy. Whether natural or man-made, a disaster is defined as something that makes the continuation of standard functions impossible.

The people aspect of a DR plan is an entire topic of its own and is of course a vital piece to address. Processes, policy and procedures are also crucial and there are a number of excellent resources available to assist in developing the right mix for your business. An easy and effective way to tackle process is by following the ITIL (Information Technology Infrastructure Library) guidelines. Additionally there are templates, consultants and best practices that can be applied when it comes to writing the definitive DR plan document for your company. A substantial portion of the disaster recovery strategy addresses the very facility that, under normal conditions protects the technology – the data center. Depending on the size of the IT infrastructure and budget, there are many options and levels of protection available.

Three options for recovery strategies for the physical data center include cold, warm or hot sites. A cold site is a standby data center facility that contains no equipment, but has the right electrical, environmental and telecommunication accommodations. This is the least expensive option, however it makes for a much longer recovery time objective. A number of businesses offer shared cold site services and it can be an economical choice compared to not having anywhere to go in the event of a disaster. A warm site contains all IT equipment and is ready to go live, but does not have live data and would require a brief setup period when initiated out of the DR plan. A hot site is a fully equipped location ready to take over operations at a moment’s notice, and is frequently backed up to or contains continuously replicated data. While this is the more expensive option, it will provide the most flexibility and lowest recovery time objective.

Site selection is an important component of the recoverability of the data center. If more than one disaster recovery site is feasible for the business and budget, one strategy could be to have a secondary data center within the same general region of the country as the primary, and then have a tertiary facility in a completely opposite or distant region of the country or the world. If your DR plan or budget does not call for that level of redundancy the same two or three site strategy can be implemented with collocation or hosting providers around the globe. When selecting a collocation provider, apply the same rigid requirements to vendors as you would to building your own facility. Look at relevant site selection criteria for the provider, to see that where they are located has a lower potential impact from natural disasters. Available network connections in the facility and connectivity to your primary data center should be investigated as well, especially in the case of looking to it as a hot site. For extreme data protection and recovery needs there are data centers built in underground, military-grade bunkers that shield against all natural disasters as well as an EMP (Electromagnetic Pulse) or nuclear blasts.

A design alternative for the provisioning of a DR site is the data center container. Many major hardware vendors have container products available that provide a modular, mobile solution for rapid deployment. The data center container is based on ISO standard shipping containers and typically come in 20 foot or 40 foot configurations. Think of it as an enclosed row in your brick and mortar facility that contains the supporting elements of a data center, but can be built and delivered in a matter of weeks instead of months or years. Maybe your site selection for a disaster recovery site is any one of five places – depending on the disaster event you can then select the best fit for where to deliver the container(s) to. Additionally there are manufacturers that offer containerized power and cooling solutions to compliment a data center container. The Cisco NERV (Network Emergency Response Vehicle) is a mobile command and control center that delivers mobile-IP enabled solutions.

Another alternative has come about over the past several years in the form of cloud computing. With the advent of virtualization and cloud computing, the portability of server images and data can be extended to aid disaster recovery. Cloud providers allow for a massive scale as well as the easy and quick setup of virtual infrastructure. Services like Amazon Elastic Compute Cloud (EC2) offer a variety of options for a flexible, controlled and geographically disperse cloud.
Amazon EC2 has location options, composed of regions and availability zones. Availability zones are distinct locations that are insulated from failures in other zones, while providing inexpensive, low latency network connectivity to other zones. Many other providers have private clouds, virtual private clouds, or public cloud offerings suited to a variety of needs.

Before diving to deep into the disaster recovery plan for the data center, remember the overall orchestration of it with the business and IT requirements for recovery. Another major influence to the disaster recovery planning process is the Business Continuity Plan (BCP). The BCP long-term plans for how the business will continue to operate after a disaster greatly impact what data center decisions to make for a disaster recovery plan.

Writer’s Bio: John Rath is an independent consultant and blogger at