Disaster Recovery Plan and Testing for a 5,000 Site Global Retailer
A global retailer with over 5,000 locations worldwide, sought to improve disaster recovery (DR) for order processing. The company turned to Burwood Group to address technical challenges and tailor a solution within its current infrastructure. Upon completion, the company wanted to implement a major test exercise encompassing their mainframe, production data center, and a wide range of applications.
The Challenge: Slow Uptime in a Complicated Network Environment
The risks of downtime to mission-critical systems include the loss of data, productivity, revenue and the negative impact on the customer experience. When the company experienced a five-day mainframe outage, its board and IT leaders agreed it was time for an update to uphold the company’s commitment to customer satisfaction. Its existing infrastructure was too complex for an off-the-shelf solution, so Burwood sought to improve recovery time without a complete network overhaul.
The Solution: Create an Isolated Disaster Recovery Environment
Working as an extension of the retailer’s IT team, Burwood Group engaged stakeholders across the company to identify past order-processing DR strategies and requirements, and recommend the foundation for a new approach. The critical concern was that the company’s DR backup site is a remote mainframe, but the order-processing backup was not functioning properly. Burwood created an isolated order-processing network, collaborating with the company’s iT team to identify critical order-processing applications to be replicated. Through the isolated environment, Burwood was able to test the solution without disruption to the corporate network.
One challenge was that the company’s current infrastructure is based on legacy SNA network protocol that is not compatible with the TCP/IP protocol used in contemporary DR solutions. Burwood configured a virtual “tunnel” based on a Cisco switch, to integrate the order-processing system with the new DR solution.
Intensive Testing for Fail-safe Disaster Recovery
Upon implementing the solution, the company wanted to test a broad DR scenario far more extensive than any disaster that might occur, to encompass the entire ordering, payment processing, and delivery lifecycle.
One challenge involved the unknown number of dependencies between the Tier 1 and Tier 2 applications to be tested. In addition, the company was planning to eliminate its order processing mainframe, but had not yet implemented the new solution. Therefore, Burwood needed to design the testing exercise to accommodate the current and future state of the order processing workflow.
Working as an extension of the company’s IT team, Burwood Group interviewed roughly 70 stakeholders across the company to gather requirements and uncover hundreds of dependencies in dozens of applications. With a deep understanding of the entire ecosystem, the Burwood project team determined the optimal failover order for every application, and structured a comprehensive testing exercise that would ensure business continuity in the event of a major outage.
Given the scope of the DR test, Burwood recommended using an automated toolset to rapidly and seamlessly transition workflow from the primary systems to the DR servers. The Burwood team created the automation scripts and executed multiple iterations to resolve emerging technical issues.
Burwood Group Services:
Disaster Recovery
Data Optimization
Network Management
Program Management
Data Center Automation
Technology Strategy
The Outcome: Improved Business Continuity
Collaborating with the retailer, Burwood executed a successful DR test for all critical business processes. The four-day exercise encompassed four main sites—including two data centers—in the United States and India, and a total of 171 servers. In addition, the company gained a deeper and more detailed view of its IT environment and application dependencies. Using the playbooks and automated tools created by Burwood, the company can now respond to a disaster affecting any of the company’s core business systems.
Now that an isolated DR network is in place, the company can use it not only for testing DR, but also for testing application upgrades without risking the critical production environment. Based on the success of the order-processing DR solution and testing, Burwood and the retail company are now creating a DR and refresh strategy solution for all of the company’s applications and data. Most important, the exercise demonstrated that the business could withstand a major disruption without missing a beat. For senior management, that assurance means confidence and peace of mind.