What is it ?
• Risk reduction by designing a recovery capability into each service.
o At the site level facilitate an alternate building
o At the server level facilitate High Availability
o At the human level facilitate multiple heads per skill
• Analysis of resources to highlight existing Single Points of Failure
o An alternate data path from each end user to each application
o More than one support staff for each path and application
o A tested and verified back-up source of data
• A requirement for resilience as part of the development cycle
o Based in the BIA requirement
o Resilience intrinsic to the design and build of the application
o The requirement is included with the development budget
How does it work ?
• Avoid Single-Points of Failure (SPoF) in new services
• Seek out and eliminate SPoF in existing services
• Apply SPoF rules to vendor products and services
Blunder points ?
• Assuming resources are safe and available through time
• Regarding resilience as a design afterthought
• Thinking service providers think as you do
What next ?
Disaster Recovery Planning and Rehearsal