Part IV. Maintaining Systems

Organizations that are prepared to cope with uncomfortable situations have a better chance of dealing with a critical incident.

Even though it’s impossible to plan for every scenario that might disrupt your organization, the first steps toward a comprehensive disaster planning strategy, as discussed in Chapter 16, are pragmatic and approachable. They include setting up an incident response team, prestaging your systems and people before an incident, and testing your systems and response plans—preparation steps that will also help equip you for crisis management, the subject of Chapter 17. When dealing with a security crisis, people with a variety of skills and roles will need to be able to collaborate and communicate effectively to keep your systems running.

In the wake of an attack, your organization will need to take control of recovery and deal with the aftermath, as discussed in Chapter 18. Some up-front planning during these stages will also help you to mount a thorough response, and learn from what happened to prevent reoccurrence.

To paint a more complete picture, we’ve included some chapter-specific contextual examples:

  • Chapter 16 features a story about Google creating a response plan for how to handle a devastating earthquake in the San Francisco Bay Area.

  • Chapter 17 tells the story of an engineer discovering that a service account they don’t recognize has been added to a cloud project they haven’t seen before.

  • Chapter 18 discusses the tradeoffs ...

Get Building Secure and Reliable Systems now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.