From the course: DevOps Foundations: Site Reliability Engineering

Unlock the full course today

Join today to access over 23,300 courses taught by industry experts.

Incident management

Incident management

- Incident management, not always the most popular topic. Problems with production are not usually people's favorite part of the job. - But problems do strike, and having a high quality playbook on how to handle them will reduce your downtime. It also makes your engineers, customers, and internal stakeholders happier. - I wrote my company's incident response process, and regularly train the organization on it. I've done that in a number of places, but it all started with one great conference presentation. - That's right at Velocity 2008, Brent Chapman did a presentation called "Incident Command for IT: What We Can Learn from the Fire Department". It adapted the incident command system used emergency first responders in the real world to IT incidents. - [Instructor] The process scales from the smallest incident to the largest. Essentially, when a first responder looks into a problem, often prompted by an alert from the…

Contents