Incident Response Process
How Incidents Flow
Section titled “How Incidents Flow”GitHub Issues are the system of record for all incident data. Slack is the coordination and post-mortem authoring surface. GCP Cloud Monitoring provides automated alerting via Slack notification channels.
Monitored Incidents (primary path)
Section titled “Monitored Incidents (primary path)”When a GCP Cloud Monitoring alert policy fires:
- GCP Cloud Monitoring fires the alert — uptime check or custom metric condition evaluates to violated state.
- Notification channel posts to
#incidentsvia Slack with alert details. - Responder acknowledges in the
#incidentsSlack thread — starts the response clock. - Responder follows the Incident Response Runbook for containment and recovery.
- Coordination happens in the
#incidentsSlack thread for the duration of the incident. - Responder creates a GitHub Issue as the incident record and marks it resolved when done.
- Postmortem authored via Slack modal (SEV-1/2) — the
incident-postmortemEdge Function posts a “Write Post-Mortem” button to the thread; submitted content is stored as a GitHub issue comment. - Nightly evidence automation pulls incident data from GitHub Issues and GCP Cloud Monitoring API.
User-Reported Incidents (fallback)
Section titled “User-Reported Incidents (fallback)”For incidents reported by users or observed outside monitoring coverage:
- Open a GitHub Issue using the Incident Response template with title
[SEV-?] Brief description. - Continue with the standard runbook from step 4 above.
Detection Sources
Section titled “Detection Sources”| Source | What It Detects | Alert Channel |
|---|---|---|
| GCP Cloud Monitoring | Uptime check failures, API degradation, alert policy violations | Slack #incidents notification channel |
| CrowdStrike | Endpoint malware, lateral movement, exploits | Falcon console + email |
| Google Workspace | Suspicious sign-ins, admin audit anomalies, DLP | Admin alerts + nightly evidence |
| GitHub | Secret exposure, vulnerable dependencies | Repository notifications |
| 1Password | Compromised credentials, unusual sign-in patterns | Events API (nightly evidence) |
Severity and SLAs
Section titled “Severity and SLAs”| Level | Definition | Response SLA |
|---|---|---|
| SEV-1 | Confirmed data breach, complete production outage, active exploitation | 15 minutes |
| SEV-2 | Partial outage, confirmed compromise, unauthorized access | 1 hour |
| SEV-3 | Degraded service, minor security event, single-user compromise | 24 hours |
| SEV-4 | Suspicious but unconfirmed activity, low-severity alert | 72 hours |
Escalation Routing (GCP Cloud Monitoring → Slack)
Section titled “Escalation Routing (GCP Cloud Monitoring → Slack)”| Severity | Routing |
|---|---|
| SEV-1/2 | Slack #incidents + CISO DM follow-up after 15 min without ack |
| SEV-3 | Slack #incidents |
| SEV-4 | CISO manual tracking only |
Evidence Collection
Section titled “Evidence Collection”Nightly automation collects incident evidence from GCP Cloud Monitoring API (alert policies, notification channels, uptime check state) and commits to evidence/logs/gcp-monitoring/YYYY/MM/. No manual collection is needed for routine compliance evidence.
For incident-specific artifacts (log exports, screenshots, containment actions) collected during response, store in evidence/incidents/YYYY/YYYY-MM-DD-incident-brief/.
To run an immediate evidence collection outside the nightly schedule:
After containment, verify no compliance regressions:
Postmortem
Section titled “Postmortem”SEV-1 and SEV-2 require a postmortem within 5 business days.
Authoring flow: After resolution, the “Write Post-Mortem” button appears in the #incidents Slack thread. Clicking it opens a structured modal. The submitted content is stored as a GitHub incident issue comment and is picked up by nightly evidence automation.
For the full postmortem template and review process, see Incident Response Runbook §5.
Related Documents
Section titled “Related Documents”Meridian Seven — Confidential