Incident Response Process

How Incidents Flow

GitHub Issues are the system of record for all incident data. Slack is the coordination and post-mortem authoring surface. GCP Cloud Monitoring provides automated alerting via Slack notification channels.

Monitored Incidents (primary path)

When a GCP Cloud Monitoring alert policy fires:

GCP Cloud Monitoring fires the alert — uptime check or custom metric condition evaluates to violated state.
Notification channel posts to #incidents via Slack with alert details.
Responder acknowledges in the #incidents Slack thread — starts the response clock.
Responder follows the Incident Response Runbook for containment and recovery.
Coordination happens in the #incidents Slack thread for the duration of the incident.
Responder creates a GitHub Issue as the incident record and marks it resolved when done.
Postmortem authored via Slack modal (SEV-1/2) — the incident-postmortem Edge Function posts a “Write Post-Mortem” button to the thread; submitted content is stored as a GitHub issue comment.
Nightly evidence automation pulls incident data from GitHub Issues and GCP Cloud Monitoring API.

User-Reported Incidents (fallback)

For incidents reported by users or observed outside monitoring coverage:

Open a GitHub Issue using the Incident Response template with title [SEV-?] Brief description.
Continue with the standard runbook from step 4 above.

Detection Sources

Source	What It Detects	Alert Channel
GCP Cloud Monitoring	Uptime check failures, API degradation, alert policy violations	Slack `#incidents` notification channel
CrowdStrike	Endpoint malware, lateral movement, exploits	Falcon console + email
Google Workspace	Suspicious sign-ins, admin audit anomalies, DLP	Admin alerts + nightly evidence
GitHub	Secret exposure, vulnerable dependencies	Repository notifications
1Password	Compromised credentials, unusual sign-in patterns	Events API (nightly evidence)

Severity and SLAs

Level	Definition	Response SLA
SEV-1	Confirmed data breach, complete production outage, active exploitation	15 minutes
SEV-2	Partial outage, confirmed compromise, unauthorized access	1 hour
SEV-3	Degraded service, minor security event, single-user compromise	24 hours
SEV-4	Suspicious but unconfirmed activity, low-severity alert	72 hours

Escalation Routing (GCP Cloud Monitoring → Slack)

Severity	Routing
SEV-1/2	Slack `#incidents` + CISO DM follow-up after 15 min without ack
SEV-3	Slack `#incidents`
SEV-4	CISO manual tracking only

Evidence Collection

Nightly automation collects incident evidence from GCP Cloud Monitoring API (alert policies, notification channels, uptime check state) and commits to evidence/logs/gcp-monitoring/YYYY/MM/. No manual collection is needed for routine compliance evidence.

For incident-specific artifacts (log exports, screenshots, containment actions) collected during response, store in evidence/incidents/YYYY/YYYY-MM-DD-incident-brief/.

To run an immediate evidence collection outside the nightly schedule:

doppler run -p m7-security -c prd -- python3 automation/scripts/runner.py --mode evidence --systems gcp-monitoring --output-dir evidence

After containment, verify no compliance regressions:

doppler run -p m7-security -c prd -- python3 automation/scripts/runner.py --mode verify

Postmortem

SEV-1 and SEV-2 require a postmortem within 5 business days.

Authoring flow: After resolution, the “Write Post-Mortem” button appears in the #incidents Slack thread. Clicking it opens a structured modal. The submitted content is stored as a GitHub incident issue comment and is picked up by nightly evidence automation.

For the full postmortem template and review process, see Incident Response Runbook §5.

Meridian Seven — Confidential