Incident Response Runbook

Overview

Operational companion to the Incident Response Plan. GitHub Issues are the system of record for all incident data. Slack is the coordination and post-mortem authoring surface. Incident channels: #incidents (live coordination), #security-alerts (automated alerts), #incident-YYYY-MM-DD-brief (dedicated SEV-1/2).

1. Detection and Triage

1.1 How Incidents Are Created

GCP Cloud Monitoring creates alerts automatically when uptime checks or alert policies detect a failure. Notification channels route alerts to Slack #incidents — no manual severity assignment needed for monitored systems.

Upon alert firing, GCP Cloud Monitoring:

Evaluates alert policy conditions and fires the notification channel
Posts to #incidents via Slack notification channel
On-call responder acknowledges and begins response

User-reported incidents not detected by monitoring: Open a GitHub Issue using the Incident Response template with title [SEV-?] Brief description.

1.2 Severity Reference

Observation	Severity
Confirmed data breach / active exploitation / complete outage	SEV-1
Partial outage / suspected compromise / critical vuln in the wild	SEV-2
Degraded performance / single account compromise / failed brute-force	SEV-3
Suspicious but unconfirmed / policy violation / informational	SEV-4

For monitored incidents, severity is set by Catalog metadata. For user-reported incidents, assign within 15 min (SEV-1/2) or 1 hour (SEV-3/4).

1.3 Alert Sources

Source	Where to Check
GCP Cloud Monitoring	console.cloud.google.com → Monitoring → Alerting
CrowdStrike	falcon.crowdstrike.com → Detections
Google Workspace	admin.google.com → Reporting → Audit
GitHub	github.com → Security → Code scanning
1Password	1password.com → Watchtower
Manual report	`#security-alerts` Slack

1.4 First Response (first 5 minutes)

Acknowledge the alert in the #incidents Slack thread.
Open the GCP Cloud Monitoring alert link for full context on the affected system.
Begin Slack thread in #incidents for SEV-1/2/3 coordination.

2. Communication

2.1 Internal Notification

GCP Cloud Monitoring alert policies handle primary notification routing via Slack and email channels. The table below describes what happens automatically and what requires human action.

Severity	Automatic (GCP Cloud Monitoring + Slack)	Human Action
SEV-1	Slack `#incidents` notification + CISO follow-up after 15 min	Open `#incident-YYYY-MM-DD-brief`; phone CTO; if customer data at risk, call legal
SEV-2	Slack `#incidents` notification + CISO follow-up after 15 min	Post updates every 2 hours in `#incidents` thread
SEV-3	Slack `#incidents` notification	Post daily update in `#incidents` thread
SEV-4	None automatic	CISO logs in GitHub Issue if tracking is warranted; no broader notification

Status update format for #incidents thread:

[TIME] Status: [Investigating / Contained / Remediating / Resolved]
What we know: [1-2 sentences]
Current action: [what is happening now]
Next update: [time]

Cadence: SEV-1 every 30 min, SEV-2 every 2 hours, SEV-3 daily.

2.2 External Communication

Only CTO or CISO may authorize external communications. All must be reviewed by legal before release.

Event	Action	Timing
Customer data breach	CTO + Legal via email	Within 72 hours of confirmation
Status page update	CTO or CISO authorizes update via GCP Cloud Monitoring status dashboard	When customer-visible impact occurs
Regulatory notification	Legal leads, coordinated with CISO	Per applicable regulation

3. Containment by System

Preserve evidence (Section 4) before destructive containment actions.

3.1 Google Workspace

Account compromise: Admin console → Users → Suspend; revoke all active sessions; reset password; export 30-day audit log.
Unauthorized OAuth grants: Security → API controls → App access control → revoke suspicious grants.

3.2 GitHub

Compromised PAT or deploy key: Revoke immediately at github.com/settings/tokens or repo → Deploy keys. Check audit log for actions taken.
If GH_PAT_ADMIN is compromised: Revoke, generate new, doppler secrets set GH_PAT_ADMIN -p m7-security -c prd, verify evidence workflows.
Malicious workflow: Disable via GitHub UI (Actions → workflow → disable); review recent runs.

3.3 CrowdStrike

Endpoint threat: Endpoint Security → Detections → review details → Hosts → “Contain host” (isolates from network, preserves forensics). Export detection report. Do not reimage until investigation is complete.

3.4 Cloudflare

Active attack: Toggle “Under Attack Mode” → Security → WAF → create block rule. Export security events before making changes.
DNS tampering: DNS → compare to known-good state → revert unauthorized changes. If token compromised: revoke, create replacement, doppler secrets set CF_SECURITY_API_TOKEN -p m7-security -c prd.

3.5 1Password

Compromised account: Admin console → People → Suspend; review vault access; rotate all credentials the account could reach.
Service account compromise: Admin console → Service Accounts → revoke token → generate replacement.

3.6 Doppler

Compromised secret: Rotate at source → doppler secrets set SECRET_NAME -p m7-security -c prd. Check Activity log for unauthorized reads.
Compromised DOPPLER_SA_TOKEN: Revoke in Workplace → Service Accounts; generate new; update DOPPLER_TOKEN in GitHub Actions secrets; rotate all secrets it had read access to.

3.7 Supabase

Unauthorized database access: Roll service_role key in API Settings → doppler secrets set SB_SERVICE_ROLE_KEY -p m7-security -c prd. Check auth.audit_log_entries and Postgres logs.
Leaked anon key: Dashboard → API Settings → Roll anon key. Updates all application deployments immediately.
Emergency lockdown: Settings → API → disable “Enable API” (takes application offline; last resort only).

3.8 GCP Cloud Run (web app)

Compromised deployment: GCP Console → Cloud Run → web service → Revisions → select last known-good → route 100% traffic. Check Cloud Logging for the service for request/error logs.
Compromised service account: IAM → locate service account → remove all role bindings → create replacement → update Doppler.

3.9 GCP Cloud Run (agents)

Compromised deployment: GCP Console → Cloud Run → agent service → Revisions → select previous known-good → route 100% traffic.
Compromised service account: IAM → locate service account → remove all role bindings → create replacement → update Doppler.

4. Evidence Preservation

Capture within the first hour, before destructive containment. Nightly automation collects GCP Cloud Monitoring alert data (policies, notification channels, uptime check state) — manual collection below is for system-specific artifacts and mid-incident snapshots.

Screenshots of triggering alert with timestamps
Current state of affected system
Log exports (at least 24 hours before the alert)
GitHub audit log (if access-related)
Doppler activity log (if secrets may be compromised)

System	Log Location	Export Method
GCP Cloud Monitoring	console.cloud.google.com → Monitoring → Alerting	Nightly automated collection via API
GCP Cloud Logging	console.cloud.google.com → Logging → Log Explorer	Filter by service, export to Cloud Storage or CSV
CrowdStrike	falcon.crowdstrike.com → Detections	Export detection report
Google Workspace	admin.google.com → Reporting → Audit	Export to CSV
GitHub	github.com/orgs/Meridian7-io/audit-log	`gh api /orgs/Meridian7-io/audit-log`
Cloudflare	dash.cloudflare.com → Security → Events	Export security events
Supabase	Supabase dashboard → Logs	SQL query + screenshot
Doppler	doppler.com → Activity	Screenshot

Store incident-specific artifacts in evidence/incidents/YYYY/YYYY-MM-DD-incident-brief/. Naming: YYYY-MM-DD-HH-MM-[system]-[artifact-type].[ext]. Commit with a signed commit.

5. Post-Incident Review

5.1 Postmortem Requirement

Severity	Required	Due
SEV-1	Yes	5 business days
SEV-2	Yes	5 business days
SEV-3	CISO discretion	10 business days if written
SEV-4	No	—

5.2 Postmortem Authoring Flow

For SEV-1/2, after resolution the incident-postmortem Edge Function receives a GCP Cloud Monitoring notification and posts a “Write Post-Mortem” button to the #incidents Slack thread. Clicking it opens a structured modal. The submitted content is stored as a GitHub incident issue comment — this is the system-of-record entry picked up by nightly evidence automation.

Template for the modal (and for manually authored post-mortems):

Incident: [Title]
Date: YYYY-MM-DD  Severity: SEV-[N]  Duration: HH:MM
Incident Commander: [Role]  Responders: [Roles]

Summary
[1-2 sentences: what happened and impact]

Timeline
HH:MM — Alert triggered
HH:MM — Severity assigned
HH:MM — Containment taken
HH:MM — Root cause identified
HH:MM — Remediation complete

Root Cause
Immediate cause: ...  Root cause: ...

Impact
Systems affected: ...
Customer impact: [None / Description]
Data exposed: [None / Description]

What Went Well / What Could Be Improved
- ...

Action Items
[Action] | [Owner] | [Due Date]

Process: draft via Slack modal (incident commander) → submitted as GitHub incident issue comment → share in #incidents for 24h async review → 30-min sync with CTO + CISO (SEV-1/2) → create GitHub issues for all action items → mark incident resolved.

6. Escalation Matrix

Internal

Condition	Escalate To	Method
On-call no ack in 15 min	CISO (backup)	Slack DM
Both on-call and backup unreachable	Next available	Phone
Active data breach suspected	CTO + Legal	Phone immediately

GCP Cloud Monitoring routes SEV-1/2 alerts to Slack #incidents with CISO follow-up after 15 min without acknowledgment. SEV-3/4 routes to Slack #incidents only.

External

Condition	Contact	When
Confirmed customer data breach	Legal counsel	Immediately
Customer notification required	Legal + CTO	Within 24h of confirmed breach
Regulatory notification	Legal counsel	Per applicable regulation
Criminal activity suspected	Legal, then law enforcement	After legal consultation
Active intrusion beyond internal capability	CrowdStrike Overwatch	As needed

Vendor Emergency Contacts

Vendor	Contact
CrowdStrike	support.crowdstrike.com
Google Cloud / GCP	cloud.google.com/support
Google Workspace	Admin console → Support
Cloudflare	dash.cloudflare.com → Support
1Password	support.1password.com
Supabase	support.supabase.com

Meridian Seven — Confidential