🚨 Incident Response August 28, 2025 · 11 min read

Incident Response Planning: Build Your IR Playbook

A well-rehearsed incident response plan reduces breach costs significantly. Learn how to build an IR playbook, define escalation paths, and run tabletop exercises your team will actually use.

IR
🚨 Incident Response
IR

Incident response is not a skill you learn during a breach. The organisations that handle incidents well have done the work in advance — defined roles, documented playbooks, tested their communications, and practiced their procedures. When an attack hits at 3 AM, the teams that have prepared execute confidently. The teams that haven’t are improvising under maximum pressure. Our managed SOC provides the 24/7 detection and initial response capability that gives you the time to execute your playbook effectively.

This guide walks through how to build an incident response programme that works — not as a compliance checkbox, but as a functional capability that reduces breach impact when you need it most.


What Is Incident Response?

Incident response (IR) is the structured process of detecting, containing, eradicating, and recovering from security incidents, then learning from them to improve future defences.

NIST SP 800-61 defines four phases:

  1. Preparation — building capability before incidents occur
  2. Detection and Analysis — identifying and understanding incidents
  3. Containment, Eradication, and Recovery — stopping damage, removing threats, restoring operations
  4. Post-Incident Activity — learning and improving

Most organisations focus too much on phases 2–3 and too little on phase 1. Preparation is what determines whether phase 2–3 goes smoothly.


Phase 1: Preparation

Build Your Incident Response Team (IRT)

Define who is responsible for what before an incident occurs:

RoleResponsibilitiesWho Fills It
Incident Commander (IC)Overall coordination, decision authorityCISO, Security Director, or designate
Security LeadTechnical investigation and containmentSenior security engineer or SOC lead
IT/Infrastructure LeadSystem isolation, recovery executionIT Director or senior sysadmin
Legal CounselLegal obligations, regulatory notifications, privilegeGeneral counsel or external legal
Communications LeadInternal comms, customer/media communicationsHead of Marketing, Comms, or PR
HR RepresentativeIf insider threat involvedHR Director
Executive SponsorStrategic decisions, resource allocationCTO, CEO, or equivalent

For small organisations without dedicated security staff, some roles will overlap. Define who covers each role explicitly — ambiguity during an incident is costly.

External resources to pre-contract:

  • IR retainer firm — a cybersecurity firm on retainer that can deploy forensic experts within hours. Don’t search for one during a breach. Our managed SOC includes an IR retainer for active incident support.
  • Legal with breach response expertise — attorney-client privilege applies to IR communications, which is important if litigation follows
  • Cyber insurance broker — understand what your policy covers before you need to file a claim
  • PR firm (for larger organisations) — experienced in data breach communications

Prepare Your Toolbox

Stock your IR toolkit before you need it:

Communication:

  • An out-of-band communication channel (Signal, Wickr, or a separate email domain) — don’t rely on your potentially-compromised email/Slack during an incident
  • Encrypted note-taking for sensitive case notes
  • Pre-drafted email templates for customer notification, regulatory notification, employee communication

Technical tools:

  • Forensic imaging: FTK Imager, dc3dd
  • Memory acquisition: DumpIt, Magnet RAM Capture, WinPmem
  • Log collection: Windows Event Log export scripts, Linux auditd
  • Network packet capture: Wireshark, tcpdump
  • Malware analysis: Sandbox (Any.run, Joe Sandbox, hybrid-analysis.com)
  • IOC search: Grep across logs, Velociraptor, KAPE for rapid triage
  • Offline antivirus: Bootable AV media for offline scanning without network connectivity

Documentation:

  • Incident log template (timeline of events, actions taken, decisions made)
  • Chain of custody form (for digital evidence if law enforcement may be involved)
  • Network diagrams and asset inventory (printed or stored offline — don’t trust cloud docs that may be compromised)

Define Incident Severity Levels

Establish a severity classification before incidents occur:

SeverityDefinitionResponse TimeEscalation
P1 — CriticalActive breach, data exfiltration, ransomware, business-stopping system compromiseImmediate (24/7)Executive team + legal + external IR firm
P2 — HighMalware on single system, credential compromise, phishing with payloadWithin 1 hour (business hours: immediate)Security lead + IT
P3 — MediumPhishing email (no payload), policy violation, suspicious activityWithin 4 business hoursSecurity team
P4 — LowSpam, AV alert (auto-remediated), failed brute forceNext business dayTicket queue

Tabletop Exercises

Run tabletop exercises quarterly. A tabletop is a facilitated discussion where the team walks through a simulated incident scenario — no systems involved, just thinking through decisions and process.

Scenarios to test:

  • Ransomware: Your backup system is also encrypted. What do you do?
  • Business email compromise: CFO’s email was compromised, $250K wire transfer was approved
  • Data exfiltration: Insider took customer data before resigning. Where is it? What are your obligations?
  • Supply chain: A vendor you use for HR software notifies you of a breach. What’s exposed?
  • DDoS: Your e-commerce site is down on Black Friday. Is it DDoS or something more serious?

The goal is not to have perfect answers — it’s to surface gaps, clarify roles, and build muscle memory for decision-making under pressure.


Phase 2: Detection and Analysis

Detection Sources

Security incidents come to your attention through:

  • SIEM alerts — correlated events from your monitoring platform
  • EDR/MDR alerts — malware detection, suspicious process execution, lateral movement
  • Threat intelligence — external notification that your IPs are in a botnet, your credentials are for sale
  • External reports — security researcher, pen tester, vendor notification
  • User reports — “my computer is acting weird,” “I clicked a phishing link”
  • Service degradation — unexplained performance issues, encrypted files, changed admin passwords
  • Ransomware note — at this point, the incident is already severe

Initial Triage Questions

When a potential incident is reported or detected, the first 30 minutes determine how well the response goes:

1. What was detected or reported? By whom, and when?
2. When did this potentially start? (incident timestamp vs. detection timestamp)
3. What systems/accounts/data are potentially affected?
4. Is this still actively happening?
5. Has any data potentially left the organisation?
6. What's the potential business impact?
7. Does this trigger any notification obligations (PCI, GDPR, HIPAA)?
8. What is the P1/P2/P3/P4 severity classification?

Document every question and answer. Your incident log starts now.

Preserving Evidence

Before you change anything, preserve evidence:

  • Do NOT reboot an affected system without first capturing volatile memory — RAM contains running processes, network connections, encryption keys, and attacker artifacts that are lost on reboot
  • Capture system memory: winpmem_mini_x64.exe incident-memory.raw
  • Take a forensic image of affected disks before remediation
  • Export event logs before clearing or rotation
  • Capture network traffic if ongoing attack
  • Screenshot and export relevant logs from cloud platforms before any changes

Collect evidence with an eye toward legal admissibility:

  • Document chain of custody (who collected what, when, how)
  • Use write blockers for disk imaging (prevent modification of evidence media)
  • Hash all collected evidence (MD5 + SHA256) immediately after collection

Scope the Incident

Before containment, understand what you’re dealing with:

# On potentially compromised systems — gather initial data quickly
# (commands depend on OS, adjust for Linux as needed)

# Current network connections
netstat -anob  # Windows
ss -antup      # Linux

# Recent processes
tasklist /v    # Windows
ps aux         # Linux

# Scheduled tasks (common persistence mechanism)
schtasks /query /fo list /v  # Windows
crontab -l; ls /etc/cron*    # Linux

# Recently modified files
Get-ChildItem C:\ -Recurse | Sort-Object LastWriteTime -Descending | Select -First 50
find / -mtime -7 -type f 2>/dev/null | head 50  # Linux, last 7 days

# Autorun locations
Get-ItemProperty HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
Get-ItemProperty HKCU:\SOFTWARE\Microsoft\Windows\CurrentVersion\Run

# Active users
query session  # Windows
who; w         # Linux

# Event log — failed/successful logons
Get-WinEvent -FilterHashtable @{LogName='Security'; Id=4624,4625} | Select -First 100

Phase 3: Containment

Short-Term Containment

Immediate actions to stop the bleeding:

Network isolation:

  • Isolate affected systems from the network (disable NIC, place in isolated VLAN, or remove from switch port)
  • Do NOT shut systems down before memory capture
  • Block attacker C2 domains/IPs at the firewall if identified
  • Revoke compromised credentials immediately (passwords, API keys, OAuth tokens)

Account lockdown:

  • Disable (not delete) compromised accounts
  • Force password resets for all accounts that may have been exposed
  • Revoke active sessions

Cloud-specific containment:

# AWS — isolate EC2 instance
aws ec2 modify-instance-attribute \
  --instance-id i-xxxx \
  --groups sg-isolated  # Security group with no inbound/outbound

# Disable IAM credentials
aws iam update-access-key \
  --user-name compromised-user \
  --access-key-id AKIAXXXX \
  --status Inactive

# Azure — isolate VM
az vm deallocate --resource-group rg --name vm-name
# Or apply NSG with deny-all rules

Long-Term Containment

While investigation continues, implement temporary security controls that let the business operate while you remediate:

  • Enhanced monitoring on all remaining systems
  • Additional authentication requirements for sensitive operations
  • Temporary decommission of non-essential systems
  • Restrict external communications from internal networks

Communication During Containment

Internal communications during an incident should be:

  • Need-to-know only — don’t blast the whole company; contain information within the response team
  • On out-of-band channels — if email or Slack is compromised, don’t use them for incident comms
  • Documented — keep a log of who was told what and when

Executive communication: brief every 2–4 hours during active incidents. Executives need:

  • What happened (brief summary)
  • Current status (contained / ongoing / unknown scope)
  • Business impact (what’s down, what data may be affected)
  • What we’re doing about it
  • When next update will be

Phase 4: Eradication and Recovery

Eradication

Remove all attacker artifacts and close all entry points:

  1. Identify all compromised systems — don’t assume the incident is limited to what you found first
  2. Remove malware — but preserve copies for forensic analysis before deletion
  3. Remove persistence mechanisms — scheduled tasks, startup entries, backdoor accounts, compromised SSH keys
  4. Patch the vulnerability exploited — don’t recover a system to the same vulnerable state
  5. Reset all credentials — any credential that may have been accessible on a compromised system must be rotated
  6. Revoke and reissue certificates — if PKI infrastructure was touched

Rebuild vs. remediate:

  • For ransomware, severe malware, or any case where you can’t be certain of the scope of compromise: rebuild from known-good backups or clean images
  • Trying to clean a sophisticated attacker’s installation is unreliable — they leave multiple persistence mechanisms

Recovery

Restoring operations:

  1. Restore from clean backups — verify backups are clean (check backup creation timestamps against incident timeline)
  2. Validate systems before bringing online — AV scan, log review, integrity checks
  3. Monitor restored systems intensively — attackers commonly have dormant persistence that activates post-recovery
  4. Gradual restoration — bring critical systems back first, test thoroughly, then restore others
  5. Change all credentials — even on systems not directly compromised (assume lateral movement)

Regulatory Notification

Many incidents trigger notification obligations. These are time-sensitive:

RegulationTriggerNotification Deadline
GDPRPersonal data breach likely to result in risk to individuals72 hours to supervisory authority
PCI DSSCardholder data compromiseImmediately to payment brands and acquirer
HIPAAProtected Health Information breach60 days of discovery (individuals + HHS)
SEC (US public companies)Material cybersecurity incident4 business days of materiality determination
India CERT-InVarious cybersecurity incidents6 hours of discovery

Engage legal counsel immediately if any regulation applies. Attorney-client privilege may apply to incident response communications.


Phase 5: Post-Incident Review

The Post-Mortem

Within 2 weeks of incident closure, conduct a blameless post-mortem. Blameless means: focus on systems and processes, not individual mistakes. If people are afraid of blame, they won’t be honest about what happened.

Agenda:

  1. Timeline reconstruction — precise sequence of events from initial compromise to full recovery
  2. Root cause analysis — what was the fundamental cause? (vulnerability exploited, human error, control gap)
  3. What worked? — detection, response, communications that went well
  4. What didn’t work? — gaps in detection, slow response, communication failures
  5. Action items — specific, assigned, time-bound improvements

Improvement Actions

Convert post-mortem findings into concrete actions:

FindingActionOwnerDue
Attacker maintained access for 47 days before detectionDeploy EDR with detection rules for lateral movementSecurity Lead2 weeks
No MFA on VPN — credential stuffing was initial vectorEnforce MFA on all VPN accessIT Director1 week
Customer notification took 72 hoursPre-draft customer notification templatesLegal + Comms2 weeks
Backup restoration took 3 daysTest and document restoration procedures quarterlyIT DirectorMonthly

Playbook Examples

Document specific playbooks for the most likely incident types:

Ransomware Playbook (abbreviated)

DETECTION
□ Identify encrypted files (unusual extensions, ransom note)
□ Identify affected systems
□ Identify patient zero (first infected system)

IMMEDIATE CONTAINMENT (first 30 minutes)
□ Isolate all encrypted systems from network
□ Disable all file shares
□ Take all backup systems offline immediately (protect backups)
□ Activate out-of-band communications
□ Page: IC, Security Lead, IT Lead, Legal, Executive sponsor

INVESTIGATION
□ Capture memory from encrypted systems before power-off
□ Identify initial access vector (phishing? VPN creds? RDP brute force?)
□ Identify encryption scope (what's encrypted, what's not)
□ Verify backup integrity (are backups encrypted too?)
□ Identify ransomware family (ransom note, file extension, ID Ransomware)
□ Assess likelihood of exfiltration (many groups exfiltrate before encrypting)

DECISION POINT: Pay or recover?
□ Assess recovery feasibility (backup coverage vs. encrypted scope)
□ Engage cyber insurance
□ Engage external IR firm
□ Legal assessment of ransom payment implications (sanctions, reporting)
□ NEVER pay without legal clearance

RECOVERY
□ Identify clean recovery point in backups
□ Rebuild/restore affected systems
□ Patch initial access vector
□ Force credential resets across organisation
□ Restore from backup, validate, monitor

NOTIFICATION
□ Legal assessment of notification obligations
□ CERT-In (India): within 6 hours
□ Affected customers: per legal guidance
□ Cyber insurance claim

Phishing with Payload Playbook (abbreviated)

DETECTION
□ User reports clicking link or opening attachment
□ AV/EDR alert on endpoint

TRIAGE (first 15 minutes)
□ Collect the phishing email (headers, attachments, links) — without clicking
□ Submit attachment/URL to sandbox (Any.run, hybrid-analysis)
□ Check if payload executed (EDR telemetry, process tree)
□ Check for persistence (scheduled tasks, autoruns)
□ Check for lateral movement (new connections, account usage)

CONTAINMENT
□ Isolate affected endpoint
□ Reset user credentials (email, VPN, SaaS apps)
□ Block phishing domain/IP at email gateway and firewall
□ Search email platform for same email to other users → quarantine
□ Notify affected user

ERADICATION
□ If payload executed: forensic triage of endpoint, consider full rebuild
□ If payload did not execute: AV scan, verify clean, return to service
□ Update email gateway rules based on phishing indicators

POST-INCIDENT
□ User security awareness training
□ Test and update phishing simulation programme

CyberneticsPlus provides incident response planning, tabletop exercises, and active incident response for organisations across financial services, SaaS, and enterprise through our managed SOC service. We help you build the capability before you need it, and respond effectively when incidents occur. Contact us for incident response readiness assessment.

#incident response #IR planning #cybersecurity incident #SOC #DFIR #ransomware response #playbook

Need expert help with Incident Response?

Our certified security team is ready to assess your environment and recommend the right solutions.

Book a Free Consultation