Incident response is not a skill you learn during a breach. The organisations that handle incidents well have done the work in advance — defined roles, documented playbooks, tested their communications, and practiced their procedures. When an attack hits at 3 AM, the teams that have prepared execute confidently. The teams that haven’t are improvising under maximum pressure. Our managed SOC provides the 24/7 detection and initial response capability that gives you the time to execute your playbook effectively.
This guide walks through how to build an incident response programme that works — not as a compliance checkbox, but as a functional capability that reduces breach impact when you need it most.
What Is Incident Response?
Incident response (IR) is the structured process of detecting, containing, eradicating, and recovering from security incidents, then learning from them to improve future defences.
NIST SP 800-61 defines four phases:
- Preparation — building capability before incidents occur
- Detection and Analysis — identifying and understanding incidents
- Containment, Eradication, and Recovery — stopping damage, removing threats, restoring operations
- Post-Incident Activity — learning and improving
Most organisations focus too much on phases 2–3 and too little on phase 1. Preparation is what determines whether phase 2–3 goes smoothly.
Phase 1: Preparation
Build Your Incident Response Team (IRT)
Define who is responsible for what before an incident occurs:
| Role | Responsibilities | Who Fills It |
|---|---|---|
| Incident Commander (IC) | Overall coordination, decision authority | CISO, Security Director, or designate |
| Security Lead | Technical investigation and containment | Senior security engineer or SOC lead |
| IT/Infrastructure Lead | System isolation, recovery execution | IT Director or senior sysadmin |
| Legal Counsel | Legal obligations, regulatory notifications, privilege | General counsel or external legal |
| Communications Lead | Internal comms, customer/media communications | Head of Marketing, Comms, or PR |
| HR Representative | If insider threat involved | HR Director |
| Executive Sponsor | Strategic decisions, resource allocation | CTO, CEO, or equivalent |
For small organisations without dedicated security staff, some roles will overlap. Define who covers each role explicitly — ambiguity during an incident is costly.
External resources to pre-contract:
- IR retainer firm — a cybersecurity firm on retainer that can deploy forensic experts within hours. Don’t search for one during a breach. Our managed SOC includes an IR retainer for active incident support.
- Legal with breach response expertise — attorney-client privilege applies to IR communications, which is important if litigation follows
- Cyber insurance broker — understand what your policy covers before you need to file a claim
- PR firm (for larger organisations) — experienced in data breach communications
Prepare Your Toolbox
Stock your IR toolkit before you need it:
Communication:
- An out-of-band communication channel (Signal, Wickr, or a separate email domain) — don’t rely on your potentially-compromised email/Slack during an incident
- Encrypted note-taking for sensitive case notes
- Pre-drafted email templates for customer notification, regulatory notification, employee communication
Technical tools:
- Forensic imaging: FTK Imager, dc3dd
- Memory acquisition: DumpIt, Magnet RAM Capture, WinPmem
- Log collection: Windows Event Log export scripts, Linux auditd
- Network packet capture: Wireshark, tcpdump
- Malware analysis: Sandbox (Any.run, Joe Sandbox, hybrid-analysis.com)
- IOC search: Grep across logs, Velociraptor, KAPE for rapid triage
- Offline antivirus: Bootable AV media for offline scanning without network connectivity
Documentation:
- Incident log template (timeline of events, actions taken, decisions made)
- Chain of custody form (for digital evidence if law enforcement may be involved)
- Network diagrams and asset inventory (printed or stored offline — don’t trust cloud docs that may be compromised)
Define Incident Severity Levels
Establish a severity classification before incidents occur:
| Severity | Definition | Response Time | Escalation |
|---|---|---|---|
| P1 — Critical | Active breach, data exfiltration, ransomware, business-stopping system compromise | Immediate (24/7) | Executive team + legal + external IR firm |
| P2 — High | Malware on single system, credential compromise, phishing with payload | Within 1 hour (business hours: immediate) | Security lead + IT |
| P3 — Medium | Phishing email (no payload), policy violation, suspicious activity | Within 4 business hours | Security team |
| P4 — Low | Spam, AV alert (auto-remediated), failed brute force | Next business day | Ticket queue |
Tabletop Exercises
Run tabletop exercises quarterly. A tabletop is a facilitated discussion where the team walks through a simulated incident scenario — no systems involved, just thinking through decisions and process.
Scenarios to test:
- Ransomware: Your backup system is also encrypted. What do you do?
- Business email compromise: CFO’s email was compromised, $250K wire transfer was approved
- Data exfiltration: Insider took customer data before resigning. Where is it? What are your obligations?
- Supply chain: A vendor you use for HR software notifies you of a breach. What’s exposed?
- DDoS: Your e-commerce site is down on Black Friday. Is it DDoS or something more serious?
The goal is not to have perfect answers — it’s to surface gaps, clarify roles, and build muscle memory for decision-making under pressure.
Phase 2: Detection and Analysis
Detection Sources
Security incidents come to your attention through:
- SIEM alerts — correlated events from your monitoring platform
- EDR/MDR alerts — malware detection, suspicious process execution, lateral movement
- Threat intelligence — external notification that your IPs are in a botnet, your credentials are for sale
- External reports — security researcher, pen tester, vendor notification
- User reports — “my computer is acting weird,” “I clicked a phishing link”
- Service degradation — unexplained performance issues, encrypted files, changed admin passwords
- Ransomware note — at this point, the incident is already severe
Initial Triage Questions
When a potential incident is reported or detected, the first 30 minutes determine how well the response goes:
1. What was detected or reported? By whom, and when?
2. When did this potentially start? (incident timestamp vs. detection timestamp)
3. What systems/accounts/data are potentially affected?
4. Is this still actively happening?
5. Has any data potentially left the organisation?
6. What's the potential business impact?
7. Does this trigger any notification obligations (PCI, GDPR, HIPAA)?
8. What is the P1/P2/P3/P4 severity classification?
Document every question and answer. Your incident log starts now.
Preserving Evidence
Before you change anything, preserve evidence:
- Do NOT reboot an affected system without first capturing volatile memory — RAM contains running processes, network connections, encryption keys, and attacker artifacts that are lost on reboot
- Capture system memory:
winpmem_mini_x64.exe incident-memory.raw - Take a forensic image of affected disks before remediation
- Export event logs before clearing or rotation
- Capture network traffic if ongoing attack
- Screenshot and export relevant logs from cloud platforms before any changes
Collect evidence with an eye toward legal admissibility:
- Document chain of custody (who collected what, when, how)
- Use write blockers for disk imaging (prevent modification of evidence media)
- Hash all collected evidence (MD5 + SHA256) immediately after collection
Scope the Incident
Before containment, understand what you’re dealing with:
# On potentially compromised systems — gather initial data quickly
# (commands depend on OS, adjust for Linux as needed)
# Current network connections
netstat -anob # Windows
ss -antup # Linux
# Recent processes
tasklist /v # Windows
ps aux # Linux
# Scheduled tasks (common persistence mechanism)
schtasks /query /fo list /v # Windows
crontab -l; ls /etc/cron* # Linux
# Recently modified files
Get-ChildItem C:\ -Recurse | Sort-Object LastWriteTime -Descending | Select -First 50
find / -mtime -7 -type f 2>/dev/null | head 50 # Linux, last 7 days
# Autorun locations
Get-ItemProperty HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
Get-ItemProperty HKCU:\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
# Active users
query session # Windows
who; w # Linux
# Event log — failed/successful logons
Get-WinEvent -FilterHashtable @{LogName='Security'; Id=4624,4625} | Select -First 100
Phase 3: Containment
Short-Term Containment
Immediate actions to stop the bleeding:
Network isolation:
- Isolate affected systems from the network (disable NIC, place in isolated VLAN, or remove from switch port)
- Do NOT shut systems down before memory capture
- Block attacker C2 domains/IPs at the firewall if identified
- Revoke compromised credentials immediately (passwords, API keys, OAuth tokens)
Account lockdown:
- Disable (not delete) compromised accounts
- Force password resets for all accounts that may have been exposed
- Revoke active sessions
Cloud-specific containment:
# AWS — isolate EC2 instance
aws ec2 modify-instance-attribute \
--instance-id i-xxxx \
--groups sg-isolated # Security group with no inbound/outbound
# Disable IAM credentials
aws iam update-access-key \
--user-name compromised-user \
--access-key-id AKIAXXXX \
--status Inactive
# Azure — isolate VM
az vm deallocate --resource-group rg --name vm-name
# Or apply NSG with deny-all rules
Long-Term Containment
While investigation continues, implement temporary security controls that let the business operate while you remediate:
- Enhanced monitoring on all remaining systems
- Additional authentication requirements for sensitive operations
- Temporary decommission of non-essential systems
- Restrict external communications from internal networks
Communication During Containment
Internal communications during an incident should be:
- Need-to-know only — don’t blast the whole company; contain information within the response team
- On out-of-band channels — if email or Slack is compromised, don’t use them for incident comms
- Documented — keep a log of who was told what and when
Executive communication: brief every 2–4 hours during active incidents. Executives need:
- What happened (brief summary)
- Current status (contained / ongoing / unknown scope)
- Business impact (what’s down, what data may be affected)
- What we’re doing about it
- When next update will be
Phase 4: Eradication and Recovery
Eradication
Remove all attacker artifacts and close all entry points:
- Identify all compromised systems — don’t assume the incident is limited to what you found first
- Remove malware — but preserve copies for forensic analysis before deletion
- Remove persistence mechanisms — scheduled tasks, startup entries, backdoor accounts, compromised SSH keys
- Patch the vulnerability exploited — don’t recover a system to the same vulnerable state
- Reset all credentials — any credential that may have been accessible on a compromised system must be rotated
- Revoke and reissue certificates — if PKI infrastructure was touched
Rebuild vs. remediate:
- For ransomware, severe malware, or any case where you can’t be certain of the scope of compromise: rebuild from known-good backups or clean images
- Trying to clean a sophisticated attacker’s installation is unreliable — they leave multiple persistence mechanisms
Recovery
Restoring operations:
- Restore from clean backups — verify backups are clean (check backup creation timestamps against incident timeline)
- Validate systems before bringing online — AV scan, log review, integrity checks
- Monitor restored systems intensively — attackers commonly have dormant persistence that activates post-recovery
- Gradual restoration — bring critical systems back first, test thoroughly, then restore others
- Change all credentials — even on systems not directly compromised (assume lateral movement)
Regulatory Notification
Many incidents trigger notification obligations. These are time-sensitive:
| Regulation | Trigger | Notification Deadline |
|---|---|---|
| GDPR | Personal data breach likely to result in risk to individuals | 72 hours to supervisory authority |
| PCI DSS | Cardholder data compromise | Immediately to payment brands and acquirer |
| HIPAA | Protected Health Information breach | 60 days of discovery (individuals + HHS) |
| SEC (US public companies) | Material cybersecurity incident | 4 business days of materiality determination |
| India CERT-In | Various cybersecurity incidents | 6 hours of discovery |
Engage legal counsel immediately if any regulation applies. Attorney-client privilege may apply to incident response communications.
Phase 5: Post-Incident Review
The Post-Mortem
Within 2 weeks of incident closure, conduct a blameless post-mortem. Blameless means: focus on systems and processes, not individual mistakes. If people are afraid of blame, they won’t be honest about what happened.
Agenda:
- Timeline reconstruction — precise sequence of events from initial compromise to full recovery
- Root cause analysis — what was the fundamental cause? (vulnerability exploited, human error, control gap)
- What worked? — detection, response, communications that went well
- What didn’t work? — gaps in detection, slow response, communication failures
- Action items — specific, assigned, time-bound improvements
Improvement Actions
Convert post-mortem findings into concrete actions:
| Finding | Action | Owner | Due |
|---|---|---|---|
| Attacker maintained access for 47 days before detection | Deploy EDR with detection rules for lateral movement | Security Lead | 2 weeks |
| No MFA on VPN — credential stuffing was initial vector | Enforce MFA on all VPN access | IT Director | 1 week |
| Customer notification took 72 hours | Pre-draft customer notification templates | Legal + Comms | 2 weeks |
| Backup restoration took 3 days | Test and document restoration procedures quarterly | IT Director | Monthly |
Playbook Examples
Document specific playbooks for the most likely incident types:
Ransomware Playbook (abbreviated)
DETECTION
□ Identify encrypted files (unusual extensions, ransom note)
□ Identify affected systems
□ Identify patient zero (first infected system)
IMMEDIATE CONTAINMENT (first 30 minutes)
□ Isolate all encrypted systems from network
□ Disable all file shares
□ Take all backup systems offline immediately (protect backups)
□ Activate out-of-band communications
□ Page: IC, Security Lead, IT Lead, Legal, Executive sponsor
INVESTIGATION
□ Capture memory from encrypted systems before power-off
□ Identify initial access vector (phishing? VPN creds? RDP brute force?)
□ Identify encryption scope (what's encrypted, what's not)
□ Verify backup integrity (are backups encrypted too?)
□ Identify ransomware family (ransom note, file extension, ID Ransomware)
□ Assess likelihood of exfiltration (many groups exfiltrate before encrypting)
DECISION POINT: Pay or recover?
□ Assess recovery feasibility (backup coverage vs. encrypted scope)
□ Engage cyber insurance
□ Engage external IR firm
□ Legal assessment of ransom payment implications (sanctions, reporting)
□ NEVER pay without legal clearance
RECOVERY
□ Identify clean recovery point in backups
□ Rebuild/restore affected systems
□ Patch initial access vector
□ Force credential resets across organisation
□ Restore from backup, validate, monitor
NOTIFICATION
□ Legal assessment of notification obligations
□ CERT-In (India): within 6 hours
□ Affected customers: per legal guidance
□ Cyber insurance claim
Phishing with Payload Playbook (abbreviated)
DETECTION
□ User reports clicking link or opening attachment
□ AV/EDR alert on endpoint
TRIAGE (first 15 minutes)
□ Collect the phishing email (headers, attachments, links) — without clicking
□ Submit attachment/URL to sandbox (Any.run, hybrid-analysis)
□ Check if payload executed (EDR telemetry, process tree)
□ Check for persistence (scheduled tasks, autoruns)
□ Check for lateral movement (new connections, account usage)
CONTAINMENT
□ Isolate affected endpoint
□ Reset user credentials (email, VPN, SaaS apps)
□ Block phishing domain/IP at email gateway and firewall
□ Search email platform for same email to other users → quarantine
□ Notify affected user
ERADICATION
□ If payload executed: forensic triage of endpoint, consider full rebuild
□ If payload did not execute: AV scan, verify clean, return to service
□ Update email gateway rules based on phishing indicators
POST-INCIDENT
□ User security awareness training
□ Test and update phishing simulation programme
CyberneticsPlus provides incident response planning, tabletop exercises, and active incident response for organisations across financial services, SaaS, and enterprise through our managed SOC service. We help you build the capability before you need it, and respond effectively when incidents occur. Contact us for incident response readiness assessment.