BIPI

The First 72 Hours: A Practitioner's IR Playbook

Cybersecurity

The opening three days of an incident decide whether you contain a breach or amplify it. This is the hour-by-hour playbook we run when a Tier-1 alert escalates into a confirmed compromise.

By Arjun Raghavan, Security & Systems Lead, BIPI · March 1, 2024 · 8 min read

#incident-response#playbook#dfir

Every major incident we worked in 2023 followed the same arc. The first 72 hours either stabilize the environment or hand the adversary a second wind. The team that wins is the one that has rehearsed who picks up the phone at 2:14 AM and what they do for the next three days.

Hour 0 to 4: Convene and confirm

An alert is not an incident. Get a confirmed indicator before you wake the CFO. Pull the triggering EDR detection, validate it against process tree, parent hash, and a second telemetry source. If the alert is real, declare. Use a tiered declaration model so a SOC analyst can call a Severity 2 without waiting for a director.

Open the incident channel (we use a pre-created #ir-bridge with the right roster pinged automatically)
Assign the four roles: Incident Commander, Investigator Lead, Comms Lead, Scribe
Start the timeline log in the first ten minutes; if it is not written down, it did not happen
Snapshot the affected host with KAPE Triage collection before anyone reboots anything

Hour 4 to 24: Contain without burning evidence

Over-containment is the most common mistake we see. A panicked SOC isolates twelve hosts, the adversary notices, burns their access, and you lose the ability to follow them. Contain the dwell-time hosts you have confirmed, leave the suspected ones under heightened monitoring with EDR in block mode. Capture volatile memory with WinPMEM or AVML before isolation when feasible.

Hour 24 to 48: Scope ruthlessly

Scoping is where junior teams burn time. Use the indicators you have, lateral movement artifacts, and identity telemetry to widen the blast radius. Run Chainsaw or Hayabusa across the EVTX corpus you pulled with EvtxECmd. Look for 4624 Type 3 from the patient zero host, look for 4688 with unusual command lines, look for service installations (7045) outside the change window.

Pull EVTX from every host the patient zero touched in the last 30 days
Run Chainsaw with the Sigma ruleset to surface lateral movement, persistence, and credential access
Cross-reference with identity logs: Azure AD sign-ins, Okta system log, IdP failures from new geos
Build the scope diagram before the 48-hour mark or you will lose the executive thread

Hour 48 to 72: Communicate like an adult

Premature attribution kills credibility. We have watched a CISO tell the board it was nation-state activity on Day 2, then walk it back to a commodity loader on Day 4. The board never trusted that team again. Communicate what you know, what you are still investigating, and the next checkpoint. Give regulators a holding statement that buys you the legal clock without making promises.

If your comms lead has not rehearsed the holding statement before the incident, you are writing it under stress and it will read like it.

Common mistakes that cost organizations real money

Rebooting the patient zero host before memory was captured (lost beacon, lost case)
Resetting passwords before invalidating refresh tokens (attacker stays in via OAuth grants)
Telling staff there is an incident over email the attacker is currently reading
Calling cyber insurance after engaging counsel instead of before (privilege issues)
Declaring all-clear because EDR is quiet, ignoring that the dwell time was 47 days

Lessons we keep relearning

Run the tabletop quarterly with the same roster who will get paged. Pre-stage triage collection scripts on every endpoint so KAPE can run without a domain admin walking to the server room. Write the holding statement in peacetime. The team that responds well at 3 AM is the team that prepared in the daylight.

Read more field notes, explore our services, or get in touch at info@bipi.in. Privacy Policy · Terms.