BIPI
BIPI

Writing YARA Rules That Catch Tomorrow's Sample, Not Yesterday's

Cybersecurity

YARA is twelve years old and most analysts still write rules that only catch the exact sample they had on their desk. Here's how to build rules that generalize across families without drowning your hunters in false positives.

By Arjun Raghavan, Security & Systems Lead, BIPI · January 7, 2024 · 8 min read

#yara#malware#threat-hunting

I've reviewed roughly 400 YARA rules in the last year, written by analysts at banks, telcos, and government CERTs. A clear majority fall into one of two failure modes. Either the rule is so tight it matches one sample and nothing else, or it's so loose it matches half of legitimate Windows binaries. Both are useless.

Good YARA rules sit in a narrow band. They generalize across a malware family or technique without lighting up benign software. Getting there requires deliberate choices about strings, hex patterns, regex, and condition logic. None of those choices are obvious from reading the YARA documentation.

Strings, Hex, and Regex Have Different Jobs

Plain text strings catch hardcoded URLs, mutex names, error messages, and PDB paths. They're fast and they survive most repacking. The trap is variant fragility. If the malware author changes "connection_failed" to "conn_fail" the rule misses. Use plain strings for content the author is unlikely to vary: legal notices, embedded copyright strings, distinctive error messages.

Hex patterns catch opcode sequences and structural artifacts. They survive string obfuscation but break on recompilation with different optimization flags. Use them for unique decoder stubs, distinctive crypto routines, or PE structure quirks. Wildcards (??) and jumps ([4-8]) give you tolerance for register variation and small offset changes.

Regex is the most expensive primitive. YARA evaluates regex per byte until it can rule out a match. Restrict regex to anchored, short patterns. "/cmd\.exe \/c [a-z]+/" is fine. "/.*payload.*/" will crater your scan performance across a 100GB sample corpus.

Condition Logic Is Where Rules Live or Die

The condition block is where you express "how many of these strings need to match, and where". This is where most analysts give up and write "any of them" or "all of them". Both are wrong for non-trivial families.

A good condition pattern: require at least one structural anchor (PE magic, specific section name, distinctive import) plus M of N family-specific strings. For example: pe.is_pe and pe.imports("crypt32.dll", "CryptStringToBinaryA") and 3 of ($cmd*) and 1 of ($url*). This says "PE file that imports a specific crypto function, with at least three of the command strings and at least one of the URL patterns". That kind of rule generalizes across a campaign while staying narrow enough to avoid noise.

Performance Mistakes That Kill Hunts

  • Short strings (under 4 bytes) trigger atom warnings, YARA degrades to brute-force scanning
  • Unanchored regex with .* destroys throughput on large corpora
  • Hundreds of strings in one rule slow the entire ruleset, not just that rule
  • Missing the 'fullword' modifier creates substring matches inside larger benign strings
  • Forgetting 'nocase' on Windows API names misses samples that vary capitalization

Testing Against Real Corpora

Write the rule, then run it against three datasets before deploying. First, the original family sample set, if it doesn't catch what you wrote it for, fix it. Second, a clean Windows install plus the top 100 GitHub releases (PuTTY, 7-Zip, Notepad++, etc.), if it fires on those, tighten the condition. Third, your historical malware repository or VirusTotal Retrohunt, this catches both variants you didn't have and overlap with unrelated families.

Retrohunt is the highest-value test because it shows you whether the rule generalizes. A rule that matches your three reference samples and 800 historical samples in the same family is doing real work. A rule that only matches your three samples is a hash check with extra steps.

Version, Review, and Retire

Treat YARA rules like detection code. Git repo, peer review for every new rule, metadata fields for author, family, date, MITRE technique. Set a quarterly review where any rule that hasn't fired in 12 months gets re-evaluated. Either the threat moved on, the rule was always wrong, or the sample base shifted. All three demand action.

When YARA Isn't the Right Tool

YARA is for files. For network detection, you want Suricata or Snort. For runtime behavior, EDR rules or Sigma. For memory artifacts on live hosts, YARA still works (volatility plugin, EDR memory scanning), but the rules need different conditions than disk-based YARA. Mixing scopes inside one rule is a common cause of the "works in the lab, fails in production" problem.

Read more field notes, explore our services, or get in touch at info@bipi.in. Privacy Policy · Terms.