BIPI

Fuzzing Strategies for Pentesters: AFL++, libFuzzer, Honggfuzz, and Coverage Tactics

Cybersecurity

Choosing and running fuzzers for pentest engagements: AFL++ for binaries, libFuzzer for libraries, Honggfuzz for parallelism, and coverage-driven harness design.

By Arjun Raghavan, Security & Systems Lead, BIPI · December 30, 2024 · 11 min read

#fuzzing#afl++#libfuzzer#honggfuzz#coverage

Fuzzing is the highest-yield bug-finding technique for native code. AFL++ for whole-binary fuzzing, libFuzzer for in-process library harnesses, Honggfuzz for parallel and persistent modes. The fuzzer is the easy part. Harnesses, corpus, and coverage are where the real engineering lives.

Choosing Your Fuzzer

AFL++ (community fork of AFL): the default. Coverage-guided, fork-server, persistent mode, supports QEMU and Frida instrumentation for binary-only targets
libFuzzer: in-process, integrated with Clang. Best for library code where you can write LLVMFuzzerTestOneInput()
Honggfuzz: multi-threaded by design, hardware perf-counter feedback, runs every process per core, fast for big targets
syzkaller: kernel fuzzing, syscall grammar, found hundreds of Linux kernel CVEs since 2015
Jazzer: JVM fuzzer for Java and Kotlin, integrates with libFuzzer engine
atheris: Python coverage fuzzer for pure-Python libraries

Building a Harness

The harness is a function that takes raw bytes and exercises the target. For libFuzzer, write int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) that parses input as the relevant format and calls into the API. Keep harnesses small, deterministic, and free of global state. Compile with -fsanitize=fuzzer,address for combined fuzzing and ASan.

Coverage and Corpus

Coverage-guided fuzzers prioritize inputs that hit new code paths. Seed corpus matters enormously. For a JSON parser, seed with thousands of real JSON files, valid and invalid. For an image decoder, seed with PNGs, JPGs, and crafted broken files. AFL++ minimizes corpus with afl-cmin, then minimizes individual cases with afl-tmin.

Seed corpus from real-world data: parse logs, public samples, vendor test suites
Run for hours, then re-minimize, re-feed, run again, this is corpus distillation
Track coverage with afl-showmap, llvm-cov for libFuzzer, or honggfuzz coverage reports
Dictionary support in AFL++ (-x dict.txt) accelerates discovery of magic bytes and protocol tokens

Sanitizers

Fuzzing without sanitizers is half the value. ASan catches heap and stack memory errors, MSan catches uninitialized reads, UBSan catches integer overflow and undefined behavior, TSan catches data races. Run ASan+UBSan together as the default, switch to MSan for code that does not link uninstrumented libraries.

Every memory corruption CVE in 2024 we have shipped started life as an ASan crash in a 10-minute fuzz session. Fuzzing is not magic, it is leverage.

Binary-only Fuzzing

For closed-source targets, AFL++ supports QEMU mode (system-mode emulation with coverage instrumentation), Unicorn mode (snapshot fuzzing), and Frida mode (in-process instrumentation via Frida-gum). Performance is 5x to 50x slower than instrumented source, but it works on firmware, mobile, and proprietary parsers.

Structure-Aware Fuzzing

Random bytes rarely produce valid protocol messages past the first parser layer. Use libprotobuf-mutator with a .proto schema to generate structurally-valid protobufs. Use FuzzedDataProvider for typed input. For grammars, use Grammarinator or Domato. Google's OSS-Fuzz runs structure-aware harnesses for hundreds of projects in production.

Engagement Workflow

Identify high-value parsers: network protocol decoders, file format readers, deserialization sinks
Write minimal harness per target, compile with ASan+UBSan and coverage
Build seed corpus from project test cases, public samples, and crafted edge cases
Run 8 to 24 hours per harness, multiple cores, monitor coverage saturation
Triage crashes with afl-collect or a custom dedup script, file CVEs with PoC and root cause

CVEs Found by Fuzzing in 2024

CVE-2024-4577 PHP CGI argument injection, root cause caught via input fuzzing
Multiple ImageMagick CVEs from OSS-Fuzz across coders in 2024 (TIFF, PCX, PSD)
CVE-2024-43485 .NET CoreFX certificate parser overflow, found via SharpFuzz harness
Curl and OpenSSL continue to ship fixes from continuous OSS-Fuzz coverage

Common Pitfalls

Forgetting to disable randomization in the target, makes coverage feedback noisy
Letting the corpus explode without periodic cmin runs, slows the fuzzer to a crawl
Skipping sanitizers, missing 90 percent of findable bugs
Fuzzing the wrong layer, network-level fuzzing when the bug lives in the parser called by the network layer

When to Skip Fuzzing

Pure-business-logic code, web app routes, and HTML rendering rarely benefit from byte-level fuzzing. Use property-based testing (Hypothesis, jqwik) for logic. Use scanners and manual review for web. Save fuzzers for parsers, decoders, deserializers, and crypto primitives where a single byte can pivot control flow.

The fuzzer you run consistently beats the fuzzer you tune perfectly once. Start small, run hot, harvest crashes, then iterate on harness and corpus.

Read more field notes, explore our services, or get in touch at info@bipi.in. Privacy Policy · Terms.