Fuzzing Strategies for Pentesters: AFL++, libFuzzer, Honggfuzz, and Coverage Tactics
Cybersecurity
Choosing and running fuzzers for pentest engagements: AFL++ for binaries, libFuzzer for libraries, Honggfuzz for parallelism, and coverage-driven harness design.
By Arjun Raghavan, Security & Systems Lead, BIPI · December 30, 2024 · 11 min read
Fuzzing is the highest-yield bug-finding technique for native code. AFL++ for whole-binary fuzzing, libFuzzer for in-process library harnesses, Honggfuzz for parallel and persistent modes. The fuzzer is the easy part. Harnesses, corpus, and coverage are where the real engineering lives.
Choosing Your Fuzzer
- AFL++ (community fork of AFL): the default. Coverage-guided, fork-server, persistent mode, supports QEMU and Frida instrumentation for binary-only targets
- libFuzzer: in-process, integrated with Clang. Best for library code where you can write LLVMFuzzerTestOneInput()
- Honggfuzz: multi-threaded by design, hardware perf-counter feedback, runs every process per core, fast for big targets
- syzkaller: kernel fuzzing, syscall grammar, found hundreds of Linux kernel CVEs since 2015
- Jazzer: JVM fuzzer for Java and Kotlin, integrates with libFuzzer engine
- atheris: Python coverage fuzzer for pure-Python libraries
Building a Harness
The harness is a function that takes raw bytes and exercises the target. For libFuzzer, write int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) that parses input as the relevant format and calls into the API. Keep harnesses small, deterministic, and free of global state. Compile with -fsanitize=fuzzer,address for combined fuzzing and ASan.
Coverage and Corpus
Coverage-guided fuzzers prioritize inputs that hit new code paths. Seed corpus matters enormously. For a JSON parser, seed with thousands of real JSON files, valid and invalid. For an image decoder, seed with PNGs, JPGs, and crafted broken files. AFL++ minimizes corpus with afl-cmin, then minimizes individual cases with afl-tmin.
- Seed corpus from real-world data: parse logs, public samples, vendor test suites
- Run for hours, then re-minimize, re-feed, run again, this is corpus distillation
- Track coverage with afl-showmap, llvm-cov for libFuzzer, or honggfuzz coverage reports
- Dictionary support in AFL++ (-x dict.txt) accelerates discovery of magic bytes and protocol tokens
Sanitizers
Fuzzing without sanitizers is half the value. ASan catches heap and stack memory errors, MSan catches uninitialized reads, UBSan catches integer overflow and undefined behavior, TSan catches data races. Run ASan+UBSan together as the default, switch to MSan for code that does not link uninstrumented libraries.
Every memory corruption CVE in 2024 we have shipped started life as an ASan crash in a 10-minute fuzz session. Fuzzing is not magic, it is leverage.
Binary-only Fuzzing
For closed-source targets, AFL++ supports QEMU mode (system-mode emulation with coverage instrumentation), Unicorn mode (snapshot fuzzing), and Frida mode (in-process instrumentation via Frida-gum). Performance is 5x to 50x slower than instrumented source, but it works on firmware, mobile, and proprietary parsers.
Structure-Aware Fuzzing
Random bytes rarely produce valid protocol messages past the first parser layer. Use libprotobuf-mutator with a .proto schema to generate structurally-valid protobufs. Use FuzzedDataProvider for typed input. For grammars, use Grammarinator or Domato. Google's OSS-Fuzz runs structure-aware harnesses for hundreds of projects in production.
Engagement Workflow
- Identify high-value parsers: network protocol decoders, file format readers, deserialization sinks
- Write minimal harness per target, compile with ASan+UBSan and coverage
- Build seed corpus from project test cases, public samples, and crafted edge cases
- Run 8 to 24 hours per harness, multiple cores, monitor coverage saturation
- Triage crashes with afl-collect or a custom dedup script, file CVEs with PoC and root cause
CVEs Found by Fuzzing in 2024
- CVE-2024-4577 PHP CGI argument injection, root cause caught via input fuzzing
- Multiple ImageMagick CVEs from OSS-Fuzz across coders in 2024 (TIFF, PCX, PSD)
- CVE-2024-43485 .NET CoreFX certificate parser overflow, found via SharpFuzz harness
- Curl and OpenSSL continue to ship fixes from continuous OSS-Fuzz coverage
Common Pitfalls
- Forgetting to disable randomization in the target, makes coverage feedback noisy
- Letting the corpus explode without periodic cmin runs, slows the fuzzer to a crawl
- Skipping sanitizers, missing 90 percent of findable bugs
- Fuzzing the wrong layer, network-level fuzzing when the bug lives in the parser called by the network layer
When to Skip Fuzzing
Pure-business-logic code, web app routes, and HTML rendering rarely benefit from byte-level fuzzing. Use property-based testing (Hypothesis, jqwik) for logic. Use scanners and manual review for web. Save fuzzers for parsers, decoders, deserializers, and crypto primitives where a single byte can pivot control flow.
The fuzzer you run consistently beats the fuzzer you tune perfectly once. Start small, run hot, harvest crashes, then iterate on harness and corpus.
Read more field notes, explore our services, or get in touch at info@bipi.in. Privacy Policy · Terms.