XML External Entity (XXE)
XML External Entity (XXE) is a security vulnerability that allows attackers to interfere with an application's processing of XML data. By exploiting weakly configured XML parsers, attackers can access sensitive files, perform server-side request forgery (SSRF), execute denial-of-service attacks, and potentially achieve remote code execution.
What Is XML External Entity (XXE)?
XML External Entity (XXE) injection is a web security vulnerability that occurs when an XML parser processes external entity references within XML documents without proper validation or restriction. XML parsers by default are often configured to process Document Type Definitions (DTDs) and external entities, which can be exploited to read local files, interact with internal systems, or cause denial-of-service conditions. The vulnerability arises when user-controlled XML input is parsed by an application that has not disabled these potentially dangerous features.
External entities in XML are defined using the <!ENTITY> declaration within a DTD. An attacker can craft a malicious XML document that defines an external entity pointing to a local file path (e.g., file:///etc/passwd) or a remote URL. When the XML parser processes this document, it resolves the external entity and includes its content in the parsed output. If the application returns this output to the user or uses it in further processing, the attacker can extract sensitive data or manipulate application behavior.
XXE vulnerabilities are particularly dangerous in applications that process XML from untrusted sources, such as SOAP web services, RSS feeds, file upload features accepting XML formats (SVG, DOCX, XLSX), and configuration file parsers. According to OWASP, XXE is classified under A05:2021 – Security Misconfiguration in the OWASP Top 10, reflecting the fact that many XML parsers are insecurely configured by default. Despite widespread awareness, XXE continues to be discovered in modern applications, especially in legacy systems and third-party libraries.
How It Works
The application provides functionality that accepts XML data from users or external sources. This could be a SOAP API endpoint, a file upload feature that accepts XML-based formats (SVG, Office documents), a configuration parser, or any other feature that processes XML content.
The application uses an XML parser that has not been configured to disable external entity processing or DTD resolution. Many XML parsers, including Java's DocumentBuilder, .NET's XmlReader, and PHP's SimpleXML, allow external entities by default for backward compatibility.
The attacker creates an XML document containing a malicious DTD that defines an external entity. For example: <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>. The entity &xxe; can then be referenced within the XML document to trigger the file read.
When the XML parser processes the malicious document, it encounters the entity reference and attempts to resolve it. The parser reads the content from the specified external resource (a local file, remote URL, or internal network resource) and substitutes it into the XML document.
If the application returns the parsed XML content in a response, error message, or log entry, the attacker can read the contents of the external entity. In blind XXE scenarios where output is not directly visible, attackers can use out-of-band techniques to exfiltrate data to an external server they control.
Vulnerable Code Example
@RestController
public class XmlController {
@PostMapping("/parse-xml")
public ResponseEntity<?> parseXml(@RequestBody String xmlData) {
try {
// VULNERABLE: Default DocumentBuilderFactory allows
// external entity processing and DTD resolution
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
InputStream is = new ByteArrayInputStream(xmlData.getBytes());
Document doc = db.parse(is);
String result = doc.getDocumentElement().getTextContent();
return ResponseEntity.ok(Map.of("content", result));
} catch (Exception e) {
return ResponseEntity.status(500).body(e.getMessage());
}
}
}
// An attacker can submit malicious XML:
// <?xml version="1.0"?>
// <!DOCTYPE foo [
// <!ENTITY xxe SYSTEM "file:///etc/passwd">
// ]>
// <data>&xxe;</data>
//
// The parser will read /etc/passwd and include its contents
// in the response, exposing sensitive system files.Secure Code Example
@RestController
public class XmlController {
@PostMapping("/parse-xml")
public ResponseEntity<?> parseXml(@RequestBody String xmlData) {
try {
// SECURE: Disable DTDs and external entity processing
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
// Disable external DTDs
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
// Disable external general entities
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
// Disable external parameter entities
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
// Disable XInclude processing
dbf.setXIncludeAware(false);
// Disable entity expansion
dbf.setExpandEntityReferences(false);
DocumentBuilder db = dbf.newDocumentBuilder();
InputStream is = new ByteArrayInputStream(xmlData.getBytes());
Document doc = db.parse(is);
String result = doc.getDocumentElement().getTextContent();
return ResponseEntity.ok(Map.of("content", result));
} catch (Exception e) {
return ResponseEntity.status(500).body("Invalid XML");
}
}
}
// With these security features enabled, any XML containing
// DTDs or external entities will be rejected, preventing
// XXE attacks entirely.Types of XXE
Classic XXE (File Disclosure)
The most straightforward type of XXE attack. The attacker defines an external entity that references a local file on the server's filesystem using the file:// protocol. When the XML parser processes the entity reference, it reads the file's contents and includes them in the parsed document. If the application returns this content to the user, the attacker can read sensitive files such as /etc/passwd, configuration files containing database credentials, source code, or private keys. This attack is particularly effective when error messages or application responses echo the parsed XML content.
Blind XXE (Out-of-Band)
Used when the application does not return the parsed XML content in responses, making traditional XXE exploitation impossible. Attackers use out-of-band data exfiltration techniques to send data to an external server they control. The malicious DTD defines an entity that makes an HTTP or DNS request to the attacker's server, embedding file contents in the URL. For example, using parameter entities to read /etc/passwd and send it via HTTP request. Blind XXE can also be used for Server-Side Request Forgery (SSRF), forcing the server to make requests to internal network resources.
Billion Laughs (DoS)
A specialized XXE attack designed to cause denial-of-service by exploiting XML entity expansion. The attack defines nested entities that exponentially expand when parsed, consuming massive amounts of memory and CPU. The classic "Billion Laughs" payload defines ten entities, each referencing the previous one ten times, resulting in one billion expansions from a small XML document. Modern variants can crash servers with just a few kilobytes of XML. This attack does not require external entity support, only DTD processing, making it effective even when file:// access is restricted.
Impact
XXE vulnerabilities can lead to severe security breaches with consequences ranging from data theft to complete server compromise. The impact depends on the permissions of the application process, the network architecture, and the attacker's objectives.
Attackers can read arbitrary files from the server's filesystem, including configuration files containing database credentials, API keys, private encryption keys, source code, and system files like /etc/passwd or /etc/shadow. On Windows systems, files such as C:\Windows\win.ini or registry hives can be accessed. This can lead to credential theft, exposure of business logic, and further system compromise.
XXE can be exploited to make the vulnerable server send HTTP requests to internal network resources that are not accessible from the internet. Attackers can scan internal ports, access internal APIs and admin panels, interact with cloud metadata services (like AWS EC2 metadata at 169.254.169.254), and potentially pivot to other internal systems.
Using entity expansion attacks like "Billion Laughs," attackers can cause the XML parser to consume excessive memory and CPU resources, leading to application crashes, server unavailability, and service degradation. Even a small malicious XML document can bring down a production server in seconds.
In certain configurations, XXE can escalate to remote code execution. The expect:// protocol in PHP's libxml allows command execution. On systems with specific XML parsers or when combined with other vulnerabilities, attackers may be able to write files to disk, trigger deserialization flaws, or exploit SSRF to interact with services that enable code execution.
Prevention Checklist
The most effective defense is to disable Document Type Definition (DTD) processing entirely. Configure your XML parser to reject any XML containing a DOCTYPE declaration. If DTDs must be supported for legitimate use cases, disable external entity resolution and parameter entity processing. Every XML parsing library provides features to disable these dangerous capabilities.
Whenever possible, avoid XML altogether and use simpler data formats like JSON, which do not support external entity references or complex DTD processing. If your API or service does not require XML-specific features, migrating to JSON significantly reduces your attack surface. For new projects, prefer JSON unless XML is absolutely necessary.
Ensure all XML parsing libraries and dependencies are kept up to date with the latest security patches. Many older versions of XML parsers have XXE vulnerabilities or insecure defaults. Use dependency scanning tools to identify outdated libraries and regularly review security advisories for the XML processors your application uses.
Use XML schema validation (XSD) to enforce strict structure requirements for incoming XML documents. Reject any XML that does not conform to the expected schema. While schema validation alone does not prevent XXE, it provides defense-in-depth by limiting the complexity and content of XML documents the application will process.
Run the application with minimal filesystem and network permissions. Use network segmentation to prevent the application server from accessing internal resources that should not be reachable. Implement egress filtering to block outbound connections to unexpected destinations, mitigating blind XXE and SSRF exploitation. Use dedicated service accounts with restricted read access.
Integrate Static Application Security Testing (SAST) tools into your CI/CD pipeline to detect insecurely configured XML parsers during development. Use Dynamic Application Security Testing (DAST) and penetration testing tools to actively probe for XXE vulnerabilities in deployed applications. Tools like OWASP ZAP, Burp Suite, and specialized XXE scanners can identify vulnerable endpoints.
Real-World Examples
Facebook XXE Vulnerability
Security researcher Reginaldo Silva discovered an XXE vulnerability in Facebook's infrastructure that allowed reading arbitrary files from Facebook's servers. The vulnerability was found in an endpoint that processed Office document formats (DOCX, XLSX), which are XML-based. The researcher was able to extract sensitive server configuration files and potentially access internal network resources through SSRF techniques.
SAP NetWeaver XXE
Multiple XXE vulnerabilities were discovered in SAP NetWeaver's web services. Attackers could exploit these flaws to read arbitrary files from SAP application servers, including configuration files containing database credentials and customer data. The vulnerabilities affected numerous SAP products and required extensive patching across enterprise installations worldwide.
Apache Struts XXE (CVE-2017-9805)
A critical XXE vulnerability was discovered in Apache Struts 2's REST plugin when handling XML payloads. The vulnerability allowed remote attackers to execute arbitrary code on affected servers. This flaw was weaponized in the wild within days of disclosure and led to numerous breaches. It demonstrated how XXE can serve as an initial attack vector for complete system compromise.
Cisco WebEx XXE
Cisco disclosed an XXE vulnerability in WebEx Network Recording Player for Windows that could allow an attacker to execute arbitrary code. The vulnerability existed in the XML parser used to process WebEx recording files. A malicious recording file could exploit the XXE flaw to read sensitive files or execute commands on a victim's system when the recording was opened.
Ready to Test Your Knowledge?
Put what you have learned into practice. Try identifying and fixing XXE vulnerabilities in our interactive coding challenges, or explore more security guides to deepen your understanding.