XXE vulnerability stands for XML External Entity Injection, which is an XML external entity injection vulnerability.
XXE vulnerability is typically triggered at locations where XML files can be uploaded without filtering. This allows malicious external files and code to be loaded, leading to arbitrary file reading, command execution, internal network port scanning, attacks on internal network websites, initiation of DoS attacks, and other hazards. To understand XXE vulnerabilities, itâs essential to grasp the basic knowledge and understand the fundamental structure of XML documents.
XML is used for marking electronic files to provide a structured markup language. It can be used to tag data and define data types, allowing users to define their own markup language as a source language. XML document structure includes XML declaration, optional DTD document type definition, document element
>
Introduction and Purpose of XML
XML is designed to transport and store data. XML documents form a tree structure starting from the ârootâ and then expanding to âbranches and leaves.â
XML allows authors to define their own tags and document structures.
Building Blocks of XML Documents
All XML documents (and HTML documents) are made up of the following simple building blocks:
Elements
Attributes
Entities
PCDATA
CDATA
Below is a brief description of each building block.
1. Element
Elements are the primary building blocks of XML and HTML documents. Elements can contain text, other elements, or be empty. Example:
body text in betweensome message in between
Examples of empty HTML elements are âhrâ, âbrâ, and âimgâ.
2. Attribute
Attributes provide additional information about elements. Example:
3. Entity
Entities are variables used to define common text. An entity reference is a reference to an entity.
4. PCDATA
PCDATA stands for Parsed Character Data. PCDATA is text that will be parsed by the parser, and entities and markup will be checked by the parser.
5. CDATA
CDATA stands for Character Data. CDATA is text that will not be parsed by the parser.
XML Syntax Rules
All XML elements must have a closing tag
XML tags are case sensitive
XML must be properly nested
XML attribute values must be quoted
Entity references
In XML, spaces are preserved
Function Introduction
Introduction of file_get_contents function
The file_get_contents() function reads the entire file into a string.
Introduction of php://input
php://input is a read-only stream that allows access to the raw data from the request.
Combined with file_get_contents(php://input), it can read data submitted via POST.
Introduction of simplexml_load_string function
The simplexml_load_string function in PHP converts an XML formatted string into a corresponding SimpleXMLElement
XML Injection Echo Output Function
In PHP, you can use print_r(), echo to output the desired content.
The purpose of DTD (Document Type Definition) is to define the legitimate building blocks of an XML document.
DTD can be declared inside the XML document or referenced externally.
1. Internal Declaration: ex: Complete Example:
<!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>]> George John Reminder Don't forget the meeting!
<!ELEMENT note (to,from,heading,body)><!ELEMENT to (#PCDATA)><!ELEMENT from (#PCDATA)><!ELEMENT heading (#PCDATA)><!ELEMENT body (#PCDATA)>
DTD Entities
DTD entities are variables used to define shortcuts for referencing plain text or special characters, which can be declared internally or referenced externally.
Entities are divided into general entities and parameter entities. 1. The declaration syntax of a general entity: Reference the entity as follows:
&entityName;
2. Parameter entities can only be used in DTD and the declaration format of parameter entities is:
Protocols supported by different programs are shown in the following diagram: 
XXE Impact 1: Arbitrary File Reading
This case involves reading /etc/passwd. Some XML parsing libraries support directory listing, allowing attackers to further attack by listing directories, reading files, and obtaining account passwords, such as reading tomcat-users.xml to get account passwords and logging into Tomcatâs manager to deploy a web shell.
Data can be sent to a remote server.
The content of the remote evil.dtd file is as follows:
After triggering the XXE attack, the server will send the file content to the attackerâs website.
XXE Impact 2: Execute System Commands
This case involves executing system commands in a PHP environment with the expect extension installed. Other protocols may also execute system commands.
XXE Impact 3: Internal Network Port Scanning
This case involves probing ports 80 and 81 of 192.168.1.1. The âConnection refusedâ response indicates that port 81 is closed while port 80 is open.
XXE Impact 4: Attacking Internal Network Websites
This case involves attacking an internal Struts2 website and executing system commands remotely.
How to Defend Against XXE Attacks
1. Use the Function to Disable External Entities Provided by the Development Language