Mastering TLS Fingerprint Blocking: Techniques for Bypassing Fingerprint Recognition and Anti-Crawling Shields

1. Introduction: TLS Fingerprint Blocking

In previous projects, it was discovered that some websites produce different results depending on the client used. For example, accessing the site with a browser might work fine, but using a script written in Python or making requests with curl would be blocked. Attempts were made to replicate data packets exactly, but this still did not resolve the issue. This behavior is often related to TLS Fingerprint Blocking.

Test fingerprint blocking site: https://ascii2d.net

Recently, I read an article by a master titled “Bypassing Cloudflare Fingerprint Shield“, which was very insightful. It seems that the issue I previously encountered was similar; when writing crawlers and facing similar fingerprint shields (anti-crawling mechanisms), using Selenium to emulate the browser was attempted in the past as a workaround. This time, I have gained new perspectives and learned something new.

The content is mainly divided into two parts: 1. Bypassing TLS fingerprint recognition, 2. Bypassing Akamai fingerprint (HTTP/2 fingerprint) recognition.

2. Related to TLS Fingerprint Blocking

2.1. What is TLS Fingerprint Blocking?

TLS fingerprint is a technique used to identify and verify TLS (Transport Layer Security) communications.

It can identify the characteristics of TLS communication by examining the **cipher suites, protocol version, and encryption algorithms** used during the TLS handshake. Since different TLS implementations use different cipher suites, protocol versions, and encryption algorithms, comparing TLS fingerprints can determine whether the communication is from the expected source or target.

TLS fingerprints can detect security threats like spoofing, man-in-the-middle attacks, and espionage, and can also be used for device and application identification and management.

The principle of TLS fingerprint recognition (ja3 algorithm): https://github.com/salesforce/ja3

TLS Fingerprint Blockingja3 algorithm

2.2. Testing TLS Fingerprint Blocking

Testing the fingerprint differences (ja3_hash) between different clients.

For in-depth analysis, you can use Wireshark to capture and analyze TLS packets.

Test site: https://tls.browserleaks.com/json

  • CURL v7.79.1

TLS Fingerprint Blockingcurl v7.79.1

  • CURL 7.68.0

CURL 7.68.0CURL 7.68.0

  • Chrome 112.0.5615.137 (Official Build) (x86_64)

Chrome 112.0.5615.137 (Official Build) (x86_64)Chrome 112.0.5615.137 (Official Build) (x86_64)

  • Burp Chromium 103.0.5060.114 (Official Build) (x86_64)

Burp ChromiumBurp Chromium

  • Python 2.11.1

Python 2.11.1Python 2.11.1

It is apparent that different clients have variations. A simple explanation for the last Python ja3_text is as follows:

  • The first value 771: Represents the JA3 version, which is the version of the JA3 script used to generate the fingerprint.
  • The second value 4866-4867-4865-49196-49200-49195-49199-163-159-162-158-49327-49325-49188-49192-49162-49172-49315-49311-107-106-57-56-49326-49324-49187-49191-49161-49171-49314-49310-103-64-51-50-52393-52392-49245-49249-49244-49248-49267-49271-49266-49270-52394-49239-49235-49238-49234-196-195-190-189-136-135-69-68-157-156-49313-49309-49312-49308-61-60-53-47-49233-49232-192-186-132-65-255: Represents the cipher suite, i.e., the encryption algorithms supported by the client.
  • The third value 0-11-10-35-22-23-13-43-45-51-21: Represents the supported compression algorithms.
  • The fourth value 29-23-30-25-24: Represents the supported TLS extensions, such as SNI.
  • The fifth value 0-1-2: Represents the supported elliptic curves, i.e., elliptic curve algorithms.

2.3. Bypassing TLS Fingerprint Blocking

Since we know the principle, bypassing involves masquerading as a legitimate client. In simple terms, it means disguising the ja3_text value so that it isn’t intercepted, primarily by modifying the supported encryption algorithms.

2.3.1. Method Zero: Use Native urllib for TLS Fingerprint Blocking

import urllib.request
import ssl

url = 'https://tls.browserleaks.com/json'
req = urllib.request.Request(url)
resp = urllib.request.urlopen(req)
print(resp.read().decode())

# Forge TLS fingerprint
context = ssl.create_default_context()
context.set_ciphers("ECDHE-RSA-AES128-GCM-SHA256 ECDHE AESGCM")

url = 'https://tls.browserleaks.com/json'
req = urllib.request.Request(url)
resp = urllib.request.urlopen(req, context=context)
print(resp.read().decode())

urlliburllib

2.3.2. Method One: Use Other Established Libraries

You can try the curl_cffi library, which focuses on emulating various fingerprints.

Python binding for curl-impersonate via cffi. An HTTP client that can impersonate browser TLS/ja3/HTTP2 fingerprints.

In addition, you can also try pyhttpx, pycurl

pip install --upgrade curl_cffi

Test code:

from curl_cffi import requests

print("edge99:", requests.get("https://tls.browserleaks.com/json", impersonate="edge99").json().get("ja3_hash"))
print("chrome110:", requests.get("https://tls.browserleaks.com/json", impersonate="chrome110").json().get("ja3_hash"))
print("safari15_3:", requests.get("https://tls.browserleaks.com/json", impersonate="safari15_3").json().get("ja3_hash"))

# Support proxy
proxies = {"https": "http://localhost:7890"}
r = requests.get("https://tls.browserleaks.com/json", impersonate="chrome101", proxies=proxies)
print(r.json().get("ja3_hash"))

The effect is as follows:

curl_cfficurl_cffi

The supported browser spoof list is as follows:

# curl_cffi.requests.session.BrowserType
class BrowserType(str, Enum):
    edge99 = "edge99"
    edge101 = "edge101"
    chrome99 = "chrome99"
    chrome100 = "chrome100"
    chrome101 = "chrome101"
    chrome104 = "chrome104"
    chrome107 = "chrome107"
    chrome110 = "chrome110"
    chrome99_android = "chrome99_android"
    safari15_3 = "safari15_3"
    safari15_5 = "safari15_5"

2.3.3. Method Two: Add a Client Proxy Layer

Here, Burp is used to complete the TLS certification process, provided Burp’s TLS fingerprint is not intercepted.

burpburp

Burp’s TLS fingerprint can be modified in the following way

Modify burp TLS fingerprintModify burp TLS fingerprint

2.3.4. Method Three: Modify the Underlying Code of Requests

The Requests library’s SSL/TLS authentication is based on the urllib3 library, so modifying the underlying code involves changing the urllib3 code.

Check the installation location of urllib3

python3 -c "import urllib3; print(urllib3.__file__)"

/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/urllib3/__init__.py

Modify relevant SSL code, typically located at site-packages/urllib3/util/ssl_.py

DEFAULT_CIPHERS = ":".join(
    [
        "ECDHE AESGCM",
        "ECDHE CHACHA20",
        "DHE AESGCM",
        "DHE CHACHA20",
        "ECDH AESGCM",
        "DH AESGCM",
        "ECDH AES",
        "DH AES",
        "RSA AESGCM",
        "RSA AES",
        "!aNULL",
        "!eNULL",
        "!MD5",
        "!DSS",
    ]
)

There is a lot of room for operation. As a script kiddie, I mostly stick to deleting and rearranging positions, as shown below:

vsvs

3. Akamai Fingerprint Related (HTTP/2 Fingerprint)

3.1. What is Akamai Fingerprint

Akamai Fingerprint is a technology provided by Akamai Technologies to prevent malicious bots and automated attacks, based on browser fingerprint recognition technology.

Browser fingerprinting is a technique used to identify web browsers by collecting and analyzing various attributes and behaviors, such as user-agent strings, plugins, fonts, language, screen resolution, and more to identify browsers. Browser fingerprinting has been widely used in Internet security, for detecting and identifying malicious bots, fraudulent actions, phishing, etc.

Akamai Fingerprint incorporates browser fingerprinting and combines it with other security technologies to identify and block automated attacks. It can identify and verify the browsers accessing the site without affecting the user experience, preventing automated attacks, account abuse, and data leaks.

You can view detailed fingerprints on https://tls.peet.ws/api/all, which mainly include the following:

HTTP2HTTP2

Fingerprint is: 1:65536,2:0,3:1000,4:6291456,6:262144|15663105|0|m,a,s,p

  1. 1:65536: HEADER_TABLE_SIZE, which means the header table size is 64KB, referring to the size used for storing request and response headers. This field indicates a 64KB header table size.
  2. 2:0: HTTP2_VERSION, indicates the HTTP/2 version used for this request. 0 implies H2, meaning the HTTP/2 protocol is enabled.
  3. 3:1000: MAX_CONCURRENT_STREAMS, which stands for the maximum number of concurrent streams, indicating the maximum number of requests the client and server can send in parallel at any given time. This field indicates a maximum count of 1000 concurrent streams.
  4. 4:6291456: INITIAL_WINDOW_SIZE, which refers to the initial stream window size, indicating the maximum amount of bytes the client can send. This field indicates an initial stream window size of 6MB (i.e., 6291456 bytes).
  5. 6:262144|15663105|0|m,a,s,p
: Separated by vertical bars '|'. Their specific meanings are as follows:
- `6:262144`: `max header list size`, referring to the dynamic table size allowed, indicating the maximum HTTP header size the receiver can accept. This field indicates a dynamic table size of 256KB (i.e., 262144 bytes).
- `15663105`: `WINDOW_UPDATE`, indicating a `WINDOW_UPDATE` frame was received and the window size increased by 15663105 bytes.
- `0`: `no compression`, indicating that header compression is not enabled.
- Encodes the first character of headers starting with ':', separated by commas, such as `:method`, `:authority`, `:scheme`, `:path`, encoded as `m,a,s,p`.

Details can be found in Passive Fingerprinting of HTTP/2 Clients.

3.2. Testing Akamai Fingerprint

Test site: https://tls.browserleaks.com/json

  • CURL

curlcurl

  • Chrome

chromechrome

  • Python

pythonpython

It can be seen that using Python requests results in an empty response, with the crawler being blocked outside.

3.3. Bypassing Akamai Fingerprint

For specific fields in the forged fingerprints.

3.3.1. Method One: Use Other Established Libraries

Again, the curl_cffi library, which focuses on emulating various fingerprints.

Python binding for curl-impersonate via cffi. A HTTP client that can impersonate browser TLS/ja3/HTTP2 fingerprints.

pip install --upgrade curl_cffi

Test code:

from curl_cffi import requests

print("edge99:", requests.get("https://tls.browserleaks.com/json", impersonate="edge99").json().get("akamai_hash"))
print("chrome110:", requests.get("https://tls.browserleaks.com/json", impersonate="chrome110").json().get("akamai_hash"))
print("safari15_3:", requests.get("https://tls.browserleaks.com/json", impersonate="safari15_3").json().get("akamai_hash"))

The effect is as follows:

akamaiakamai

The supported browser spoof list is as follows:

# curl_cffi.requests.session.BrowserType
class BrowserType(str, Enum):
    edge99 = "edge99"
    edge101 = "edge101"
    chrome99 = "chrome99"
    chrome100 = "chrome100"
    chrome101 = "chrome101"
    chrome104 = "chrome104"
    chrome107 = "chrome107"
    chrome110 = "chrome110"
    chrome99_android = "chrome99_android"
    safari15_3 = "safari15_3"
    safari15_5 = "safari15_5"

4. Final Effect

https://ascii2d.net has Cloudflare’s fingerprint shield, denying crawlers. Let’s test it.

Direct CURL, blocked

bannedbanned

Bypass

from curl_cffi import requests

req = requests.get("https://ascii2d.net", impersonate="chrome110")
print(req.text)

Page can be accessed normally

normalnormal

5. References