Understanding TLS Encryption: Capturing and Decrypting HTTPS Traffic with Wireshark

Previously, an article discussed using the Wireshark capturing tool to debug HTTP requests and responses, detailing the entire TCP connection and disconnection process. This article aims to use Wireshark to capture TLS packets and understand the complete process of HTTPS requests and responses using TLS encryption.

For those who don’t want to read the full text, scroll to the bottom to see the picture explanation of the TLS process

First, prepare curl and Wireshark; search on Google for how to install them. ?

HTTPS is based on TLS, and without the target’s private key, it is impossible to decrypt. If you use Wireshark alone, you cannot see the encrypted information on the TLS layer.

Open Wireshark and start capturing packets, then send a request in curl:

TLS encryption

Then enter tls in Wireshark to filter requests, suspected IP address as the target website

Enter that IP address to confirm it is the target site httpbin, then try to view the returned data, because TLS encryption cannot view the returned JSON data

Because TLS uses the Diffie-Hellman key exchange to generate a symmetric key encryption algorithm, it is necessary to obtain a series of key-generating information to generate keys and decrypt data.

Without rushing to solve this problem, first review some key points, roughly understand encryption algorithms, CA digital certificates, and TLS encryption processes:

In ancient times, symmetric encryption was used, where the encryption and decryption keys are the same, making it extremely unsafe. If one key is leaked, the encrypted files can be cracked.

Later, asymmetric encryption was invented. This method uses a public key for encryption and a private key for decryption. One party encrypts using the other party’s public key, transmits the ciphertext to the other party, who decrypts it using the private key. In this way, without the private key, the ciphertext cannot be decrypted. However, this encryption/decryption efficiency is low, which led to hybrid encryption, a method combining asymmetric and symmetric encryption.

Hybrid encryption first uses asymmetric encryption to generate a symmetric key, then uses the symmetric key for secure data transmission, greatly improving encryption/decryption efficiency. Diffie-Hellman used in TLS is one method.

A brief mention of the Diffie-Hellman key exchange algorithm, this algorithm has several features:

The encryption process of this algorithm is as follows:

Thus the symmetric keys PSASB and PSBSA are complete, and both parties can use this symmetric key to encrypt transmission data?

⚠️ The issue is that when A sends P to B, how can B confirm that P is really from A and not altered by a middleman attack? This requires CA authentication, and with a CA-certified digital signature, validated through the CA’s public key, P can be confirmed as coming from A.

Therefore, here is a simple mention of CA certificates and their principles:

For a website A to get a digital certificate, the process is as follows:

This way, A’s public key PA can be deemed as actually from A, because of the CA-issued digital certificate, certification, and endorsement. ?

⚠️ Another issue is who certifies the CA’s public key? What if the CA’s public key is maliciously replaced? This requires the CA’s CA for certification, and finally, a root CA for ultimate authority.

Finally, review the TLS encryption process:

TLS is built on top of TCP, thus requires a three-way TCP handshake to establish a TCP connection before establishing TLS.

As seen above, a session key is generated using two randomly generated numbers and a pre-master key, through the Diffie-Hellman key exchange algorithm mentioned earlier. The client receives a CA digital certificate to obtain the server’s public key for encryption, also supporting the CA authentication process mentioned earlier. The server decrypts the client’s encryption with its private key, covered earlier in the asymmetric key section. Therefore, TLS is a secure transmission technology utilizing multiple authentication/encryption methods. Moreover, the HASH calculation mentioned above but not elaborated upon is to prevent data tampering, similar to how we verify md5 checks on software downloaded online.

⚠️ If there are errors, please correct them in the comments section, thank you

As mentioned earlier, Wireshark direct capturing of TLS packets cannot reveal decrypted data, so we need some methods to obtain decrypted data.

So how can we decrypt and access the data? Here’s an article to check out https://jimshaver.net/2015/02/11/decrypting-tls-browser-traffic-with-wireshark-the-easy-way/

It turns out that Firefox and Chrome both support logging the symmetric session key used to encrypt TLS traffic to a file. You can then point Wireshark at said file and presto! decrypted TLS traffic.

The principle is that the browser will exist in the system as an environment variable already set, obtaining and saving the random numbers, preMasterSecret, and MasterSecret generated by each HTTPS connection into a file specified by this environment variable.

Let’s get started with hands-on practice, using the Windows system, we configure the environment variable:

Configuration complete, so the browser writes the key information to the file in the specified path

For macOS/Linux, search on Google for how to set environment variables, it’s simple and won’t be elaborated here

Open Wireshark, click configuration information to specify the TLS pre-master key path to the file path specified by the environment variable

With this configuration complete, let’s try opening the Chrome browser to visit an https URL:

Visiting the homepage of Baidu, you can see Wireshark showing decrypted data

Then open the terminal to use curl to send an https request

Still using httpbin as an example, filter http to view Wireshark results:

Right-click follow

This is the entire process from TCP establishment to TLS handshake to HTTP request/response to TCP disconnection, analyzed one by one:

The previous article has already discussed TCP, so it won’t be mentioned again; let’s directly look at the TLS section:

Client Hello stage, the client sends a random number to the server, along with all Cipher Suites the client supports

Server Hello stage, the server sends a random number to the client, along with the chosen Cipher Suite

Then the server continues to send the client a CA digital certificate, Server Key Exchange, and Hello done information to complete the first handshake phase:

This is the certificate:

This is Server Key Exchange, showing a negotiated encryption algorithm:

This is Server Hello Done:

The client sends a Client Key Exchange, Change Cipher Spec, and Finished message

Finished Verify Data includes verification information for all messages up to this connection, encrypted with the server’s public key

The client is ready to switch to symmetric key encryption

Finally, the server returns a Change Cipher Spec and Server Finish

The server is ready to switch to symmetric key encryption

At this point, the TLS handshake is successful, and in Wireshark, you can see the following HTTP request/response packets:

Finally, a simplified mind map was drawn for easier understanding: