Understanding HTTPS: Secure Communication Protocols to Combat Man-in-the-Middle Attacks

1. Summary: Secure Communication

This article, complete with illustrations, introduces the hierarchical structure of the HTTPS protocol, outlines access principles, and describes interaction processes. It also explains how to resolve current issues with man-in-the-middle attacks, emphasizing the importance of Secure Communication.

2. Secure Communication Content

2.1 Secure Communication: HTTPS Protocol Stack Hierarchy

HTTPS (full name: Hyper Text Transfer Protocol over Secure Socket Layer or Hypertext Transfer Protocol Secure) is a secure-oriented HTTP channel, simply put, a secure version of HTTP. The SSL layer is added under HTTP, and the security foundation of HTTPS is SSL, so the detailed encryption content requires SSL. It is a URI scheme (abstract identifier system), syntactically similar to http: system. Used for secure HTTP data transmission. The https: URL indicates it uses HTTP, but HTTPS has a different default port and an encryption/authentication layer (between HTTP and TCP).

To encourage the global adoption of HTTPS by websites, some internet companies have set forth their own requirements:

1) Google has adjusted its search engine algorithm to rank websites using HTTPS higher in search results;

2) Starting from 2017, Chrome browser has started marking websites using the HTTP protocol as unsafe;

3) Apple requires all applications in the App Store as of 2017 to use HTTPS connections;

4) The current trending WeChat Mini Programs also require the use of HTTPS protocol;

5) Support for the latest HTTP/2 protocol is based on HTTPS.

Therefore, as a programmer, knowledge of HTTPS is essential.

 Secure Communication

As shown in the above figure, HTTPS has an additional SSL/TLS layer compared to HTTP.

SSL (Secure Socket Layer): Developed by Netscape in 1994, the SSL protocol is between the TCP/IP protocol and various application layer protocols, providing security support for data communication.

TLS (Transport Layer Security): The predecessor of SSL, its initial versions (SSL 1.0, SSL 2.0, SSL 3.0) were developed by Netscape. From 1999, it began to be standardized and renamed by IETF from version 3.1 onwards. As of now, there are TLS 1.0, TLS 1.1, and TLS 1.2. Due to security vulnerabilities, SSL 3.0 and TLS 1.0 are now rarely used. The changes in TLS 1.3 are quite significant and it is currently still in draft stages. TLS 1.1 and TLS 1.2 are the most widely used.

2.2 Secure Communication: TCP 3-Way Handshake and HTTP Access Process

2.2.1 TCP 3-Way Handshake

When a client enters a URL and hits enter, DNS resolves the domain to obtain the server’s IP address. The server listens for client requests on port 80, establishing a connection through the TCP/IP protocol (this can be implemented through Socket). HTTP belongs to the application layer protocol within the TCP/IP model, so the communication process corresponds to data entering and leaving the stack.

 Secure Communication

The packet is transmitted from the application layer to the transport layer, establishing a connection with the server through a TCP 3-way handshake and releasing the connection with a 4-way handshake.

Analyzing the 3-Way Handshake process:

First Handshake: Host A sends a data packet with the bit code syn=1 and a randomly generated seq number=X to the server. Host B knows from SYN=1 that A wants to establish a connection;

Second Handshake: After receiving the request, host B acknowledges the connection information, sending syn=1, ack number=X+1, and a randomly generated seq=Y package to A;

Third Handshake: Host A checks whether the ack number is correct, i.e., X+1 from the first transmission, and whether the bit code SYN is 1; if correct, Host A will send ack number=(Y+1), Seq=z. After receiving this, Host B confirms the seq value and ack=1, and thus, the connection is successfully established.

Once the 3-way handshake is completed, host A and host B start transmitting data.

2.2.2 HTTP Access Process

Packet capture as follows:

As shown in the figure above, during the HTTP request process, there is no identity confirmation process between the client and the server, and all data is transmitted in plaintext, “roaming naked” on the Internet, making it susceptible to hacker attacks, as shown below:

image

It is evident that requests sent by the client can easily be intercepted by hackers. If a hacker impersonates the server, they can return any information to the client without the client’s perception resulting in what we often hear as “hijacking”. Thus, HTTP transmission faces the following risks:

(1) Eavesdropping risk: Hackers can access communication content.

(2) Tampering risk: Hackers can modify communication content.

(3) Impersonation risk: Hackers can impersonate others to participate in communications.

2.3 The Three Principles of Secure Communication and HTTPS Design Thought

2.3.1 Ensuring Secure Communication with the Three Principles

A. Encryption of data content

This is easy to understand, isn’t it? Sensitive information must be encrypted; plaintext transmission is akin to committing suicide. No further explanation is needed.

B. Identity verification of communication parties

Many people do not understand what this means. In theory, asymmetric encryption should be perfect. But remember, our data packet does not travel directly from A to B.

It goes through countless router forwards along the way. If someone intercepts our packet in between and replaces it with their own, it becomes highly dangerous.

So we also need a mechanism to verify the identity of both communication parties, ensuring I am speaking with my wife and not my mother-in-law.

C. Integrity of data content

This is also not easily understood. In theory, TCP ensures data reaches the other end in order and complete. However, do not forget the countless router forwards we pass through,

possibly being hijacked. If hijacked, the data packet might be tampered with, so we need a mechanism to protect our data from tampering and to detect if it’s tampered, ensuring my letter to my wife remains complete so she can see the entire content, not just half.

2.3.2 HTTPS Design Thought

Based on the encryption algorithms and the three principles of secure communication we explained earlier, to ensure our communication can proceed securely, a process is plausible to ensure communication security:

  • The server generates a public key for each client and sends the public key to the client;
  • The client chooses an encryption algorithm, encrypts it with the public key, and sends it to the server;
  • Upon receiving the public-key-encrypted algorithm, the server decrypts it with its private key and discovers which encryption algorithm it is. Henceforth, the communication continues using this algorithm;

From the current perspective, this thought seems perfect. Even if the public key is intercepted by a middleman, it remains ineffective, as one cannot decrypt or discern which algorithm both parties are using.

But there is a major drawback:

A middleman can swap the server’s public key package, and how would the client know whether this public key is truly from the server or an illegal one from a middleman?

As shown in this diagram, a basic man-in-the-middle attack means what the figure illustrates.

So is there a way to securely obtain a public key and prevent impersonation by hackers? Then the ultimate tool comes into play: SSL certificate procurement.

Specific Step Explanation:

As shown in the figure above, at step ②, the server sends an SSL certificate to the client. The SSL certificate contains:

(1) Certificate issuer CA

(2) Certificate validity period

(3) Public key

(4) Certificate owner

(5) Signature




..

When the client receives the SSL certificate sent by the server, it will validate the authenticity of the certificate. Taking browsers as an example, here is how it works:

(1) Initially, the browser reads the certificate owner and validity information in the certificate for verification;

(2) The browser searches for trusted certificate issuers CA embedded in the operating system and checks against the issuer CA in the server’s certificate to verify if it is issued by a legitimate entity;

(3) If not found, the browser will report an error indicating the server’s certificate is not trustworthy;

(4) If found, the browser retrieves the public key of the issuer CA from the operating system and decrypts the signature in the server’s certificate;

(5) The browser computes the hash value of the server’s certificate using the same hashing algorithm and compares this computed hash value with the signature in the certificate;

(6) If the comparison matches, it proves the server’s certificate is legitimate and not impersonated;

(7) At this point, the browser can read the public key in the certificate for subsequent encryption;

Thus, by sending the SSL certificate, both public key acquisition problems and hacker impersonation issues are effectively resolved, achieving a dual purpose, forming the HTTPS encryption process.

Compared to HTTP, HTTPS transmission is more secure:

(1) All information is encrypted, preventing hackers from eavesdropping.

(2) There is a validation mechanism, allowing both communication parties to immediately detect any tampering.

(3) Equipped with identification certificates to prevent identity impersonation.

2.3.3 Encryption Knowledge Dissemination

1. Symmetric Encryption

There are two types: stream and block-based. Both encryption and decryption use the same key.

For example: DES, AES-GCM, ChaCha20-Poly1305, etc.

[Anecdote]

One party (e.g., Xiaohong) uses the key K to encrypt the text M; the other party (e.g., Xiaoming) uses the same key to decrypt:

Xiaohong: C = E(M, K)
Xiaoming: M = D(C, K)

There’s a problem here: Once one party generates the key K, it must share K with the other party. However, traversing the treacherous roads of Sin City might lead to K being eavesdropped on, allowing the eavesdropper to impersonate any party in the communication. This is called a man-in-the-middle attack.

2. Asymmetric Encryption

The key used for encryption differs from that used for decryption, termed as: public key and private key. Public key and algorithm are both public, whereas the private key is confidential. Asymmetric encryption algorithms are less performant but very secure due to their encryption characteristics; they also limit the amount of data that can be encrypted.

For example: RSA, DSA, ECDSA, DH, ECDHE.

[Anecdote]

Asymmetric encryption uses a pair of two keys: K1 and K2. Xiaohong uses one key to encrypt the text, and Xiaoming can use the other to decrypt it. For instance, Xiaohong encrypts with K1, and Xiaoming decrypts with K2:

Xiaohong: C = E(M, K1)
Xiaoming: M = D(C, K2)

In this way, one of the parties (e.g., Xiaohong) can generate K1 and K2, privately keeping K1 as a private key while publicly sharing K2 as a public key. Once the other party (e.g., Xiaoming) obtains the public key, they can communicate.

However, the middleman might intercept the message when Xiaoming obtains the public key and create a pair of keys (Îș1, Îș2). Then, inform Xiaoming that Îș2 is Xiaohong’s public key. Thus, the middleman can decrypt the encrypted text from Xiaohong to Xiaoming (and even modify it), re-encrypt it with Îș1, and send it back; Xiaoming decrypts using Îș2.

3. Hash Algorithm

Transforms any length of information into values of fixed shorter lengths, typically much smaller than the information itself, and the algorithm is irreversible.

For example: MD5, SHA-1, SHA-2, SHA-256, etc.

4. Digital Signature

Digital signature technology encrypts summary information with the sender’s private key and transmits it with the original text to the receiver. The receiver can only use the sender’s public key to decrypt the encrypted summary information and then use a HASH function to generate a summary from the received original text for comparison with the decrypted summary information. If identical, it indicates the message is complete and not modified during transmission; otherwise, it signifies alteration, thus verifying the integrity of the message.

Digital signing is an encryption process, while digital signature verification is a decryption process.

Common digital signature algorithms include RSA, ElGamal, Fiat-Shamir, Guillou-Quisquarter, Schnorr, Ong-Schnorr-Shamir, DES/DSA, elliptic curve digital signature algorithm, and finite automaton digital signature algorithm, etc.

HTTPS uses certificate transmission with CA, utilizing digital signature, asymmetric encryption, symmetric encryption, and hybrid encryption technology.

Digital Signature Implementation:

  • Xiaohong blends her public key and ID (ID number or domain) into an identity card application (certificate signing request, CSR), sending this CSR to a highly respected individual (known as certificate authority, CA), like Xiaoliang.
  • Xiaoliang uses his private key to encrypt Xiaohong’s CSR, resulting in ciphertext known as a digital signature (digital signature).
  • Xiaoliang merges the signature with the CSR plaintext into a CA-signed identity card (CA signed certificate, CRT) and sends it to Xiaohong.
Xiaohong: CSR = Xiaohong’s public key + Xiaohong’s domain
     signature = E(CSR, Xiaoliang’s private key)
     CRT = CSR + signature

Whenever others (e.g., Xiaoming) approach Xiaohong for a chat (establishing an HTTPS connection), Xiaohong presents her Xiaoliang-signed identity card. Anyone with this identity card, provided they trust Xiaoliang—having installed Xiaoliang’s identity card on their machine, can extract Xiaoliang’s public key from Xiaoliang’s CSR within Xiaoliang’s identity card;

Then decrypt Xiaohong’s identity card’s signature using Xiaoliang’s public key to get a Xiaohong CSR;

If this CSR’ matches the CSR plaintext in Xiaohong’s identity card, it verifies “this Xiaohong identity card is endorsed and signed by Xiaoliang.”

Xiaoming: Xiaoliang’s public key = Xiaoliang’s CRT.CSR.Xiaoliang’s public key
     CSR' = D(CRT.signature, Xiaoliang’s public key)
     if CSR' == CRT.CSR then OK

2.4 HTTPS True Interaction Message Process

2.4.1 HTTPS Interaction Message

Notes:

(1) Look at the blue part, which is a tcp connection. So the encryption layer of HTTPS is above tcp.

(2) The client initiates a clientHello message first. It includes a client-generated random random1 number, client-supported encryption algorithms, and SSL information.

(3) Upon receiving the clientHello message from the client, the server extracts the random1 number sent from the client, takes out the client’s supported encryption algorithms,

selects an encryption algorithm, generates a random random2 number, and sends it to the client with serverhello for server identity verification; the server sends its public key through a digital certificate to the client.

(4) After receiving the certificate from the server, the client first verifies its legitimacy with the CA. Upon passing verification, it extracts the server’s public key from the certificate, generates a random Random3, encrypts Random3 with the server’s public key asymmetrically to create a PreMaster Key, and sends it to the server.

(5) The server decrypts the PreMaster Key with its private key to obtain Random3. At this point, the client and server both hold three random numbers: Random1, Random2, and Random3. Both parties then generate a symmetric encryption key through these three random numbers using the same algorithm, with all subsequent application layer data being encrypted with this key.

Change Cipher Spec Finished: Notify the client that subsequent communications should use this key.

(6) Lastly, all ApplicationData uses symmetric encryption because asymmetric encryption is too slow, while symmetric encryption doesn’t impact performance. Therefore, it’s clear that the true purpose of HTTPS is to ensure that the symmetric encryption key is not cracked, replaced, or subjected to a man-in-the-middle attack. If any of these situations occur, the encryption layer in HTTPS will detect it, avoiding incidents.

2.4.2 Using WireShark to Reconstruct an HTTPS Interaction Process

The target access address uses GitHub. Here’s how the capture looks.

Focus on the tlsv1 section, which is the encryption layer. Let’s analyze step by step:

(1) ClientHello (line-2330)

(2) ServerHello (line-2380)

Notice, at this point, the server and client have two random numbers. The encryption algorithm is also determined.

(3) Certificate / ServerHelloDone (line 2435)

This mainly involves sending certificate information. Opening it reveals all the certificate details. ServerHelloDone indicates the server’s work is completed.

(4) Client key exchange / ChangeCipherSpec (line-2449)

There are three steps here. Let’s analyze what each of these actions accomplished:

  • Client Key Exchange

When the server receives this encrypted random3 information, it uses its private key to decrypt it, allowing the server and client to both hold random 1, 2, and 3 sets of random numbers. Using these sets of data, they generate a key; this key is used in the subsequent application data exchange as a symmetric encryption key.

  • ChangeCipherSpec

(5) Change Cipher Spec Finished / new session ticket (line 2926)

Explanation based on the image description.

This session ticket is the data transmitted to the client by the server in the final step.

Once the client receives this encrypted data, it can store it. The next time it requests HTTPS, it can send this session ticket, saving a lot of handshake time and resource consumption (we analyzed earlier, which is quite complex, especially the substantial resource consumption of asymmetric encryption on the server). In practice, for most browsers pointing to the same domain name’s HTTPS connection, we intentionally wait for the first HTTPS connection to complete the handshake before connecting the nth HTTPS. Doing so allows the subsequent HTTPS to carry relevant information, significantly saving resources this ticket is similar to a cookie.

During the access to chrome-gitub by the author, the browser did not use ticket technology but rather session ID technology:

The role of sessionid is practically similar to ticket; however, sessionid cannot achieve server synchronization since id exists in server memory, and state synchronization from load balancing is a major issue.

(6) Application Data (line-2964)

2.5 CA Certificate is Charged, What if I Don’t Want to Pay?

You can create your own certificate, then place its public key in the client (e.g., in the app’s installation directory), allowing the app to use its certificate public key for decryption without relying on the system. However, the issue arises: what if someone obtains this public key certificate?

Digital signature verification algorithms can address such issues. Simply put, the server and client pre-agree on an encryption rule to determine if modifications have occurred.

Since this isn’t the focal point, it won’t be explained in detail for now. Knowing there is such an issue is sufficient. Once you understand the entire HTTPS, you’ll naturally comprehend this aspect.

Refer to “Lesson 9 of Ant Blockchain: SSL/TLS Working Principles and Its Application in Ant BAAS” for an understanding of SSL/TLS principles and its application in Ant Blockchain.