Mastering WireGuard: The Ultimate Guide to Simplifying VPN Setup and Security

Network security

This article is 4,500 words long, taking about 10 minutes for a casual read and 30 minutes for a thorough read. The content includes information about WireGuard.

Recently, during an internal BBL session in my team, I shared about WireGuard. WireGuard (hereafter referred to as WG), as a representative of the new generation VPN, is likely familiar to many tech enthusiasts. Similar to other VPN technologies, we can use it to establish a secure channel between home and company networks, thereby accessing “intranet” data and applications.

Before diving into WG, let’s first abstract the general requirements for VPNs:

  • Security: Ensuring the data between two private networks can be securely transmitted over an insecure network such as the Internet
  • Authenticity: The visitor is a legitimate user, accessing the correct network
  • Efficiency: Enabling the VPN doesn’t significantly slow down the network access, and the tunnel setup should be fast
  • Stealthiness: Third parties should not easily sniff out the presence of the gateway
  • Accessibility: Easy to configure, easy to turn on and off

We must thank Martin E Hellman, Bailey W Diffie, and Ralph C. Merkle for securely transmitting data over insecure networks. Their patent, Cryptographic apparatus and method, introduced the widely used DH algorithm for key exchange.

This algorithm uses properties of congruence and the commutative property of multiplication— the process is simple, and those interested can refer to Wikipedia. WG uses ECDH, a variant of the DH algorithm, which uses elliptic curves to enhance performance and security.

Through the DH algorithm, both ends of the network can negotiate a key in an insecure network for encrypting the data to be transmitted. Subsequently, the data stream can be efficiently symmetrically encrypted with this key.

With security addressed, how do we solve the issue of identity authentication at both ends of the network? Currently, there are two general solutions to this issue:

  1. Pre-shared key
  2. Certificate

For example, when we access the website of a particular bank, the browser verifies the bank’s certificate to ensure that the network we are accessing is indeed the one we intend to visit. When a company’s headquarters and branch networks need to communicate, they can pre-configure each other’s public keys and authenticate each other through digital signatures. This is a variant of the pre-shared key (pure pre-shared keys don’t satisfy forward secrecy and should almost never be used in communications).

Once the issues of security and identity authenticity are solved, the most important VPN problems are resolved. Our current VPN solutions, whether IPSec VPN operating at the network layer or SSL/TLS/OpenVPN at the session layer, all utilize the algorithms discussed above for key exchange and authentication. Their complexity largely stems from handling configurations, encryption algorithm negotiations, and various compatibility issues. Meanwhile, WG, though not innovative in algorithms, cleverly organizes requirements and implements alternative approaches, resulting in breathtaking simplicity.

Here’s a comparison of code volume:

WG achieves its implementation with just 4k lines of kernel code! It’s so ingeniously crafted. While it sounds a bit disrespectful, in comparison, OpenVPN or StrongSwan seems like a product of line-by-line charged Indian outsourcing companies, whereas WG is the masterpiece of a real programmer! Linus himself was full of praise for WG, writing in an email on August 2, 2018:

Btw, on an unrelated issue: I see that Jason actually made the pull request to have wireguard included in the kernel. Can I just once again state my love for it and hope it gets merged soon? Maybe the code isn’t perfect, but I’ve skimmed it, and compared to the horrors that are OpenVPN and IPSec, it’s a work of art. Linus

Notably, Linus’s usual style of commenting on code is like this (Mauro is a Kernel maintainer):

“It’s a bug alright – in the kernel. How long have you been a maintainer? And you still haven’t learnt the first rule of kernel maintenance? “Shut up, Mauro. And I don’t ever want to hear that kind of obvious garbage and idiocy from a kernel maintainer again. Seriously.”

So getting Linus to “state my love” is as difficult as climbing to the moon. So let’s learn WG devotedly — the way it approaches product coding is worth our deep study!

The Concept of WireGuard Interface

Let’s start with the concept.

Many product managers don’t bother to clearly explain various new and old concepts within the product, especially when creating new concepts. This is very wrong. From the beginning of architecture design, the product should have all its concepts clarified. When existing concepts cannot adequately describe parts of the product, we should have the courage to create new concepts to ensure comprehensive descriptions. Concepts form the basis of communication between engineers and between engineers and the outside world. Communicating through mutually agreed concepts is more precise and efficient. For example, when I previously referred to ECDH as a variant of the DH algorithm using elliptic curves, I don’t need to re-explain it every time I mention ECDH. Once a new concept is created, we can attach many attributes to it to distinguish it from other concepts.

WG first defined an important concept — WireGuard Interface (hereafter referred to as wgi). Why do we need wgi? Why aren’t existing tunnel interfaces suitable? A wgi is a special interface:

  • It has its private key (curve25519)
  • It has a UDP port for listening to data
  • It has a group of peers (peer is another important concept), with each peer’s identity confirmed through its public key

By defining this new interface, wgi distinguishes it from a regular tunnel interface. With such an interface definition, the mapping of other data structures and the sending and receiving of data become clear and straightforward.

Let’s look at the WG interface configuration:

Code language: javascriptCopy

[Interface]Address = 10.1.1.1/24ListenPort = 12345PrivateKey = blablabla[Peer]PublicKey = IWNVZYx0EacOpmWJq6lE8RfcFBd8EeUliOi+uYKQfG8=AllowedIPs = 0.0.0.0/0,::/0Endpoint = 1.1.1.1:54321

The initiator/responder of a WG VPN tunnel is symmetrical, hence there’s no client/server or spoke/hub distinction like in usual VPNs. Thus, configurations are also symmetrical.

In this configuration, we further learn about the peer concept: it is a counterpart of a WG node, with a statically configured public key, a white list of networks behind the peer (AllowedIPs), and the peer’s address and port (not always required and might change automatically as the network roams).

In just 9 lines of configuration, we describe the simplest VPN network. This configuration doesn’t include endless certificate setups, complex and lengthy content that is difficult to understand, nor does it require setting a CA. If you’ve had the misfortune of configuring IPSec VPN or OpenVPN, you would marvel that simplicity truly is a productivity driver.

From a data structure perspective, there’s a hash table of peers and a hash table of key_index mounted below wgi. Through key_index included within received data packets, we can immediately locate the peer, and each peer stores the state of the endpoint, handshake state, and keypairs (three sets: the currently used key, the key used before the last rekey, and the key to be used after the next rekey), with each keypair set including keypair for receiving and sending directions.

When the wgi interface is enabled (wg-quick up wg0), it gets initialized, and consequently, its related peers are created; conversely (wg-quick down wg0), wgi stops running, and related peers are deleted. The data structure’s outline is exceptionally clear.

The Process of Channel Negotiation Encryption with WireGuard

WG’s simplicity also reflects in the negotiation of encrypted tunnels. It uses the Noise Protocol Framework to build the protocol negotiation process. The Noise Protocol Framework is an ingeniously designed framework for creating secure protocols, which won’t be discussed here but will be introduced in another article later. WG uses Noise_IKpsk2_25519_ChaChaPoly_BLAKE2s, from the protocol name you can probably infer it selects curve 25519 for ECDH, ChaChaPoly for symmetric encryption, and Blake2s for hashing. In IKE/SSL/TLS protocols, these algorithms are usually negotiated between parties. WG sees no need to negotiate and fixes them in the protocol, significantly reducing supported encryption algorithms and saving algorithm negotiation processes. As both ends are configured with each other’s public keys, it can complete tunnel establishment using only 1-RTT (a round trip of messages), 2 messages. Comparing to IPSec’s IKE protocol needing 6 messages under main mode (3-RTT) or at least 3 messages under aggressive mode (2-RTT), the benefit is clear. From Beijing to Seattle, 1-RTT is about 175ms (cloudping.info), 2-RTT will noticeably delay protocol performance. For any protocol, reducing RTT in tunnel negotiation can greatly enhance protocol performance.

1-RTT also implies connection-less operation, as there’s no mutual confirmation. You can compare it to a connection-oriented network like TCP (three-way handshake for eye contact connection) and connection-less UDP. Connection-oriented networks have numerous benefits, but connection-less shines in its simplicity, like a fish with only a seven-second memory, free from past, present, and future burdens.

For connection-oriented protocols, generally, a state table is required to store where previous communication progressed. This dynamically generated state table can easily become a target for DoS attacks, like TCP’s enduring SYN-flood issues since its inception. On the other hand, connection-less protocols don’t carry this burden— the server doesn’t need to specially handle a client’s handshake requests nor consider packet loss (just re-handshake anyway with 1-RTT), doesn’t need to manage timers for a connection table of half-open connections (since such a table doesn’t exist) and so forth.

WG handshake packets encapsulate:

  • unencrypted_ephemeral: Sender’s temporary public key generated for this handshake (unencrypted, for ECDH)
  • encrypted_static: Encrypted peer public key using a temporary key generated via receiver’s public key and temporary private key ECDH
  • encrypted_timestamp: Encrypted current timestamp using key2 obtained from receiver public key and sender’s private key ECDH mixed into key1
  • mac1: Hash of peer’s public key and entire message content

Receiving side first verifies mac1 (simple authentication – most hackers would fail here), if incorrect, discards it; then verifies encrypted_static (confirmation – without private key the hacker fails here again), verifies encrypted_timestamp (prevents replay attack, so replay attacks fail here as well). Once the receiving side checks everything is okay, it can create its temporary key pair. By now, with the sender’s temporary public key, it can calculate the keys needed for encrypting data jointly agreed upon post handshake. But, it still needs to send a handshake reply message to provide its temporary public key to the sender for the sender to calculate the same key:

  • unencrypted_ephemeral: Receiver’s temporary public key generated for this handshake (unencrypted, for ECDH)
  • mac1: Hash of peer’s public key and entire message content

This way, both ends, with each other’s temporary public key and their temporary private key, can ECDH + HKDF (a method of deriving a symmetric encryption key from a DH result) to derive the symmetric keys for data encryption in both directions.

If there’s packet loss, say the receiver didn’t get the sender’s handshake request or the sender didn’t get the handshake reply, the whole process can just restart. Since it’s 1-RTT anyway, it doesn’t waste any time.

This process considers stealthiness; receiving parties will discard any unauthorized handshakes (e.g., peers they don’t know or retransmissions). From the sender’s perspective, the handshake packets appear as if they’ve entered a black hole, illustrating that unless the hacker has authorization to add their public key as a peer on the WG gateway, they have virtually no chance of sniffing out the existence of the receiver. Meanwhile, other VPN protocols during tunnel establishment phases, like the IPSec IKE protocol or OpenVPN SSL/TLS protocol, can be sniffed out.

Sending and Receiving Data Packets

With keys established, user data packets are straightforward to handle. The handling logic is exceptionally simple and clear, covered in just a few lines:

  • Sending:
    • User Space: Application sends data packets destined for the VPN peer network
    • Kernel: Routes are determined that the packets should be sent via the wg0 interface, thus handing it over to WG for processing
    • WG: Upon the destination address, it reverse-maps to determine which peer should receive it, encrypts the packet with the pre-negotiated key with the peer (if not yet negotiated or the key expired, it re-negotiates) and encapsulates it in a UDP packet destined for the peer’s endpoint with key_index included
  • Receiving:
    • Kernel: If a data packet’s UDP port is monitored by WG, it’s handed to WG for processing (WG’s recv handles this packet)
    • WG: Using key_index in the packet, locates the corresponding key in a hash table and decrypts (not directly but enqueues it in a decryption queue—a small network system trick in design)
    • WG: Checks if the decrypted original packet is permitted in the peer’s allowed IP list. If it is, passes the original packet to the kernel for further processing, discerns the peer again from key_index
    • Kernel: Based on routing table for the original packet’s destination address sends out the packet

Too Dry? Let’s Add Some Color!

In the BBL, I conducted a demonstration: establishing a WG VPN between my machine and a DigitalOcean machine, then sending an HTTP GET request, with the server returning a hello world text. Below is a Wireshark packet capture, slightly annotated by me:

WireGuard>WireGuard>

Additional Reading

  1. WireGuard Protocol: https://www.wireguard.com/protocol/
  2. Noise Protocol: https://noiseprotocol.org/
  3. Authenticated Encryption with Associated Data (AEAD) algorithm – RFC7539
  4. HKDF: https://tools.ietf.org/html/rfc5869
  5. DH Algorithm Patent: https://patents.google.com/patent/US4200770
  6. WireGuard Source Code: https://github.com/WireGuard/WireGuard
  7. Linus’s Email: https://lists.openwall.net/netdev/2018/08/02/124

Moments of Reflection

As a veteran in the space of network and security protocols, WireGuard’s impact on me was profound. It’s like a hammer striking down on my head: if one practices rational compromise and simplifies bureaucratic processes, something as complex as a VPN protocol can become so graceful and refined; a simple, well-thought-out user interface (configuration) implies user-friendly products and designs that embody great wisdom in apparent simplicity; the resulting simplicity streamlines many subsequent processes: because of the simple, clear interface, nearly all data structures can be pre-generated; because the protocol itself is simple (1-RTT), it’s easy to renegotiate; losing packets during the handshake? Let it be; handshakes are fast and easy; ultimately, simplicity leads to less code, free from complex twists and turns, allowing an engineer familiar with C and Linux development to comprehend the main flow with ease in an afternoon—this means code is easier to review, writing test code takes less time and can achieve higher test coverage with fewer errors, and less time fixing bugs, leaving engineers more time to think deeply and perhaps even plan for the future, hence no need for 996 culture, with time saved to spend joyfully with family or read books and attend concerts with friends. It’s all worthwhile.

Share this