Solving DNS Timeout Issues: A Guide to Configuring Your Own DNS Service

In practical applications, a problem was found where, in certain countries or regions with specific ISPs, the program sometimes fails to obtain the server IP due to DNS timeout. This happens because the ISP’s DNS experiences slow recursive queries, leading to DNS request timeouts.

The solution for users is: Please do not use the DNS server automatically assigned by the ISP, but rather switch to 8.8.8.8, and it will be resolved.

However, configuring it this way for users is too cumbersome and not user-friendly. So, I started thinking: Can I implement my own DNS service, where if the ISP’s DNS request times out or fails, those DNS requests can be made directly to 8.8.8.8 internally?

If you were to use gethostbyname() and getaddrinfo() as a solution to this problem, the approach would be to modify the contents of /etc/resolve.conf. But this is not the correct method, as it is imprecise and can affect other DNS requests in the system. A feasible solution is to construct DNS requests on our own and parse them independently to obtain the IP information we need.

This article is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Originally published at: https://segmentfault.com/a/1190000009369381, which is also in the author’s own column.

Reference

Even though DNS is considered a relatively simple protocol within network interconnection, none of the reference materials seem to cover the knowledge points I needed for such a straightforward requirement


I did my own packet capturing as well, and when capturing packets, it’s advised not to directly query authoritative DNS servers, but rather aim at servers providing DNS relay such as gateways, routers, etc., to gather more information than what’s provided by the last reference material below.

Basic Concepts of DNS

Briefly organize some points related to this article:

The essence of DNS is that it invented a hierarchical, domain-based naming scheme and implemented it with a distributed database system. The primary function of DNS is to map host names to IP addresses.

The initiator of DNS resolution is usually the client end in the Internet Server/Client model (hereinafter referred to as the client end, referring to the DNS resolution initiator). Most C language client ends now use getaddrinfo(). It used to be gethostbyname() which is not recommended anymore due to certain reasons and it only supports IPv4.

In DNS resolution, the DNS server should open port 53. When the client makes a request, the server returns not just the IP information, but resource records associated with the domain name.

We cannot differentiate between a domain name and a host name based only on a domain URL. The total length of a domain name should be less than or equal to 255 bytes, and each segment of the domain name must be less than or equal to 63 bytes.

DNS Message Format

The format of DNS requests and responses are similar, so they are not explained separately. Starting from the main part of the UDP packet, the structure of a DNS message is as follows:

Data Type

Name in Ethereal

Description

uint16_t

Transaction ID

Identifier. Explained below

uint16_t

Flags

Flags. Explained below

uint16_t

Questions

Number of questions in the query list

uint16_t

Answer RRs

Number of (direct) answers

uint16_t

Authority RRs

Number of authority records (only in response packet)

uint16_t

Additional RRs

Number of additional information (only in response packet)

variable

Queries

Main content of the request. Only present in the request packet. The response packet also contains the original request data

variable

Answers

Main content of the response

variable

Authortative name servers

Name server data for the domain

variable

Additional records

Additional data

  • Transaction ID: This is an identifier specified by the client; the DNS server returns this field as is, allowing the client to distinguish between different DNS requests.
  • RR: Abbreviation for Resource Record

Flags

The 16 bits value is structured as follows (in order, bit number, Ethereal name, description):

  • Bit 15, Response: 0 indicates a query, 1 indicates a response (query/response)
  • Bit 14~11, Opcode: Query type—applicable to both request and response packets:
  • 0: Standard query (most common)
  • 1: Inverse query
  • 2: Server status request
  • 3: Notify
  • 4: Update (seemingly used in DDNS?)
  • Bit 10, Authoritative: Used in response packets to indicate if the server is an authoritative domain server
  • Bit 9, Truncated: Whether the message has been truncated. Used in both send and receive packets
  • Bit 8, Recursion desired: Used in both send and receive packets, indicates whether recursion is needed. It’s best to set it to 1 as a client; otherwise, DNS will not perform recursive queries, leading to incomplete data retrieval
  • Bit 7, Recursion available: Used in response packets to indicate if the server can perform recursive queries
  • Bit 6: Ethereal says it’s a reserve bit, while books indicate it’s whether the data is authenticated—confirmation needed
  • Bit 5, Answer authenticated: Indicates if the data has been authenticated by the server (seemingly 0 in captured packets)
  • Bit 4, Reserved
  • Bit 3~0, Reply code: Response status codes are as follows (see Microsoft documentation “DNS update message flags field” section):
  • 0: OK
  • 1: Query format error
  • 2: Server internal error
  • 3: Name does not exist
  • 4: This error code is not supported
  • 5: Request refused
  • 6: Name appears when it should not (what?)
  • 7: RR set does not exist
  • 8: RR set should exist but does not (what?)
  • 9: Server lacks authority over zone
  • 10: Name is not in zone

Format of Resource Records (RR)

The format of each RR is as follows:

Data Type

Name in Ethereal

Description

variable

Name

Domain name of the resource—already mentioned earlier

uint16_t

Type

Type. Explained below

uint16_t

Class

Mostly 0x0001, representing IN

uint32_t

Time to Live

TTL seconds

uint16_t

Data length

Length of the remaining part of this RR

variable

 

Main data of the RR

If it is request data, then TTL, Data Length, and main RR data are not necessary.

Most Type values are defined in RFC-1035, with extra definitions elsewhere (e.g., IPv6). The ones I’ll use include:

  • 1: “A”, indicates IPv4 address
  • 2: “NS”, name of the domain name server
  • 28: “AAAA”, indicates IPv6 address
  • 5: “CNAME”, canonical name, often followed by a set of A and AAAA

Domain Name Compression Display

This section directly references RFC-1035 section “4.1.4. Message Compression”.

In RRs’ Name field, there are three representations (unofficially classified by me):

Full Domain Name Representation

For example, the full domain name “www.google.com” requires 16 bytes as follows:

B0

B1

B2

B3

B4

B5

B6

B7

B8

B9

B10

B11

B12

B13

B14

B15

\3

w

w

w

\6

g

o

o

g

l

e

\3

c

o

m

\0

Note that it does not simply copy Google’s URL using a char * string, but rather splits each segment. In this example, the domain name is split into three segments: www, google, com. Each segment is preceded by one byte indicating the byte length of the following segment. When \0 is encountered, it signifies the end of the data (which is slightly different from the meaning of char * \0, although their forms are the same).

Labeled Representation

As mentioned earlier, each domain segment should not exceed 63 bytes, thus the highest two bits ( 0xC0) of the byte indicating segment length must be 0. This leads to the second use case here.

This representation is like a pointer, referring to a segment of the domain in the DNS message. When parsing an RR data segment, size determination logic is:

  • if the highest two bits are 00, it indicates the first representation
  • if the highest two bits are 11, it indicates a compressed representation. This byte, with its highest two bits cleared revealing the remaining 6 bits, along with the next 8 bits totaling 14 bits of data, points to a segment of the domain in the DNS message (not necessarily a full domain, see the third case).

For example, 0xC150 indicates it refers to the domain segment found at offset = 0x0150 in the DNS packet. 0x0150 is derived from clearing the highest two bits of 0xC150.

Mixed Representation

This is essentially a combination of the previous two representations. For instance, assuming the domain segment representing www.google.com in its entirety is at offset 0x20, the following usages apply:

  • 0xC020: naturally refers to www.google.com.
  • 0xC024: starts from the second segment of the full domain, referring to google.com.
  • 0x016DC024: where 0x6d represents the character m; 0x016D alone refers to m; the second segment 0xC024 refers to google.com, thus representing m.google.com.

Analysis Tools

In addition to Ethereal, recommended analysis tools include:

  • Wireshark: A packet capturing tool
  • BIND: A DNS server that can be installed in your development environment to observe and generate DNS responses. FTP address: ftp.isc.org/isc/bind9/, Simple tutorial

Code Implementation

The code implementation is in a branch I used to study epoll(). GitHub repository here, licensed under LGPL.

In terms of logic, it’s quite straightforward—implementing according to the principles mentioned above. Most of the code is unrelated to this article; just look at the AMCDns.c / h files.

This code can completely replace the blocking getaddrinfo() function, and it can also integrate into asynchronous I/O libraries. The usage process is as follows:

  1. Call socket() to create a UDP socket and bind()
  2. Call AMCDns_GetDefaultServer() to get the system’s default DNS server configuration
  3. If not using the system’s default DNS server, specify it using the struct addrinfo type.
  4. Call AMCDns_SendRequest() to request IP information for a specified domain
  5. Call AMCDns_RecvAndResolve() to get either summary or full response.
  6. Call AMCDns_FreeResult() to clear DNS response data to avoid memory leaks
  7. Close() the socket