In the context of troubleshooting DNS Access Unavailable issues, my job involves simulating various real-life failures, including network packet loss, latency, and resource saturation. Recently, while reviewing the “DNS Access Unavailable” feature, I discovered unexpected behaviors that prompted a deeper examination of its implementation. This guide will explore how I set up the function to modify the /etc/hosts
file and the complications that can arise, leading to potential misconfigurations that prevent the intended outcome.
For example, if you want to implement www.baidu.com
a situation where access is unavailable, you can do this:
127.0.0.1 odin.xiaojukeji.comwww.baidu.com #chaosblade
As a result, when you access it through a browser or terminal using curl www.baidu.com
, an error will be reported.
The problem I encountered is that in some cases, even if I made the above settings, the result was still accessing Baidu’s server instead of the one I set 127.0.0.1
.
ping
The following will discuss go-http
two failure scenarios, the failure phenomenon capture and the causes.
Our goal is to www.baidu.com
map it locally local 127.0.0.1
.
Before modifying /etc/hosts
the file, we first runping www.baidu.com
~$ ping www.baidu.com
PING www.baidu.com (110.242.68.4) 56(84) bytes of data.
64 bytes from 110.242.68.4: icmp_seq=1 ttl=55 time=3.32 ms
64 bytes from 110.242.68.4: icmp_seq=2 ttl=55 time=4.39 ms
64 bytes from 110.242.68.4: icmp_seq=3 ttl=55 time=2.39 ms
64 bytes from 110.242.68.4: icmp_seq=4 ttl=55 time=3.66 ms
Without interrupting the above ping command , domain name access is unavailable ( /etc/hosts
file modification 127.0.0.1 www.baidu.com #chaosblade
).
After modifying /etc/hosts
the file, ping www.baidu.com
the result found has not changed from 110.242.68.4
.
But at this time, open another terminal and run it ping www.baidu.com
, and find that it is in line with expectations!
~$ ping www.baidu.com
PING www.baidu.com (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.014 ms
64 bytes from localhost (127.0.0.1): icmp_seq=2 ttl=64 time=0.007 ms
64 bytes from localhost (127.0.0.1): icmp_seq=3 ttl=64 time=0.011 ms
64 bytes from localhost (127.0.0.1): icmp_seq=4 ttl=64 time=0.009 ms
64 bytes from localhost (127.0.0.1): icmp_seq=5 ttl=64 time=0.007 ms
64 bytes from localhost (127.0.0.1): icmp_seq=6 ttl=64 time=0.006 ms
Is ping
there a cache in the implementation?
In fact, it depends on ping
the implementation principle of the command
ping implement github.com/iputils/iputils/blob/master/ping/ping.c#L656
After interpreting the source code, ping
the implementation is as follows: before entering the continuous sending of icmp
data packets, the domain name specified by the user will only be resolved once. Therefore, in ping
the process, even if hosts
the file is modified, the running ping
process is not aware of it.
3.1 Problem Recurrence
Later I discovered that a similar situation occurred when go
making a request.http
ping
For example, in the following example, a request for the 50
secondary www.baidu.com
homepage is made http get
, and the interval between each request is 3s
. During the program running, modify /etc/hosts
the file to see if the request can be made and the response can be received normally.
# Test code
func TestHTTPGet(t *testing.T) {
url := "http://www.baidu.com"
for i := 0; i < 50; i++ {
resp, err := http.Get(url)
if err != nil {
t.Error(err)
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
t.Error(err)
}
fmt.Println(string(body))
time.Sleep(3 * time.Second)
}
}
Packet capture display 1: Details
Packet capture display 2: Statistics
It was found that the library did not point tohosts
the file because it was changed .go
http
www.baidu.com
IP
127.0.0.1
3.2 Troubleshooting
Through packet capture analysis, 50
this http
request uses long connection technology, because the source port is 58443
, of course, this also conforms HTTP 1.1 persistent connection
to the characteristics.
You may have questions. Looking at the request header, there is no setting HTTP
indicating a long connection .persistent connection
Indeed, no, but this is only because HTTP/1.1
persistent connections are enabled by default.
3.3. Other ways to reproduce
The above go
code needs to be run and the packet needs to be captured to confirm that http
the request is a long connection. Therefore, even if the file is modified /etc/hosts
, no new dns
query is performed, resulting in the use dns
of the first query ip
instead of the setting 127.0.0.1
.
Is there a way to know that dns
the query was only performed once without capturing the packet? Of course.
We can use the standard libraryhttptrace
func createHTTPTraceRequest(ctx context.Context, url string) (*http.Request, error) {
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return req, err
}
// Add some behavior log printing
trace := &httptrace.ClientTrace{
GotConn: func(info httptrace.GotConnInfo) { // Get the underlying connection, http sends the request through the underlying connection
fmt.Printf("GotConn info:%+v\n", info)
},
DNSStart: func(info httptrace.DNSStartInfo) { // Perform dns query and print it
fmt.Printf("DNSStart info:%+v\n", info)
},
DNSDone: func(info httptrace.DNSDoneInfo) { // Complete the DNS query and print it
fmt.Printf("DNSDone info:%+v\n", info)
},
}
traceCtx := httptrace.WithClientTrace(ctx, trace)
req = req.WithContext(traceCtx)
return req, nil
}
func TestWithTraceHTTP(t *testing.T) {
url := "http://www.baidu.com"
for i := 0; i < 50; i++ {
// Build request structure
req, err := createHTTPTraceRequest(context.Background(), url)
if err != nil {
t.Error(err)
}
// Send request
resp, err := http.DefaultClient.Do(req)
if err != nil {
t.Error(err)
}
defer resp.Body.Close() _, err = ioutil.ReadAll(resp.Body) if err != nil { t.Error(err) } time.Sleep(3 * time.Second) } }
The output is as follows:
DNSStart info:{Host:www.baidu.com}
DNSDone info:{Addrs:[{IP:110.242.68.3 Zone:} {IP:110.242.68.4 Zone:} {IP:2408:871a:2100:2:0:ff:b09f:237 Zone:} {IP:2408:871a:2100:3:0:ff:b025:348d Zone:}] Err:<nil> Coalesced:false}
GotConn info:{Conn:0x14000186000 Reused:false WasIdle:false IdleTime:0s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001353958s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001554083s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.00088375s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.00107s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001237583s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001469083s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.002459209s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.00587425s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001838042s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.007866292s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.00083075s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000906125s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001565917s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.002019208s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001439292s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000538625s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001538042s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001537375s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000745167s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000566916s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001680083s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001349s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001143166s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.00183775s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001327459s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001370041s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001351875s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001454833s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001327166s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000833042s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001082209s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001355333s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.002008875s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001410666s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000633791s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001041583s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000719583s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000613334s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000894167s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000837292s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001611s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001473875s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000864875s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000621208s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000998083s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001407s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.00110875s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.00126275s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001923709s}
If you look at the above log information carefully, you will find the following facts:
1. Only one query was performed for 50
each request , which was the root cause of the failure to modify the file as expected.http
dns
/etc/hosts
2. Only http
when the connection is sent for the first time, a new underlying connection is created, and all subsequent requests reuse the previous underlying connection.
How do you know?
The fields GotConn
you see indicate a new connection and indicate reuse of the underlying connection.Reused
false
true
Summary
When the operating system receives dns
a query request, it first checks whether the domain name to be queried /etc/hosts
has a corresponding configuration in the file, otherwise it calls dns
the service to perform a recursive query.
After modifying /etc/hosts
the file, we expect it to take effect immediately, and it does.
However, there are still some special cases where even if the file is modified, the changes in the file cannot reach these scenarios, such as the ping
and described above http
.
I think we can call it a caching problem.
Ahahaha, this is one of the two biggest problems in the world of computers:
- name
- cache