Sure, here is the translation of the plain text content while keeping the original formatting:
1. Introduction to Network Jitter
In the previous article, we provided a detailed introduction to the usage of Wireshark, focusing on how it can be used to analyze Network Jitter.
Network Troubleshooting in Practice (3) â Detailed Explanation of Wireshark Usage
The most important usage of Wireshark is, of course, diagnosing network issues.
In this article, weâll use Wireshark to see how to approach these types of network problems.
This article mainly references Chapter 9, the first four sections of âNetwork Analysis Using Wireshark Cookbookâ.
2. Viewing TCP Connection Information and Network Jitter with Wireshark
The process of TCP connection establishment and communication should already be familiar to you, and you can also refer to an article I wrote previously:
Transmission Control Protocol â TCP
2.1 Connection Establishment with Network Jitter
As shown in the diagram, these three lines represent the process of TCPâs three-way handshake:
First, the clientâs TCP process sends a SYN packet with an initial sequence number Seq of 0. Additionally, in Wireshark, we can see more detailed information such as MSS, Selective ACK, etc.:
Among this information, you might be more interested in:
- Maximum Segment Size (MSS) â the maximum length of a single TCP packet.
- Windows Size (WSopt) â the window size.
- SACK â or Selective ACK, which allows for the retransmission of only lost packets during retransmission, only enabled if both ends support this feature.
- Timestamps options (TSopt) â the delay between the client and server.
The second line of the message is the serverâs ACK to the clientâs SYN packet, and it also contains the serverâs SYN information.
The packet contains the serverâs initial sequence number and the serverâs window size information.
In the third line, apart from the sequence number of the clientâs packet, the clientâs ACK packet specifies the clientâs window size again.
2.2 Troubleshooting Issues
Quite simply, if you see in the packet capture results that after the client sends a SYN packet, the server has no reply or replies with an RST packet, it is obvious that the corresponding port on the server might not be listening, actively rejecting, or blocked by a firewall.
After confirming both the client and server are running properly, check the firewall configuration, verify if the username and password you transmitted are correct, and confirm if the IP address and port youâre trying to access are correct.
You might use the ping command to check if the server is online, but in many cases, the server will block ICMP packets via a firewall, so you canât ping the server, but this doesnât mean the server is down.
3. TCP Retransmission
One of the most common issues during TCP communication is TCP retransmission.
TCP retransmission is an important mechanism used by TCP to recover from damaged, lost, duplicated, or out-of-order packets. If the sender does not receive an acknowledgment of the sent packet within a certain time, it will trigger a retransmission.
During communication, if the retransmission rate reaches 0.5%, it will seriously affect performance. If it reaches 5%, the TCP connection will be interrupted.
In Wireshark, retransmitted packets are marked as TCP Retransmission.
To configure the display filter to obtain all the retransmitted packets in the current packet capture results:
expert.message == âRetransmission (suspected)â
As shown in the figure:
3.1 Case1. Retransmission to Multiple Destinations
As depicted in the figure above, you will find that the Destination is not concentrated but spread across multiple destination servers, which is usually a link issue, perhaps due to high network card load.
Through the IO Graph option under the Statistics menu in Wireshark, you can open Wiresharkâs IO load monitoring, thus you can see whether the communication on the current machine has reached the load bottleneck of the network card.
If, like the figure above, the network card load is not high, it could be due to a fault in the network card or link, or other high-load links occupying bandwidth.
You can log in to the communication device in the link to check packet loss rate.
3.2 Case2. Retransmission Only to the Same Destination
In a situation like this, where all retransmissions are concentrated on the same destination, itâs usually caused by the low processing performance of the application itself.
To further confirm if this is the cause, you can check by following these steps:
- As introduced in the previous section, use the IO Graph provided by Wireshark to check whether the network load is too high.
- Through the Conversation option under the Statistics menu, open the network session window. In the IPv4 tab, check the Limit to display filter box to see all sessions where retransmissions occurred for further confirmation.
- In the network session window, click the TCP tab, similarly checking Limit to display filter to view the specific retransmission port, confirm which application it is, and thus pinpoint the specific issue.
Pay special attention to whether the retransmission timing follows a certain periodicity or is event-triggered, for instance, in the image below, a retransmission occurs approximately every 30 ms, which coincides with the client performing a certain operation in the software, indicating this operation likely triggered the slow request.
3.3 Case3. Application Unresponsiveness Leading to Retransmission
If multiple retransmissions occur immediately after sending SYN or ACK packets when establishing a connection, and the intervals between retransmissions grow longer, this is usually due to application unresponsiveness.
In such circumstances, troubleshoot the reasons for application unresponsiveness. After 15 to 20 seconds, the application may attempt to re-establish the connection, or you can manually restart the application to retry connection establishment.
3.4 Case4. Retransmission Caused by Network Jitter
The TCP protocol itself has mechanisms like the Nagle Algorithm, sliding window protocol, slow start, congestion avoidance, and fast recovery to prevent network congestion.
However, network jitter poses a significant problem for the TCP protocol and often triggers TCP retransmissions.
To confirm this issue, you can execute a ping to the destination address, observing fluctuations in the time value for variation.
You can check:
- If the link is congested and the linkâs status is stable.
- If the server hosting the application lacks resources, has hardware faults, or is inadequately configured.
- If any devices in the network link are overloaded or resource deficient.
4. Summary
Overall, the problems mentioned above can be approached with the following considerations:
- Summarization: Is the problem associated with a particular host, a specific TCP connection, or a particular behavior?
- Step-by-step Investigation: Is the link overloaded? Are there packet losses in the link? Are there performance issues on the server or client host? Are there performance issues with the application?
- Final Issue: Is the problem caused by network jitter?
From my experience, most performance problems are caused by issues at the business layer, which means application code is causing them. Therefore, the first thing to check is whether the application code underwent any modifications that could lead to these performance issues during the problem period. Only after thoroughly ruling this out, should you invest effort into capturing and analyzing network link issues with tools. Otherwise, it might be a futile effort in the wrong direction.
Typically, problems are not caused by network jitter, although it is often the easiest attribution, more often than not attributing issues to network jitter is merely a sign of laziness.