1. Fault Description
Fault Location: Troubles related to VLAN connectivity.
A certain power supply bureau
Fault Phenomenon:
Severe network congestion, with internal hosts experiencing intermittent connectivity to the internet and even among themselves.
Detailed Fault Description:
The network suddenly experienced communication interruptions. Certain VLANs were unable to access the internet, and access between different VLANs was also interrupted. During ping tests conducted in the server room, it was observed that the response time for ping packets from the central switch to hosts within the affected VLAN was significantly delayed, with intermittent packet loss occurring as well. The packet loss situation between VLANs was even more severe.
2. Detailed Fault Analysis
1. Preliminary Analysis
Preliminary judgment of potential causes:
- Switch ARP table update issues
- Broadcast or routing loop failures
- Malicious or virus attacks
Further information needed:
- Network topology and operational conditions under normal circumstances
- Switch ARP table details and switch load conditions
- Raw data packets transmitted over the network
2. Specific Analysis
Firstly, we obtained information from network administrators indicating approximately 450 hosts in the network, along with a simplified network topology diagram as shown in Figure 1.
(Figure 1: Original Simplified Network Topology)
From Figure 1, we know the network is divided into 6 VLANs, namely 10.230.201.0/24, 10.230.202.0/24, 10.230.203.0/24, 10.230.204.0/24, 10.230.205.0/24, and 10.230.206.0/24. VLANs 201 to 205 are allocated to different departments, and VLAN 206 is dedicated to servers. All VLANs are connected to the central switch (Passport 8010), which then connects to a firewall that links to the Internet and the provincial unit.
After gaining a general understanding of the network topology, we logged into the central switch using a super terminal and found that the switch was under heavy load. We immediately cleared the switchâs ARP table and rebooted, but the problem persisted. Hence, we decided to perform packet capture analysis on the network.
We configured port mirroring on the central switch (Passport 8010) by connecting a laptop with the Colasoft Network Analyzer installed to the mirrored port of the central switch. After the installation, the simplified network topology appeared as shown in Figure 2.
(Figure 2: Simplified Network Topology After Installing Colasoft Network Analyzer)
Since the Colasoft Network Analyzer can capture and analyze data across VLANs, connecting the laptop to the central switch for analysis did not alter the networkâs topology.
We launched the Colasoft Network Analyzer on the laptop and captured data packets for about one minute (exact capture time was 53 seconds) before stopping. We then analyzed the captured data communications.
In the endpoint browser, we located the local segment under physical endpoints and identified a host with a MAC address of 00:00:E8:40:44:99, which had 40 IP addresses associated with it, as shown in Figure 3.
(Figure 3: Endpoint View for Local Segment)
We understand that, under normal circumstances, multiple IP addresses mapped to a single MAC address can only occur in the following scenarios: a gateway, a proxy server, or manually binding multiple IP addresses. Upon consulting with the network administrator, it was confirmed that the machines in this segment each bind to only one MAC address, and there are no proxy servers. Additionally, this MAC is not the gateway MAC address. Hence, we suspect this host of conducting a spoofing attack.
By right-clicking on the 00:00:E8:40:44:99 node in Figure 3, and choosing the âLocate Browser Node (L)â command from the menu, the browser was focused on the 00:00:E8:40:44:99 node. The protocol view showed that this node had proactively sent 22,613 ARP reply packets, while sending only 2 ARP request packets, as shown in Figure 4.
(Figure 4: Protocol Distribution of Communications for 00:00:E8:40:44:99 Host)
From the data packets shown in Figure 4, it is clear that 00:00:E8:40:44:99 was actively sending ARP reply packets to other hosts on the network, claiming to be various IP addresses, which changed continuously. This confirms that the machine with MAC address 00:00:E8:40:44:99 was performing an ARP spoofing attack.
Additionally, the ARP diagnostic events section within the diagnostics view also provided relevant alert information, as detailed in Figure 5.
(Figure 5: ARP Diagnostics Information for 00:00:E8:40:44:99)
Through the above analysis, we determined that 00:00:E8:40:44:99 was indeed conducting an ARP spoofing attack. The network administrators promptly located the host, thanks to their existing IP and MAC address registry, and disconnected its network cable from the layer 2 switch. This action swiftly restored normal network operation, with both internal and external communications (including Internet and provincial network unit access) returning to normal speeds.
Additionally, as indicated in Figure 3, the hosts with MAC addresses 00:02:B0:BC:68:D2, 00:0B:DB:4B:46:81, and 00:11:25:8D:7D:C1 exhibited higher traffic usage. Upon examining the specific traffic of these machines, it was found that 00:02:B0:BC:68:D2 and 00:0B:DB:4B:46:81 were engaged in mutual data transfers, while 00:11:25:8D:7D:C1 was linked to the IP address 10.230.204.1, which is the gateway for the 10.230.204.0/24 segment. Its higher traffic consumption was thus deemed normal. Consequently, it was confirmed that the intermittent network disruptions were caused by the previously identified host machine with the MAC address 00:00:E8:40:44:99.
Having identified the fault point and assisted in restoring network functionality, we departed from the site to attend to other matters, without investigating the specifics of the 00:00:E8:40:44:99 host.
Later in the afternoon, we received a call from the power bureauâs network manager, who reported that when they located the host with the MAC address 00:00:E8:40:44:99, the user was merely editing a document in WORD and had not engaged in any deliberate attacks. After deploying antivirus software and conducting a thorough scan, several viruses were detected and removed. Upon reconnecting the host to the network, communication returned to normal. This led to the conclusion that the network issue was caused by a worm virus on the host machine with MAC address 00:00:E8:40:44:99, which autonomously carried out ARP spoofing attacks, resulting in intermittent network access.
3. Conclusion
In medium to large networks, network faults can be complex and challenging to troubleshoot without the aid of professional network analysis tools. As demonstrated in this case, without packet capture, it would have been difficult to identify the fault point even after examining switch traffic, as the traffic of 00:00:E8:40:44:99 was not exceptionally high.
Additionally, due to the short duration of packet capture, only 53 seconds, there may still be undetected issues within the network because some hosts might be inactive and thereby fail to send or receive relevant packets, making them difficult to identify. Therefore, for enterprise network operations, network administrators must employ specialized network analysis tools to conduct long-term, effective monitoring and analysis to minimize potential network faults and security threats.