Abstract: This article starts with the working process of the SRT protocol, focusing on introducing and analyzing the packet structure of the SRT protocol, and providing examples of how to utilize Wireshark packet capture software for link fault analysis to resolve practical work problems.
Introduction
SRT (Secure Reliable Transport) protocol, also known as a secure and reliable transport protocol, is an emerging audio-video transmission protocol that enables high-quality, low-latency real-time audio-video transmission in public Internet environments.
SRT Protocol Analysis for Public Network Transmission (Part 1) focuses on discussing how to measure the reliability of the SRT protocol and how to configure the parameters of the SRT link in different application scenarios. This article, as the second part, will start with the working process of the SRT protocol, analyze the SRT protocol packet structure, and then use examples to introduce how to use Wireshark software for packet capture analysis to troubleshoot link faults or obtain link information.
1
SRT Protocol Working Process
The most common working mode in the SRT protocol is the âCaller-Listenerâ mode. The listener continuously listens on its fixed UDP port, and the caller establishes the SRT connection by accessing the listenerâs public IP address and the fixed port. The roles of calling and listening are mainly active during the SRT protocol handshake phase, and either the encoding or decoding end can be the caller or listener.
Figure 1 shows the working process of the SRT protocol, including steps like handshake, parameter exchange, data transfer, and connection shutdown. In addition, while transmitting effective data, both parties send control data to accomplish functions like packet loss recovery and connection maintenance.

Figure 1 SRT Protocol Working Process
2
SRT Packet Structure
The SRT protocol is an improvement over the UDT Protocol (UDP-based Data Transfer Protocol) and submitted its RFC draft to the IETF on March 10, 2020, indicating the SRT protocol has entered a relatively stable development track.
As is well-known, the traditional advantage of SRT lies in point-to-point real-time audio-video transmission. In the past two years, the SRT protocol has seen rapid development in upstream streaming, with many mainstream platforms and companies supporting SRT protocol to replace RTMP protocol for upstream streaming. The key point is the StreamID feature of SRT, which is included in the configuration extension module of the SRT handshake packet.
Overall, the SRT protocol contains two types of packets: Data Packets and Control Packets, distinguished by the highest bit (flag bit) of the SRT header. A value of 0 represents a Data Packet, and 1 represents a Control Packet. Control Packets include various types like Handshake, Acknowledgement (ACK), Negative Acknowledgement (NAK), Acknowledgement for ACK (ACKACK), Keepalive, Shutdown.
2.1
Data Packet Structure
Figure 2 shows the structure of an SRT Data Packet, which carries the data to be transmitted. The SRT header is 16 bytes long, with the highest bit being the flag bit. The SRT Data Packet header includes four areas: Packet Sequence Number, Message Number, Timestamp, Destination Socket ID.
- Packet Sequence Number: SRT uses a sequence number-based packet sending mechanism, incrementing the packet sequence number each time a packet is sent from the sender.
- Message Number: Independently counted. Four flag bits are set before it (see Figure 2).
- Timestamp: A relative timestamp based on the connection establishment time (StartTime), in microseconds.
- Destination Socket ID: Used to distinguish different SRT streams in the case of multiplexing.

Figure 2 SRT Data Packet
2.2
Handshake Packet Structure
The handshake packets are divided into HSv4 version (SRT version < 1.3) and HSv5 version (SRT version >= 1.3). Figure 3 shows the structure of an HSv5 handshake packet, which mainly includes five areas: SRT Header, Handshake Control Info (cif.hsv5), Handshake Request/Response Extension Module (hsreg/hsrsp), Encryption Extension Module (kmreg/kmrsp), Configuration Extension Module (config). The focus is on the first three areas, and the structure of the handshake packet is shown in Figure 3:

Figure 3 HSv5 Handshake Packet
1. The headers of all SRT control packets are basically the same, containing four areas: Control Type and Reserved Area, Additional Information, Timestamp, Destination Socket. For handshake packets, the Control Type field equals 0.
2. In the Handshake Control Info Area (cif.hsv5), the following fields are important:
- ISN: Randomly generated Initial Sequence Number for packets. All subsequent data packets are counted based on this.
- Handshake Type: The first purpose of this field is to indicate the handshake phase of the packet (in the âCaller-Listenerâ mode, it is divided into Induction and Conclusion). The second and more important purpose for the user is to display an error code when the handshake fails, as seen in Table 1 below.
Error Code |
Error Type |
Error Code |
Error Type |
---|---|---|---|
1000 |
Unknown Reason |
1008 |
Peer Version Too Old |
1001 |
System Function Error |
1009 |
Socket Conflict in Ensemble Mode |
1002 |
Peer Rejection |
1010 |
Password Error |
1003 |
Resource Allocation Issue |
1011 |
Password Requirement |
1004 |
Error Data in Handshake |
1012 |
Stream Flag Conflict |
1005 |
Listener Backlog Overflow |
1013 |
Congestion Control Type Conflict |
1006 |
Internal Program Error |
1014 |
Packet Filter Conflict |
1007 |
This Socket Is Closed |
1015 |
Group Conflict |
Table 1 Error Codes and Corresponding Error Types
- SRT Socket ID: This field needs to be distinguished from the Destination Socket ID in the SRT header, as it only applies to the handshake phase, while the Destination Socket ID applies throughout data transfers.
- Sync Cookie: In âCaller-Listenerâ mode, to prevent DoS attacks, only the listener generates the sync cookie, derived from the listenerâs host, port, and current time, accurate to one minute.
3. Key fields in the Handshake Request Extension Module (HSREG) include:
- SRT Version: If either partyâs SRT version is below 1.3, the connection will be established using the HSv4 version handshake, requiring three or four round trips, while the latest HSv5 handshake requires only two. For compatibility reasons, even if both partiesâ versions exceed 1.3, the initial handshake request will be in HSv4 format.
- SRT Flag Bit: Seven flag bits are used to implement various modes and functions of SRT.
- Send and Receive Delays: The SRT protocol version 1.3 supports bidirectional transmission, allowing different directional delays to be set. In conventional unidirectional transmission (e.g., A sending data to B), the delay (Latency) is determined by the maximum of Aâs send delay (PeerLatency) and Bâs receive delay (RecLatency) and is negotiated during the handshake phase. Some codecs may use the same values for PeerLatency and RecLatency for simplicity, which does not affect unidirectional transmission.
4. Encryption Extension Module KMREQ and Configuration Extension Module CONFIG
- The final two non-essential extension modules, not discussed here due to space limitations. The Encryption Extension Module (KMREQ) implements SRTâs AES128/AES192/AES256 encryption functionality. The Configuration Extension Module (CONFIG) includes four types: SRT_CMD_SID, SRT_CMD_CONGESTION, SRT_CMD_FILTER, SRT_CMD_GROUP. The SRT_CMD_SID extension is pivotal for the StreamID functionality in upstream streaming. Interested readers may capture packets for detailed inspection.
2.3
ACK Packet Structure
An ACK packet is a positive acknowledgment sent by the SRT receiver to the sender. Upon receiving an ACK, the sender assumes the corresponding data packet has been successfully delivered. The ACK packet also contains estimated link data from the receiver, which can assist with congestion control for the sender. Figure 4 shows the ACK packet structure, highlighting several key fields:

Figure 4 ACK Control Packet
- Control Type: This field equals 2, indicating an ACK packet.
- Additional Information: Includes the independently counted ACK sequence number, primarily used to match ACK packets with ACKACK packets.
- Recently Received Data Packet Sequence Number +1: Equals the sequence number of the most recently received information data packet plus 1. For instance, if this field in the ACK packet shows 6, it indicates that the first 5 data packets have all been received, allowing the sender to purge them from the buffer. Note that this field is associated with the Packet Sequence Number and is unrelated to the ACK sequence number.
- RTT Estimate: An RTT estimate calculated using ACK and ACKACK packets, providing the round-trip time for the link.
- RTT Jitter Value: Measures the RTTâs variability, where a higher value indicates greater link instability.
- Receiverâs Available Buffer Data: Shows how much buffering data the receiver currently holds, which is available for decoding. A higher value is better, and it is capped by the Latency parameter.
- Link Bandwidth Estimate: Provides a bandwidth estimate for the current link.
- Reception Rate Estimate: Estimates the receiverâs downstream network bandwidth.
2.4
NAK Packet Structure
Upon detecting a discontinuity in packet sequence numbers, the SRT receiver judges a packet loss and immediately replies with a Negative Acknowledgement (NAK) packet to the sender. Additionally, the receiver periodically sends a NAK report, which lists all lost packet sequence numbers during the interval. This redundancy ensures that missing NAK packets in reverse transmission donât pose risks. Figure 5 illustrates the NAK packet structure, where the Control Type equals 3, containing a list of lost packet sequence numbers.

Figure 5 NAK Control Packet
2.5
ACKACK Packet Structure
The main role of ACKACK is to calculate the Round Trip Time (RTT) for the link, which is crucial as an included link statistic in the ACK packet. Figure 6 demonstrates the ACKACK packet structure. Both ACK and ACKACK packets feature precise timestamps and ACK sequence numbers. As the sender delivers the ACK packet to the receiver, the receiver promptly returns an ACKACK packet. This enables the sender to match each ACK packet with its corresponding ACKACK, calculating RTT by subtracting their timestamps.

Figure 6 ACKACK Packet Structure
2.6
Keepalive and Shutdown Packet Structure
The last two packet types in SRT are the Keepalive and Shutdown packets. Their structures are shown in Figures 7 and 8.

Figure 7 Keepalive Packet Structure

Figure 8 Shutdown Packet Structure
3
Wireshark Packet Capture Analysis
Wireshark is an extensively used open-source packet analysis software, capable of intercepting various network packets and displaying detailed packet information. As the IP trend in the broadcast industry progresses, Wiresharkâs role grows, akin to waveform monitors for SDI signals and stream analyzers for TS streams.
The following are two examples of using Wireshark for link analysis:
3.1
Scenario 1: Connection Failure
In the process of setting up SRT links, connection failures may occur due to various reasons. Here, we can leverage Wiresharkâs packet capture analysis to determine the error type.
Figure 9 depicts captured data following connection failure, with video capture available below. The persistent handshake packet exchanges indicate a failure to establish a successful handshake, yet confirm correct IP and port settings, as communication between parties is functional.
Given both partiesâ SRT versions exceeding 1.3, the handshake requires two round trips and thus four handshake packets. The initial handshake packet always follows the HSv4 format, allowing us to identify it. The âHandshake Typeâ of the fourth handshake packet is 1002-Reject, meaning âPeer Rejectionâ, suggesting parameter mismatches possibly caused the handshake failure.
Next, we examine the second handshake packet, a response from the listener to the caller. Its âEncryption Fieldâ designates AES-128, signaling a need for AES-128 encrypted response from the other party. The third handshake packet, issued by the caller to the listener, showcases the KMEQ module as NOT in the âExtended Fieldâ, indicating a lack of an encrypted response.
Through this analysis, we deduce the connection failure stems from the Listenerâs AES-128 encryption demand unmet by the Caller. To connect successfully, the Callerâs AES-128 encryption option needs setting with the Listenerâs password.

Figure 9 Scenario 1: Diagnosing Faults via Packet Capture Analysis
3.2
Scenario 2: Obtaining Link Information
The Internet linkâs Round Trip Time (RTT) indicates the duration for data to travel from sender to receiver and back, affecting SRT link latency settings. Often difficult to access due to firewall restrictions, RTT can be estimated through ACK packet analysis with Wireshark.
Figure 10 reveals an RTT of 20.61 milliseconds, with an RTT fluctuation of 9.786 milliseconds, indicating unstable RTT. Variations in RTT affect the time required for packet retransmissions, influencing SRT link error control, necessitating parameter adjustments to fit the linkâs characteristics.

Figure 10 Scenario 2: RTT Estimate and Variability
Conclusion
The SRT protocol, known for its excellent performance, low hardware-software requirements, and open-source nature, sees widespread application across various fields, with recent advances in upstream streaming. Understanding the SRT protocol packet structure allows for effective packet capture software use in fault analysis and resolution, ensuring quick, accurate troubleshooting in practical scenarios. We hope this article proves helpful and invite discussion and exchange.