Understanding RTP: The Key Protocol for Media Data Transmission

Previously, we spent considerable time introducing various details of the RTSP protocol, but RTSP transmission essentially involves three protocols: RTSP, RTP, and RTCP. RTSP is primarily responsible for establishing connections, termination, and other controls. However, the actual transmission of media data is done using the RTP protocol. In this section, we will introduce the RTP protocol.

RTP is an application-layer protocol where the transport layer protocol can be either TCP or UDP (UDP is more common)!

An RTP data packet consists of two parts: one part is the RTP Header, and the other is the RTP Body. The RTP Header occupies a minimum of 12 bytes and a maximum of 72 bytes; the other part is the RTP Payload, which encapsulates the actual data load, such as encapsulated video data encoded in H.264! Let’s now take a closer look at the organization of the RTP Header and RTP Body!

Illustration of RTP packet format

 RTP
 RTP

Description:

Definition of PT in GB28181

Payload TypeEncoding NameClock RateChannel CountMedia Item in SDD Description’s m Field
4G.7238k HZ1audio
8PCMA(G.711 A)8k HZ1audio
9G7228k HZ1audio
18G.7298k HZ1audio
20SVACA(SVAC Audio)8k HZ1audio
96PS90k HZ video
97MPEG-4  video
98H.264   
99SAVC(SVAC Video)   

Depending on the PT type, the Payload is organized differently.

Let’s take a look at an actual RTP packet capture

The red framed part is the RTP Header; the green framed part is the RTP Payload! Let’s examine it in detail:

The hexadecimal representation of the RtpHeader in this data packet is:

Its binary representation is as follows:

The value is 10, version number is 2. Let’s compare it to the packet capture analysis with Wireshark:

The value is 0, indicating no padding. Here is the Wireshark packet capture:

The value is 0, indicating that RTP header extensions are not supported! The Wireshark packet capture is as follows:

CSRC Counter, value is 09, indicating that there are no CSRCs in the RTP header information! Wireshark analysis:

The value is 0, indicating that this packet is not the last frame of data! Wireshark analysis:

PS: When this value is 1, it indicates that this packet is the last packet of a data frame!

The PT value is 96. Based on the payload type, it is determined that the RTP packet payload is of a custom data type! The packet capture is pulling video data from a camera, so it follows the GB28181 standard and is a PS type packet. Wireshark analysis is as follows:

The value is 0x 12 ed, decimal is 4845, indicating the RTP packet sequence number is 4845.

Wireshark analysis is as follows:

The value is 0x4b cf fa 46, indicating the timestamp, Wireshark analysis shows:

Synchronization source identifier, the value of this packet is 0x6b 2f dd 87, Wireshark analysis shows:

Since the CC value in the RTP header is 0, it indicates that the number of CSRC in this packet is 0. The RTP HEADER can have 0-15 CSRCs.

The blue-shaded part is the Rtp Payload. We can see the first byte is 0x67, which suggests that the data is the SPS of the video frame, demonstrating that the data in the RTP Payload is the transmitted media data. As for the details of SPS, they will not be elaborated here!

Alright, after dissecting and illustrating the RTP packet format, we now have a thorough understanding of it. This concludes our discussion for this section! Next time, we will continue our exploration!