The author is manager of technology for Wheatstone.
If you’ve been working with networks for a while, you’re familiar with Transmission Control Protocol and its fast-talking cousin User Datagram Protocol. You might even have wondered, as we have, if it’s possible to combine TCP and UDP into one?
One is more reliable and the other is faster, which is why we use TCP as the IP transport layer for web browsing, email and other enterprise applications and we use UDP as the IP transport layer for AoIP, streaming and other real-time media applications.
Their main difference is that TCP uses a bigger buffer (read: more delay) to guarantee that data packets are accounted for and delivered reliably, whereas UDP is a simple message-oriented protocol that can transport audio and control data at very low latency.
We can reduce WheatNet IP audio network packet timing to 1/4 msec for minimum latency in part because of UDP multicasting, which means that local audio transport and studio controls are almost instantaneous.
But what if we go beyond the studio network and want to live-stream audio and control data in real time across the public network, where links are less reliable, and distance adds more delay? Is it possible to have both the reliability of TCP and the speed of UDP?
Enter Reliable Internet Stream Transport, or RIST, an open-source transport protocol developed for reliable transmission of video and audio in real time.
RIST adds error correction and packet recovery to UDP multicasting similar to TCP, but without the huge buffering time. It uses things like RTP sequencing to identify potential packet losses and multi-link bonding to guarantee media delivery over these public links with very little delay and without having to compress or reduce the bits, and hence the quality, of audio being transported.
Established by our industry
RIST is based on established protocols widely adopted by the broadcast industry. It uses the interoperability profiles found in VSF TR-06-1 and VSF TR-06-2, the technical recommendations of the Video Services Forum (VSF), which gives us important features like link bonding, forward error correction, seamless switching and many of the specifications found in SMPTE 2022-1.
All of this makes it highly interoperable with broadcast-grade equipment, yet gives manufacturers the freedom to innovate within our own product implementations.
Fig. 1 shows RIST use for transporting real-time audio between two broadcast facilities across the public internet.
We added it to our protocol list for Wheatstone AoIP and streaming products for these and other reasons. RIST is a relatively new arrival in the media transport world, and it’s especially timely given the growing number of high-speed links now making it possible to stream at full audio bandwidth as part of a low-cost contribution network or transport between regional centers.
More and more high-speed links are popping up on the public network, and RIST now gives us a way to get it there. We are using RIST, for example, to get streams back and forth between studios and AWS data centers.
How it works
RIST adjusts in real time to achieve the lowest latency and fastest performance for a given link, whether the link is closer to an AWS data center or several hops away. In our test runs, RIST doesn’t seem to degrade performance like TCP does.
TCP is designed to be error correcting; it waits to see that packets are received correctly and if not, it will issue a retry to the sender. If too many retries are issued, TCP will throttle down its transmission speed. It doesn’t help that TCP uses large data packets, so by the time it figures out that the packet needs to be resent, it’s already been a half-second.
All of this makes TCP packet timing unpredictable and requires deeper buffering, making it unsuitable for real-time transport. When you click a page on your web browser, it can take a second or two for the web page to appear. That works for most applications, but when you’re mixing or streaming live, even 100 milliseconds is too much delay.
Fig. 2 shows the impact of packet loss on decoded audio.
Comparatively, RIST uses smaller packet sizes and therefore can hold to a much better tradeoff threshold between latency and speed.
RIST also uses RTP under the hood, which means it gives us RTP time stamps critical to real-time audio or video transport. We’ve been able to add RTP to UDP streams, but sequencing of the RTP time stamps using UDP is not reliable over long distances. Depending on distance and the quality of the link, there’s no guarantee that packets will arrive in the proper order; with RIST, RTP time stamps get sequenced by the protocol in the correct order, making it ideal for real-time transport of audio and video.
RIST supports IP multicast natively, which means it combines the ability to provide one-to-many transmission with very low latency and lower network overhead. RIST also supports load sharing and seamless switching so that should a link go down, it can route around that link and use the alternate link without interruption.
With RIST, you can dial up a 100-millisecond delay for a live event like a concert and know that you have about 100 milliseconds of delay and you’re not going to have a problem getting it delivered. It’s as easy as opening up a RIST stream session between two points — in our case from a WheatNet streaming appliance or software — and establishing a dedicated communication between the local IP address and an IP address on the far end.
Bank-level security
One little-known, but increasingly important, benefit of RIST is its encryption and validation technology.
Going from a closed AoIP network in your building to a third-party cloud provider like AWS changes everything about security measures. Your media becomes a much more likely target sitting on a public server in the open Internet.
Similar to Secure Reliable Transport or SRT, another protocol you might have heard about, RIST uses a pre-shared key (PSK) method provided in the VSF TR-06-2 specification for access control, which has long been a problem for standard IP multicast delivery. This specifies that all receivers be configured with a secret passphrase, which can change on the fly as necessary. But unlike SRT, RIST augments PSK with a secure remote password protocol and also offers a DTLS (Data Transport Layer Security) mode with certificate-based authentication. This is fundamentally the same security technology used by banks; it’s almost impossible for a third party to listen in or pirate your broadcast streams.
Fig. 3 shows the RIST certificate authentication process using DTLS specifications. When a device connects, it presents a certification that, unless trusted, will be rejected.
For these and other reasons, RIST is being adopted by a variety of media companies, including Wheatstone. RIST is now included in our streaming appliances and software as well as our WheatNet-IP Blade 4s, and it is used by our industry partners such as StreamGuys.