The Evolution of Audio Streaming

In my almost 40-year career in broadcast technology sales, we’ve gone from NAB carts and pressure rollers to high-definition audio streaming. It has been an amazing journey through technology. Over the last 10 years I’ve had the privilege to work with one of the true gurus of audio streaming, Greg Ogonowski, first with Orban and now as part of his own company, Modulation Index, including the StreamS Hi Fi Audio brand.

EARLY INTERNET AUDIO
My first experience with Internet audio was in 1998, while at TM Century when someone gave me a folder full of photocopied material on a new company called Broadcast.com, headed up by some guy by the name of Mark Cuban. I remember returning it to the person who had me review it with a comment that this wasn’t going to make it. What kind of yahoo would invest in something like this, right?

Available bandwidth was very limited; however Broadcast.com was doing sports talk, which meant that low bitrate MP3s could be tolerated. The rest is history.

By 2000, TM had sold its technology department to a German company called ON AIR, and they had developed an MP3 encoder that we had up for a long time. It was a mono stream of 64 kilobits per second, which for the day sounded good. This new technology showed promise, but then the tech boom crashed; to go along with the crash, royalties on the Internet became an issue, and the technology went into hibernation for a couple of years.

Eventually, once the royalty issues were worked out, several entrepreneurs would establish music channels using high bitrate MP3, 128 kbps and above. Not all browsers would accept MP3 directly, which meant that either Windows Media Player or another player (typically WinAmp), needed to be downloaded — which in corporate America was a big no-no. Port blocking issues would continue to halt the progress of streaming audio and limit it to home computers.

At the same time there were some new codecs being developed.

HIGH QUALITY, LOW BITRATE
AAC, and especially the ultra-low bitrate HE-AAC, showed the ability of the Internet to deliver high-quality audio at low bitrates, including bit rates that would allow streaming to some of the new generation 3G phones. These phones required different streaming protocols over the well-established ICY HTTP (Shoutcast/Icecast) protocols, RTSP.

While this showed great promise, it was never pushed by the phone carriers, which had no problem promoting downloads but had no desire to promote a technology that required real time streaming reliability.

Nonetheless, streaming gained in popularity as almost all browsers now played MP3s. Adobe’s Flash player, delivered on almost all PCs, became a way around the port issues. However both methods required fairly high bitrates, and despite the ability to be played on most browsers, audio streaming was being discouraged by companies whose 10baseT connections were being overloaded.

In the meantime, the new streaming codecs AAC and HE-AAC were gaining popularity because Apple pushed the AAC protocol. (Neither A in AAC is Apple; it stands for Advanced Audio Codec.) However users still required the download of a player like WinAmp to play the AAC and HE-AAC files. That would change in the late 2000s as Adobe would include both AAC and HE-AAC into their Flash Player. Windows Media Player would include native support, which would become the standard for PC audio playback.

In the office environment, bandwidth would continue to increase making concerns about data through-put a thing of the past. Audio streaming was on its way to becoming the most popular way to listen to audio programming in the office, replacing radio that was plagued by reception problems coupled with dissatisfaction with 20-minute stop sets on radio.

High-quality audio streaming was now available in the home and office. However, with the exception of the few who actually knew they could stream audio to their 3G phones, streaming audio was still not mobile.

This would all change in 2007 when Apple introduced the iPhone. Ironically the iPhone could not stream audio by itself. However apps that could stream audio soon appeared and all of a sudden, streaming was everywhere. It was the Apple iPhone’s ease of use UI that changed everything.

SEGMENTED AND ADAPTIVE STREAMING
There were still issues, however. With the number of listeners increasing and the advent of streaming video, carrier networks were getting overloaded and stream reliability became a big issue. To address this issue came two new streaming formats: HTTP Live Streaming developed by Apple and MPEG-DASH, an evolving MPEG standards-based protocol that has made some impact in the video field but due to a lack of standards for meta-data has yet to make an impact in the audio only field. Both HLS and DASH use segmented audio files and support adaptive bit rate streaming. However, the lack of meta-data standards within DASH has hindered the development of players capable of playing those streams.

Both incorporate segmented audio and adaptive bit rate streaming. Segmenting refers to the fact that instead of endless real time data, segments or packets of files are sent and fed into a buffer on the player end. This allows for network interruptions in some cases as long as a couple of minutes to take place without any loss of audio.

The adaptive part is a means of eliminating interruptions because of bandwidth reductions which can take place in mobile applications. As an example, the content provider sets up three HE-AAC streams: 64 kbps (very high quality), 48 kbps (excellent quality) and 32 kbps (good quality). When the player is no longer capable of maintaining its buffer at 64 kbps, it switches to the 48 kbps stream and if it continues to drain the buffer it goes to 32 kbps. A properly designed encoder will keep audio in sync with the switch from the higher to the lower without audio transcoding, which reduces audio quality.

Currently, most HLS streams are being generated at the server end requiring dedicated and expensive media servers from Adobe or Wowza. However, a couple of companies have or are in the process of providing encoders that do the segmenting at the encoder level and since HLS contains segmented files, they can now be streamed using virtually any server or cloud based solution which should dramatically reduce content distribution costs while increasing stream reliability.

To date, few HLS encoders are known to be 100 percent IETF compliant, per the Apple specified HLS encoder solution. For those that are, they offer secure server connection modes and also play and display metadata correctly in iTunes. Many content delivery networks are currently offering poor misinterpretations of the HLS standard.

With HTTP 2.0 coming, the buyer should beware.

John Schaab is a 44-year broadcast hardware/software sales and marketing veteran. Starting at International Tapetronics in 1972. He was the manager of TM Century’s advance technology department and general manager of the department when it was spun off to ON AIR Digital GMBH. He spent the last 11 years as PC Products manager and then North American sales manager for Orban before taking on the marketing director position for StreamS Hi FI.

Share your own experiences and comments about audio streaming technology. Email [email protected].