...One Giant Leap for an Industry?
We conclude our exploration of surround sound on the radio with a discussion of compressed digital formats and the recent proposals for digital radio applications.
You may recall from previous articles in this series that Dolby Labs became the leading developer of successful surround formats in the analog domain, which appeared first in cinema sound systems, and eventually migrated to the home theater via the 4-2-4 matrix encoding of consumer VHS tapes' hi-fi soundtracks. The company was therefore poised to move surround sound into the digital domain, and did so with equal success.
After an abortive attempt at digital surround for film soundtracks by Kodak in the early 1990s, Dolby released its AC-3 format in 1992 (once again, initially in the cinematic environment only), which provided a 5.1-channel mix based on the analog "six-track" surround system featured in earlier 70 mm films, using a compressed audio data rate of 320 kbps or higher. Subsequently, the AC-3 format - commercially marketed as Dolby Digital - was adopted by the Laserdisc, ATSC and other DTV broadcast systems, as well as the DVD-V format, at 384 or 448 kbps. AC-3 also provided a stereo ("2.0-channel") option, and a small amount of metadata about the technical parameters of the content.
Two other compressed digital surround formats followed, from DTS and Sony (the latter called SDDS), using somewhat higher data rates (i.e., less aggressive compression than AC-3). These were also initially designed for cinematic applications, but DTS has subsequently become an optional DVD format for consumer releases. Most major motion pictures are now released with all three formats included on their distribution prints, since the formats can all coexist (along with legacy analog optical stereo) on 35 mm film. DTS and SDDS can also include up to 7.1 channels of audio. The additional channels are generally placed as "front surrounds," which can help produce more uniform atmospheres in theatrical environments.
More recently, the DVD-A and SACD formats have offered a number of new and catalog music releases in digital 5.1 surround, using Dolby Digital or DTS. These products also offer stereo mixes in the uncompressed PCM format, usually at higher resolutions than the 44.1 kHz/16-bit format used on CDs. (For example, DVD-A can include up to 192 kHz/24-bit audio.) DVD-A also offers an "in-between" format called Meridian Lossless Processing (MLP), which allows those higher resolutions to be used in multichannel mixes.
Importantly, when surround mixes are offered on DVD-A or SACD formats, in most cases separate 5.1-channel and stereo mixes of all content are provided, so that the consumer can choose the one best suited for the playback environment at hand. This is in contrast to digital video formats with surround sound, in which a single 5.1-channel soundtrack is intended to "downmix" appropriately to 4-channel matrix surround, or to stereo, or to mono. So unlike the cinematic and TV sound industry, the music industry has established a tradition of not guaranteeing downmix compatibility of its 5.1 mixes to stereo or mono listening. (We'll return to this point below.)
Now radio broadcasting comes to the party late, which is not necessarily a bad thing, and considers the many methods available to incorporate surround (or, more generically, "multichannel") sound as it converts to digital broadcasting.
First and simplest, receivers could simply incorporate one of several available systems for derived surround (or "pseudo-surround") sound. Broadcasters would not have to do anything to enable this, because the receivers would synthesize a surround-soundfield from existing stereo content. Most audio professionals feel that this approach falls short of truly enabling surround sound, however, and that if digital radio chose such a path it would be another opportunity missed.
Among true encode-decode systems, then, the simplest alternative is the adoption of a matrixed surround system, along the lines of the home theater environment. Encoding of content would be simple, and existing two-channel infrastructures could continue to be used throughout the radio plant. Multipath problems that prevented this system's use in analog stereo radio broadcasting would be solved in an IBOC system. The major drawback here is that --unlike the cinema/TV world - there is no large inventory of radio content (i.e., commercial music releases) already encoded in matrixed surround. The legacy nature of this format makes it unlikely that this will change. The commercial music that is produced in surround today (on DVD-A and SACD) uses one of the compressed digital 5.1 formats described above.
Thus to use the matrixed approach, discrete digital 5.1 surround audio content would have to be converted to a four-channel matrix system through a transcoding process, and compatible matrix decoders would have be placed in digital radio receivers. A company called SRS, familiar to many PC audio users, has proposed just such a system, based on its implementation of the long-standing but not widely implemented Circle Surround matrix. Like any matrix system, surround content is inherently designed to downmix well to stereo or mono, but surround imaging may not be as robust as in discrete systems.
So as it has been since the quad days, there are those who feel a discrete solution is preferable to a matrix, and this argument is made even more strenuously in the context of a transition to digital radio. This camp advocates that digital radio broadcasting probably is best advised to consider a discrete 5.1-channel compressed digital format rather than an older matrix method. Unlike packaged media, however, bandwidth limitations in broadcasting require a single audio coding approach that can compatibly address mono, stereo and surround receivers.
This has led to the development of a system called parametric surround, which allows the addition of a "side-channel" of steering data to an existing stereo codec platform. New multichannel decoders can reconstruct a surround mix from the signal, while legacy stereo decoders ignore the steering data and simply decode the stereo audio signal as before. Currently two such formats have been developed, one by Fraunhofer and Agere Systems, the other from Philips and Coding Technologies. ISO/MPEG has initiated an effort to converge these two formats into a single parametric surround coding standard.
In general terms, this approach would "steal" a fixed amount of the digital audio channel's bits and dedicate them to steering data. Initial systems have used 16 kbps for this parametric steering channel, so in the case of IBOC, this would leave 80 kbps for audio coding. Recent tests of the HDC codec have shown little perceived difference between 96 and 64 kbps coding of audio, so this reassignment of the datastream should not produce much penalty for stereo listening. The surround sound results at these data rates are impressive, and are typically difficult to distinguish from the original discrete 5.1 source material.
Unlike matrixed systems, it is assumed that this type of surround encoding would not happen until the broadcast codec is applied (i.e., in the transmission air chain), so surround audio signals would have to be maintained in discrete (compressed or uncompressed) modes until airing. This presents challenges to existing broadcast production infrastructures.
An third alternative therefore has been proposed, which allows the generation of a parametric steering channel during the upstream production process, with this signal encoded as a watermark that is perceptually hidden in an uncompressed digital stereo audio signal. Such content can be stored, edited and routed throughout the existing broadcast infrastructure as stereo audio, and broadcast as either an analog or digital signal without apparent consequence, according to the purveyors of this approach, Neural Audio and Harris Broadcast. This implies that the watermark remains inaudible after broadcast processing, and survives the digital broadcast codec's compression so it can be interpreted properly by a surround decoder.
Because the watermark must remain inaudible, this system is constrained as to the data rate it can apply to the steering channel. As a result, the Neural steering signal uses about half (or less) the data rate employed by parametric systems for their steering signals. This has caused some to comment that the watermarked approach provides less accurate imaging than parametric systems.
A potential downside to both the parametric and watermark surround systems is that (for optimal surround results) they rely on the stereo broadcast signal being a downmix of the original 5.1 source, so it can eventually be recombined with the steering data to extract the surround mix on surround-capable decoders. But the surround downmix may not produce an aesthetically pleasing result to stereo listeners.
(As described earlier, 5.1-channel music produced today is generally mixed without regard for downmix compatibility, since the release formats usually include separate stereo mixes on the same disks. In fact, the DVD-A format includes a "Do Not Downmix" flag for multichannel content, and some record companies are routinely turning on this flag in their current releases.)
Consider that it is likely stereo listening will comprise the lion's share of the audience for some time, if not always, so some engineers object to causing this potential problem for the majority of the audience, simply for the sake of a probable permanent minority of surround listeners.
As a reaction to this, the parametric surround camp has developed an artistic downmix option, by which a dedicated stereo mix can be injected into the broadcast downstream of the parametric steering signal generation in the airchain, so the stereo listener hears the "regular" stereo mix. But this can foil the surround decoding, thus potentially defeating the whole point of the process.
Pick your poison
It appears that there is no clear winner here, with each of the three proposed methods (matrix, parametric or watermark) having its respective set of pros and cons. The NRSC and the World DAB Forum have each begun to explore these options for possible standardization or recommendation, but it is uncertain whether any single format will be selected in either the IBOC or DAB environments.
In fact, all of these systems as currently configured can technically coexist, and broadcasters could choose freely among them. Stereo listening could continue, and surround listening would be enabled if the proper decoder were available on a receiver. If multiple systems proliferated, it is likely that receivers could incorporate multiple decoders, just as many of today's home theater systems do for TV-audio surround (e.g., Dolby Digital and DTS).
The music industry - and radio broadcasters who produce their own original surround content - may also consider this movement, and could adapt their stylistic approach to 5.1 mixing such that aesthetic compatibility of surround downmixes to stereo and mono is more commonly assured. If this practice does not occur, such a content-compatibility problem could prove the downfall for the nascent surround-sound enterprise in digital radio.
The topic of digital radio surround will naturally occupy much conversation in upcoming months, but it is probable that the ultimate choices - including, once again, whether surround sound finally comes to radio - will transpire in the marketplace.
...One Giant Leap for an Industry?