Requires
Flash Player 9

Version Test
Download Flash

RW newsbytes
Reference Room

Broadcast Law Review
Tech Tips
Guy Wire
IBOC DAB
Product Evaluation
RW Special Report

Skip Pizzi/The Big Picture
Wire for Sound
Workbench
IT Management
Roots of Radio
Spotlight on RF Safety
Radio Road Warrior
Green Radio
Certification Corner
Classifieds

Subscribing to RW
Customer Service


The Leslie Report


Cool Stuff Awards Radio World Announces 2008
“Cool Stuff” Award Recipients


Excellence in Engineering Award

Subscribe to Email Newsletters


Click on the widget below to view the last issue of
Radio World Engineering Extra


Requires
Flash Player 9

Version Test
Download Flash

The Big Picture

08.17.05

 

The Architecture of Spatial Coding

A Look at the Proposed MPEG Standard for Sending Surround and Stereo Over the Same Channel

by Skip Pizzi

Much of the discussion regarding surround sound on digital radio worldwide these days has involved a developing specification in ISO/MPEG called Spatial Coding, or SC.

On its surface, it is a mechanism that allows a stereo or mono audio signal to be sent in its usual form, but accompanied by a small auxiliary data stream that describes how a surround mix of the current signal would be created.

Legacy receivers just ignore this aux data stream and play out the stereo or mono audio as usual, while SC-enabled devices interpret the aux data and apply it to the same audio signal to recreate the surround mix. The system is codec-agnostic, so it could conceivably be applied to any transmission or storage scheme. It also is scalable over a wide range of input and output channels (meaning that it is not fixed at encoding 5.1 audio into stereo, but could also be used to extract 10.2-channel audio from a mono signal, for example).

Conceptually this seems simple enough, and also sounds like a great solution for managing digital radio transmission that addresses a variety of emerging content and listening environments - just as the stereo multiplex provided backward compatibility to existing FM mono transmission in the 1960s.

But for those not intimately familiar with the technologies involved, how the system pulls this off seems hard to fathom when you actually start to think about it. For those accustomed to matrix surround, it's hard to understand how Spatial Coding can faithfully recreate a surround signal using a very low bit rate data channel (~5 kbps), even from a monaural audio feed. (Matrix surround always requires at least a stereo transmission channel, hence its "4-2-4" nomenclature.) The system allows the use of either the same audio signal

Perhaps hardest of all to grasp, however, even for those comfortable with other 5.1 coding systems (like AC-3), is how the system can allow the use of either a downmixed surround signal for the audio, or a wholly separate "artistic stereo" mix, and still recreate an acceptable multichannel presentation at the receiver. (This implies that the audio signal seen by the decoder may be different than the one used by the encoder in generating the spatial data signal.)

So to sort this all out, let's dig in a bit to the system's interesting design.

Do it with frequency

Like most perceptual audio coding systems, MPEG Spatial Coding does most of its work in the frequency domain. This means that multichannel source audio is first converted from the time domain to the frequency domain, and analysis of each audio channel is then done in so-called critical bands, which are based on how the human hearing sense perceives sound. (The bandwidths of critical bands are set to the minimum frequency resolutions of human hearing a various frequencies - some bands are wider than others - and they are the basis for spectral masking algorithms used by all perceptual audio coding systems.)

Instead of using this analysis to reduce digital audio coding bit rates, however, Spatial Coding uses it to extract spatial cues from each band. Such cues are derived by comparing the channels against one another for level and phase differences within each spectral band. These deltas between pairs of channels can then be robustly encoded using a relatively small amount of data, which is sent to the receiver via a data side-chain transmission. (This technique is adapted in part from the older joint-stereo coding technique used by some perceptual coders.)

Also included in this data signal are prediction signals that help the system manage how audio elements spread over groups of channels are mapped, which is conceptually similar to the steering signals used in advanced matrix systems to aid in image stability. A final component of the data describes the actual audio signal's dynamic deviations from those fixed prediction models - a kind of steering "servo" signal.

Manual or automatic transmission

As this spatial data signal is sent to an aux data output, the multichannel audio signal is meanwhile downmixed to stereo (or mono, if necessary), then reconverted to the time domain for presentation to the transmission or storage system's coding and modulation components.

Alternatively, a wholly separate "artistic" mix (or "handmade downmix") can be substituted at this point, such that the content transmitted or stored will be this alternate signal rather than an automatic downmix of the multichannel audio. In either case, legacy decoders will encounter only the stereo (or mono) signal, while new systems will apply the data channel's spatial coding to the same audio signal and derive a multichannel output.

As noted earlier, the spatial data is adequately robust for it to extract a multichannel mix even when an artistic audio input is transmitted instead of the original multichannel audio's downmix. Nevertheless, a relatively new feature of the system allows the SC encoder to compare its input and output audio signals, and if it detects a substantial difference - as it might in some cases where the artistic mix option is selected - it can adjust its spatial coding data's parameters so they are optimized for the decoder to reconstruct the multichannel audio signal from the artistic stereo mix instead of the encoder's own automatic downmix.

Tweaks

There are several other clever techniques used in the MPEG-SC system that improve its performance and efficiency. The system also offers quite a bit of encoding adjustment and scalability, along with the ability to remain transport-agnostic, allowing it to be used across a variety of applications besides digital radio broadcasting. (The spatial data channel includes a metadata block that communicates these settings to the decoder for optimum performance and extensibility.)

To learn more about this system's inner workings, see AES Convention Paper 6447, "The Reference Model Architecture for MPEG Spatial Audio Coding," presented at the 118th Convention, May 2005, Barcelona, Spain.

Skip Pizzi is contributing editor of Radio World.

 

Sponsored links:

Omneon Spectrum™ media server systems provide the most flexible and cost-effective solutions for digital video storage and broadcast. Visit Omneon Video Networks at www.omneon.com.

Nucomm delivers industry-leading microwave solutions for high-data-rate HD and IP File transport applications from portable ENG/OB to rack-mounted fixed link systems. Click here!

QuStream's signal conversion and processing products set the signal standard using patented technology to convert, encode, decode, synchronize and process video signals. Click here!

MultiDyne provides a wide array of video and fiber optic transport solutions, each with the highest image quality in the industry. Click here!

RF Central - Total RF solutions manufacturer (TV broadcast): Full-Service 2GHz Relocation, COFDM, HDTV ENG components, complete links.

Transradio: DRM, AM, VHF/FM - We make the transmitters. Visit us now at www.transradio.de for more information.

Harris Corporation's Broadcast Communications Division designs products that streamline workflow of content production, processing, transmission, management, storage, test and measurement and broadcast graphics. Click here!

 
Radio World CoolCasts

Take a virtual booth tour of the products that won the 2008 Radio World "Cool Stuff" Award.
Radio World’s 2008 Source Book & Directory... ...is now available in a special digital edition. In response to many reader requests, our handy annual resource is now online for free. (A 12 MEG PDF)
back   Home | Subscribe | About NewBay Media | Contact Us