A Face-Off With Frank Over MPEG Surround and Treatment of Surround Broadcasting Compatibility
The Feb. 1 issue of Radio World included an opinion piece on the editorial page titled “Compatibility Begins at Home.” It described the dilemma faced by broadcasters as they move toward backward-compatible surround sound broadcasting, but find that much of the 5.1-channel music content available today is produced in a form that may not be compatible to stereo or mono listening.
The column posed the central question of how FM stereo would have fared if much of the stereo music content of the day had not summed well to mono.
Point
To restate the premise of the opinion piece, this problem stems from practices the music industry has followed since the introduction of DVD-A and SACD release formats for surround content. Unlike the old quad days, or the tradition of the TV and cinematic industries – where an audio encoding system allows a producer to create a single sound mix to address all listening arrangements – the large storage capacity of these new formats allows the inclusion of separate stereo and surround versions of a release’s content.
This allows surround music mixes to be optimized for surround listening only, without regard for how they might sound when “downmixed” to stereo or mono. While this approach provides content creators with considerable creative freedom, it does not mesh well with the spectral efficiency requirements of broadcasting, which call for a singular, compatible solution.
Given this context, the RW opinion concluded that all the effort underway to develop a compatible mono/stereo/surround broadcasting system for HD Radio might be in vain if there was not a large and reliable source of content that could take advantage of it. If surround audio is produced in a non-compatible manner to begin with, no transmission system, however ingenious its design, can make the content compatible downstream.
Therefore the opinion called for the broadcast and music industries to come to a mutually beneficial compromise, allowing radio stations to broadcast a single music mix that was compatible to all known listening formats.
Counterpoint
A well-known industry veteran took some issue with the opinion, however.
Frank Foti of Omnia/Telos fame – and a host of earlier, high-profile station chief-engineering credits – has been among those broadcast audio professionals actively involved with development of workable surround broadcasting for digital radio. Frank sent RW the following comment, from which I excerpt:
“I mostly agree with ‘Compatibility Begins at Home,’ but it leaves the impression that all surround systems are affected by the downmix problem.
“Downmix is only a problem for the matrix schemes. The MPEG Surround technology can transmit the ‘artistic’ stereo mix as the producer intended it. Listeners receive exactly the familiar stereo/mono version with no modification of any kind. Fidelity might be a bit better, though, since both SACD and DVD-Audio disks have better resolution than 16-bit CDs.
“In some cases,” Foti wrote, “the SACD or DVD-A stereo version might be re-mixed from the multi-track master to improve quality and motivate purchase of new disks. This happened in the transition from vinyl to CDs as well, and was seen as a benefit rather than a problem.
“Occasionally, the stereo version is completely different from the surround. For example, on the ‘Tommy’ SACD there are a couple of songs where the 5.1 versions are longer than the stereo versions. Pete Townshend used a different take for the 5.1. In cases like this, where the stereo mix is not useful, the simple ITU-775, 5.1-to-2.0 downmix method usually results in an acceptable compromise that is stereo/mono compatible and pleasing to listen to, though it may differ from the familiar stereo original.
“Matrix systems force stereo/mono listeners to a downmix because there is no way to transmit the original stereo version. But then the matrix systems go on to phase-shift the channels as well – an even bigger problem. Even if music producers were able to somehow constrain their surround mixes for better downmixed stereo/mono compatibility, you’d have the phase-shifting to contend with.
“I say, let producers mix as they wish. Let them go creatively wild to make the most impressive aural experience they can. Then let’s broadcast that faithfully to wow our listeners,” Foti concluded.
Rebuttal
I also agree with Frank on much of his rebuttal, but have to take issue with a few points. (I know he won’t mind if in the interest of full disclosure we also mention that his company has historical business alignments with Fraunhofer IIS, one of the developers of the MPEG Surround format.)
Yes, MPEG Surround attempts an elegant solution, and addresses some of the difficulties inherent to “matrix” (or what I prefer to call “composite”) surround systems – i.e., those that encode surround information directly into the stereo audio mix, rather than extracting the steering data and transmitting it as a separate signal, as the “component” approach used by the MPEG Surround format does.
But it is not a panacea, nor does it provide its solution without some additional cost over composite systems.
First, if the MPEG Surround system is used in the way Frank suggests, such that the “artistic stereo” audio is broadcast along with steering data gleaned from the same song’s surround mix, the stereo may come through as intended, but now the surround reproduction may be compromised. Let’s call this process substitution. It simply shifts the problem from the stereo to the surround listener, and although this may lessen the impact since that latter audience is smaller for now (and may always be), it’s only a displacement of the issue, not a true solution.
Further, while this substitution approach is indeed unique to MPEG Surround today, it only works when the song’s stereo and surround mixes are released in synchronized forms – which, as Frank mentions, is not always the case.
The MPEG Surround developers refer to these conditions (where the two mixes are actually different songs) as “pathological cases.” I imagine some musicians would resemble that remark , but it doesn’t allow MPEG Surround to work its substitution trick, and such cases may be on the increase.
So when the two mixes aren’t synchronous, an incompatible surround mix will suffer from the same problem in MPEG Surround as it does with composite systems, since all the formats then rely on a downmix for the stereo audio. (And Frank’s reference to such downmixing by an official ITU recommendation’s algorithm may look impressive, but it doesn’t make an incompatible surround mix sound any better.)
By the way, the music industry isn’t fond of either the downmix or the substitution approach, since in each case they feel the broadcast may be violating the intent of the artist – either for the stereo or the surround output, respectively. The industry hasn’t waded into the fray yet officially, but they have made informal public comment that they don’t want radio messing with either mix. (One could counter that radio has always taken some liberties in this respect, given its tradition of audio processing, but that’s another argument.)
Cost concerns
The substitution approach also would require the maintenance of a double music inventory by broadcasters.
This is not as big a technical problem as it sounds, since it’s fairly easy to store all eight channels (5.1+2 = 8) together in a single audio file, for which there are already standard uncompressed audio file formats proposed. The ingest process might be a bit trickier and slightly more time-consuming, but potentially more problematic here is audio routing, especially since not all content would be stored this way, and the main channel’s digital and analog services would require separate feeds. Audio storage capacity also would be affected (such uncompressed surround + stereo files are 4x bigger than stereo-only equivalents).
I know Frank’s colleagues at Axia have a good answer for this in moving to an IP-routed system, but some broadcasters may find this a bigger adjustment than they are willing to make just to add surround sound. Ultimately, that kind of system could be a wise choice when a facility move or rebuild is involved, but it’s likely that more justification than surround conversion alone will be required for such a shift.
Finally, for its optimal operation the MPEG Surround system also levies an opportunity cost to the broadcaster by requiring the full-time dedication of ~5 kbps of a station’s IBOC payload bandwidth to deliver the “steering data” component. While this may seem negligible today, it may not be considered so if a robust IBOC datacasting business evolves in the future.
Note also that neither the original opinion nor my response here takes any position on the relative aural fidelity or imaging quality of the various surround systems proposed for IBOC use. While those attributes should certainly figure into the holistic assessment that broadcasters undertake when considering any surround solution, the sole issue under discussion here is the compatibility question.
So although the MPEG Surround system offers some unique help on the compatibility problem, it also presents some unique costs to broadcasters in doing so. It also doesn’t truly solve the problem. The sole, complete solution to this issue – as presented in the RW opinion column – remains downmix-compatibility in the original content. Let’s hope the music and broadcast industries can work together and successfully resolve this matter in the near future, as they have done many times in the past.