Mike Pappas says broadcasters need a surround sound option that works with existing content and equipment
This is a rebuttal to a commentary by Steve Church of Telos Systems in the June 16 issue, one of a series of opinion pieces on surround sound technologies as applied to radio. Church stated the answer does not lie in matrixed systems, but I have first-hand experience with another approach that is ready to go today.
As someone who has actually broadcast 5.1 using the Harris Broadcast/Neural Audio 5225 system I thought I might comment on its supposed issues.
1. The Harris/Neural Audio 5225 is not a matrix system. Matrix systems can (and usually do) have severe image stability problems. Our broadcasts with the Neural 5225 system have exhibited rock-solid image stability even through multiple lossy codec passes, satellite up and down links and 2-channel DAW editing.
2. The Neural 5225 makes outstanding conventional stereo, with a watermark from 5.1 surround sound. Center channel, low frequency effects channel and rear surround are correctly handled and there are no surprises in the stereo mix.
3. The watermarking is extremely robust and we haven’t been able to damage it.
4. The advantage of the 5225 system is that the watermarked stereo can be shipped via a conventional stereo broadcast plant using all of the conventional stereo equipment that broadcasters already have, including CD recorders, MiniDisc machines, ISDN codecs, analog STL, digital STL, air processor, transmitters and editors. It makes no difference whether the plant is analog or digital.
So how does the Harris/Neural 5225 system do this magic? Due to the pending patent applications in depth information on the system is limited, but here is the overview.
The Neural spatial compression is a new methodology that meets 2.0/5.1 broadcast challenges head-on and allows the distributor or broadcaster the ability to capture original source 5.1 content and “downmix” it (via the Harris/Neural 5225 surround production appliance) to a 2.0 channel format that survives aggressive lossy compression, editing (yes, editing!) and even conversion to analog.
Stereo editing systems will work just fine with 5225 downmixed 5.1 because it’s not a bitstream (unlike certain other methodologies). Encoded content may be broadcast, stored or distributed through existing 2.0 infrastructures.
It may be rendered to 5.1 at any point for production or broadcast “confidence monitoring.” After capture, it may be stored in the server and treated as any other stereo content.
Consistent, renderable audio
To broadcasters, this could be a godsend as the existing infrastructure, including production and storage, are 2.0. The stereo and 5.1 “mix” of content is 100-percent compatible with the Harris/Neural Neustar codec conditioner. This results in HDC-compatible content that is controlled, consistent and “renderable” to a 2.0 or 5.1 spatial environment.
While this is the attraction of 5:2:5 matrixes, matrix methodologies fall far short of what is possible regarding perceived discreteness. Matrixes are an excellent solution in that they solve transition compatibility issues and allow, to a certain extent, 2.0 content to co-exist with 5.1 content while extending the value of the ubiquitous 2.0 infrastructure.
That being said, matrixes do not satisfy the distributor’s and consumer’s expectation of what is now called “discrete” 5.1. The Neural Spatial Environment Engine rendering process allows the consumer to enjoy a consistent spatial environment with as many or as few loudspeaker elements as is available. It can spatially render content ranging from 5.1 original source digital to mono analog.
It is obvious that the interspersion of legacy 2.0 and 5.1 content is a reality. Unless this is handled on a system basis, the result will be less than transition-proof. In fact, inability successfully to integrate legacy content with “modern” content in such a way that meets the consumer’s expectation is unfortunately naive and slows the adoption to 5.1.
The Neural SEE is a patent-pending programmable, transform-based spatial rendering system. SEE can render any two dimensional audio source – both 5.1 and stereo are 2-D, for example – to as many as 256 or as few as two outputs with a high degree of perceived separation.
During encoding of 5.1 original source material, the 2-D image envelope of the 5.1 content is imbedded in the two downmixed audio channels in the form of watermarking. Intensity/coherence watermarking is an excellent choice because of its similarity to the image construct of naturally occurring 2-D stereo and compatibility with already prevalent Lt/Rt matrix content.
This simplifies the inevitable integration of 2.0 and 5.1 content on both the broadcaster and consumer sides. Upon decoding, the image envelope of the original 5.1 content may be re-synthesized based on the intensity/coherence information contained in the watermark. Using this methodology, an impression of the original source 5.1 content is rendered from the two downmixed audio channels with a high degree of merit.
Doing the downmix
The decoder segregates spatial elements of stereo based on the image envelope naturally residing in the content; nothing is either created or destroyed. If you re-downmix the 5.1 rendering of stereo back to 2.0 stereo, the result is “perfect reconstruction” of the original stereo content with the stereo image completely intact.
Our extensive listening tests to both the reconstructed 5.1 and the watermarked stereo have shown that image stability is not impacted by the Neural SEE system.
The advantage of handling 5.1 as watermarked stereo is huge. Broadcast plants are stereo, not six-channel.
Upgrading a stereo plant to discrete is not a minor issue, and when we did a study at KUVO the cost ran into the tens of thousands of dollars. Many of the proposed 5.1 systems require the use of HD Radio transmitted bits. The number of bits these systems are proposing to use, 16 kbps, is sufficient bandwidth to support a voice-grade secondary audio channel. It is my feeling that giving up 16 kbps to support 5.1 is not a good use of a very limited amount of bandwidth.
Our empirically derived results indicate that the Neural 5225 system is robust, makes great audio, has rock-solid imaging and works today.
RW welcomes other points of view.