Neural Audio Co-Founder Responds to Questions Posed by Steve Church, Frank Foti
This is in response to a Guest Commentary authored by Telos/Omnia's Steve Church and Frank Foti in the March 16 issue.
Where has Telos/Omnia been? The Neural system has been the focus of several presentations, technical panels and demonstrations for AES, NAB and CES since 2001. It has been scrupulously tested and validated by XM Satellite Radio, National Public Radio, Harris, Ibiquity and hundreds of engineers that have taken the time to verify Neural's "outrageous claims."
"Smoke and mirrors" don't survive reality. The Neural technology has been "on the air" for quite some time now and has been proven to be stable, reliable and practical. Many radio stations have planned their air chains to include Neural's surround technology.
Telos/Omnia has received countless invitations to visit the Neural facility and experience Neural's surround technology. They apparently refuse. Somehow, Telos/Omnia continues to cling to the dangerous assumption that bitstream based technologies are compatible with the day-to-day operations of a real, live broadcast facility.
Neural Audio is distributing practical, workable and affordable 5.1 surround sound technology into the radio broadcast arena. That is a fact. The act of casting baseless dispersions in a public forum vs. the due diligence of proving a concept in the field is so lame that it merits a (really old) proverb:
"It is not the critic who counts, not the one who points out how the strong man stumbled or how the doer of deeds might have done them better. The credit belongs to the man who is actually in the arena."
Stop wasting time. Get into the arena and make your system function in the real world. Compete. Prove that it works in a real (from mic to antenna) broadcast environment. Neural has.
Your comments are in italics.
"With the Neural system, stereo is always derived (downmixed) from the 5.1 multichannels [sic]. Mike, if this is a satisfactory procedure, why don't DVD-Audio and Super Audio CD disks use the same approach? They could save a lot of bits and trouble by providing only the surround mix and letting stereo players do a mechanical downmix. But they never do, instead providing listeners with human-optimized mixes for each mode."
DVD-A and SACD formats don't have to. Broadcasters don't have these lofty bit rates to work with.
All spatial audio coders, including Neural and the Fraunhofer variant Telos/Omnia is promoting, must downmix 5.1 fine structures to two fine structures as part of the spatial compression process. Additional information is provided along with the stereo downmix to reconstruct the 5.1 content.
The "mechanical downmix" is something that has to happen in all spatial coding, including Fraunhofer, Agere, Philips and Coding Technologies. Neural happens to have one that works very well.
Criticizing Mr. Mike Pappas for striving for stereo compatibility in his 5.1 mixes is really reaching for something to complain about. If I recall correctly, mono compatible stereo was something that was desirable in the early (and present) days of the broadcast transition to stereo.
Stereo was never broadcast along with a separate mono mix. The mono was derived from stereo. History teaches us that the same should be true for the surround transition.
"The KUVO test broadcasts were with a live concert that your station produced for itself in surround, right? So, what reference is available to know that there, "were no surprises in the stereo mix," since there was no stereo original for comparison?"
One of the most obvious "surprises" in any stereo downmix are artifacts in the frequency domain. These are easy to hear as they resemble the objectionable effect ("comb-filtering") of early "non-mono compatible" stereo easily recognized by the mature broadcasting engineer.
Non-adaptive, mechanical downmixing used in side information based systems don't account for frequency domain artifacts caused by ITD's in the content. Neural's system does.
"You really need to test with DVD-Audio or SACD music as the source, so you can evaluate carefully and accurately if the stereo is OK. This is going to be critical to acceptance of a broadcast surround system since weird-sounding stereo on familiar music is certainly going to trigger protests from program directors, listeners and owners."
Yes. Objective testing is a good thing to do. Neural prefers MUSHRA (MUlti Stimulus test with Hidden Reference and Anchors) statistical testing as a validation of performance.
During the transition from stereo to surround/stereo interoperability the grand majority of your audience will be listening to 5.1/stereo downmixed content interspersed with legacy stereo. As the transition continues, the majority of listeners will hear the 5.1 original source content reconstructed to 5.1.
The 5.1 content will be interspersed with legacy stereo that is rendered in a 5.1 format. As this happens, all content, regardless of original and/or eventual spatial goals, must be perceived as natural, entertaining and in context with the intention of the original content.
"What was done to try and damage it? No details were provided on the signal path for your experiment. But the exceptional claim that the Neural watermark can pass through multiple codecs with no problem requires exceptional proof!"
Mr. Pappas has authored a painfully detailed account of validating the performance of the Neural system. Extensive tests performed by National Public Radio agree with Mr. Pappas' findings... "The audio will fail before the watermark does." The article appeared in a recent RW supplement about the "Toast of the Nation" broadcast.
"In a recent Radio World article, Neural gave 16 kbps as its watermark bit rate. A 5-10 bits-per-second rate is considered robust in the context of anti-piracy watermarking. Experts say that around 100 bits per second would be pretty much the limit in order to withstand passage through usual codecs."
After a review of Radio World articles, I don't see where Neural gave 16 kbps as the "watermark bit rate." Neural doesn't discuss the capacity of the watermark. (Ed. Note: In the Nov. 17, 2004 issue, Mike Pappas of KUVO made reference to a 16 kbps watermark data rate. Neural disputes the figure.)
The real innovation of the Neural system isn't in the watermark, it's in a new ultra-efficient (patent-pending) method for translating the ITD, ILD and ICC spatial descriptors to a format that naturally fits watermarking.
Benchmarking the performance of a spatial audio coder by the available side information data rate under ideal conditions is a bad idea. Side information doesn't survive editing (cross-fades, voice-overs), tandem coding (ISDN, STL's, etc.), analog (mixers, processors or cross-points) and always penalizes the consumer with the side information cost even when they are hearing stereo or mono. The question to be asked is "does it get the job done in the broadcaster's real world?" XM, NPR and Harris think so.
Even Fraunhofer claims that its "Scalar Costas Scheme"-based watermarking has a low error rate data capacity of about 6.8 kbps. Don't sell watermarking short.
"So what is going on here? What is really the rate, and what tests have been performed, under what conditions?"
Users of the Neural system have been less concerned with the watermark bit rate and more concerned with the performance under real-world conditions. It has been tested under the most stringent of real-world conditions (by independent parties) and the spatial information it conveys clearly survives.
"Since the system is being proposed for analog FM as well as HD, what happens to it with multipath?"
Neural didn't originally propose its system for analog broadcast. However, many a clever broadcaster has concluded that conditions that support stereo content work just fine with the watermark.
It isn't, however, designed to work transparently through a mono format. Under heavy multipath conditions, the 5.1 system could "blend" to a "synthetic aperture" format, the spatial equivalent of the "blend-to-mono" feature of present stereo receivers. Under those conditions it could be used with analog broadcast with no more difficulty than regular stereo.
"Neural's secrecy is a barrier to making a valid assessment of their quite outrageous claims."
Neural's claims are not outrageous to those skilled in the art. Yes, there is secrecy. All broadcast and receiver manufacturers/OEMs work with Neural under strict non-disclosure agreements. With an NDA in place there is no barrier to disclosure and assessment.
"What would happen if two pre-coded [sic] sources (music stored on a delivery system, for e.g.) were to be cross-mixed on-air?"
A flawless, artifact-free, on-air cross-mix.
"During the overlap time, wouldn't the watermark be corrupted and the received result sound pretty bad, or collapse to stereo?"
"Has cross-mixing been demonstrated?"
Yes. It works very well.
"How would a surround or panned mic be added to the mix for voice-overs? This is something you wouldn't have tested in your live concert demonstration, but certainly cross-mixing and announcer voice-overs are routine in normal radio programming."
Intensity/coherence watermarking bears a strong spatial resemblance to the natural construct of legacy stereo or panned mono sources. Downmixed 5.1 may be mixed with stereo and/or mono with no ill consequences.
In fact, mono, stereo and 5.1 cross-mixing was an integral part of the broadcast. The received broadcast was flawless in both stereo and in 5.1
"...the incremental costs to move from stereo to discrete surround are near zero. The majority of studios on-air today are still analog and need to be upgraded to digital anyway, so the surround capability comes along for the ride. Surround, digital and networking are coming together fortuitously."
Total, overnight displacement of any broadcast infrastructure is not going to happen. It is expensive.
To be successful, technology must work within an existing infrastructure, expanding capabilities while preserving downward compatibility with legacy formats and content. If a technology requires the client to "replace everything" it will meet with tremendous resistance.
I would bet all of the Radio World readers would love to see a detailed cost accounting showing how there is "near zero incremental cost differential" in building a discrete 6-channel plant over a conventional stereo one. I am also sure that the folks at NPR would be fascinated in Mr. Church and Mr. Foti's proposal for shipping a discrete 6 channel bitstream though the PRSS satellite distribution system to all of the NPR member stations for "near zero additional cost."
You must allow the broadcaster to upgrade at a pace that makes fiscal sense.
"Would you prefer to save those (spatial information) 16 kbps for a cell-phone-grade voice something or other, rather than provide a capable and compatible surround service? We respectfully disagree."
Talk to the radio stations, guys; 16 kbps can produce much higher quality than a cell phone. There are many speech and data services that generate far more revenue than the promise of surround "someday."
"Anyway, there will probably soon be more bits to play with. Ibiquity has a proposal before the FCC to increase HD Radio's data rate from 100 to 150 kbps and there are technologies on the horizon to deliver yet another 64 kbps within the current SCA spectral space."
"Probably," "soon" and "on the horizon" are not words that instill enough confidence in the broadcaster to move on a technology. It has to be proven that it works now with what they have now. Future available data will be used for more revenue generating services such as more channels and more data services.
"...a live concert with material no one has heard before and with no stereo reference is no way to evaluate how a system will work for what broadcasters will use it for, day-in, day-out. We need tests with normal radio programming and production techniques."
Neural believes that passing muster with the likes of broadcast giants like XM, Harris Broadcast and NPR is of great value as these are organizations interested in the function of broadcast systems as a whole. This is where Neural has subjected its technology to scrutiny.
At this level, if technology has any holes in it, these guys will find it and kill the technology. XM and NPR have validated the performance for both satellite and terrestrial use and Harris, after much due diligence determining that the technology performs as claimed, distributes it. Product is available now from Harris. Not probably, not soon, not on the horizon, but now.
"Speaking of tests, why hasn't Neural submitted their technology to the scrutiny of the unbiased MPEG testing that has been ongoing the past months? At what point will Neural offer an honest description of their system so it can be evaluated on a reasonable basis? Thus far it's been a lot of smokey words and fogged mirrors."
Neural has been working with MPEG members for years. If you were to actually talk to MPEG, as we actually do, Neural Audio is an MPEG member; you would find that they don't have much to say about encoding standards (like on the broadcast end).
They only recommend technology on the consumer or decoding end. That is why testing has been heavily performed on the broadcasting end with organizations that do have something to say about the broadcast end. Neural respectfully suggests that Telos/Omnia follow the same path.
Neural has recently started licensing decoder technology that is available now, that is compatible with Neural's verified broadcast technology. Licensing activity has been brisk, to say the least. Only recently has it been appropriate to disclose the technology to consumer organizations (like MPEG).
Mike, we appreciate your enthusiasm. Surround is an impressive listening experience and you've heard a system that delivers it on the FM band. So naturally, you want to get on with it! But you are proposing that broadcasters adopt a system that has had no significant on-air testing, no disclosure of technology, no comparative evaluation of performance, a single vendor source and troublesome claims.
As Mr. Pappas can tell you, he has performed significant on-air testing and the technology has been disclosed to him as he has taken the appropriate legal steps. His many years of broadcast and surround experience make him uniquely qualified to generate a meaningful opinion. The comments of "vendor problems" and "troublesome claims" seem more like "sour grapes" than reality.
"The MPEG system we support has been carefully tested in a controlled scientific fashion with a wide variety of source audio material. Its developers include Fraunhofer Laboratory (inventors of MP3 and MPEG AAC), Agere (former Bell Labs and Lucent researchers), Coding Technologies (inventors of the "plus" enhancements to MP3 and AAC as well as the HD Radio codec), and Philips (co-inventor of MPEG Layer 2 and a consumer electronics firm).
"Yet more testing is forthcoming as the best ideas continue to be merged from each contributor. The technology approach has been published in a number of AES and other papers so that researchers have been able to evaluate claims and build upon each other's work."
Neural started down the same (bitstream) path several years ago. There are aspects of a bitstream based surround system that are terribly cumbersome to the existing and future broadcast infrastructure. That path was rejected by the broadcaster; it doesn't matter where the technology came from.
Neural is delivering a tested, scalable, future-proof and reliable system that may be implemented into the existing infrastructure without costing the broadcaster an arm and a leg. It does work and it works very well.
Telos/Omnia has had a long-standing invitation to come to Neural to enjoy surround and salmon. What are they waiting for?
RW welcomes other points of view.
Neural Audio Co-Founder Responds to Questions Posed by Steve Church, Frank Foti