With IBOC Coming, and More Codecs Than Ever Already in Use, Radio Struggles to Preserve Decent Sound
What do you do when an audio feed has been stomped on several times before it reaches your station, and you need to encode and decode that audio even more before your station transmits it? How do you avoid ending up with ugly audio?
In today’s broadcast production environment, many stations use digital audio from a variety of sources, much of it subject to different codecs applied to conserve storage space or to reduce the data rate for transmission over links with limited bandwidth.
MP3 files from the Internet, ATRAC-encoded audio recorded on MiniDisc, MPEG Layers II and III over ISDN, proprietary coding schemes for POTS transmission, and soon, the Perceptual Audio Coder employed as the compression technology for in-band, on-channel digital audio broadcasting – all find their way into the broadcast chain, making these perceptual codecs an essential part of the audio production toolbox.
However, applying multiple encode-decode cycles on the same piece of audio can cause significant degradation that decreases with each generation until the audio becomes unlistenable.
This problem of “cascading codecs” is becoming more common, especially at radio stations that produce programs with audio from many sources, leading radio engineers to ask how an audio production and distribution chain involving multiple stages of perceptual coding should be designed.
A trio of presenters addressed the question at the Public Radio Engineering Conference in Las Vegas this spring.
Ken Pohlmann, professor of music at the University of Miami and author of “Principles of Digital Audio,” reviewed perceptual coding theory and presented an overview of techniques used by codec designers to encode bit-reduced digital audio.
Ultimately, he said, no coding technique will be able to remove unnecessary bits from the data stream completely without leaving or creating artifacts. These artifacts can add up through serial application of the same or different codecs, creating audible defects.
“The lower the bit rate, the greater the chances for problems,” Pohlmann said.
He discussed ways of assessing the characteristics and magnitude of particular coding artifacts. Traditional graphical methods for measuring audio signals are not particularly useful for detecting digital audio defects, he said, because single-tone, multitone or pink noise sources don’t emulate the very short duration and narrow frequency bands typical of coding errors.
FFT analysis is capable of measuring the difference between samples of actual source material, but the volume of data can be difficult to interpret consistently, he said. Subjective listening tests are the most accurate way to evaluate codec cascading problems, Pohlmann said, and they have the advantage of accessibility. An engineer with a critical ear can easily set up an A-B test to examine a specific set of codecs.
Pohlmann played examples of encoded audio using two different MP3 codecs. As expected, the audio quality of music as it passed through one codec and then the other degraded gradually with each successive generation until it became unlistenable after the fourth or fifth pass.
The most dramatic examples were of the same codec applied to the same piece of audio multiple times. One of the codecs performed remarkably well, with only slight degradation after each pass, so that the fifth-generation audio sounded fairly close to the second-generation audio.
With the other codec, however, significant degradation was evident even after the first encode-decode cycle, and the audio was seriously impaired after the second cycle. Pohlmann used these examples to demonstrate that not all codecs are created equal, even those based on the same encoding algorithm.
Many perceptual codecs available for computers are written by software engineers, not audio engineers, he said, so users should experiment with products and choose those that perform best with other codecs in your particular broadcast chain.
Alex Cabanilla, senior engineer with the PAC Group at Ibiquity Digital Corp., detailed the PAC codec architecture and the subjective listening tests on PAC-encoded audio conducted during the development of Ibiquity’s IBOC system.
Ibiquity has conducted limited compatibility testing with cascading MPEG Layer II and Layer III codecs, Cabanilla said; but the company is looking to the users of PAC, both satellite radio and terrestrial IBOC broadcasters, to supply data for further assessment of any cascading problems.
Cabanilla discussed QPAC, a variant of the PAC codec developed by Ibiquity as a studio-side complement to the IBOC transmission system. Ibiquity designed the algorithm to allow storage, real-time audio transport and transcoding to PAC without requiring a decoding and re-encoding cycle, eliminating the possibility of cascading problems at this stage in the signal chain, according to Cabanilla.
Ibiquity’s testing indicates that QPAC-to-PAC transcoding allows higher-quality audio than MPEG Layer II-to-PAC or Layer III-to-PAC broadcasting chains.
As development of the QPAC algorithm continues, Cabanilla said, Ibiquity will test QPAC in a variety of real-world applications, including STL transmission, digital storage, content production and satellite distribution. The company also plans to study intercodec cascading problems in greater detail.
Rich Rarey, master control supervisor at NPR and a Radio World columnist, described a troubleshooting project to demonstrate how adding one more digital codec to a transmission chain can cause audio problems in conjunction with other codecs.
WUNC(FM) in Chapel Hill, N.C., found that a translator in Buxton would transmit severely garbled audio of certain elements within NPR programs, but those elements sounded fine on WUNC. After evaluating and testing the signal chain, NPR and WUNC were able to replicate the problem with a particular piece of audio fed from Scotland to NPR using a G.722 codec over ISDN.
When played from NPR’s MPEG Layer II audio storage system through an analog board to NPR’s MPEG Layer II satellite system, then through WUNC’s linear digital plant and a final MPEG Layer II satellite link to feed the translator, this audio element introduced enough digital artifacts to “break” the second satellite codec, affecting only the translator.
NPR has been more vigilant about the digital coding of audio feeds since the WUNC experience, Rarey said.
Now NPR does not accept audio feeds in MP3 format, because employing this algorithm can lead to degraded audio in fewer generations than does high-bit rate MPEG Layer II.
The presenters agreed that the specific codecs employed and the sequence of their use create wide variations in the nature and severity of audio degradation in a particular signal chain.
Their recommendations for minimizing problems were similar:
* Keep digital audio linear with high bit rates as long as possible;
* Know your codecs. Experiment and test to see what makes them “break”;
* When you have to use a codec, keep the compression ratio as low as is practical.
They agreed that further research and development of new or modified coding algorithms optimized for multigenerational storage and transmission signal chains would be the best long-term solution to the cascading codec problem.
Radio World is interested in hearing from readers who have experience with cascaded algorithms, good or bad. Send e-mail to [email protected]