Commentary: Enhanced apt-X and Low-Delay Circuits

APT Says Its Algorithm Keeps Coding Delay Under 5 Milliseconds for Analog, Digital Users
Publish date:
Social count:
APT Says Its Algorithm Keeps Coding Delay Under 5 Milliseconds for Analog, Digital Users

APT Says Its Algorithm Keeps Coding Delay Under 5 Milliseconds for Analog, Digital Users

Over the past 20 years, the audio services that telcos supply to broadcasters have migrated from balanced analog copper circuits to digital solutions based on synchronous E1 and T1 networks. The next decade will see the development of IP circuit provision, but that is another topic.

The legacy copper circuits were heavily dependent on regular human intervention to maximize and maintain the audio performance (stereo phasing) and presented a number of engineering difficulties.

They needed constant maintenance (equalization) and required repeaters at regular stages to amplify the reduction of the signal due to line attenuation; they had a relatively poor dynamic range and were also vulnerable to crosstalk from adjacent audio and telephony channels. Being simplex in nature, analog channels did not offer any additional ancillary services such as auxiliary data or contact closure, meaning that broadcasters had to order additional audio circuits for return feed confidence monitoring as well as dedicated POTS circuits for ancillary data. However, the one advantage that analog circuits did provide, which went some way in negating all of the above points, was a near-instantaneous link between two points (studio and transmitter) .

When service providers had a viable alternative to move away from the onerous operational overheads of analog circuits in the late 1980s they embraced E1 (2048 kilobit per second) and T1 (1536 kbps) circuits. Installing these circuits and terminating the data interfaces (G.703/G.704, X.21, V.35) with audio codecs on either end provided a service that was substantially easier to support, thus increasing the profit margin. The single performance legacy they had to ensure was that the encode/decode latency cycle was similar to the analog circuit it replaced.

Coding delay/latency within broadcast distribution networks is defined as the time taken to encode a signal at Point A (the studio), move this signal across a digital medium and then decode the signal at Point B (the transmitter site). The transmitted signal may then be picked up "off-air" and fed back into the headphones of the incumbent jock/DJ.

For the complete round trip, various figures for maximum (or minimum) latency are bandied around; under 5 milliseconds is the desired figure and any delay beyond 10 milliseconds will start manifesting itself as a slight echo, which is irritating at best. Beyond 15 milliseconds this delay becomes extremely challenging and any delay over 20 milliseconds creates such an echo that it proves unworkable to all but the most seasoned on-air talent.

Benefits of J.41

In the late 1980s research into digital audio data rate reduction was in its infancy; and in order to deliver the required audio parameters, service providers turned to the ITU recommendation of J.41. This recommendation gives the characteristics of equipment for the coding of 15 kHz monophonic analog program signals into a digital signal of 384 kbps. For stereophonic operation, two monophonic digital codecs can be utilized.

J.41 is a companding technique that takes a 14-bit word, reduces it to 11 bits and sends it over a digital medium; at the decoding end it recodes the companded 11-bit signal back to the original 14-bit word.

Designed as a simplex operation, J.41 has an encode/decode cycle time of approximately 4 milliseconds (including the transport time) and a dynamic range limited to 84dB. In terms of consistent stereo imaging, dynamic range and cross talk etc, the J.41 solution offered broadcasters an improved service over that of analog copper circuits, but they could still only get simplex services with no ancillary data channels. Furthermore, the operational cost saving enjoyed by service providers, was, in general, not passed onto the broadcasters.

Different needs

Since the first generation of digital services was introduced 15 years ago, broadcasters' needs have changed dramatically, and the increased awareness and knowledge of areas such as data transport costs, compression algorithms and tariffed services have taken the mystique out of moving program content from Point A to Point B.

With the advent of DAB/HD Radio (5.1 multi-channel audio in the future), RDS/RBDS, AES/EBU and increased performance in FM analog services, broadcasters are starting to demand improvements in the services provided by telco operators. The benchmark has been raised to a level where most radio stations require:

* Increased dynamic range to at least 96 dB equating to 16-bit word resolution.

* Fully duplex links for return feeds to permit confidence monitoring and other full-bandwidth audio requirements.

* Ancillary services to include RS-232 for RDS, contact closure and status monitoring and control.

* Further reduction in bit rate real estate; 768 kbps stereo demand is now considered onerous.

* But above all to retain the low coding delay associated with J.41 and the antiquated analog circuits.

Service providers in the UK, Denmark, Finland, Sweden and the United States have conducted exhaustive tests (which include blind listening tests with various genres of music and voice) into the improved feature set that broadcasters are demanding. At this time, the results have indicated that incorporating a low-bit rate, non-destructive digital audio data compression algorithm into a codec will address all of these requirements.

To this end, the 24-bit Enhanced apt-X algorithm from APT - Audio Processing Technology will constitute one of the main building blocks for the next generation of digital services.

Gentle and non-destructive

Keeping the same audio bandwidth (15 kHz stereo) and using only half the data capacity (384 kbps), Enhanced apt-X can improve the audio parameters while keeping the encode/decode latency to under 5 milliseconds including transport time.

A service offering 24-bit Enhanced apt-X can expect the following:

* Dynamic range in excess of 100 dB.

* Fully duplex.

* Embedded RS-232 and an additional low-speed channel for contact closure.

* Option to expand to 22.5 kHz stereo.

As the Enhanced apt-X algorithm is based on time domain, ADPCM principles, its coding technique is gentle and non-destructive in nature. Coding algorithms in general have had bad press following the myriad of problems associated with highly destructive, frequency domain, perceptual coders, i.e. MPEG Layer II, Layer III, AAC etc. Enhanced apt-X has been proven both academically and in practice to be immune from these problems.

In summary, by implementing Enhanced apt-X, broadcasters can now avail of audio services from telco operators that will ensure they are maximizing their existing analog FM services and transporting crystal-clear CD quality content for DAB and HD Radio services. Most important, coding delay figures are kept under 5 milliseconds.

Contributors to this article include Charlie Wooten of Clear Channel and Fred Wylie, formerly of BBC; apt-X is a trademark of APT.

RW welcomes other points of view to