Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now


Building a Better Listener Experience

HD Radio time and level alignment keeps the audience listening

The author is executive director of broadcast engineering of DTS Inc.

As use of HD Radio products expands in new cars and home receivers, consumers are providing feedback on the quality of the user experience. The most noticeable comments are around audible blending artifacts and general audio quality. The car industry likes the HD Radio service and is looking at ways to expand the technology. At the same time, they want broadcasters to minimize the potential for a poor user experience as receivers blend between the analog program and HD1 program. Time and phase offsets and level misalignments between the analog and HD1 audio program are frequently cited as the source of customer complaints. The majority of the problems can be traced back to individual broadcast stations.

This paper will review the HD Radio broadcast system from the viewpoint of audio alignment. It will explore the various points in the chain that can cause misalignment: audio processing differences, audio distribution paths, network reliability and STL feeds. We will also explore operational requirements to improve audio alignment and thereby improve the user experience. Finally, we will offer suggestions for real-time monitoring and correction of misalignment at the station.

With more than 29 million HD Radio receivers in the marketplace, every day more radio listeners are hearing your station in digital. While a significant improvement over analog, digital signals are not immune to signal discontinuity. An important feature of the HD Radio technology is its ability to seamlessly transfer back to the analog signal during an outage or at the fringes of digital coverage. This feature — blending with time diversity — is known as “Diversity Delay.”

To work as intended, Diversity Delay requires the analog and digital program audio be kept in perfect synchronization to ensure a good listener experience at blend. This alignment not only pertains to time synchronization, but phase and the apparent audio loudness as well.


Fig. 1: The process of time-diverse transmission. When the samples are synchronized properly at the transmitter, a digital signal impairment causes the receiver to blend seamlessly to the analog signal. Time-diverse transmission can be an effective method for dealing with channel fading in a mobile environment, providing a second channel conveying duplicate information that is uncompromised. Transmitting the information on the second channel, shifted in time, can enhance the total system performance when the two channels are recombined in the receiver. DTS’s HD Radio technology includes a time-diverse backup channel in all AM and FM modes. HD Radio technology takes advantage of time diversity by delaying the backup (analog) transmissions and realigning the signals in the receiver. Fig. 1 illustrates how the time-diverse blend operates during a digital signal impairment.

During a signal interruption, the receiver will automatically blend (quickly crossfade) from the digital signal to an analog backup channel that is time shifted in transmission. To make this time diverse methodology effective, a radio station must be able to align the analog and digital program signals accurately and reliably. This requires accurate measurement of the signal alignment and a system topology that supports stable phase coherent time aligned transmission. However, time and phase coherence is only part of the equation. The apparent loudness of both the analog and digital program streams also plays a critical part in a seamless blend transition.


Fig. 2: When the digital and analog audio samples are aligned, no frequency response artifacts are present.

Fig. 3: When the digital and analog audio differ by five samples, marked changes in the frequency response occur during blend. The most noticeable effects are at the mid-point of the blend from digital to analog audio.

Fig. 4: At a 50 sample offset, there is a significant frequency response change during the blend from digital to analog audio. One fundamental challenge encountered in the maintenance of station-alignment centers around the measurement of the time alignment offset. To be effective, the method used to correlate the analog and digital signals needs to be near enough to real-time to correct the conditional set under measurement, and because these metrics rely on an algorithm to correlate the offset, it is possible for the measurement to draw an inaccurate conclusion based on a too limited range of samples, or errors in the algorithm’s assumptions. The simplest approach, therefore, is to average several samples over time and invoke hysteresis (a lag between observing the samples and acting upon them) before large-scale corrective action is applied.

Figs. 2, 3 and 4 illustrate how sample offsets between digital and analog audio affect the program audio during the blend period. As these illustrations show, when the number of offset samples gets larger, a larger number of detrimental frequency response changes occur during the blend crossfade between digital and analog audio.

Analog and IBOC program levels are often difficult to match in terms of apparent loudness. Often these differences are straightforward and can be explained by the nature of the transmission:

● Root Mean Square (RMS) audio differences between analog and digital program paths

● Absolute Frequency Response:

○ FM low-frequency cut-off (DC coupling adversely effects the exciter’s phase lock loop)
○ FM high frequencies filtered at 15 kHz (to protect the 19 kHz stereo pilot)

● Dynamic range of transmission medium

○ up to 70 dB FM
○ up to 50 dB AM
○ up to 96 dB IBOC

● Square wave response: effects of FM pre-emphasis/de-emphasis on apparent loudness

With signal level adjustment, it is possible to accommodate these discrepancies by increasing or decreasing relative signal levels and frequency response to achieve parity in the apparent loudness. The increased frequency response and dynamic range capability of the HD Radio signal is in fact a contributor to the distraction during blend. Simply stated, the audio ballistics of the analog and digital programming should be nearly identical. Matching the processor settings in equalization as well as frequency cut-off is a great place to start this effort. Since FM frequency response is limited to 15 kHz, the digital should be as well.

In the early days of the technology’s rollout, budget constraints led to compromises in equipment selection that have had significant impact on blend performance. Radio stations frequently use castoff equipment from the main channel audio processing chain and press it into service on the digital path. When stations employ separate audio processors on the analog and digital audio chains, the probability of encountering time alignment stability issues increases significantly. The problem is twofold; dissimilar processing settings will affect group delay on each path differently and will exhibit time domain movement centered on a given alignment offset value. The second concern has to do with measuring the offset. In audio processing where minimal signal latency is desirable, low-order infinite impulse response (IIR) filters are often used to implement a desired equalization curve. These filters typically have poor phase response, and generate disruptive signal cutoffs when the filter coefficients are switched. Finite impulse response (FIR) filters are preferred for digital equalization because they do not suffer from this discontinuity problem and are inherently phase linear, thus eliminating group delay distortions.

Fig. 5: The resulting correlation between “normal voice” and a phase rotated vocal audio. One variable encountered with separate audio processing is the use of audio phase rotation networks (commonly known as phase rotators). The use of phase rotation in audio processing began in the early 1980s as a way to make speech more symmetrical, reducing its peak-to-average ratio by as much as 6 dB without adding nonlinear distortion. A phase rotator is a series of all-pass filters, with flat frequency response. The phase response is related to the time delay encountered by the signal’s frequency components. At a given frequency, time delay is related to phase shift. Simply stated, phase response changes as a function of frequency. Ideally, phase rotation would only be used in the microphone chain of the on-air and production studios. However, if you use separate analog and digital program audio processing, with phase rotation in only one program path and not the other, it is probable that correlation algorithms will draw errant conclusions and act on bad information. Fig. 5 shows the poor correlation between “normal voice” and phase-rotated vocal audio.

Fig. 6: The phase of the AES audio affects the blend. Another opportunity for blend discontinuity may be found in AES audio phase. While digital audio processing has been in use since the mid-’90s, little attention was paid to AES audio phase until the introduction of HD Radio technology. With HD Radio transmission, the opportunity arose to compare the audio phase of separate analog and digital program paths. If each path has the AES audio 180 degrees out of phase from the other, a sharp null would occur during an analog to digital or digital to analog blend. Fig. 6 illustrates this. Fortunately, software updates to audio processing have made available a simple AES audio phase reversal feature that can correct the problem.

How the broadcast hardware is arranged can have a substantial impact on the overall time-domain stability of the transmission. HD Radio components evolved from a monolithic digital exciter package into the distributed platform of Importer, Exporter and Exgine to take advantage of the bit-reduced audio transport. While reducing the studio-to-transmitter link (STL) data throughput requirements per audio service, it added a system requirement for time base synchronization at both the studio and transmitter locations. This distributed synchronization is most efficiently accomplished at each end with a GPS-sourced 10 MHz precision time base and a 1 pps signal. While this structure works quite well in the lab, getting a GPS receiver to function properly in a high RF field at the transmitter site or delivering a usable GPS signal from the roof to the studios, 30 stories below, becomes a significant challenge. Due to these commonly encountered limitations, an alternate method of synchronization was developed using a synchronous 10 MHz clock at the transmitter site, derived from the Exporter’s Modem Frame Clock. While slow to achieve system lock (it takes hours), this method works well on links with low jitter. Nevertheless, cost constraints have won out, and dual GPS locked systems have become less commonplace. As a result, many stations have found performance benefits by co-locating the Exporter and Exgine where they can share a common locally generated time base.

An additional timing consideration is that, regardless of the manufacturer, a restart of the digital broadcast system hardware typically results in a shift in the Modem Frame start sequence relative to the analog signal. As new equipment designs begin to focus on single-frequency network (SFN) operation, this issue should ultimately be resolved so systems will maintain consistent alignment after a power cycle. Until then, after a power outage, it is a good practice to confirm the station is in proper alignment.

Another method that can introduce time shift is the use of separate program paths to the transmitter site. The use of a common link that conveys both the analog and digital is desirable, as signals may remain coherent. Whenever a station implements separate STL paths for the analog and digital programming, the probability increases that one path may experience dropped packets while the other remain solid. When this occurs, the time relationship between analog and digital begin to shift. Since these links are typically unidirectional, the loss can never be corrected. However, even state-of-the-art digital STL solutions may have variable latency between their audio and data program paths. Recent improvements in diversity delay monitoring have revealed instantaneous shifts in relative time of the AES and Ethernet channels on one popular system. While these excursions in time are not overtly apparent to the listener, they do exceed the system recommendations.

The HD Radio Broadcast Monitor Network, a system deployed by DTS to help stations improve performance, is capable of generating time and level alignment trend reports. This information, combined with continuous discrete station monitoring, has been instrumental in providing analytics on the stability of various system configurations. As broadcast monitoring tools improve, it is becoming ever more apparent that a form of automated monitoring and control is desirable to control the complex time and level domain of the hybrid HD Radio audio signals.

Several products are already in the marketplace from Belar, DaySequerra and Inovonics that allow for automated alignment control. The Belar FM-HD1 sends alignment correction information to HD Radio exciters enabled with their proprietary protocol. The DaySequerra M4DDC and M2Si Time Lock, as well as the Inovonics Justin 808, actively control time and level alignment by managing the analog path diversity delay and controlling the digital program path loudness.

In order to offer the listener the best user experience, stations should strive to match the audio characteristics of the analog and digital program paths. Doing so will always improve the blend experience during signal impairment and at the edges of coverage. Today, this is most easily accomplished with an audio processor designed to provide both analog and digital program path output simultaneously. Use of these dual-output processors resolves nearly all of the phase and level discrepancies found in the field.

Wherever possible, co-locate the Exporter and Exgine and share a common time base reference. If this isn’t practical, the next best option is a GPS-locked time base reference at the Exporter and Exgine ends of the system. Tie all of your stations AES audio sources to a common 44.1 kHz wordclock, preferably tied to the GPS time base reference.

Last, automate the time and level alignment process whenever possible. Doing so will save countless hours of maintenance and provide a seamless user experience.

DTS’ Jeff Detweiler directs broadcast product development and the introduction and launch of its HD Radio brand of in-band on-channel (IBOC) technology to radio stations worldwide. He is responsible for licensing the technology to transmission equipment manufacturers and working with them to commercialize HD Radio products.