As a radio listener I find commercial breaks to be incredibly frustrating. It’s not because I have a general dislike for advertising, it’s because my radio of choice happens to be my computer and on some of the streams I listen to, ad breaks regularly cut in and out at the wrong times. Here is a common listener experience:
The on-air host teases the audience by announcing that an interesting segment will be coming up right after a few messages.
Ears perk up in anticipation.
An ad begins: “We’re having the biggest sale ever wi…”
Ad insertion kicks in with a new set of ads, targeted for the Internet audience.
This leaves the listener confused. “What just happened? What is the big sale event for?”
Frustration is further fueled when the ad insertion break ends and the listener discovers he has joined the interesting segment partway through.
This type of experience hurts radio advertisers, diminishes radio brands and encourages listeners to explore other options like Pandora.
A complicated problem
The source of this problem originates at the studio where audio and metadata go through separate process chains before they are joined together by the live streaming encoder. For instance, the audio feed needs to go through signal processing and PPM encoding equipment before it can be encoded into an Internet stream. Because the audio feed consists of only raw PCM audio, more information is needed by the streaming encoder to determine what song is playing as well as when commercial breaks begin and end. This metadata information comes from the radio automation system, which can be configured to signal the streaming encoder on event changes with now playing information.
The Listen Later interface shows clearly separated events produced with the Smooth Spots algorithm.
Streaming audio encoders are responsible for inserting ad replacement cues into the stream based on metadata information. This creates an extremely challenging problem where the encoder must synchronize a raw audio feed that arrives through one chain of processes with metadata signals that originate from a separate place. The problem is further complicated by the large variety of different automation systems that are available. Some of the older systems make ad replacement particularly challenging, as metadata output features was designed for RDS encoders and now playing sections on station websites where timing is not that critical. On these legacy systems, a metadata cue may come at precisely the right time on one event change and be delayed by a second or two on the next update. This is not a big deal for a station website but when this data is being relied on to handle precise ad insertion timing it creates a big problem.
Some automation systems send data directly over UDP and TCP, which provide consistent timing, but others use protocols such as HTTP, FTP, Windows File Shares and even serial cable connections, which can delay the transmission process and impact ad insertion timing. The data itself is sometimes encapsulated in standards-compliant XML, other times it arrives in uniquely formatted XML, comma separated lists, HTML with tables and several other proprietary formats. The amount of data can range from just a mere Title and Artist truncated to 28 characters to 100kB files with full stack updates listing every song queued up for the next hour with normalization levels, song descriptions and millisecond-precise durations on each event. This enormity of possible combinations requires streaming encoders to use complicated parsers that can add further latency to the ad insertion cueing process.
To address ad insertion timing and other streaming challenges, StreamOn provides broadcasters with a dedicated piece of hardware called a StreamOn Appliance. This equipment is passively cooled, has no hard drive or fans (common failure points on computers) and runs a flavor of Unix built for long-term robustness. Using a StreamOn Appliance rather than a software encoder on a Windows machine ensures that the encoder does not have to compete with other system processes such as Anti-Virus software that can cause periodic CPU spikes and create latency.
The StreamOn Appliance runs a chain of processes designed to make ad insertion as smooth as possible. We call this our Smooth Spots solution. The chain of processes work as follows:
1) Rather than having one big program to deal with metadata and encoding, the appliance runs a series of separate processes to handle specific tasks. The first task for handling metadata is to insert a timestamp into the audio feed. The moment that a signal from the automation system arrives we produce a floating point Unix timestamp with microsecond accuracy (ie. 1372440050.123456) and inject that value directly into the audio feed.
2) The system then proceeds to parse the automation data to determine the type of content that just began playing as well as artist and title information. We do this after injecting the timestamp into the feed rather than before as the parsing process can sometimes take 100-300 milliseconds and we do not want that to delay the ad insertion cues. The content type and artist/title information are then mapped to the timestamp.
The ONdemand player shows separated events that the listener can go back and re-play, made possible by the Smooth Spots algorithm.
3) Finally, the timestamped audio gets fed into our patent-pending Transition Detection algorithm. This algorithm scans the PCM audio within a 2-second window around the timestamp and runs mathematical calculations on the audio waveform to search for perceived transition points in the audio. The metadata is then moved to the appropriate location in the audio and ad insertion cues are inserted if necessary. Though not perfect, this algorithm currently has an 87 percent success rate for identifying commercial transition points within a 2-second window of audio. For stations that run modern automation systems, the scanning window can be decreased from 2 seconds to 500 milliseconds for even better results.
4) Audio is then encoded and sent to our servers which work with the Adswizz targeted advertising platform. The Adswizz server reads the inserted ad cues and uses an intelligent buffering system so that if a 2-minute stop set is replaced with 2:10 of content, a buffer is built up and the listener gets moved 10 seconds further behind the live broadcast. This takes away the need to map the exact durations of the original and replaced ads and ensures that when the stop set ends, no content is missed. If over time the listener falls more than 30 seconds behind, the server simply skips a 30 second spot to move the listener closer to the live broadcast.
Ad insertion timing is an extremely complicated problem, but it is a problem that listeners care about and is worthy of focused attention. Our Smooth Spots solution combines dedicated hardware with software algorithms that we are continually improving to address this crucial problem. You can hear this solution in action on our demo page at smoothspots.com.
Snook is chief technology officer of StreamOn, Edmonton, Alberta.
StreamOn is an Internet Radio solutions provider, created by the OK Radio Group providing radio broadcasters with tools, technology and strategies for growing online audiences and generating meaningful revenue on social media.