Zeno Media Expands Its Offerings With HLS

This is one in a series of articles about trends and best practices in streaming.

Cristian Oneț is co-CTO of Zeno Media, an audio-focused streaming and podcasting infrastructure company that connects diaspora communities to the content from their home countries. It works with broadcasters and creators and has partners in approximately 100 countries.

Radio World: What’s the most important trend in how audio streaming has evolved for radio companies?

Cristian Oneț: The base technology used for streaming did not change for a long time; even HLS was released in 2009, although adoption took some time — Google Chrome added native support just this year.

I think AI has the biggest potential to improve radio station workflows; it will not replace content creation but it reduces the cost of labor-intensive tasks like transcript generation, metadata extraction, content curation and summarization.

We are working on our own product that will produce podcasts from live shows by properly identifying the relevant parts of the show, building the episode media and generating the episode metadata — image, description, transcripts.

RW: Zeno Media recently adopted HTTP Live Streaming. Why is this important and what can you do with it that you could not do before?

Oneț: Although we adopted HLS, I don’t think it will replace our ICY Icecast-style streaming anytime soon. I see the two protocols as covering different needs, and it doesn’t make sense for HLS to completely replace ICY. Technically I think HLS is a much better protocol; because of the segmentation it opens up all kinds of possibilities that were just not there with ICY. But ICY is still a good solution for the particular problem of online radio streaming.

HLS enables flexible streaming and on-demand delivery, supporting both audio and video through a single protocol. It also allows client-side ad stitching using tags like EXT-X-DISCONTINUITY and EXT-X-DISCONTINUITY-SEQUENCE, while still keeping ad requests handled server-side. In addition, HLS includes built-in support for multiple quality levels, making adaptive streaming straightforward.

High availability is made easier because there are no more long-running http requests like with ICY.

RW: How can a user choose the most suitable streaming CDN?

Oneț: The most suitable is that which has the best ratio of cost to provided service, while keeping service quality above a minimum accepted threshold.

This is a very engineer-ish answer but it’s similar to the problem we have of picking cloud service providers that are good enough to run our streaming CDN, taking all aspects into consideration.

For example bandwidth is the biggest cost generated by a streaming CDN, so we need to make sure that the price that the cloud service provider charges for the bandwidth that our stations need is enough to make it viable for our stations to use our service.

I can tell you for certain that the big cloud providers like AWS, Google Cloud Platform or Azure do not meet these criteria for our stations. The bandwidth would simply be too expensive for our stations to run. So we ended up picking mid-tier cloud service providers that offer machines with unmetered bandwidth at a decent price.

RW: Should a broadcaster have redundant CDNs?

Oneț: It depends on the level of service being aimed at. If the broadcaster feels like any particular downtime would cost a lot more than adding redundancy to their broadcast, it makes sense to invest in it.

Going back to the similarity with picking a cloud service provider, based on our experience, one provider was good enough for a few years, but lately, there are some routing issues that are not necessarily caused by the provider but are affecting the traffic going to the provider. We felt we needed some redundancy at that level.

So we started actively searching for cloud service providers similar to the current one that offer unmetered guaranteed bandwidth with the machines they are selling.

Everybody must judge for themselves whether they need redundancy based on their situation. Redundancy is good, but it has a cost and brings some complexity. If the main service works reliably it might not be worth it.

RW: How is metadata support accomplished in your streaming service?

Oneț: Since we just added support for HLS, most of our content is ingested as ICY (icecast) streams, so metadata is sent by the encoders using the icecast metadata update admin call.

This call can set two values, the song and the URL, which go into the StreamTitle and StreamUrl ICY metadata fields. The StreamTitle is displayed by all players that support ICY metadata so the StreamUrl is usually used to send encoded metadata about the content that is being played.

Since the metadata is sent out of stream by the encoder, synchronization between metadata and the broadcast content is the responsibility of the encoder, and sometimes it’s not properly performed.

Then there’s the problem of web players, which usually don’t have support for the ICY metadata protocol, which can require more complex code to separate the metadata from the actual stream bytes. For these players we provide a streaming API for every station. Using a single http connection, a web client can get real-time metadata updates by subscribing to these server sent events.

Frankly we would prefer all the metadata be sent out of band because metadata scanning by various systems generates a significant amount of streaming traffic, and it makes monetization more difficult because it is a source of automatic traffic.

RW: How can a streaming station match audio levels among different sources including music and ad partners?

Oneț: Through loudness normalization aiming at standard loudness levels defined by the industry. I won’t go into the details of doing that on the broadcaster side, but our programmatic ad insertion system matches ads — transcoded and normalized to standard loudness levels — to the loudness of our stations so that there’s a good match between them.

We track the broadcasted loudness of all our stations in time, and when we do ad insertion, we try to match the transcoded ad version that has the loudness closest to the broadcast loudness. We transcode every ad to two loudness levels by default, –16, –24 LUFS. We could define more, but so far these were enough. And when the ad is fetched to be inserted, the loudness of the station is used to fetch the closest match.

Read the Radio World ebook “Streaming Best Practices.”