Pop Up Archive Solves Transcription Headaches

Pop Up Archive rescued five years of KALW(FM)’s content.

Audrey Dilling is a reporter/editor for KALW(FM) Public Radio’s “Crosscurrents” news magazine, which airs in Los Angeles. One aspect of her reportorial duties is to transcribe her research audio into text, for archival and search purposes for subsequent stories.

In the past, Dilling had to produce these transcripts from scratch, literally listening to the broadcasts and typing what she heard. “This could take anywhere from four to five hours a day,” Dilling said. “Now, using Pop Up Archive, I have been able to reduce the transcription process to just an hour daily.”

POP UP ARCHIVE TO THE RESCUE
Created by former journalists Anne Wootton and Bailey Smith, Pop Up Archive (www.popuparchive.com) is a major advancement in audio-to-text transcription.

The idea is to allow broadcasters, podcasters and other producers of audio content to convert their audio files quickly into readable text files that can be stored, indexed and searched easily.

The transcribed files are automatically time-stamped and analyzed to create descriptive keywords, further improving the search process. (Wootton and Smith have also created Audiosear.ch, a full-text search and discovery engine for podcasts and radio that aggregates and analyzes transcripts with other data, like charts positions, reviews, recommendations and social media mentions.)

“Pop Up’s website offers tools to make audio-to-text transcription as easy as dragging-and-dropping audio files onscreen,’ said Emily Saltz, Pop Up Archive’s content strategist. “The customer receives computer-generated transcripts within an hour or less that are indexable and searchable. They are made available on our website as soon as they are ready.”

Pop Up Archive offers editing tools to enable customers to correct any wrongly-transcribed words.

For a monthly fee, based on customer usage, the Pop Up Archive service provides the transcripts, online storage of the audio files, transcript editing tools for the customer to correct any wrongly-transcribed words, customer-specified and auto-generated tags to help identify each transcript’s content and embeddable players that enable anyone to search within a Pop Up Archive audio file.

BROAD USAGE
In addition to employing Pop Up Archive to transcribe audio to text, KALW hired the company to build a “custom, easily searchable archive of five years’ worth of unindexed program transcripts,” said Dilling. “For us, this rescued five years of valuable content from being lost.”

KCRW Public Radio of Santa Monica, Calif., also uses Pop Up Archive to transcribe a number of its shows, including “To The Point,” “Bookworm” and “The Treatment.”

Pop Up Archive’s audio-to-text transcription client list includes KQED, “This American Life,” the Canadian Broadcasting Corp., Duke University, the New York Public Library, Illinois Public Media and Public Radio International.

“Overall we produce audio-to-text transcripts for about 30 major enterprise clients and 150 smaller ones,” said Saltz.

LIMITS
The toughest challenge for any speech-to-text recognition engine is deciding the specific text word that is associated with a given spoken word accurately, especially given the wide variations in human speech across regions, socioeconomic groups and personal vocal inflections/idiosyncrasies.

Audio tags are like visual markers to help users search within the transcriptions of the audio content. These factors explain why retail voice recognition software such as Nuance Dragon NaturallySpeaking insist on users “training” the software to “recognize” their voices first. The process entails each user reading and recording a number of preset text sections into the software using a headset microphone. Doing this allows the voice recognition software to actively associate the user’s spoken words with specific text, giving the software access to a sound/text database that it can then employ to compare the user’s speech entries against during actual audio-to-text transcriptions. Using this approach, Nuance Dragon NaturallySpeaking promises users “up to 99 percent accuracy out of the box,” according to the company’s website (http://shop.nuance.com/).

Because it does not enjoy the advantage of user training and has to convert voice-to-text from audio for an unrestricted number of voices, the accuracy of Pop Up Archive’s transcripts can vary widely depending on speaker, background noise and crosstalk (when two people are talking at the same time).

But even the 75 percent accuracy rate experienced by Dilling is a lot better than transcribing audio from scratch.

“Seventy-five percent accuracy means that three-quarters of the words in the transcript are already right,” Dilling told Radio World. “The time spent to correct the remaining 25 percent is still much less than having to do this all without Pop Up Archive’s help — and the accuracy rate keeps improving.”

Hence, even with its flaws, Pop Up Archive is proving to be a useful tool.

“I can’t imagine doing my transcriptions without it,” said Dilling. “I certainly wouldn’t want to go back to the old ways of doing things.”