Seeing the Movies

APEX-Experience5.4-Descriptive Audio 3

This article originally appeared in The Entertainment Issue of APEX Experience.

Picture your favorite movie scene. Maybe it’s Julie Andrews’ towering performance that brings Austria’s alpine landscape to life in The Sound of Music. Or perhaps it’s the explosive opening of Woody Allen’s cinematic ode to Manhattan, where fireworks crescendo against the nighttime cityscape in rhythm with George Gershwin’s Rhapsody in Blue. Is it the cosmic 3-D magnitude of space in Alfonso Cuarón’s Gravity or Christopher Nolan’s Interstellar that gives you pause? Or maybe you’re a romantic and prefer watching Jack and Rose fly into the sunset from the bow of James Cameron’s colossal blockbuster, Titanic. Close your eyes and picture your favorite scene. Now, imagine what it would be like to watch it without sight.

Kim Charlson, president of the American Council for the Blind, lost her vision when she was 11 years old, but thanks to descriptive audio, watching the movies is something she can still enjoy. Descriptive audio, or visual description, is a narrative track slipped between a film’s soundtrack and dialogue that describes the visual aspects of a production so that blind or visually impaired audience members can understand what’s going on. In that “flying” scene in Titanic, for example, a narrator describes what can’t be seen: “Jack’s face grins beside hers, as they soar over the waves from their solitary perch.” Says Charlson, after watching Titanic with descriptive audio: “One of my favorite scenes was when Leonardo DiCaprio and Katie Winslet were standing out on the prow of the ship looking at the setting sun and the reflection on the water. I really lost myself in the whole movie experience.”


The production process of providing descriptive audio usually takes an average of two weeks. “When we get the program in, it’s ingested, and then we get it burned into time code so that our writer has something to work from,” explains Simone Cupid, producer, Accessible Media Inc. Once the writer receives the file, they watch the film carefully, dissecting each scene and identifying what visual information the narrator will need to convey. This can include anything from costume or scene changes to the entrance of a new character, sight gags, body language and more.

APEX-Experience5.4-Descriptive Audio 2

Writers are trained and hired vigorously to provide balanced descriptions in line with the company’s style. After the writer scripts out the visual elements of the program, the producer and their team is tasked with determining what information is and isn’t essential. “Sometimes it doesn’t really matter that a character’s drinking a cup of coffee, because that has nothing to actually do with the story,” Cupid says. “Even though it’s an action that someone can’t see, it isn’t relevant to the story and all it does is cloud up the issue.”

“Sometimes it doesn’t really matter that a character’s drinking a cup of coffee, because that has nothing to actually do with the story.” – Simone Cupid

It may not be necessary to describe that a gun is shot, but who shot it and who was on the bullet’s receiving end is essential. Other visual details are less cut-and-dried. “When is it important that somebody’s black or white?” Cupid asks. “A lot of the times it is important, but most of the time it isn’t … Legitimately, you should mention it for all or mention it for none [of the characters], unless there’s a pertinent reason that we have to point out a person’s ethnicity.”

Beyond the challenge of identifying what needs to be described is the challenge of when to do it. In a dialogue-heavy movie, the production team may have to choose between description and dialogue: ”Minute by minute it’s a toss up between these two decisions,” Cupid explains. “Do I need to step on something a character is saying so that I can tell you he’s pulling out a gun at the same time, or is it more important for me to stay out and let him say whatever it is he’s got to say?”

Genre plays a factor in description as well. “You want the tone and the vocabulary and the pacing of the actual words to match that of the program,” adds Bryan Gould, director, Accessible Learning and Assessment Technologies at The Carl and Ruth Shapiro Family National Center for Accessible Media (NCAM) at television station WGBH. “If you’re doing a Mickey Spillane noir show, you’re not going to do it in a clipped-English sort of way… but you don’t want anything to stick out like a sore thumb.” Likewise, language used for children’s programming may need to be adjusted so that it’s suited to their level of comprehension.


When the writer is finished with the script – it usually takes an average of three to five days to write – it goes back to the producer, who ensures that crucial visual information has been accounted for. With a television series like CSI, which may have multiple writers working on it, the producer may make edits for consistency to make sure the series has a similar tone. “I always call it ‘the voice that binds,’” says Cupid.

Vocal talent is then hired to narrate the writer’s descriptions. “If the program has a lot of female voices or female characters in it, we may use a male narrator and vice versa,” Gould explains. So films such as Fight Club, The Rat Pack or Ghostbusters would have a female narrator, while Bridesmaids, Black Swan or Sex and the City may be narrated by a male. “That’s one thing we don’t want to do, is mask the description to make it sound like part of the program,” he adds.


In the United States, most major theatrical films are staring to be released with descriptive audio. With digitization, descriptions can be transmitted onto a separate audio channel that can be turned on and off, and in movie theaters, this channel can also be transmitted to headset-equipped devices. The Department of Justice intends to issue a final ruling this fall that may require theaters to offer moviegoers devices for both closed captioning and descriptive audio.

APEX-Experience5.4-Descriptive Audio 5

Fight Club

For in-flight entertainment (IFE), “The natural progression would be to emulate the broadcast model, where you have one movie, but you provide different soundtracks for it,” says Geoff Freed, director, Technology Projects and Web Media Standards, NCAM. “Technologically it is like any other audio track,” adds Michael Childers, APEX Technical Committee chair and Board member. “Descriptive audio tracks would be delivered to post-production just like audio language tracks, and they are multiplexed alongside other audio tracks and synced to the video the same way.” In 2014, in partnership with Disney, Emirates became the first airline to offer passengers films with descriptive audio tracks. Moving forward, proactive agreements with studios will be key, as they will allow the necessary preparations to be made to create the descriptive audio track within the early content delivery window.

“Technologically it is like any other audio track.” – Michael Childers

APEX-Experience5.4-Descriptive Audio 4

The Full Monty

The US Department of Transportation (DOT) will provide recommendations to the Security of Transportation by July 25 this year, with rulemaking planned for December. Under APEX’s Closed Caption Working Group, or through a newly created descriptive audio working group, the APEX Technical Committee will work with parties such as DOT and NCAM on the submission and to bring descriptive audio onboard. “The first action for us is… to draft specifications for the delivery of these tracks in the same way that we specify parameters for delivery of other tracks,” Childers says. “Most likely, it would be a revision to APEX 0403 [for closed captioning], which would be picked up and reused in new specifications, such as a potential H.265 specification.”


Bringing descriptive audio onboard is one huge step, but making it possible for passengers to find these films is a whole other matter. In 2014, Air Canada became the first airline to provide a fully accessible in-flight entertainment system with the unveiling of its Boeing 787 fleet, equipped with Panasonic’s eX3 system produced in collaboration with DTI Solutions (now part of Global Eagle Entertainment). The system allows visually impaired passengers to navigate the graphical user interface in complete autonomy thanks to text-to-speech technology.

“With our software partners,” explains Éric Lauzon, manager multimedia entertainment, Air Canada, and APEX Board member, “we developed a content management system (CMS) for this accessible IFE solution that allows us to basically piggyback on the metadata and entertainment content that our content service provider enters into the media management system. The output of this CMS is then imported into a tool that creates text-to-speech audio files.” After being tested by Panasonic, the files are packaged and made available onboard so passengers can hear descriptions of the film in either French or English before selection their option.

The airline has also licensed programs such as the cooking show Four Senses, an original production produced by Access Media Inc. that features blind Master Chef Christine Ha and includes embedded descriptions.