How do I caption a video?
How do I caption a video?
By Jennifer Beer
Services Coordinator, Digital Media Captioning
We regularly get calls from people and businesses who ask, “How do I add captions to a video?” They want to make their online video content accessible, but aren’t sure how to do it. CHS has researched captioning quality standards in Canada and other countries, and has also conducted its own research with members of the Deaf and hard of hearing community to develop quality standards for its own captioning services. Here are a few important points to consider:
Accuracy: Every word spoken needs to appear on screen. If it’s not essential to understanding, we’ll leave out an “um” or “uh.” But other than that, if you can hear it, it must appear in the captions.
Language mechanics: Text should be presented as it is generally spoken and written. Proper spelling and punctuation must be used, and blocks of text should begin and end with natural breaks in speech. Text that is broken up awkwardly can be difficult to understand. If the announcer mentions “Mr. Smith,” “Mr.” and “Smith” need to stay together. Words like conjunctions, pronouns, and adverbs need to stay with the word they’re modifying. And each sentence should start on a new line, unless they are very short sentences that work together, e.g. “Hark! Who goes there?” In addition to aiding comprehension, breaking up captions according to proper language mechanics can help in situations where translation services are available, such as YouTube.
Sound effects: It’s important to include sound effects and other non-spoken information that a Deaf or hard of hearing viewer may miss out on. Can you imagine watching “Close Encounters of the Third Kind” without hearing the music? Or watching a suspenseful film where someone is hiding, and listening for footsteps to be sure the coast is clear? Non-verbal information is often crucial to making sense of video, and needs to be included in the captions.
Speaker identification: I recently watched a video of a debate – with the sound off. Because the speakers weren't identified, it was impossible to tell who was saying what! Ideally, where there are multiple speakers, they need to be identified by name where possible.
Pacing: Text blocks should contain a maximum of two lines where possible, and should remain onscreen for a minimum of two seconds. Text should not be edited to accommodate presentation rate unless absolutely necessary, and then editing should be minimal and retain both the tone and the meaning of the original speech.
Timing: Text should be synchronized as closely as possible with sound. If you’ve ever watched a video where the audio was out of sync with the picture, you know how disorienting it can be – hard of hearing viewers feel the same way about out-of-synch captions! If necessary in order to accommodate text, timing can be off by as much as 0.3 seconds; more than that and it becomes distracting.
Now, you may ask yourself: How do you take all of those issues into account at the same time? The answer is, it’s a balancing act requiring immense language skills as well as great attention to detail. But we’d be happy to do it for you! For more information or to request a quote, drop us a line at firstname.lastname@example.org.