Netflix Audio Description Style Guide v2.3
This document contains a list of required practices when originating audio description for Netflix content. This is not intended to be nor should be used as an exhaustive guide on Audio Description, please consult your Netflix representative for any specification not covered in this document.
Make audiovisual material accessible by means of concisely conveying plot-critical and/or character-integral information that would otherwise be missed by a blind or low vision viewer.
1.1 The Basics
Use best judgement and be mindful of time constraints when determining the amount of details you include and prioritize description of the most relevant and important characters and actions in the scene. Avoid over describing — do not include visual images that are not vital to the understanding or enjoyment of the scene. Allow room for dialogue, sound effects, music and intentional silence. Plot-pertinent dialogue and songs should always take priority.
1.2 Describing Actions
When describing actions, not all elements need to be included at all times. Determine what is most relevant for the story to flow without negatively impacting the viewers’ experience. Avoid information overload when not relevant or when the same details can be discerned from dialogue/music.
- Focus the description on the main and relevant supporting characters and describe visual aspects that reveal information about their identity, personality and traits (what they look like, how they move, what they’re wearing, facial expressions etc.).
- Our content is increasingly representative of the diversity of human experience. When considering whom to describe and in what detail, consider both the needs of the plot and the importance of representation. Description should be factual and prioritize an individual’s visual attributes to address their most significant identity traits, such as hair texture, skin color, eyes color, build, height, age description (such as late thirties, fifties, teenage, etc. ), traits related to visible disabilities, etc. and should be done consistently for all main and relevant supporting characters that are being described, (i.e. do not single out a character because of a specific trait, describe everyone equally) and using a person-first approach (e.g. "a swimmer with one leg" instead of "a one legged swimmer").
- If unable to confirm or if not established in the plot, do not guess or assume racial, ethnic or gender identity. Instead, focus on the characters’ physical attributes as described above.
- For non-fictional characters, determine how known/unknown they are in your territory to decide which elements to describe. This might apply to fictional characters too (e.g. a leprechaun).
- In case of time constraints or information overload, characters should be described gradually.
- Description should include known relationships when they have been revealed.
- Ideally, characters should remain unnamed until introduced through dialogue or plot-point. However, characters can be named when they first appear if they are part of pop-culture or when necessary for timing and clarification, as well as to identify characters in a large group.
- Do not name characters if they are purposefully supposed to remain unknown.
- When naming characters for the first time before they are introduced through dialogue, aim to include a descriptor before the name (e.g. "a bearded man, Jack").
- Description should convey facial expressions, body language and reactions, especially when in opposition to the dialogue. These elements can be omitted if they completely mimic the dialogue they are accompanying.
- Elements of the visual style or film language should be included when crucial to the story and/or genre (for example, text typographical features that may convey a meaning or shaky handheld camera work).
- Directional movement should be included when relevant.
- Description should be as specific as possible and avoid general terms and/or brand names, unless plot pertinent.
- Exception: if unable to confirm, do not guess. Instead, use the general term. (If you are unable to confirm what a chef is chopping, it’s better to say they chop herbs than to say they chop parsley - please consult your Netflix representative.)
- Colors should be referenced when relevant to the scene and if time allows.
- Although some subjectivity is unavoidable, description should not be opinionated unless content demands it.
- Description should include location, time, and weather conditions when relevant to the scene or plot.
- When choosing the level of detail to provide, determine if a setting has a symbolic function (for example if it helps reconstruct the traits of a character) and if it carries more plot-relevant information compared to other elements.
- It is best to provide directional description for visual action in reference to the viewer’s body. (“The mouse ran behind a tree to the right of the house.”)
- When creating Audio Description for a language other than the original language, determine how known/unknown the setting is outside the original audience and describe accordingly (naming vs explicitation. It is best to name and explicate if time allows, e.g. Tower Bridge - a turreted bridge over the river Thames; He wears a barretina – a red Catalan hat).
- Description should be informative and conversational, in present tense and third-person omniscient. Second-person plural can be used if relevant to the content style (She turns to the camera and winks at us) especially for children’s programs (Where is she taking us now?).
- The vocabulary should reflect the predominant language/accent of the program (for example American English vs British English; Castilian Spanish vs Mexican Spanish, etc.) and should be consistent with the genre and tone of the content, while also mindful of the target audience.
- As languages evolve, pay attention to the words you choose and their historical context. Conduct research as appropriate, and avoid using words that express negative connotations or bias towards a community or that are considered antiquated or no longer acceptable. Please consult your Netflix representative should you need assistance.
- Pay attention to verbs. Choosing the most appropriate verb is more vivid and quicker, rather than colouring a bland verb with an adverb (e.g. he hobbles, rather than he walks with difficulty)
- Common terms should be used in lieu of full description (plié vs. bending at the knee).
- Pronouns should only be used when it is clear to whom they refer. Please consult your Netflix representative should you need clarification on the pronouns to use.
- When noting shapes and sizes, comparisons to familiar objects are recommended. Use globally relevant objects to describe sizes, i.e. avoid describing 100 m as the length of a football field, a US centric reference, and opt for a more global alternative.
- Description over dialogue should be utilized only as a last resort, for example where the plot cannot unfold properly without a description being added. In these scenarios, it is acceptable to describe over applause, laughter, repetitive dialogue or music. Do not describe over the main dialogue unless absolutely essential.
- Treat lyrics like dialogue, and only describe over them when necessary. In the case of having to describe over lyrics, allow for the song to establish itself. When lyrics are not meaningful and visuals are more important, describe what is happening. Preferably only add description when lyrics repeat e.g. during the chorus.
- Only interrupt music, sound effects, and intentional silence for vital, timely information that must be described.
Avoid censorship: do not censor any information. Description should be straightforward when addressing nudity, sexual acts, and violence. Language choice should reflect the target audience and rating (be guided by the program content). Please consult your Netflix representative should you need help determining the target audience and rating of a specific title.
1.4 Description Consistency
The word choice, character’s qualities, and visual elements (e.g. the naming of locations) should remain consistent within the description for the entirety of the content and across episodes/seasons. A glossary should be created listing common descriptors.
2.0 Describing On-screen Elements
2.1 On-screen Text
Determine if the information is already being provided by other elements, such as dialogue, before adding to the description. Text may be rendered synchronously or asynchronously, verbatim or paraphrased.
Different techniques can be used to introduce text; i.e. explanation (“words appear”), change in the tone of voice in order to create a distinction between reading text and the actual description or different voice/s. Consult your Netflix representative before casting additional voices.
Legal Disclaimers should be read as-is.
2.2 Subtitles for Foreign Language and Difficult-to-understand Dialogue
The same techniques used for on-screen text should be used to introduce subtitles (explanation, name of the speaker, change in tone, multiple voices). The description should read the subtitles verbatim. The original dialogue audio should be dipped in order to avoid confusion, but still allow the viewer to hear the original dialogue in the background. State “subtitles” when necessary to avoid confusion (for example, the first time they appear on-screen) and reintroduce if considerable time has passed before they appear again.
Subtitles for difficult-to-understand dialogue should be included in the description only when the audio is unintelligible. Avoid describing over lines that can be understood from the original version.
For heavily subtitled content, multiple voices may be needed to help differentiate the speakers. Consult your Netflix representative before casting additional voices.
2.3 Subtitles for foreign language songs
When song lyrics are plot pertinent and have been subtitled, they should be read by the AD voice. They should not be sung but be timed to fit within the rhythm of the music as much as possible while allowing key phrases of the original to be heard.
If the original lyrics are not subtitled but are plot pertinent treat them as dialogue (do not speak over them).
If time allows, description should be provided for any on-screen logos to include any studio or company names and the details of the image. Be consistent with logo descriptions, respecting that they change through time.
If present, the Netflix Ident should be described as per Netflix Original Credits document.
2.5 Titles and Credits
The description should include any opening and closing credits with an adjusted tone when not too distracting, but if these interfere with simultaneous dialogue and action, timing adjustments may be made, such as grouping, to introduce the text before or after actual credit appearance. Credits will be included as time permits. They may be condensed if time is limited. Prioritize credits in order of appearance. Aim to have the following credits described during opening and/or closing credits.
- Creator, Writer, Director, Main Cast, Producer, Executive Producer, Director of Photography, Editor, Music & Sound by
If unable to cover all credits and if time allows, state that edits have been made with a line such as “other credits follow.”
When creating Audio Description for a language other than the original language (i.e. AD that is mixed with a dub) and if time allows, please read credits that will appear in the dub card after credits listed above or in place of Main Cast in original crawl.
Introduce the title of the content by stating “title” before the name. Reflect the typography if relevant. When creating Audio Description for a language other than the original language (i.e. AD that is mixed with a dub) use the Netflix approved translations for the Main Title as provided in the KNP.
3.1 Voice Casting
The AD voice should be selected according to the following categories:
The gender of the AD voice should be chosen either to complement or to contrast with the majority of voices in the film. Some think it should be easy to distinguish between the dialogue and the describer, others that it should match because of the subject matter. This should be decided on a case-by-case basis.
The age of the AD voice should match the content and age of the intended audience e.g. a teenage or young adult voice would be preferred for Sex Education, although the gender probably does not matter. The exception to this would be program for young children where a nurturing voice might be best.
The quality of the AD voice should match the dominant mood of the content, e.g. a mellifluous voice for a love story, a grittier voice for a Western.
The accent of the voice actor should reflect the predominant accent in the program (for example American English vs British English; Castilian Spanish vs Mexican Spanish, etc.).
More important than their vocal characteristics is that the voice talent should have a sense for the content of the program and be able to reflect the emotion of a scene.
Creative possibilities in AD are opened up by having more than one AD voice. Consult your Netflix representative before casting additional voices.
3.2 Technical Requirements
The description should be mixed to sound as though it was part of the original content. For 5.1, description should be mixed to the center channel. For further information about the mix, please consult our technical specs.
- For a 5.1 Printmaster(PM), dip center channel only for descriptive events. For very loud sections or for films with very wide dynamic range, it’s acceptable to dip the Left and Right channels of a 5.1 PM as well, generally no more than -6db, and sparingly up to -12db when absolutely necessary.
- For a 2.0 Printmaster, dip both channels accordingly.
- Original Version/PM can also be manually dipped. Voiceover/AD may not be raised above Netflix loudness specifications to overcome very loud events in the Printmaster audio.
- Dip original version mix 6-12 dB, per mixer discretion. AD/VO audio should be clear and intelligible with the natural presence of OV dialog underneath. These are subjective choices relating to the dynamics of the OV/PM and the perceived volume of the mix with incorporated AD Voiceover.
- Side-chained compressors should not have attack time shorter than 2ms and longer than 15ms. Use your best judgment to avoid compression artifacts, such as pumping or popping. Achieve a transparent natural response.
- Mix level should adhere to Netflix LKFS loudness and true-peak specifications.
- Mix level should transition to/from dips for descriptive events in no more than 5 seconds. Avoid abrupt transitions and noticeable level changes to create a seamless experience.
- Regarding EQ and dynamic processing - VO should sound natural. A good recording will generally need very little processing with EQ and dynamic Compression. Avoid over processing. Avoid using noise reduction, as recordings should already be quite clean. (NR can cause serious artifacts when not applied sparingly)
3.3 Vocal Approach
The delivery of the description should match the volume, pace, emotional tone and rhythm of the content.
- Voice - Avoid a monotonous or sing-song delivery. The narrator’s voice must be distinguishable from other voices in the content, but it should not be distracting or over-animated, becoming the voice of a performer unless the content demands it. For selected titles, your Netflix representative may request a specific delivery depending on the type of content (for example, more empathetic for an emotional fiction title).
- Enunciation and speaking rate - Speak clearly and at a rate that can be understood. Avoid speaking too fast or too slowly. As far as possible, the pace of the description should reflect the pace of the scene. In a romantic sequence, the description should flow casually and allow silence and pauses as necessary. It should be quicker and more staccato for a fight or a chase.
3.4 Describer Consistency
The same voice talent should be used across all episodes and seasons of a series, as well as movie sequels when possible. Consult your Netflix representative if unable to secure the same voice talent.
3.5 Audio Description Credits
Include AD post-house name, script writer and voice talent credits within the AD track, after the last frame of picture and before the end credit crawl. In case of time constraints, consult your Netflix representative to determine where to better place the credits.
Determine the genre, visual style and spatio temporal setting (where and when) the content belongs to as well as its audience. Choose words and expressions from the same semantic field consulting the original script / screenplay if need be and when available.
4.1 Children’s Content
Tone and vocabulary should match the age range of the target audience and a more intimate style may be appropriate. For educational materials, or situations in which the viewer is asked to follow specific actions of a character on-screen, description should be clarified in order that the sight-impaired audience identifies that the audience is being addressed (“to us”; “let’s have a go”), rather than an on-screen character.
4.2 Horror/Suspense Content
Description should account for intentional pauses, dramatic silences and the musical score in order to allow the sight-impaired audience to experience the same build-up of suspense intended by the production. This should also be reflected in the delivery.
5.0 Plot Devices
While jargon and technical terms should not be used, film terminology that has entered the common vocabulary can be used when necessary for timing and clarification, or when in line with the story and/or genre (for example, “now in close-up”).
It is preferable to describe synchronously with the image, especially with regard to comic situations. However, description may adjust timings (pre-description) in order to introduce plot elements early, when there is no other way to sensibly inform the audience about the content.
5.2 Camera Angles & Shot Changes
When shot changes are critical to the understanding of the scene, indicate them by describing where the action is or where characters are present in the new shot. Camera angles or point-of-view should only be included in the description when content-appropriate (“from above” & “bird’s eye view”).
As time allows, describe montages of images or series’ of still images. When the images are relevant but time is restricted, highlight a couple of the most significant images.
5.4 Passage of Time
Always address time shifts in relation to the character(s). When describing certain passages of time, such as flashbacks or dream sequences, describe the visual cues that indicate such, and be consistent throughout the program.