A Closer Look at Netflix's Timed Text Style Guides and Subtitling Best Practices

Introduction

Netflix Timed Text Style Guides

Technical aspect
Linguistic aspect
Forced Narratives
Trailers
Subtitles vs. CC

Conclusion
Resources
Appendix: SDH Identifiers Table - HU

Watch my short "hook" about this post here:

Download the slides here.

Introduction

Subtitling and audiovisual translation

Dubbing and subtitling are very creative processes. Whether the audience watches with dubbed audio, or in the original language with foreign language subtitles, closed captions, or forced narratives, the ultimate goal is to make the shows enjoyable and resourceful.

As well as making sure that any text is timed appropriately to the action, capturing creative vision and nuances in translation is critical for this goal. Audiovisual translation is like creating 3D translations. In traditional translation projects, you have the source text and the target text. It's two-dimensional. With audiovisual translations, you have the source text, the visuals, and the sounds, and you create a target file to flow seamlessly with your project but invisible enough not to take attention away from the production. It's not as easy as it seems. Subtitling borrows from many different practices and schools of thought. One of the biggest rules of subtitling is, naturally, to always keep the viewer in mind.

But let's look at the process more closely. In this article we will look at Netflix's Timed Text Style Guide (which is available on the internet at partnerhelp.netflixstudios.com) and use it as a model to acquire the necessary skills and approach for creating general subtitles for audiovisual localization projects. I will attempt to simplify their most common timing rules - which can look confusing at first, especially for those who have never done subtitling before - as well as discuss some of their other guidelines and best practices that I learned during the time I worked as a subtitler and subtitle QCer. Of course, whenever it's available, you should always follow your client's guidelines and requirements. Please also note that specifics about these guidelines might be different or change, but since I am not affiliated with Netflix, I would not know about those changes more than anyone else who can access their publicly available material on the internet.

Netflix Timed Text Style Guide

"Timed Text" in the name means that it carries the dialogue along with the corresponding time code. Timed Text's requirements can vary from language to language, but there are some common elements.

The Netflix Timed Text Style Guide starts with this description: "Any timed-text created specifically for Netflix – Originals or non-Originals – should follow the Netflix Timed Text Style Guide unless otherwise advised." Let's look into the document itself.

I have separated the process into two major aspects: technical and linguistic.

The technical aspect covers things like timing, duration, character limitations, and positioning of the text. In the linguistic section, I will share some language-neutral tips and best practices I learned as a subtitler.

Hint: if you are planning to apply to become a subtitler for any major streaming or media company, you probably should not overlook any of the above-mentioned areas. Having a very solid language with perfect grammar and an excellent understanding of idioms and phrases, spiced up with a good amount of creativity and artistic touch is just as important as having a perfect understanding of the technical requirements. The combination of these two will enable you, your client, and ultimately the audience to really enjoy the show and almost "forget" about the "distracting" white text on the screen while being perfectly able to follow the storyline just like a native speaker would without subtitles.

Technical Requirements/Timing Rules

Timing

According to the Timed Text Style Guide, the minimum duration of the subtitle event is 5/6 (five-sixths) of a second per subtitle event (e.g. 20 frames for 24fps) and the maximum duration is 7 seconds per subtitle event. The character limitation for most European languages is 42. Normally this refers to adult shows. Children's productions and trailers -- known also as supplementals -- may have other timing requirements and/or timing guidelines, and can be adjusted or give preference to other, prioritized items. For example, in the case of children's shows, given that children don't read as fast, the text needs to stay longer on the screen and cannot be as long as in adult productions. And with supplementals, the shot and scene changes can happen so fast that in some instances you cannot follow the basic timing rules while still satisfying other, more important criteria, such as making sure that your text stays on screen long enough to make it readable at all.

Timing to audio

Under normal circumstances, dialogue needs to be timed to the audio. Subtitles should have an in-time which is on the first frame of audio or as close to it as possible (within 1-2 frames of the first frame of audio is acceptable) unless the scenario falls into the timing to shot change rules (see below). The out-time can be extended up to half a second past the timecode at which the audio ends.

2-frames rule, chaining, and 12-frames rule

Around 2020, Netflix modified its timing rules and adopted the "chaining" or "linking" or "closing gaps" method. I assume this was done following general subtitling best practices and what had already been the general approach by other media or streaming services, such as YouTube. The chaining method basically means that when timing a sequence of subtitles, you create a run of subtitles with even gaps by bumping up the out-time of the previous subtitle to two frames before the in-time of the new subtitle where any gaps of fewer than half a second exist. This chaining is possible by playing around with the 12-frames rule, that is, adjusting the subtitle event just enough within 12-frames to close any gaps between subtitles while respecting the 2-frame rule between all subtitles.

Hint: Two frames are hardly detected by human eyes, while more than two frames can be detected and it would feel as if the viewer's eyes jump twice, especially around shot changes.

Timing to shot changes

Complications start when the dialogue starts or overlaps a shot change. According to my understanding, timing to shot changes plays a major role in all Netflix subtitles. Timing to a shot change and audio are key aspects of the subtitling process which contribute to the ease with which subtitles are read by viewers.

Netflix's rule is that when the dialogue starts on the shot change or within half a second past the shot change, set the in-time to the first frame of the shot change. And here, half a second is really 12 frames.

To simplify things, Netflix's shot change timing rules boil down to a red zone and a green zone. What does that mean?

· Red zone: If the audio start or ends within 12 frames of the audio, the subtitle needs to be timed to the shot change while respecting the two-frames rule between subtitles. That is to say, in-times and out-times may be brought forward or extended to be in sync with shot changes within a half-second parameter in order to create an even viewing experience and to allow the subtitles to fit neatly within the edited content. This is so simple: just drag the subtitle and move it to the shot change. You can be assured that there are many studies behind this sometimes illogical rule.

· Green zone: If audio start more than 12 frames after the shot change, or the audio start more than 12 frames before the shot change, the subtitle should be timed to the audio.

Netflix also advises to "apply good judgment when determining if a subtitle looks like it is hanging on-screen for too long and apply timing adjustments accordingly".

Hint: When I did the subtitling test in 2018, "12 frames" were "7 frames", which is basically half their time specified in the latest guideline. This created a more "jumpy" experience for the viewer, as well as meaning that the chaining method was not used. The image above shows this "older" approach, particularly timing around shot or scene changes. The 2-frame rule was applied, and that never changed.

Hint: In general, I have been advised to always apply these two major timing rules no matter what when doing subtitles:
Respect the two frames gap;
Follow the shot changes rule.
Hint: Use common sense when trying to figure out these rules. For example, no one wants the dialogue to travel from Sweden to Mexico when the scene is also changing with the shot.

Reading Speed

Most studies that incorporate eye-tracking mechanisms show that the comfortable reading speed for the general public ranges from 9 to 15 characters per second, or cps. These measurements are based on objective cps, i.e. they solely take the sheer character count in a specific period of time into account.

As of now, Netflix's guidelines, and thus their proprietary subtitling tool Originator - a central hub for project managers, translators, and vendors that also includes a subtitling environment, similar to a translation environment with added features - allows up to 17 cps without any warnings. This is usually plenty. If the speed exceeds around 23.5 cps, the platform will flag it as an “error.” Anything between these two numbers is labeled as a “warning.” You can create very similar warnings and errors in any available subtitling tool to make sure that you don't exceed reading speed.

For adult programs, Netflix suggests 17 cps as a limit, while the limit for children's programs is 15 cps for the above-mentioned reasons. The numbers can vary from language to language, so you want to make sure to check out your language's or client's requirements before setting up any subtitling environments or CAT tools.

Hint: It happens that you encounter comments by your client or even from the general public, that "the subtitles only include a part", or " the character used 8 words but it was really 10". Keep in mind that only about 60% to 70% of dialogue can be conveyed through subtitles as it is because otherwise, it would not be readable. Some people read faster; some do not. Some are already proficient enough in English so it’s easier for them; most aren’t. The subtitler's goal is to appease the majority. Perhaps that is the reason Netflix refers to their subtitle creation process as "origination" rather than translation or localization.

Positioning

We don't really think about this part, but the correct positioning of subtitles is crucial in order to provide a seamless user experience.

In general, all Netflix subtitles should be center justified, center-aligned, and placed at the bottom of the screen or raised to the top to avoid clashes in the lower third.

The number of lines should not exceed two but the preferred treatment is to keep all text on one line if possible.

Subtitles are positioned to avoid overlap with on-screen text (e.i. forced narratives), mouths, faces, and important actions happening in the lower third of the screen. This is also important: just imagine your character shows something relevant to understand the happening and you cover it with the subtitles. But in cases where overlap cannot be avoided (text at the top and bottom of the screen), the subtitle should be placed at the bottom of the screen and the forced narrative placed at the top (or even left out if it's not plot pertinent!). And finally, if you need to raise some of your subtitles or on-screen texts, make sure that you raise one sequence (for example, your first 5 opening subtitles) consistently to avoid the experience of having the eyes jumping up and down.

Linguistic Best Practices

Line Treatment and Line Balance

Netflix's guidelines do not allow more than 2 lines for subtitles, and even with that, their preference is to fit the text on one line whenever it's possible. However, if the line needs to be broken, the following rules are to be followed:

· The line should be broken

· after punctuation marks

· before conjunctions

· before prepositions

· The line break should not separate

· a noun from an article

· a noun from an adjective

· the first name from the last name

· a verb from a subject pronoun

· a prepositional verb from its preposition

· a verb from an auxiliary, reflexive pronoun, or negation

If the text is to be broken into two lines, you should always try to aim for a balanced view, that is, a pyramid shape formation with the first line slightly shorter than the second. A line with too few or too many lines makes it harder to read. This is not always doable. Always rephrase if it will result in a better break. You can also add a filler word to the shorter line if appropriate. Generally, avoid having one short word on a line.

With a little practice and thinking, these rules become second nature; I even noticed that it hurts my eyes when I see a line break "violation" in other localization projects. It's just so logical, and why not follow a consistent and logical line break rule on every localization project?

Hint: Never sacrifice your language and good grammar in order to fit your text in one line. Some languages naturally expand. I have seen (bad) cases when the translator used unnecessary abbreviations, slang, or sloppy language to satisfy the one-line request. That is a big no-no. Language first, then technical requirements. Believe it or not, a good audio-visual localizer finds a way to satisfy them both.

Spelling, grammar, and punctuation

One of the error codes you can receive from a QCer is the SPG error when your subtitle event contains either a spelling, grammar, or punctuation error.

When "originating", try to accommodate your language's grammar rules while also not forgetting that subtitling has its own conventions. Languages evolve fast, and official spellings of words might become obsolete over time.

Hint: One good practice I followed in all my subtitling projects was not "to reinvent the wheel". Yes, you need to be creative, but I cannot imagine your language does not offer enough room to find the right word or idiom, even if it's very challenging to localize content. You don't need to invent, improvise, or reconstruct already existing words or rules. Just think about it; do you think your audience will understand it if you start using phrases that sounded like a brilliant idea but only you know their meaning? Not likely.

Localizing numbers, currencies, dates, etc.

When localizing numbers, currencies, dates, and such things, the best practice is to follow your language's or client's guidelines and rules. If they are not available, create a separate spreadsheet for yourself and use it as a reference.

Objective vs. Subjective Translation Errors

For content translation, you can be flagged for either objective or subjective errors by QCers.

Subjective errors are meant to cover cases when the translation is not entirely wrong or bad but the viewer may not benefit from it as much as if it had been translated slightly differently.

Objective errors are given when you commit a mistranslation or the content is completely off relative to the original intention.

Hint: My general advice is: don't be afraid of creative ideas; don't translate word by word but try to find solutions in your language rather than sticking to the source's form; use your best judgment and be a little conservative as well. I have seen subtitle translations that were so eloquent and "innovative" that they drew attention away from the production itself. Be humble, and know that what you are creating is just a small portion of the whole. The subtitle exists for the production and not the other way around.

Terminology and consistency

Just like any other translation or localization project, consistency with terminology is very crucial. Netflix has a separate "glossary" database for their "Key Names and Phrases", also known as KNP. Each show has its own KNP.

There is a publicly available KNP template by Netflix available on this site:

https://docs.google.com/spreadsheets/d/11u-tsOJq1r2HJy_ds7pD95C-Vy9jKEfT5wTYhhtNHAg/edit#gid=1062876492

Hint: You should always have a glossary dedicated to each of your subtitling productions. You never know how the production will scale or grow, and it's useful to remember what was the nickname translation of your favorite character in the first season when you are already working on the fifth.:)

Other cases when general subtitling rules may differ

The above-mentioned rules - particularly the timing rules - apply to all general subtitles, but there are some special cases when it is necessary to follow different guidelines or to apply a more flexible approach in order to satisfy the intended objective of the situation.

The most common situations when we need to slightly adjust our general approach or bend the rules are when we use forced narratives, when we create closed captioning, or in the case of supplementals or trailers. Let's look at each of them.

Forced Narratives

A Forced Narrative (FN) subtitle is a text overlay that clarifies communications or alternate languages meant to be understood by the viewer. They can also be used to clarify dialogue, texted graphics, or location/person IDs that are not otherwise covered in the dubbed/localized audio. To enable the same viewing experience across multiple countries and devices, FN subtitles are localized and delivered as separate timed text files.

Simply put, forced narrative is any other text that appears to clarify text on-screen text or foreign language dialogue.

Use cases:

FN subtitles are used in the following cases:

Short segments of foreign language, intended to be understood by the audience, that differs from the original language of the show.  ;
Translation of original language location/person IDs, dates, or other labels (e.g. “White House, December 10”). As a creative element, these text graphics are usually burned into the image.
Communication that would not otherwise be commonly understood (e.g. sign language, a subtitled dog, Klingons, etc.).
· Transcribed dialogue in the same language, often done for audience clarification (if audio is inaudible or distorted, commonly in documentaries).

In terms of timing, general timing rules don't apply to forced narratives unless they are covering foreign dialogue. The text mimics the duration of the on-screen time whenever it's possible. If it's not possible to include both subtitle and forced narrative, for example, because of truncating or timing errors, the most plot pertinent event is given priority.

Subtitles vs. CC/SDH

Subtitles for deaf and hearing-impaired (SDH) are created for English-speaking viewers, also known as closed captions. The main difference between subtitles and closed captions is that while subtitles are meant to help the viewer to understand a foreign language production in their native language, closed captions can be turned on and off in most cases and they are intended to include all dialogue and other audio effects.

When working on a closed captioning project, you should always remember to include everything in the subtitles. For example, while in regular subtitles (or open captions) you should not repeat a word in the subtitle, in closed captioning, you must include it exactly as it was said. That is, with closed captioning, you are creating a mirror image of the audio while in subtitles you are localizing to have a separate timed-text project to convey the meaning of your show. Therefore, in subtitles, you can leave out words that are not as important in order to enhance timing or reading speed, while this is not possible with closed captioning. Naturally, you will have more challenges adhering to the timing rules with the SDH events. Moreover, closed captioning includes all audio effects, such as coughing or laughing, that are otherwise not obvious to the viewer, as well as the name or identifier of the character who is speaking if they are not visible on screen.

Hint: Don't include sound effects that are otherwise visible to the viewer, such as a door banging when that clearly shows.
Create your own language-specific cheat sheet for the most commonly used sound cues and have it handy when working on a CC project. While you may think that it's easy to remember the sound cues being used for slamming a door or loading a gun, you would be surprised how often they don't come to mind when you are focusing on other things, such as catching the right word, timing, etc. (You can download my Hungarian cheat sheet for commonly used SDH identifiers, such as speaker identifiers, sound cues, and non-verbals here. It also includes music genres and music depictors. The content is in Hungarian, but you can still create a similar one in your language or you can translate mine to you have your own language-specific list!)

Short forms vs. Main production

Supplementals, also called trailers, are short form productions, usually a few minutes long, which are meant to catch the viewers' attention and serve as an introduction or "hook" for the main production.

Challenges about translating short forms vs. long forms:

Higher exposure: It's likely that more people will watch a trailer in its entirety than the long-form. If you make a spelling error within the first five minutes of a show that lasts 45 minutes in total, people will not remember it as much as if you make one single spelling mistake in a two-minute short form.
Giving enough without giving it away: In short forms, it's not possible to gradually build up the dialogue as you can in long forms. Your dialogue needs to be clear, easily understood, and squeezed into a very short time frame, sometimes just half a minute. The content needs to be more a hint than a narration that intends to explain everything.
Fitting the subtitle events into a fast cut environment: For trailers, timing sometimes has to be “loosened” so that reading speed specifics can be met. Due to the quick cuts of trailers, timing to shot changes cannot always be met, so extending the timing of subtitles in these cases is needed and expected even if it’s meant to break the 2 or 12-frames rules.

Hint: When working on an intro or "hook" file for your main project, always try to mirror the main production to the extent possible without giving away any puns or spoiler alerts.

Conclusion

Globalization increased the need to employ subtitles in video content as part of localization and other processes. Subtitles and captions (closed or open) play an integral role in bolstering viewer engagement, content creation, increasing viewers, search engine optimization (SEO), and content discovery.

How captions are presented, both optically and structurally, could have a serious impact on the viewers' understanding and enjoyment of the content. The difference between a good and bad experience is usually minor fixable issues.

Obviously, there is much more to say about the entire subject, but I hope you will find my initial post helpful to fix those "minor" issues and enter the world of audiovisual localization and translation with full confidence!

Resources:

https://partnerhelp.netflixstudios.com/

https://netflixtechblog.com/

https://slator.com/how-netflix-does-subtitling-for-the-world-ex-china/

https://blog.andovar.com/captions-subtitles-explained

https://translationjournal.net/October-2016/how-to-do-subtitles-well-basics-and-good-practices.html

Images courtesy of Netflix (partnerhelp.netflixstudios.com) and Unsplash public domain

Music in video: Bensound.com

Appendix:

SDH Identifiers Table - HU

QualityHU

Search This Blog

A Closer Look at Netflix's Timed Text Style Guides and Subtitling Best Practices

Table of Contents

Watch my short "hook" about this post here:

Download the slides here.

Introduction

Subtitling and audiovisual translation

Netflix Timed Text Style Guide

Technical Requirements/Timing Rules

Timing

Timing to audio

2-frames rule, chaining, and 12-frames rule

Timing to shot changes

Reading Speed

Positioning

Linguistic Best Practices

Line Treatment and Line Balance

Spelling, grammar, and punctuation

Localizing numbers, currencies, dates, etc.

Objective vs. Subjective Translation Errors

Terminology and consistency

Other cases when general subtitling rules may differ

Forced Narratives

Subtitles vs. CC/SDH

Short forms vs. Main production

Conclusion

Resources:

Appendix:

Labels

Comments

Post a Comment

Popular posts from this blog

Fearless Workplace and Psychological Safety

Contentful Headless CMS - l10n & i18n