Skip to main content

NMT Customization Pilot Project



This month-long pilot project aimed to train Microsoft Translators' NMT engine and to develop a Neural Machine Translation model to translate the editorial and media press releases of MOSTRA in 2021 from English to Brazilian Portuguese.

The project validated the MT engine for the above-mentioned purpose.

My team and I prepared a Statement of Work for our client with the following details: Objectives, including goals related to quality, efficiency, and costs; Project timeline; Processes and Workflows; Details Costs Table; Deliverables.

By the end of the month, we were able to give a clear, data-backed up solution to our client on whether it's worth investing in the training or hire human translators instead of the machine and we also created a Lesson Learned Video Presentation about the process.

Scroll down to see the proposed timeline of the project, download the sample files, and/or watch the video presentation.

PROJECT TIMELINE

March 1 - March 28

March 1st

KICKOFF MEETING

The official start date of the project; Kickoff Meeting with the client; Proposal presentation and QA.

by March 5th

PREPARATION PHASE

Preparation of the project including data mining, data cleaning, data alignment, setting up the workspace, etc.

by March 19th

NMT TRAINING

10 MT training runs, 1 training/day.

by March 28th

ANALYSIS

MT output analysis, Post Editing, QA.

March 28th

DELIVERY

Delivery of completed items, findings, conclusion, and updated proposal.

You can see and download all the project files here:

NMT Pilot Project Downloadable Files

  • Statement of Work Initial Proposal
  • Statement of Work Updated Proposal
  • Lessons Learned Video Slides Presentation

Lessons Learned Video Presentation

At the end of the project, we created a video presentation to show the different elements and workflows of our month-long customization project and also to describe the challenges we faced and how we overcame them. Watch the video on this link:


Contact me:


Comments

Popular posts from this blog

A Closer Look at Netflix's Timed Text Style Guides and Subtitling Best Practices

  Table of Contents Introduction Netflix Timed Text Style Guides Technical aspect Linguistic aspect Forced Narratives Trailers Subtitles vs. CC Conclusion Resources Appendix: SDH Identifiers Table - HU Watch my short "hook" about this post here: Download the slides  here . Introduction Subtitling and audiovisual translation Dubbing and subtitling are very creative processes. Whether the audience watches with dubbed audio, or in the original language with foreign language subtitles, closed captions, or forced narratives, the ultimate goal is to make the shows enjoyable and resourceful. As well as making sure that any text is timed appropriately to the action, capturing creative vision and nuances in translation is critical for this goal. Audiovisual translation is like creating 3D translations. In traditional translation projects, you have the source text and the target text. It's two-dimensional. With audiovisual translations, you have the source text, the visuals...

Contentful Headless CMS - l10n & i18n

Exploring Contentful for Translation and Localization It seems there is a new buzzword in the website building industry: headless CMS (Content Management System). But what is exactly a headless CMS and how does it work? In this write-up, I am going to walk you through the different steps of how I explored one of the most popular headless CMSs, Contentful; how I created a simple website with it; and what approaches I took to localize the website into another language. In the end, I realized that I needed a completely different mindset on website building and localization in general. And I had a clear vision of when a headless CMS is useful and when it is recommended to use a traditional CMS, like WordPress, instead. Finally, I learned a lot about Node.js, Gatbsy.js, website deployment, and the many challenges i18n problems introduced into the subject. Part 1: Headless CMS & Contentful What is headless CMS? A headless CMS is a platform that has no default front-end system to determin...

Discussion: Managing Stakeholders

A point of view of a freelance localizer and translator Who was the stakeholder and what was their role? For this discussion, I am going to use my freelancing experiences and use my clients as stakeholders. These clients are mainly LSPs and within that, primarily need to deal with translation project managers or project coordinators. In some cases they are more on the junior side, meaning they are not involved that much in the entire project but only in the coordination between linguists, and in some cases, they are senior project managers who have more interests in the project outcome because they need to manage them from beginning to end and they also need to deal with their clients. In some cases, depending also on the client and the size of the company, the project manager can be the owner or the president of the organization at the same time. As I mainly had pleasant experiences with these stakeholders, I am going to explain how I learned to deal with them in general going into de...