INNOVATION: Actionable AI – How to Improve Translation Speed + Quality with NLP

welocalize May 21, 2019

Google and Welocalize recently shared the stage in San Francisco, as part of Google Cloud Next ’19. Olga Beregovaya, Welocalize’s VP of Language Services, and Sarah Weldon, Product Manager for Cloud Translation API and AutoML Translation, presented the latest case study of actionable AI using AutoML Translation.

Professional translators and terminologists know a good glossary is like gold. AutoML’s latest release includes a new glossary feature. Through improved MT outputs, users have greater granular control over content types across multiple languages – such as brand IP, location names, or other specific terminology that is specific and tailored for an individual company or brand.

“Welocalize continues to be at the forefront of multilingual solutions for their global clients, continually helping to increase speed and scale. They recently started using our new glossary feature within Google AutoML Translation API [v3] to enable companies to maintain control of brand-specific terminology within translation workflows. Welocalize were able to increase accuracy, efficiency, and fluency for all languages – as high as 20% in some cases.” 
Sarah Weldon, Product Manager, Google

Importance of a Customized Glossary in MT

For any brand, an important asset when transforming content into other languages is the development of a glossary, also known as a terminology base. A [customized] glossary enforces consistency and content style in line with the brand, in all target languages. Plus, it eliminates uncertainty in the overall content transformation process, all equaling a shorter time to translate, better quality output, reducing costs over time. Even with no or small datasets, integrating a glossary can customize output and increase the performance of generic and pre-trained MT engines.

WATCH NOW: Case Study: Google AutoML Translation + Customized Glossary

Using glossary case studies, Welocalize’s Olga Beregovaya compares Google pre-trained models vs Google pre-trained models with customized glossary using a selection of Romance, Germanic, Asian, and Slavic language groups.

As Olga observes in her presentation, ‘the gains in accuracy and fluency for all the languages using the glossary feature comes out on top for both on Google with Pre-Trained models and on Google with AutoML.’

“We process hundreds of million words per year using machine translation in widely disparate enterprise client scenarios. The ease of customization and API consumption allows us to enforce broad terminology coverage for both clients with voluminous data in Google AutoML Translation and clients with sparse data in Google Cloud Translation API. Our initial benchmarking in five languages shows a preference for translation with glossary as much as 20% over the non-glossary.

Olga Beregovaya, VP of Language Solutions, Welocalize

Key Highlights

Even without large datasets, integrating a glossary using the AutoML feature can increase results by 20%.

    • Higher productivity + quality
    • Better accuracy + fluency results
    • Reduced time required by translators + post-editors
    • More consistent, tailored content

Welocalize is an early access partner for Google AutoML, and continues to support Google to enhance machine learning for translation challenges. Welocalize has deep practical experience customizing NMT to power content transformation solutions. Our solutions enable global brands to manage high volumes and achieve quality levels that drive user engagement with translated content.

Join us at the Global Transformation Summit in Silicon Valley on 6 November. More information here.