The State of AI, Machine Learning, and NLP in Localization

welocalize October 12, 2021

We’re constantly hearing about the cutting-edge technologies and incredible innovations happening thanks to Artificial Intelligence (AI). In McKinsey’s global survey, The State of AI in 2020 it was found that most organizations have increased investment in AI and have adopted AI in at least one business function. Likewise, Servion claim that by 2025, AI will power 95% of all customer interactions.

AI does have a reputation of being a bit of a buzzword and for many is still a new concept. AI, Machine Learning (ML), and Natural Language Processing (NLP) are often used interchangeably – but what exactly are these technologies and what does this mean for the language and localization industry?

Our VP, AI Innovation, Olga Beregovaya recently chatted to Slator about all things AI, ML, NLP, and more – taking it back to basics to explain these technologies, understand their meaning, their history, as well as the current applications.


Watch Olga on Slatorpod Now


Here are some of the key highlights from this SlatorPod #87:

Broad Applications of AI

Nowadays, we see AI everywhere. For many global businesses, AI could be behind their supply chain and logistics, powering a company’s warehouse. If you go online to buy tickets, AI is behind selecting and recommending that particular ticket. You visit a streaming site, all the personalization engines providing recommendations of what to watch are AI-driven based on our past behavior and previous selection.

Similarly for NLP, the areas it powers are conversational AI and chatbots. For example, when you visit a website or log onto your bank page and a chatbot pops up, this is AI-enabled.

“Practical applications of AI have evolved over the past five years and even more so the past three years. Now we see those technologies evolving, so I would say that practical applications are way more flexible.” – Olga Beregovaya

History of AI in the Language Industry

First and foremost, there was the introduction of translation memory systems…

“Those were pretty successful implementations of early days NLP algorithms based on fuzzy matching algorithms. Those were translators’ first introductions to actual productivity gains from NLP,” noted Olga.

Developments then evolved to using NLP within terminology management:

“Then we came to the whole terminology management universe which used more advanced NLP because it involved using character-based, frequency-based algorithms, and extracting terminology. A lot of those statistical terminology extractions are still in use today. We started plugging in rule-based machine translation systems, which did not render humongous productivity, but some rule-based legacy we are still using now.”

Impact of AI on Human-in-the-Loop

Professionals in the language industry are actually interacting with the outputs of AI in a variety of tasks, such as annotations, post-editing, and validating outputs of summarization. That whole language profession is evolving in so many directions thanks to AI.

Olga shared “We still need humans to generate the initial datasets to further generate data to create robust AI systems. To make generative models less biased and more accurate and inject reasoning into those language models. Right now, they cannot reason, and we want to teach them to reason so we can use AI for many more languages.”

How Do You Make Conversational AI Multilingual?

For global businesses, their customer service assistance experience can be much more interactive and usable through the help of chatbots.

“Knowledge bases still exist, but how many knowledge bases have now been converted to your chatbot experience? Instead of building huge online help systems, you will be developing training data for chatbots which are data-hungry.”

One way of making chatbots multilingual is using machine translation (MT). There are multiple chatbots systems where you build your chatbot in English and then you plug in MT. However, Olga noted “you do not get the same level of user engagement with machine translation that you get with English”.

Another way is to rewrite scenarios and build a whole library of utterances for the language you build the chatbots in, rebuilding your multilingual chatbot from scratch.

“Both approaches generate substantial amounts of data to train your global chatbots. The time may come when chatbots will be trainable on unlabeled data but for now you need annotated data to train chatbots.”

Key AI Language Trends

Olga highlighted that advancements in neural machine translation (NMT) was something we could expect in the next several years, particularly for long-tail languages and domains.

Advancements in NMT are expected and even more so when it comes to long-tail languages. The continued expansion into more remote regions automatically drives demand for translation in long-tail languages.”

As the technology evolves, it’s likely that MT will come closer to human parity and for NMT to improve when it comes to accurately capturing meaning.

Olga also touched upon integrating dynamically trained MT for NLP and AI-enabled work services:

“NLP is just a tiny chunk of what we can do with AI in our industry. AI is a part of the overall digital transformation in our industry. We see AI driving supply chains, AI driving logistics, the same should be happening in the language industry.”


Here’s a glossary with commonly used terms on AI in localization:

Artificial Intelligence (AI): AI is a branch of computer science that teaches machines to mimic human reasoning and human behavior.

Machine Learning (ML): ML is the actual technology base of AI. It is the technique and knowledge that enables machines to reach success, self-learn, and evolve.

Natural Language Processing (NLP): A subfield of AI and ML which applies the technique and knowledge to the analysis and creation of natural language and speech.

Natural Language Understanding (NLU): NLU is a subfield of NLP that enables an application to receive, analyze, and understand the intent of the text or speech to build and enable what feels like human-to-human interactions.

Natural Language Generation (NLG): NLG is a subfield of NLP that involves the generation of written or spoken natural language that is used to create or augment machine learning data.

Conversational AI: Conversational AI is the technology that uses NLP that allows a computer or a program to carry out conversational experiences with humans.


Learn More

You might also like Welocalize’s guide to Conversational AI which contains great insights on leveraging AI-systems (like chatbots) in multiple languages to deliver great customer experience.

For more information on Welocalize AI Services, contact the team here.