How Sentiment Analysis and MT Can Help You Make Sense of UGC Content

Global communication group works at a tableUser generated content (UGC) plays a key role in global business, localization and marketing strategies. A growing number of consumers post and share comments and reviews about products, services and brand experiences. Many global brands have realized how valuable it is to harness the power and knowledge of their users and encourage conversations, discussions and opinion-sharing. Global companies like TripAdvisor, eBay, Facebook and YouTube are based on business models that share and rank user opinions.

TripAdvisor, the world’s largest travel site, process over 320 million reviews per month! UGC is often posted in more than one language and a growing area in the localization industry is translating and understanding UGC to monitor what multilingual consumers are saying about their brand and products. This is called social listening.

By gathering and understanding UGC, businesses can use this data to promote further online sales, develop online digital marketing campaigns and provide feedback to product development.

FACT: 25% of Search Results for the world’s 20 largest brands are links to user-generated content. Source: Kissmetrics

One tweet or review can contain facts, tone and opinion that can have an impact on how others see a particular brand. It can be a challenge to collectively make sense, rank and monitor UGC data in the source language, not to mention translate UGC into other languages.

Global organizations often use machine translation (MT), to translate UGC and social media content. MT allows large volumes of data to be translated rapidly to a quality level that is acceptable for this type of content. Once UGC has gone through MT, it is often re-published automatically. As part of this localization and translation process, a growing number of organizations are embracing sentiment analysis as a value-added task to rank source and translated UGC.

Sentiment analysis (SA) is the process of computationally identifying and categorizing opinion expressed in UGC, such as product reviews, social media posts and comments. It provides analysis of the “sentiment” of UGC content, to identify whether it is positive, negative or neutral. On a more complex level, some sentiment analysis tools will break down sections of a review, positive or negative, providing an overall outcome or rating for the piece of text.

The technology behind sentiment analysis is natural language processing (NLP) which focuses on the interaction between computers and language to enable text analysis. As organizations generate huge amounts of online UGC data, sentiment analysis is a key tool to make sense and create valuable business knowledge and intelligence. Working as part of an enterprise MT program, sentiment analysis can assess translated UGC text to enable ranking of multilingual reviews.

Global brands can use sentiment analysis as part of the decision-making process, to decide whether to re-publish and keep certain reviews or UGC data live. Data collected can also be used to help assess the performance of a particular product or service by monitoring overall user feedback posted in social media forums.

Integrating sentiment analysis into an enterprise MT program is an effective way to manage and understand large volumes of UGC in more than one language.  Welocalize has recently partnered with an innovative NLP specialist and is now delivering sentiment analysis and other text analytics services for a range of languages. For more information about sentiment analysis and Welocalize weMT and language tools solutions, email

Based in the United States, Elaine O’Curran is MT Program Manager at Welocalize.

Read more about TripAdvisor and Welocalize partnering together in this case study: 

Topmost Localization Conversations in 2015


As 2015 draws to a close, Monique Nguyen reflects on her year of engaging with business leaders on localization hot topics. She identifies three subjects that have dominated conversations with language service buyers at top global organizations .

Many global brands and organizations have made real progress on their overall globalization and localization strategy. There are new techniques and innovations making the translation workflow more efficient and many organizations are realizing the importance of localization as a centralized function. Sometimes, it is not always easy for localization managers to drive decision making through the organization. Localization can be fragmented within an organization and getting full buy-in and budgetary support from all the right management levels can be a challenge.

Working with localization localization managers and key decision-makers, it is common to find they are driving initiatives to raise the profile of globalization and localization within their organizations. They do this to ensure their internal customers know who they are and what services they provide to the organization. A higher profile localization function also gains more visibility at the C-Suite level, securing buy-in to drive a centralized localization effort and ultimately ensuring a consistent representation of the global brand, content and products across all target markets.

There are a three subjects that created lot of localization and translation buzz in 2015. These are not new topics of interest. They are topics that are common in language professional “circles” and continue to be at forefront of most discussions.


Innovative companies are using cutting edge tools to produce more efficient and effective translation life cycles. Partnering and deploying the right global content platforms add value directly into the overall enterprise localization program. Smart technology means greater access to languages, markets and added business value. Integration of translation workflows into content publishing is a common theme with large content producers, in particular around the predictable nature of communication and branded localization projects.  There is also increased use of really smart tools that will publish multiple languages simultaneously, drawing in from a pool of talent that is integrated into the overall enterprise management system. This is the way forward for enterprise-level translation programs, increasing efficiency of the process and providing better economics for localization programs.


Conversation on the topic of machine translation (MT( have evolved from IF to WHEN. The MT programs we develop for clients are making a significant impact in managing large content volumes and providing considerable savings. They are actively generating usable content to boost the volume of translated content. Many organizations have been translating and localizing content for many years and MT is something that has been discussed at length for a while now. It is now a reality for most organizations. Our MT solutions are being deployed as a natural part of the overall localization strategy in conjunction with other methods. Depending on content type, MT and human translation (whether post-edited MT, straight human translation or transcreation) and working together to deliver high quality content, leveraging the usual TM assets from both disciplines.


Quality is probably the most used word in the globalization and localization industry and rightly so, because it is a number one priority for most of our clients and global brands. A drop in quality, regardless of language, can negatively impact the brand and brand value. It’s no longer a case of achieving linguistic accuracy though, it’s all about customer experience and being culturally appropriate and “in-context” with the content environment and target audience. Translators and reviewers must have access to the right information to ensure content is translated and represented in a natural way and sometimes this means deviating from a straight translation.

This year, I have been lucky enough to work with some leading experts in the industry and help shape the localization programs of many established and emerging global enterprises. I look forward to 2016, as I know it is going to be a busy and progressive year for our industry. What stood out for you in 2015? I’d love to learn more.


Based in San Francisco, Monique Nguyen is Welocalize Regional Enterprise Sales Director for West Coast, North America.

Localization Lessons Learned in 2015

by Steve Maule, Business Development Director at Welocalize

iStock_000075933447_MediumAs we draw to the end of the year, it’s a time for contemplation and reflection about the year that has almost past. In an industry that is always changing, the globalization and localization continues to evolve and make progress as more and more organizations go global and strive to increase international reach. There are a number of areas that I think have really evolved in the industry this year.

Here’s what comes to mind when about thinking about the lessons learned in 2015, clients need more from us than just words.

Localization buyers need a valued long-term partner, an extension of their own, quite often small, internal localization teams. This year we consulted with more and more clients on how to set up and structure internal localization teams and programs, how best to evangelize localization internally within their company, and how to streamline digital content processes and drive recruitment for contingent staffing. Successful globalization and localization requires a strategic approach. Clients need an LSP for the long-term, not on an individual transaction project by project basis.

Clients now recognize that they need more than translation.

Clients appreciate the value of distinct services for different content types and to realize that sometimes original content creation or transcreation has a far more effective ROI than just translating the source text. Linguistic accuracy is no longer a sign of high quality for all content types. Content has to be adapted to create a good customer experience and this involves more than just straight translation.

Cheap is expensive in the long-term.

You can “buy the business” with an inexpensive one-time price reduction; however, it’s not a long-term strategy that will gain customer loyalty. In fact, everyone may lose in the long term with a price-only approach.  This past year, I’ve lost a few client opportunities to competitors who have priced their translation services at super-low rates.  I can think of two examples where clients have realized that “buy cheap: pay twice” isn’t just some bumper-sticker waffle peddled by sales people. Those clients have often come back to me within months to resurrect a deal because they initially went with the cheapest quote and found the translated content unusable. It continues to be a fine balance in this industry between being competitive and being able to deliver a different service with value-add and quality.

We provide a world class service.

In the localization industry, we can get entrenched in the various language tools and applications required to enable localization, including: TMS solutions, connectors, machine translation, CMS, TMs, glossaries, style guides, fuzzy matches and more. It’s easy to lose sight that what we offer as an outsourced service is to enable global brands the ability to expand and reach new geographical markets.  It is all about the business goals and defined business outcomes.

I’m not a linguist or technical expert; however, my best achievements in 2015 are probably when I have placed the Welocalize service portfolio front and center to define a customer-centric solution. From global localization management to people sourcing and supply chain management, we bring together all the service elements in our solution portfolio to meet our client;’s specific and unique requirements and challenges.

My final reflection is more about Welocalize and less about the wider industry. Be good with change. I am sure that 2016 will bring even more change for everyone. As Welocalize continues to grow into an insanely great company, we embrace change and look forward to where we will go and grow – together.

Steve MauleI hope you have all had a good 2015 and are looking forward to a better 2016!   What have been your reflections for 2015 in this industry?  I’d love to hear them.



How to Select the Best Translation Tools for Your Business

Most global organizations driving a localization strategy will use tools and technology to drive the translation workflow and to manage multilingual content and localization assets. Different levels of localization maturity and different internal team structures will benefit in using TMS (translation management systems), TM (translation memory) and other tools to manage various elements in the translation workflow, including: managing ongoing translation projects, processes, vendors, suppliers, translation memory, digital content, terminology, glossaries and corporate wording.

Each company has varying translation and localization requirements and these need to be taken into consideration to assess what kind of technology is required to complete the task. The market for translation technology provides a wide variety of tools, some off-the-shelf and some which can be further developed and tailored for specific requirements. Partnering with a global language service provider (LSP) will help to plan the overall localization strategy and determine the right technology tools that will meet current and future localization requirements, as well as leveraging the strengths and benefits of existing technology and legacy systems.

At the planning stage, there are a number of questions that must be addressed before buying and implementing translation technology, including a TMS, TM and Terminology management system.  Here are a series of helpful questions to help you get started in evaluating translation technology.

Translation Management System (TMS)

  • How many buyers, requestors and users are there internally? Some global organizations have multiple divisions, all requiring localization services.
  • Do we have the internal resources to manage the daily administration of some tools?
  • What are the daily tasks, such as quoting, order management, workflow setup, reporting or others?
  • What is the proposed vendor model: freelancers, multiple translation vendors or working with a sole LSP provider?
  • How is the review process managed?
  • Is the review process managed by external vendors or by in-house resources?
  • Which file formats will translators and reviewers be working with on a regular basis?
  • Is there third-party technology that requires integrating, for example, content management systems, into the overall translation process?
  • How many languages and content formats must be supported now and in the future?

Translation Memory (TM)

  • Will those who require access to the TMs be working online, offline or both?
  • Will there be multiple suppliers and TM contributors?
  • Do internal and in-house translation teams need coordinating?
  • Would they be useful in providing translation memory materials from previous projects?

Terminology Management

  • Who will authorize corporate wording and global brand guidelines?
  • How will terminology be shared with all translation partners?
  • How many languages do we need to support?

Combined with the overall globalization and localization business objectives, answering these questions will give relevant information that will help determine the best translation system for your organization. There are hundreds of tools, from complete enterprise solutions for larger budgets, with detailed requirements and defined workflows as well as niche tools from smaller software companies for the lower scale user of translation services.

The best advice to address these questions appropriately, is to engage with an established and expert LSP and translation partner. They will work with you to help establish translation requirements and determine the best system based on these questions, as well as any adapted workflow to meet your specific content requirements. Ensure your localization partner has extensive experience, knowledge and standards relating to translation technology and is not simply just selling language tools without a strong business justification and alignment to your specific needs.

Welocalize utilizes some best and most collaborative tools in the industry. We remain independent and impartial in finding the right tools to fit the workflow and achieve localization and translation goals. Our team of Language Solution Architects work with each client to determine the best mix of tools and assets, whether Welocalize tools or market applications.  Our vast experience in implementation of solutions enables us to utilize and leverage existing resources, connectors and internal resources to our clients.

For example, Welocalize GlobalSight is a scalable multilingual content TMS solution that streamlines the entire global content life cycle and enables the efficient creation, update, maintenance and publishing of multilingual content in any format. GlobalSight is an open-source technology.  It is a low-risk, flexible and sustainable technology at a fraction of the cost of commercial TMS products in the market today.

Do you have any current concerns or challenges implementing translation tools? Send me an email at


Based in Germany, Tobias Wiesner is Business Development Director at Welocalize.



Highlights from Tekom Conference 2015

Welocalize recently attended tekom/tcworld 2015 held in Stuttgart, Germany. It represented an opportunity to engage in conversation with attending clients and colleagues, as well as share best practices on the localization of technical communications and documentation. Our attendance at such events is always important for Welocalize, as we benefit by engaging with tekom attendees and industry members to find out what they value in solutions provided by their language services providers.

Here are some of the key highlights from the event:

HIGHLIGHT #1: Automation and Content Management

One of our main findings from the tekom event was how processes and technologies involved in localization are moving closer together. They are becoming more integrated as one process and smooth workflow. Automating processes increase efficiency. It allows us to reduce administrative tasks and time-to-market. It also allows us to reduce the chances of human error and misunderstandings in file transfer and preparation. The content management systems (CMS) that many technical authors work with, must be integrated with the various translation management systems (TMS), terminology and language tools to allow an efficient process and ensure important information is accessible to everyone involved in the translation supply chain.

Welocalize uses its open-source translation management system (TMS), GlobalSight. This is a platform is available to all clients and localization teams, and allows everyone to engage in an automated translation process.

HIGHLIGHT #2: Machine Translation (MT)

MT is becoming more significant in the language services industry. While human translators still play the most important role, many companies use MT to complement and support human translators and enable higher volumes of content to be translated for various content formats. Although high standards are still required for many technical communications, use of MT and posted-edited MT is starting to play a key role. Many Tekom attendees were keen to learn more about how Welocalize weMT and language tools can help the overall localization program.

HIGHLIGHT #3: Terminology Management

Good terminology management is crucial in the translation of technical communications. At Welocalize, we make it our duty to provide the best quality translations for our clients, with a high emphasis on consistency of terminology. Attendees were keen to learn more about terminology management solutions and how these solutions could work for them. Furthermore, 75% of our clients agree that inconsistent terminology causes them the most frustration when translating content. More information can be found in the Welocalize blog Terminology Management for Translating Technical Communications.

HIGHLIGHT #4: In-Context Translations and Content Management

Technical content is highly complex and must be localized to high levels of quality and standards. Global organizations demand that translators possess a thorough knowledge of their product and industry, to ensure good accuracy and content is “in-context.” Having access to relevant product and company information as part of the overall translation workflow is key to accurate and relevant translations and also provides a better working environment for the translators and reviewers.

At tekom/tcworld 2015, we were delighted to speak with clients and attendees and gain insight into the value they see in Welocalize localization programs.  Attendees provided positive feedback related to the fact that Welocalize is very open and transparent in the approach to localization. Deploying innovative tools and technology puts us at the forefront in technical content solutions. Many clients gain great value from the fact that we are willing to work with all tools, including MT and content management systems across a variety of platforms, as we are guided by interoperability. We work with numerous connectors and technologies to ensure our clients have the solution that best fits their unique needs..


Tobias Wiesner, Business Development Director, Germany

Predictive MT and Quality Analysis

Predictive analysis in the localization industry is swiftly becoming a key approach to improving quality and efficiency in the translation workflow. Extracting information from existing datasets to determine patterns and predict future outcomes can significantly help the translation automation process. In this blog, Welocalize Technology Solutions team members Dave Clarke and Dave Landan give us an update on how two new Welocalize tools are benefiting clients.

It is becoming increasingly important for language service providers (LSPs) to quickly determine the nature of content; How suitable it actually is for the envisaged localization outcomes and, subsequently, the appropriate processes and workflows it should be routed through to successfully meet those expectations.  At the same time, today’s clients have an ever-increasing volume, and often diversity, of content to be translated.  Therefore, the ability to analyze large quantities of source content quickly, accurately, and consistently becomes an imperative.

Welocalize has recently added two new tools to its language tool portfolio to help automate these analyses:  TMTprime and StyleScorer. 

TMTprime was developed through a joint collaboration between Welocalize and the Centre for Next Generation Localisation (CNGL, now ADAPT).   TMTprime provides a way to predict which of multiple given translation assistance systems, whether translation memories (TMs) and/or machine translation (MT) engines, would provide the best output for any given content set.  By simply providing TMTprime with TMs and/or MT training data and a “tuning set,” TMTprime learns to predict which of the systems it is trained on is best for different source content types.  We are also currently researching the capabilities of TMTprime when applied to the task of predictive quality analysis, with a view to drastically reducing and, more often, replacing the running of costly and time-consuming human evaluations of multiple MT engines.

StyleScorer is a proprietary Welocalize tool that learns the content authoring style of a set of documents and then through a scoring system, rates how well new content matches the style of the initial documents.  Analytic tools like Welocalize StyleScorer can work with documents in any language and can be useful for analyzing source and target content.  Automated analysis of source content gives fast, accurate impression of suitability or potential difficulty of translation at the very beginning of the production cycle, which quite obviously, is exactly the right time to be informed. Further through the cycle, analyzing target content gives us a way to automate certain tasks in linguistic quality analysis (LQA).

By running StyleScorer on raw MT output, the scores can be used to rank documents that are likely to need more post-editing (PE) to bring them in line with the style of known target documents. This is good news when time is precious because it allows us to focus PE work where it is needed.

TMTprime and StyleScorer are just two examples of the cutting-edge tools that Welocalize uses to make sure that content gets translated as quickly as possible, to appropriate quality levels. More exciting innovation in the area of content analysis will be brought out later this year so watch this space!

Welocalize Technology Solutions

Dave Clarke and Dave Landan

For further reading on StyleScorer, read Dave Landan’s blog: Welocalize StyleScorer helps MT and Linguistic Review Workflow

Click here for more information on weMT

Getting to Know Welocalize Development

Interview with Doug Knoll, VP of Software Development at Welocalize

iStock_000037772022_MediumIn our continuation of Getting to Know Welocalize, we want to introduce you to Doug Knoll,  Vice President of Software Development at Welocalize. Doug worked at Welocalize from 2001 to 2009 as Director of Global Solutions and recently returned at the beginning of the year to take up his new post. Innovation is one of Welocalize’s key pillars, which underpin everything that the company does. Driving the software development for a global localization provider like Welocalize is no small task. Louise Law spent some time talking with Doug to learn more about his new global organization and some of the key development activities taking place at Welocalize. In Doug’s words, “I’m looking at the whole breadth of what we want technology to do for us.

Where are you and your team based?

I am primarily based at the Welocalize office in Portland, Oregon, although I have traveled so much over the past few months! I have 40 full-time team members, which is supplemented by contract resources based globally  in the US, China, Europe and India. We have talent and skills all over the world. We are focused on software development for all of Welocalize platforms, tools and applications and we drive the overall research and development strategy at Welocalize.

You are pretty busy developing a new software development strategy for Welocalize, can you tell me a bit more about some of your key initiatives?

I am excited to be introducing some new concepts and technologies to Welocalize. We are increasing our responsiveness and agility by updating our continuous integration and development processes. We are also looking at our existing architecture and investigating new ways we can use it to meet future demands and requirements. One of the coolest projects we are working on is developing predictive analytics using machine learning.

What’s machine learning?

Machine learning explores algorithms that can learn from and make predictions on data. Any global language service provider collects and processes huge amounts of data and we can use this data to help project managers make decisions. For example, we have developed a model that looks at all the characteristics of a task and assigns it a risk score, something like a credit score. Tasks with a high-risk score have an elevated chance of being delivered late, or having quality compromised. The score is calculated when the task begins, so we have the chance to alert the PM responsible and allow them to take action in time to address the situation. Applying predictive algorithms to help us see ahead is key to our development strategy and can be applied used across the whole organization, not just for the PMs. We are also using in assessing our entire talent pool.

What are some of the other development initiatives?

Overall, the team is looking at the whole breadth of what we want technology to do for us. It is our job to provide the optimum service delivery software platform for the Welocalize business. We do this by continually assessing projects and identifying the gaps and areas for improvement. The Development teams have to deliver a unified architecture and a consistent backbone for service delivery by using our existing software technologies, like GlobalSight, and also introducing new tools and emerging techniques.

To use our data intelligently, we have to ensure that it is clean, consistent, and centralized, while still supporting the unique workflows and business requirements that each client brings. Striking that balance in our designs is an interesting challenge.

Being able to predict what is coming down the line is a pretty radical shift in the way we do things in our industry. This new approach will set the bar higher in terms of delivery, velocity, responsiveness and agility for our client-facing teams and ultimately our clients.

What are some of the things that happen to you in a typical day?

Right now I’m splitting my time evenly between the strategy and the team. Software developers spend all day making decisions that everyone has to live with for the next ten years or more. It’s critical that we build an environment here that attracts great people, and gives them the support to do their best work.

One piece of that puzzle is to use technologies that people are excited about, and give developers some latitude to experiment. Being able to do that while still pulling in the same direction means having very clear strategy and spending a lot of time communicating it. Right now, we’re working to support the growth that Welocalize as a whole is experiencing and that means the team is growing too. I am investing a lot of time into the recruiting process as we add key leaders.

What are the current disruptive technologies in the localization industry?

It has to be machine translation (MT) and post-edited MT going prime-time. MT has been around for many years; however, now it is really finding a place in the translation process. We continue to see MT and PEMT playing a key role in active projects.

iStock_000059066980_MediumIf you had a crystal ball, what do you think the technology landscape of localization will look like in five years?

We will see a radical improvement in velocity of translator output, levels of 10,000 words per day. This will be because they are working in a tool-assisted environment, which will give them powerful capability and the ability to perform to very high levels.


The Getting to Know Welocalize blog series highlights our team members around the globe and the work they do for our valued clients.  In their words, it gives you a look into how Welocalize’s diversity, culture, and expertise empower us in doing things differently. You can view all here Getting to Know Welocalize posts here:

Getting to Know Welocalize CEO Smith Yewell

Getting to Know Welocalize in Germany – Day in the Life of Antje Hecker, Production Business Director at Welocalize in Germany

Getting to Know Welocalize and Agostini Associati – Day in the Life of Guido Panini, Sales and Marketing Manager at Agostini Associati, a Welocalize Company

Getting to Know Welocalize Quality and Training -A Day in the Life of Liz Thomas, Senior Director of Quality and Training at Welocalize

Getting to Know Welocalize in the United Kingdom – A Day in the Life of Joanna Hasan, Enterprise Program Manager

Getting to Know Welocalize Marketing

Getting to Know Welocalize Business Development Europe – A Day in the Life of Steve Maule, Welocalize Business Development Director in Europe

Getting to Know Welocalize Interns by Louise Donkor, Welocalize Global Marketing and Sales Support

Getting to Know Welocalize Business Development in North America – A Day in the Life of Monique Nguyen

Getting to Know Welocalize in China –An Interview with Alex Matusescu, Director of Operations

Getting to Know Welocalize in Japan -Interview with Kohta Shibayama, Senior Project Manager in Tokyo

Getting to Know Welocalize Development -Interview with Doug Knoll, VP of Software Development at Welocalize

Getting to Know Park IP Translations Operations – A Day in the Life of Nicole Sheehan, Regional Director of Operations at Park IP Translations, a Welocalize Company

Getting to Know Park IP Translations

Getting to Know Welocalize – Ten Interesting Facts You May NOT Know About Welocalize

Getting to Know Welocalize Staffing – A Day in the Life of Brecht Buchheister

Welocalize Language Tools Team Highlights EAMT 2015 Conference

The Welocalize Language Tools Team recently presented at the 2015 EAMT Conference in Antalya, Turkey.  Olga Beregovaya, Welocalize VP of Language Tools and Automation was the invited guest speaker at the conference.  She presented, What we want, what we need, what we absolutely can’t do without – an enterprise user’s perspective on machine translation technology and stuff around it,with the main objective of promoting collaboration between academia and field users. Olga also presented with Welocalize Senior Computational Linguist Dave Landan “Streamlining Translation Workflows with Welocalize StyleScorer, as part of the project and product description poster session.

In this blog, Olga Beregovaya, Dave Landan and Dave Clarke, Principal Engineer for the Language Tools Team, share their insights from the 2015 EAMT Conference.


Olga Beregovaya gives her impressions of EAMT 2015 and highlights her favorite presentations from the user track.

As a global language service provider, the language technology and translation automation strategy is very important. The EAMT conference and associated conferences are excellent forums to attend as the team can share real-life MT production experiences and learn more about the latest innovations and research projects. As always, there were many interesting research papers and posters at EAMT, all delivered by highly-talented colleagues in the field of MT and all describing very innovative and promising approaches.

Welocalize EAMT Poster Presenation 2015I was proud of Welocalize’s own poster presentation, describing work by colleague Dave Landan,  Streamlining Translation Workflows with StyleScorer. Capturing and evaluating the style of both training corpora and target text has traditionally been one of the biggest challenges in the industry. The tool Dave has created allows us to compare style of the input text and the available training data, and build the most relevant MT engine, and also to assess the stylistic consistency of the target text and its adherence to the client’s style guide.

The poster presented by Mārcis Pinnis, Dynamic Terminology Integration Methods in Statistical Machine Translation, was very interesting for the team. Integrating terminology in a linguistically aware way is a major pain point for domain adaptation of SMT engines. Speaking as a program owner, this poster presentation was particularly relevant to our work.

Another very relevant presentation was the paper delivered by Laxström et al, called Content Translation: Computer-assisted translation tool for Wikipedia articles. This presentation talked about a tool created by Wikipedia to promote translation and post-editing of machine-translated articles by Wikipedia users. Community translation is more important for Wikipedia than for any other organization in the world. As content democratization is the key paradigm shift of the modern times, such tools that enable a “casual translator” to contribute and make content available globally have become an essential component of the global content universe.

Finally, Joss Moorkens and Sharon O’Brien presented an excellent poster called Post-Editing Evaluations: Trade-offs between Novice and Professional Participants. Building an efficient and productive supply chain for  post-editing, that would be open to new tools and new ways of working, is an essential component of an LSP MT program success. Joss and Sharon compare the perception of MT output and a new CAT environment by experienced translators and by novice users.


Dave Landan, Computational Linguist at Welocalize and EAMT 2015 presenter identified two presentations he found particularly interesting.

This year’s EAMT conference started strong with several interesting talks and papers on a range of topics.  While there were many strong research papers, I would like to mention two that stood out for me. Bruno Pouliquen presented findings on linear interpolation of small, domain-specific models with larger general models. At Welocalize, we hope to try these methods with our own data, and we are optimistic about the possibilities!  The other research paper that stood out for me was by Wäschle and Riezler. This paper presented innovations around using fuzzy matches from monolingual target language documents to improve translations. I am excited about expanding our collaborations with the academic community.


Dave Clarke, Principal Engineer at Welocalize is a regular participant at EAMT. One topic that was touched on many times at EAMT 2015 was the evolution of CAT tools and their impact on productivity. He shared the following perspective.

From a technical or tools perspective, the EAMT conference provided considerable insight into how translation tools could and should evolve. One such insight was provided by the best paper award winner, “Assessing linguistically aware fuzzy matching in translation memories,” by Tom Vanallemeersch and Vincent Vandeghinste from the University of Leuven. The algorithms typically used in CAT tools to calculate fuzzy match values from translation memories have little or no linguistic awareness. They are firmly established as stable units in our industry word currency. This paper implemented and tested alternative fuzzy match algorithms that identify potentially useful matches, based on their linguistic similarities. The results were gathered from tests carried out with translation master’s degree students measuring translation time and keystrokes. The results strongly suggest the potential for unlocking further productivity from existing resources.

The other presentation that stood out for me was “Can Translation Memories afford not to use paraphrasing?” by Rohit Gupta, Constantin Orasan, Marcos Zampieri, Mihaela Vela and Josef Van Genabith.

More MT productivity and quality can be achieved with incremental and specialized improvements; however, it will be a cumulative process. Importantly, NLP can drive ‘intelligent’ aids to productivity, including auto-suggest/complete, advanced fuzzy matching and automatic repair and others, within a translator’s working environment. Not all will benefit every user. CAT tool platforms may now evolve so that these innovations can be quickly absorbed into the environment with little cost or effort. This leads to how each translator can maximize their own productivity with the combination of aids that best suits their style of work. We even saw a project from ADAPT in the early stages of developing a platform for CAT tool designers that allows the fast definition and measuring of data during testing of prototype productivity-enhancement functions.

To echo the words of the outgoing EAMT President, Professor Andy Way, it was good to see researchers really getting to grips with specific, known problems. It was encouraging to see more focused work on such errors that we know first-hand to have a particular impact on productivity, for example, improvements in terminology selection, new methods to improve choice of preposition and more. It was also encouraging to see the increase in research presented with supporting data gained from end-user evaluation rather than the automatic evaluation metric staples that have long been the norm. In fact, ‘BLEU scores’ almost, just almost, became a dirty… bi-gram.

“Overall, EAMT 2015 was a great conference, attended by extremely talented people, and we should not forget to mention in beautiful Antalya, Turkey, where the conference was held this year,” Olga Beregovaya.

View Olga Beregovaya’s EAMT presentation, “What we want, what we need, what we absolutely can’t do without – an enterprise user’s perspective on machine translation technology and stuff around it” below.

For more information about Welocalize’s MT program, weMT, click here.

Click the link to see Dave Landan and Olga Beregovaya’s EAMT poster presentation, Streamlining Translation Workflows with StyleScorer: EAMT_POSTER 2015 by Welocalize.

Welocalize EAMT Poster Presenation 2015

Welocalize to Present at 18th European Association for Machine Translation Conference

Frederick, Maryland – May 7, 2015 – Welocalize, global leader in innovative translation and localization solutions, will share industry insight and expertise at the 18th Annual Conference of the European Association for Machine Translation (EAMT) taking place in Antalya, Turkey, May 11-13, 2015, at the WOW Topkapi Palace.

“I am very excited to be taking part as an invited speaker at this year’s EAMT 2015 Conference in Turkey,” said Olga Beregovaya, VP of language tools and automation at Welocalize. “EAMT is an important international conference for the MT community. It is where experts, thought leaders and users of machine translation can meet and share research, findings and new tools to help their language technology strategy.”

Featured Welocalize presentations at the 18th Annual Conference of the European Association for Machine Translation:

  • Welocalize VP of Language Tools and Automation, Olga Beregovaya will deliver her keynote, “What We Want, What We Need, What We Absolutely Can’t Do Without – An Enterprise User’s Perspective on Machine Translation Technology and Stuff Around It” at 9:30 – 10:00am on Tuesday, May 12.
  • Olga Beregovaya along with Welocalize Senior Computational Linguist Dave Landan will be presenting “Streamlining Translation Workflows with Welocalize StyleScorer” as part of the poster project and product description session on Tuesday, May 12.

For more information about the EAMT 2015 conference, visit

About Welocalize – Welocalize, Inc., founded in 1997, offers innovative translation and localization solutions helping global brands to grow and reach audiences around the world in more than 157 languages. Our solutions include global localization management, translation, supply chain management, people sourcing, language services and automation tools including MT, testing and staffing solutions and enterprise translation management technologies. With over 600 employees worldwide, Welocalize maintains offices in the United States, United Kingdom, Germany, Ireland, Italy, Japan and China.

Welocalize to Present at GALA 2015 Sevilla

Frederick, Maryland – March 18, 2015 – Welocalize, global leader in innovative translation and localization solutions, will share industry insights and expertise at the annual Globalization and Localization Association (GALA) Language of Business conference, taking place in Sevilla, Spain, March 22-25, 2015, at the Barceló Sevilla Renacimiento Hotel.

galaLaura Casanellas from Welocalize’s Language Tools Team will be presenting “Localizing for Travel: Diverse Solutions for Diverse Needs” on Monday, March 23 as part of a special conference forum designed to address the needs of the travel and tourism sector.

“The presentation at GALA 2015 Sevilla will discuss Welocalize’s localization and language approaches and processes specific to travel and hospitality,” said Laura Casanellas, machine translation and CAT tools program manager at Welocalize. “There are diverse localization models across the travel sector, from full transcreation to raw machine translation output for gisting purposes. Welocalize works with several global brand leaders and online travel companies, enabling us an opportunity to share our best practices at this year’s GALA Conference.”

“The GALA organization and events provide a great platform for the localization industry where we can network with our colleagues and collaborate with thought leaders,” said Jamie Glass, vice president of global marketing at Welocalize. “We are delighted to share our expertise at GALA 2015 Sevilla.”

GALA Language of Business conferences are gatherings for the translation and localization community, including providers of language services, managers of global content and language technology developers. Welocalize is a corporate sponsor and member of GALA.

About Welocalize – Welocalize, Inc., founded in 1997, offers innovative translation and localization solutions helping global brands to grow and reach audiences around the world in more than 157 languages. Our solutions include global localization management, translation, supply chain management, people sourcing, language services and automation tools including MT, testing and staffing solutions and enterprise translation management technologies. With over 600 employees worldwide, Welocalize maintains offices in the United States, United Kingdom, Germany, Italy, Ireland, Japan and China.

Press release:

Welocalize StyleScorer Helps MT and Linguistic Review Workflow

GettyImages_476511721Innovation is one of Welocalize’s four pillars which form the foundation of everything we do as a business. Clients and partners rely on our leadership to drive technological innovation in the localization industry. One of our latest innovative efforts is the soon-to-be-deployed language tool, Welocalize StyleScorer which will form part of the Welocalize weMT suite of linguistic and automation language tools. One of the driving forces behind StyleScorer is Dave Landan, computational linguist at Welocalize and a key player in many Welocalize MT programs.

In this blog, Dave shares the key components of StyleScorer and how style analysis tools can help the MT and linguistic review workflow.

At Welocalize, we are constantly looking for ways to improve the quality and efficiency of the translation process. Part of my job as a computational linguist is to create tools that help people spend less time on looking for potential problems and more time on fixing them. One of my team’s latest efforts in this area is StyleScorer.

Welocalize StyleScorer is currently in the early deployment testing phase. This tool will be deployed as part of the Welocalize weMT suite of language tools around linguistic analysis and process automation. I’d like to share some of the key components of StyleScorer and the role it will play in the MT and linguistic review workflow.

What is StyleScorer?

Welocalize StyleScorer is a tool that compares a single document to a set of two or more other documents and evaluates how closely they match in terms of writing style. The documents being compared must all be in the same language; however, there is no restriction on what that language is in the source content.

The main difference between StyleScorer and existing style analysis tools is that rather than summarize types of style differences (for example: “17 sentences with passive voice”), it takes a gestalt approach and gives each document a score anywhere between 0 and 4, with 0 being a very poor match to the style and 4 being a very good match.

To do this, StyleScorer uses statistical language modeling as well as innovations from NLP (natural language processing), forensic linguistics and neural networks (machine learning) in order to rate documents on how closely they match the style of an existing body of work. Because it learns from the documents it’s given, even if you don’t have a formal style guide, StyleScorer will still work as long as the training documents can be identified by a human as belonging to a cohesive group.

How does StyleScorer help the MT workflow?

While we think StyleScorer will be very useful as part of the linguistic review workflow for human translation, we are even more excited about how it can benefit the MT (machine translation) workflow at several points of the process both on source and target language documents.

One of the key components to training a successful MT system is starting with a sufficient amount of quality bilingual data. We are seeing more and more clients who are very interested in MT; however, they don’t have a lot of bilingual training data to get started. In the past, the only option available to those clients was a generic MT engine (similar to what you’d get off-the-shelf). This gets someone started in MT, though the quality of generic engines is generally lower than engines trained with documents that match the client’s domain and style.

We can use StyleScorer to filter open-source training data to find additional documents to train from that are closest to the client’s documents. High-scoring open-source data can then be used to augment the client’s training data, which allows us to build better quality MT engines for those clients early in the project life cycle.

If some documents are getting lower quality translations from MT than others, we can use StyleScorer as a sanity check as to whether the source document being translated matches the style of the client’s other documents in the same language and domain. An engine trained exclusively on user manuals probably won’t do well on translating marketing materials. StyleScorer gives us a way to look for those anomalies automatically.

We are particularly excited about using StyleScorer on target language documents to help streamline workflows. If we run StyleScorer on raw MT output, we can use the scores to rank which documents are likely to need more PE (post-editing) effort to bring them in line with the style of known target documents. This is particularly useful for clients with limited budgets for PE and clients with projects that require extremely fast turnaround because it allows us to focus PE work where it is needed the most.

Finally, we envision StyleScorer becoming part of the QA & linguistic review process by spot-checking post-edited and/or human translated documents against existing target language documents. Translations that receive lower scores may need to be double-checked by a linguist to make sure the translations adhere to established style guides. If it turns out that low-scoring translations pass linguistic review, we use them to update the StyleScorer training set for the client’s next batch of documents.


Based in Portland, Oregon, Dave Landan is a Senior Computational Linguist for Welocalize’s MT and language tools team.

A Refresh on MT Post-Editing

galaThe Globalization and Localization Association (GALA) recently asked machine translation expert Olga Beregovaya, Vice President of Language Tools and Automation at Welocalize, to be the organization’s GALAxy Guest Editor. In the GALAxy Newsletter Q4 2014, Olga provides a fresh perspective on a number of MT trends and hot topics in the feature, Letter from the Guest Editor: MT Post-editing — A Fresh Perspective.

Why did GALA choose Olga to edit this issue?  Olga is a regarded language services advisor who works with multinational organizations in MT and post-editing strategies and implementations.  As the Guest Editor, she was charged with the task of selecting the most relevant topics and contributors, while working closely with the GALAxy editorial team to produce a high-impact edition of the popular newsletter for Q4 2014. The latest issue shines light on the current trends, opportunities and challenges in MT post-editing, as well as the impact it has on the future of the translation and localization industry.

Olga Beregovaya“When I was offered the role of Guest Editor for this issue of GALAxy Newsletter, I knew immediately who I would want to reach out to for their insights and what aspects of this exciting new field I would want the issue to cover,” said Olga Beregovaya. “The process was a great experience and the GALA editorial team are fantastic. I hope the readers get as much out of this issue of the GALAxy newsletter as I have in my role as editor.”

Here’s a quick summary of the lead articles and authors that were included in the publication:

If you are considering machine translation or would like to talk about any of the topics raised in the GALAxy newsletter, reach out to Olga at

For information about Welocalize’s weMT solutions, click here.

Welocalize is a member of GALA.

Welocalize Highlights from AMTA-2014 Conference

AMTA AlexElaine O’Curran, Alex Yanishevsky and Olga Beregovaya from Welocalize’s Language Tools and Automation Team presented at the 11th Biennial Conference of the Association of Machine Translation in the Americas (AMTA-2014) in Vancouver last month. Elaine O’Curran, Welocalize Training Manager and presenter at AMTA, provides her highlights and insights from the conference.

Last week, several members of the Welocalize Language Tools and Translation Automation Team went to Vancouver, Canada and participated in a number of presentations and panel discussions related to machine translation and post-editing at this year’s AMTA conference. We enjoyed four full days of presentations, demos and workshops in a very collaborative environment that brought together developers, researchers and translators from various backgrounds.

Here are my top three highlights from this year’s conference:

  • MT for User-Generated Content (UGC). This was the topic of several presentations, including the opening keynote by eBay’s Hassan Sawaf. Not only in the context of post-editing, also in general use of MT for a global personalized customer experience. My own presentation, “Machine Translation and Post-Editing for User Generated Content”, also covered the key considerations for this topic. You can find the full presentation below.
  • Dynamic engine adaptation methodologies and dynamic “real time” engine building were other hot topics. Welocalize’s VP of Language Tools and Automation, Olga Beregovaya, delivered a joint presentation with Alon Lavie from Safaba Translation Solutions on dynamic overnight engine training to meet the challenges of content drift. Prashant Mathur, a researcher from the Bruno Kessler Foundation, shared their work on multi-user adaptive statistical MT, which takes into consideration that several translators will often work on the same document or project and that the engine needs to adapt to the style and post-editing choices of each individual user.  The presentation is posted for full view below.
  • The Post-Editing Workshop provided new approaches to post-editing, including the use of monolingual post-editors and casual post-editors. The paper that resonated most with me related to the cognitive effort of MT errors for post-editors. It identifies specific categories of cognitively challenging MT errors and the goal is to understand what features of the source text are associated with cognitively demanding errors in MT output. Once this is identified, efforts can be concentrated on reducing the types of errors that require the most time and effort to fix during post-editing. This, combined with the dynamic engine adaptation mentioned above should improve productivity for post-editors.

All the Welocalize AMTA-2014 presentations are posted below. If you would like to discuss any of these topics, please contact me directly.

Training Manager, Welocalize Language Tools and Automation

EAMT Conference 2014: Welocalize Language Tools Team Overview

Laura CasanellasThe Welocalize Language Tools team attended and presented at the 2014 EAMT Conference in Croatia. In this blog, Laura Casanellas, Welocalize Language Tools Program Manager and presenter at EAMT, provides her highlights and insights from her Welocalize colleagues who took part in the conference.

Just like Trento in 2012 and Nice in 2013, the Welocalize Language Tools Team participated in the Annual Conference of the European Association for Machine Translation (EAMT). The conference took place June 16 – 18 in the city of Dubrovnik, Croatia and four members of the Welocalize Language Tools team attended:

Olga Beregovaya, VP of Language Tools, and Dave Landan, Pre-sales Support Engineer, presented a project poster on “Source Content Analysis and Training Data Selection Impact on an MT-driven Program Design with a Leading LSP.”

Lena Marg, Training Manager and I delivered our presentation “Assumptions, Expectations and Outliers in Post-Editing.”

We take the EAMT conference and associated conferences (International, Asian and American) seriously, as most of the important developments that are currently taking place around machine translation (MT) are presented and followed up in those forums.

As a global language services provider (LSP), Welocalize adds value to the EAMT conference by being able to share real-life MT production experiences, demonstrated through thorough analysis of large and varied quantities of actual data. We are privileged in that we work in a real scenario where some of the new technologies around natural language processing (NLP) and MT can be tested in depth.

EAMT 2014 Poster Presentation WelocalizeIn their poster: Source Content Analysis + Training Data Selection Impact – EAMT POSTER by Welocalize, Olga and Dave stressed the idea of the importance of preparing the training corpus in advance and matching it to the specifics requirements of the content that subsequently will be translated. To give an example, many translation memories come from different projects created at different points in time. They may contain inconsistencies or the sentences in these translation memories can simply be too long or may contain a lot of “noisy” data. They need to be cleaned up before they can be used as engine training assets. Going deeper into the possibilities of automatic data selection and matching it with the source content, Olga and Dave spoke about our suite of analytic applications, divided between proprietary tools like Candidate Scorer, Perplexity Evaluator, StyleScorer and others that are being developed as part of an industry partnership with CNGL: Source Content Profiler and TMT Prime.

Olga Beregovaya’s impressions about the EAMT Conference and Welocalize’s role within it are very positive. “Overall, the great thing about the conference was the applicability of the new generation of academic research in real live production scenarios. Many of the academic talks were relevant for the work on MT adaptation and customization that we do at Welocalize. Today, we need to cover more and more domains and content types so the domain and sub-domain adaptation is becoming the key area of our R&D. This means that we benefit greatly from academic and field research around data acquisition for training SMT systems and the relatively new developments around using terminology databases to augment the SMT training data. Not all of our clients come to us with their legacy translation memories, and while there is some public corpora available, we still need to rely on us acquiring and aligning data ourselves.”

Dave found two presentations he attended particularly interesting that focused on common pain points within the industry. “The challenges of using MT with morphologically rich languages are well-known, and we were happy to see interesting research in possible ways to overcome those challenges. We also found a talk on gathering training data from the web very interesting. The presenters discussed using general and specific data to train separate engines which could be weighted and combined to give improved results in cases of sparse in-domain training data. Indeed there were several innovations from academia that we are looking forward to incorporating into our bleeding-edge MT tools and processes.”

In our presentation, Lena and I focused on different challenges in a real MT production scenario: the necessity of forecasting future post-editing effort, with an emphasis on post-editors behavior, and their personal and cultural circumstances, as an important variable of the MT + PE equation. As part of a large LSP, we have been able to gather large amount of data and focus on the quality of a number of MT outputs related to different languages and content types. Our presentation elaborated on our findings around correlations between different types of evaluation methods (automatic scoring, human evaluations and productivity tests). We obtained interesting findings around the adequacy score in our human evaluation tests and the productivity gains contained on the post-editing effort. We will continue gathering data and investigating around this area.

Another topic that was touched upon during the conference was the area of quality. Lena and Olga both shared their perspectives:

“After closely following the QTLaunchpad project for several months, it was particularly interesting to see and discuss results from their error annotation exercises using MQM earlier in the year. Welocalize took part in these exercises by providing data and annotator resources. The findings of this exercise are contributing to further advances both in quality estimation and quality evaluation, fine-tuning metrics further for better inner-annotator agreement, etc. These discussions also provided some immediate take-aways for our approach to evaluation.” – Lena Marg

“The other area of high relevance to us is Quality Evaluation. Again, it is great to see so many research projects dealing with predicting MT quality and utility. While it still may be challenging to deploy such quality estimation systems in-production as various CAT tools and TMS systems have their own constraints around metadata-driven workflows, it is very encouraging to know that this research is available.” – Olga Beregovaya

“A general theme of the EAMT Conference was the question of how to increase cooperation between the translation and the MT research community. In this context, Jost Zetsche’s keynote speech was important in pointing out that translators should take an active interest in providing constructive feedback on MT and on how they work, to ensure new advances in MT developments are truly benefiting them. And yet, with the presence of some interested freelance translators, translation study researchers and a handful of LSPs presenting on MT, it would seem that progress has already been made in bringing the two sides together.” – Lena Marg

Stradun la nuit_dubrovnikThe EAMT Conference was a great opportunity to meet professionals, academics and researchers who work in the field of MT. The Welocalize team members were able to exchange ideas around the current pressing challenges surrounding MT technology and we still had time to admire the beautiful surroundings of historical Dubrovnik.

Laura Casanellas is program manager on the Welocalize Language Tools team.

Welocalize to Present at 17th Annual European Association for Machine Translation Conference

FREDERICK, MD – June 16, 2014 – Welocalize, global leader in innovative translation and localization solutions, will share industry insight and expertise at the 17th Annual Conference of the European Association for Machine Translation (EAMT) taking place in Dubrovnik, Croatia, June 16 – 18, 2014.

Senior members of the Welocalize Language Tools Team will be taking part in a number of presentations and discussions related to machine translation (MT) and automation at this year’s EAMT conference.

“Welocalize is excited to participate at this year’s EAMT 2014 Conference in Dubrovnik,” said Olga Beregovaya, VP of language tools and automation at Welocalize. “As more content is created every day, the demands for language services related to machine translation deployments is growing exponentially. EAMT is an important international conference where thought leaders and experts in machine translation can collaborate through shared research and innovations to advance our industry and meet the escalating demands.”

Featured Welocalize presentations at EAMT 2014:

For more information about the EAMT 2014 conference, visit and to find out more general information about EAMT, visit

About Welocalize – Welocalize, Inc., founded in 1997, offers innovative translation and localization solutions helping global brands to grow and reach audiences around the world in more than 125 languages. Our solutions include global localization management, translation, supply chain management, people sourcing, language services and automation tools including MT, testing and staffing solutions and enterprise translation management technologies. With over 600 employees worldwide, Welocalize maintains offices in the United States, UK, Germany, Ireland, Japan and China.

Welocalize Office Exchange Program from Portland to Dublin

david landan welocalizeDave Landan is a pre-sales support engineer for Welocalize’s machine translation (MT) and language tools team and is based in Portland, Oregon. He recently spent a week in Dublin as part of the Welocalize Office Exchange Program. He shares his experience of Ireland and recaps his journey.

I’m a pre-sales support engineer on Welocalize’s MT and Language Tools team. I spend a lot of my time on MT pre-sales with external prospective clients and support our existing clients MT programs. Some of my time is spent supporting internal clients and continually working to make MT better for everyone involved.  I work from my home office in my garage near Portland, Oregon, with occasional visits to the Welocalize Portland office or down to California for meetings with clients or prospects.

I love traveling – new food, new people, and new sights. So, I jumped at the opportunity to participate in the Welocalize Office Exchange Program and visit our Welocalize Dublin office.  In addition to meeting several of my colleagues for the first time, the main purpose of my Dublin visit was for me to get up-to-speed on an exciting project that springs from our partnership with the Centre for Next Generation Localisation (CNGL).  CNGL is a collaborative academia-industry research center combining the expertise of researchers at Trinity College Dublin, Dublin City University, University College Dublin, and University of Limerick with localization industry partners. Welocalize is one of CNGL’s key industry partners and has worked with them for over three years now.

I left Portland on a Friday morning and touched down in Dublin about 13 hours later on Saturday morning.  Since I did not have any meetings scheduled until Monday, I rented a car and headed west.  Ireland is a beautiful country – lush, green, and very pastoral.  There’s so much history that I could have stopped every 5 km to see a centuries-old castle, abbey, monastery, or pub.  Instead, I put 600 km of touring on the car before passing out in my room. The next day, I enjoyed a hearty Irish Sunday breakfast of eggs, bacon, sausages, black and white puddings, grilled tomatoes, toast, and coffee. I was ready for anything!

The bulk of my week in Dublin was incredibly productive with meetings, demonstrations and presentations.  I spent two days in our Dublin office, meeting with colleagues and two and a half at Dublin City University meeting with some of the CNGL folks.  Meeting my colleagues in the Welocalize office was great. It gave us a chance to dive into details that we would not normally have the time for during the regular work week.  I was also able to get a good sense of what a “day in the life” is like for the people who I support. I will try to translate that into better tools and processes for my internal work.

olgab_lenaMdavelandan_lauracOlga Beregovaya, Lena Marg, and Alex Yanishevsky were in Dublin as well that week. We all got together with the Dublin MT and Language Tools folks for a lovely dinner in Dun Laoghaire.

As for the CNGL work that I took part in during my visit, I can’t tell you much about that just yet.  Suffice to say there’s some exciting and unique work in the world of weMT that should come to light soon. Watch this space for updates as they are available.

We managed to accomplish all of the goals we had set before the trip with a half day to spare. On Friday afternoon, I managed to visit several of the Dublin sights that I missed in my first tour.  In all, the trip was both fun and productive and I can’t wait to go back.


Welocalize Brings to Market Innovative Translation Productivity iOmegaT Tool in Collaboration with CNGL

cnglFREDERICK, MD–(Marketwired – Mar 12, 2014) – Welocalize, a global leader in translation and localization services, announced today they are now licensing the innovative iOmegaT technology to help translators maximize productivity. iOmegaT was developed in collaboration between Welocalize and the Irish-based academia-industry research center, CNGL (Centre for Global Intelligent Content).

The iOmegaT Translation Productivity Test-Bench is a suite of language tools used to gather translator activity data and calculate the effect of machine translation (MT) on translation speediOmegaT integrates and gathers data from OmegaT, a free, open-source translation memory Computer Aided Translation tool used by many professional translators.

iOmegaT is one of three language technology projects that Welocalize has been developing in close collaboration with CNGL over the past two years in order to measure and improve translation productivity and delivery. Welocalize contributed significant engineering and project management expertise to the development of iOmegaT.

“The iOmegaT language software tools make it possible to measure the impact of machine translation on translator speed relative to traditional translation, which is a huge improvement over traditional methods where comparisons are made of raw and post-edited MT outputs,” said Dave Clarke, principal engineer of language tools at Welocalize.

For the localization industry, iOmegaT will help measure the success of MT programs and aid translators in defining fair pricing for post-editing work. Translators using the iOmegaT tool can increase productivity by reducing onerous reporting requirements of daily activities and use more of their time for translating and post-editing activities.

Welocalize recently announced that OmegaT integrates with Welocalize’s open-source translation management system (TMS), GlobalSight. With the iOmegaT integration, Language Service Providers (LSPs) can collate real-time data on translator activity. LSPs can use this data to assess the success of MT programs.

The prototype for iOmegaT was developed in 2011 by John Moran, a CNGL Ph.D. student. Welocalize has worked closely with John Moran, who completed most of the development work for iOmegaT on-site at Welocalize’s office in Dublin. The licensing of iOmegaT is the culmination of years of work and collaboration between Welocalize and CNGL.

“Welocalize and CNGL have worked closely over the years to move the iOmegaT project to completion,” said John Moran, CNGL. “Licensing the iOmegaT software will help the global translation and localization industry by bringing another dimension in business intelligence and data analysis, ultimately enhance MT quality, translator productivity and most importantly, the client and their content.”

CNGL combines the expertise of researchers from four top universities in Ireland with industry partners, including Welocalize, to produce research and technologies for MT, localization and global content adaptation and delivery, as well as driving standards in the localization industry.

About Welocalize – Welocalize, Inc., founded in 1997, offers innovative translation and localization solutions helping global brands to grow and reach audiences around the world in more than 125 languages. Our solutions include global localization management, translation, supply chain management, people sourcing, language services and automation tools including MT, testing and staffing solutions and enterprise translation management technologies. With over 600 employees worldwide, Welocalize maintains offices in the United States, UK, Germany, Ireland, Japan and

About CNGL – CNGL is a EUR 58M collaborative academia-industry research center dedicated to the development of advanced content processing technologies to adapt and personalize digital content and services to meet the different language and technical needs and preferences of users across global markets. The CNGL research centre combines the expertise of its more than 150 investigators at four universities: Trinity College Dublin (lead institute), Dublin City University, University College Dublin and University of Limerick, as well as that of its industry partners to produce technologies with significant economic and societal impact.

CNGL Media Contact: Laura Grehan, Marketing and Communications Officer:, Tel: +353 1 7006705

Link to press relase:

Media Contacts:
US: Jamie Glass
Europe/Asia: Louise Law

Welocalize at memoQfest Americas Discusses MT and Better Translations

memoqfest 2014The annual translation technology conference, memoQfest Americas, took place in Los Angeles last week. David Landan from Welocalize’s Language Tools Team was invited to present about MT at the conference. In this blog, he shares his experience.

Presenting at memoQfest Americas 2014 was an important event for me in several ways. Not only was it my first time attending a memoQfest conference, it was also my first time representing Welocalize at a conference. The icing on the cake was being asked to give a talk about MT at the event.

Public speaking isn’t my strong suit. (Unless it’s about wine, but that’s another story). I live near Portland, Oregon (rain, anyone?), so when when I was asked to spend a few days in sunny Los Angeles in February, I didn’t need to think twice. I put my nervousness aside, prepared a talk, and packed my bags.

MemoQfest was a unique experience. Kilgray Translation Technologies is a fast-growing translation technology company that makes memoQ, an advanced translation environment for translators and reviewers. The company has been putting on the event for several years in their home country of Hungary. For the past few years, they have also hosted an annual event in the US. While many company-sponsored conferences are free to attend and used as an opportunity to sell product or gain exposure, memoQfest attendees pay to attend and most are die-hard memoQ users. Attendees are primarily translators and project managers (PMs), with a few executives, salespeople and tech support folks.

Kilgray uses the event for education, and it is a way for them to both offer workshops on current versions of their products and to announce what is in the works for the next release. What is most notable and refreshing is that the Kilgray folks court criticism at the event. They genuinely want to make their users happy and they take the criticism and feature suggestions seriously. This year’s upcoming release includes fixes and new features suggested at last year’s memoQfest conferences.

Machine translation (MT) is a big, exciting topic in the localization industry. MT was represented in the presentations (mine included) and in the discussions that were happening outside of the scheduled events. I presented a rather technical talk about Welocalize’s work in improving localization throughput by using a set of analytical tools to make MT better. Click here: Better translations through automated source and post-edit analysis, to view the slides from my presentation.

One thing that surprised me was how many translators use generic MT (like Google or Bing) in their day-to-day work. The thing that people need to understand is that computers are dumb. If I ask you what word comes next in the sentence “I need to pick up a dozen eggs and some milk from the …” you’d probably guess something like “store” or “market”. In statistical natural language processing, if your training data includes the phrase “milk from the” followed by “cow” often, then the system will think that “I need to pick up a dozen eggs and milk from the cow” is a perfectly reasonable sentence, because it’s the one with the best probability given the data that was used to train it.

MT output is only as good as the data used to trained the engine. With large generic MT engines, the training data is very noisy. In fact, some of the training data that’s automatically scraped ends up being someone else’s unedited bad MT output. Garbage in, garbage out as they say.  Not to say that everything you get from generic MT is garbage. Google and Bing do reasonably well for high-resource languages in general domains. If you need professional quality work, you need a professional quality MT engine. To get a professional quality MT engine, you need good data and you need to use translators to post-edit the MT output, depending on what quality levels are required.

What we have developed within the MT and Language Tools team at Welocalize is a way to identify good, clean data so you start with a better engine. We don’t stop there — our tools can identify trouble spots in MT output and we have tools and processes for post-editing that provide a feedback loop to keep improving on every project. Exciting stuff, right?

Now, if only I hadn’t brought the rain with me from Portland to LA.



Email me at

Welocalize to Present in LA at Translation Technology Conference memoQfest Americas

memoqfest 2014Fredrick, Maryland – February 25, 2014 – Welocalize, a global leader in translation and localization will be sharing machine translation (MT) knowledge and expertise at the 2014 memoQfest Americas conference in Los Angeles, February 27 through March 1.

David Landan from Welocalize’s Language Tools Team will be presenting “Better translations through automated source and post-edit analysis” on day two of the conference.

My presentation at memoQfest Americas will discuss how Welocalize is developing processes and tools grounded in computational linguistics and NLP to reduce post-editing effort,” said David Landan, support engineer at Welocalize. “We analyze data using techniques from machine learning, language modeling, and information retrieval.  Our data-driven approach allows us to build more targeted, more accurate MT systems.”

David will explore ways of automating training data selection using a source content analysis suite and show how the selected data led to improved MT engine quality by using Welocalize’s WeScore and StyleScorer as a way to evaluate translations. Welocalize’s WeScore is a dashboard for viewing several metrics in a single application. It makes automatic scoring of MT output easier by handling input parsing formats, tokenization, and running multiple scoring algorithms in parallel.

Machine translation (MT) is a topic with a high level of interest at localization and translation industry events. As global organizations produce more and more content and the demands for quick localization grow, Welocalize will highlight how combining translation approaches, like MT and post-edit analysis, can achieve the desired quality of output that meets time and budget goals.

memoQfest is an annual conference, hosted by Kilgray Translation Technologies, to learn more about trends within the translation technology industry. The memoQfest event also provides networking opportunities for translators, language service providers and translation end-users.

About Welocalize – Welocalize, Inc., founded in 1997, offers innovative translation and localization solutions helping global brands to grow and reach audiences around the world in more than 125 languages. Our solutions include global localization management, translation, supply chain management, people sourcing, language services and automation tools including MT, testing and staffing solutions and enterprise translation management technologies. With over 600 employees worldwide, Welocalize maintains offices in the United States, UK, Germany, Ireland, Japan and

Welocalize Office Exchange Program 5,078 Miles from San Francisco to Dublin

andymAndy Mallett is a project manager at Welocalize for the Language Tools Team. She is based at the Welocalize office in San Mateo, California. Andy works regularly with a team in the Dublin office. Keen to meet her colleagues, she applied for the Office Exchange Program, got accepted and hopped on a plane bound for Ireland. She gave us answers to a few questions we asked about her experience.

What do you do at Welocalize?

I work as a Project Manager (PM)  for the Welocalize Language Tools Team. My role covers all main PM tasks and includes coordinating machine translation (MT) assessments and conducting productivity tests.

Why did you apply to the Welocalize Office Exchange Program?

A lot of the Language Tools Team are based in Dublin and we are on opposite time zones. Sometimes, we only have a few hours that overlap each day. Visiting the team in person gave me a chance to spend time with my team as well as meet with other Welocalize teams that we work with on a regular basis. I was also able to meet up with a local client. It gave me and the team the opportunity to have longer, sustained discussions about important issues that sometimes don’t get addressed when there are immediate production issues.

What surprised you the most?

Although I know how many people are based at our Dublin office, it still surprised me to see how busy it was in the office! I was told it was an unusually busy week. I loved how lively the office was and almost every division of the company represented in person. The energy was great.

a full irish breakfastDid you try anything new whilst you were in Dublin?

Yes, meat! I’m a vegetarian but some things you just have to try, like the full Irish breakfast – even the black pudding.

What two things did you learn from your trip?

First, the history of the entire localization industry could be told by the people at the Welocalize Dublin office. The institutional memory is amazing and exciting.  Second, if you arrive at Dublin Airport at 2am, the only place open is McDonalds. (Not very Irish)

How do you think the business will benefit from your exchange experience?

A central theme for this exchange was to find ways to position our team as a more accessible shared service in 2014. The Welocalize Language Tools Teams hopes to better educate internal teams so we can then go on to communicate about what we offer and why it is so valuable. This will develop relationship with existing clients and those new clients who are looking for Language Tools solutions.


Welocalize and Intuit at TAUS 2013: From Zero to MT Deployment

Alex Yanishevsky 2013Welocalize Senior Solutions Architect, Alex Yanishevsky, delivered a joint presentation with Render Chiu, Group Manager, Global Content & Localization from Intuit, at the recent TAUS annual conference in Portland, Oregon.

In their presentation How STE and Analytical Tools Enabled MT Program”, Alex and Render shared valuable insights about the Welocalize-Intuit machine translation (MT) program. They specifically detailed experiences and best practices in going from zero to MT deployment across 11 languages in a short 90 days.

Their TAUS presentation focused on the role of language tools and analytics in meeting a global organizations need for fast product expansion with localized solutions. Alex and Render presented how Welocalize and Intuit leveraged publicly available data to train an initial set of MT engines and build a business case to go into production with MT.

Alex Yanishevsky shares his five key highlights from the presentation:

  • The Welocalize and Intuit MT program was deployed in 3 months for 11 languages
  • We trained Microsoft Translator with significant improvement over baseline engines on very sparse bilingual data
  • Intuit’s adherence to Simplified Technical English made MT onboarding much easier
  • Welocalize-specific analytics, part of our secret sauce,  along with POS Candidate Scorer, Perplexity Evaluator and Tag Density Calculator where used to analyze source content suitability
  • We used weScore, part of Welocalize weMT framework, to calculate analytics on MT engine quality, such as auto- scoring, human evaluation, productivity metrics

The full TAUS presentation is available to view here: How STE and Analytical Tools Enabled Intuit MT Program Welocalize TAUS 2013

You can also view Welocalize Olga Beregovaya’s presentation at the TAUS Showcase at LocWorld here: WeMT Tools and Processes

In addition, you can learn more about the Welocalize and Intuit MT story as presented by Tuyen Ho, Senior Director at Welocalize and Intuit’s Render Chiu at LocWorld 2013 in Silicon Valley by viewing: “Silver Linings Playbook – Intuit’s MT Journey. You can also read Tuyen’s blog about the presentation with Intuit.

MT and the French Riviera

By Laura Casanellas

This month I found myself in Nice, at the heart of the French Riviera, discussing Machine Translation (MT) and surrounded by experts in the field of MT and localization. As a member of Welocalize’s global Language Tools Team, this was a great opportunity to learn about the latest advances in MT.

I attended the Machine Translation Summit XIV, the international conference which takes place every two years, bringing together important names in the area of MT from the European, Asian and American sister associations.

In the world of localization and translation, MT is a growing phenomenon. Successful MT deployments are on the up. Perception around MT is rapidly changing; whereas a couple of years ago people would have focused on quality of MT output (or the lack of it as they saw it), nowadays what users want to know is how to make it work. This approach is helping the time, cost and quality equation which is always at the centre of every localization business. As more digital content is created and the possibilities of making it accessible to different locales become tangible, many companies are beginning to understand that not every type of text needs to be translated to the same level of quality. Once this level has been established, time to market and cost can be adjusted accordingly. The role of MT is becoming more significant as a consequence of the dramatic increase in volume of content.

The subject of MT is spreading into the commercial world. Proof of this was the large presence of commercial users among the conference attendees.

I presented at the MT Summit together with my colleague Lena Marg. Lena is in charge of training our language force on Post-Editing practices. In our presentation, “Connectivity, Adaptability, Productivity, Quality, Price….Getting the MT Recipe Right” we explained Welocalize’s practices around MT and what we consider are all the necessary elements to create a successful Machine Translation program.

Olga Beregovaya and Dave Clarke from Welocalize Language Tools Team also took part in the Summit; delivering a joint presentation with Dr. Alon Lavie, CEO of Safaba Translation Solutions, CMU research professor, a highly regarded figure in the world of MT. The presentation was entitled “Analyzing and Predicting MT Utility and Post-Editing Productivity in Enterprise-scale Translation Projects”. This presentation set out the joint Welocalize – Safaba research which has begun to identify the effect of features in ‘real-world’ content on post-editing efficiency and predictability, such as: presence, retention and placement of tags; recognized terminology; do-not-translate lists and more.

For the first time ever, this MT Summit showed a slightly higher attendance figure of industry and user representatives over the academic; with a high proportion of participants attending the user track presentations. This is a clear sign that the commercial world is following academics and researchers and that MT is becoming main stream.

The strategy of Welocalize is to keep abreast of all technological advances that are currently taking place around language tools in general and in particular, MT. Using and deploying the new technologies around MT that we identify as having true potential commercially.

You can view the presentations by visiting:

Another MT case study will be showcased at the forthcoming Localization World Silicon Valley, 9-11 October. “Silver Linings Playbook – Intuit’s MT Journey” will be presented by Render Chiu from American Software company, Intuit and Tuyen Ho, Welocalize’s Senior Sales Director for North America. Render and Tuyen will be talking about how Intuit and Welocalize architected an MT program that supports the enterprise, and successfully met an aggressive product launch schedule.

Welocalize to Present at Machine Translation XIV Summit in Nice, France

FREDERICK, MD–(Marketwired – Aug 28, 2013) Welocalize, a global leader in translation and localization services and products, will share insights and expertise at the Machine Translation XIV Summit taking place September 2-6, 2013 at the Acropolis Conference Centre in Nice, France.

The 14th annual Machine Translation (MT) Summit, organized by the International Association for Machine Translation and the European Association for Machine Translation, will include keynote speeches by renowned experts in MT, as well as panel discussions and presentations of papers.

Welocalize MT Program Manager Laura Casanellas will be presenting her paper, Connectivity, Adaptability, Productivity, Quality, Price: What are the Necessary Ingredients to get the MT Recipe Right?, Wednesday, September 4.

Olga Beregovaya, Welocalize vice president of language tools, and David Clarke, Welocalize MT architect, will present Analyzing and Predicting MT Utility and Post-Editing Productivity in Enterprise-Scale Translation Projects with Welocalize MT technology partner, Safaba. The presentation takes place Thursday, September 5.

“We know machine translation is crucial to a successful global localization and translation strategy,” said Olga Beregovaya, Welocalize vice president of language tools. “This conference is a meeting of the greatest minds in MT. We are pleased that Welocalize has a strong presence at the annual MT Summit and plays a key strategic role in shaping the future of MT for global brands and enterprises.”

For information about the MT Summit, visit

Press release: