Insights on Quality and MT for Localizing User Generated Content 

Samantha HendersonThe premise behind the Welocalize LocLeaders event series is to provide a forum for some of the localization industry’s most influential leaders to discuss new concepts, and how these ideas will shape the future of global business. The question of how to deal with the mind-blowing volumes of source content, which are primarily network and user generated content (UGC) and now flooding our lives daily, is one of the challenges which the localization industry must address.

There is great opportunity for businesses who can apply strategic, innovative approaches to localizing UGC and create greater engagement with customers all over the world. Moreover, the risks of not localizing also warrant keen examination so as not to lose a crucial edge on the competition.

The Welocalize LocLeaders Forum 2016 Montreal panel discussion, “Quality Validation for Network Generated Content,” brought together expertise from attendees who gave valuable insight into the challenges of dealing with evolving types of source content, including UGC.

There are different budgets, different quality expectations and certainly a different sense of urgency for localization depending on who, what, when, why and how the localized content will be consumed.  Source content needs to be categorized, with quality expectations defined, and then a calculation on potential return on investment will determine the priority for localization. UGC usually has a short shelf-life; however, there are examples such as a breaking news announcement, the instant impact and gratification of the message warrants that it is done fast and accurately, or else don’t bother at all.

img_3539The localization budget plays an important role. To translate high volumes of UGC using a more traditional localization approach would be too expensive and time consuming.

There is a lot of buzz around community or crowd-sourcing models, which, for the most part, rely on the goodwill of their user base, as a highly cost-effective and scalable model for both translating and validation of UGC. However, closer evaluation reveals that a ‘crowd’ willing to offer their services for free cannot be expected to mobilize for just any content type. There needs to be a deep-rooted passion for a product or a movement, which, in itself, drives a desire to make sure that consumers in their target market are able to experience the product or message in their native language. If such a community doesn’t exist, then other options need to be explored.

Machine translation (MT) is quickly becoming a standard tool for localizing UGC. Our LocLeaders panelists were all able to provide examples of how MT has been able to speed up time-to-market, increase efficiency and reduce the bottom line for their business.

MT translates content types which would otherwise been overlooked or sunk into traditional localization methods that don’t suit next generation content types like UGC. With a wider usage of MT, the role of the translator is shifting to a post-editor, with a focus on enhancing the raw MT output for better reuse and gradual enhancement of the MT engine quality over time.

The debate over the optimal way to localize UGC is only just beginning. By definition, we expect that users will increasingly devise their own creative methods for rendering source content into target formats. Welocalize aims to stay at the forefront of these developments and we will keep the discussions flowing at future LocLeaders Forums to ensure we continue to drive unique and innovative solutions for our clients.


Samantha Henderson, Senior Client Services Director at Welocalize

Samantha was a featured host at LocLeaders Forum 2016 Montreal for the panel discussion, “Quality Validation for Network Generated Content” with Loy Searle from Intuit, Hanna Kanabiajeuskaja from Box and Andrzej Poblocki from Veritas.  Sam also joined Katie Belanger from Intuit at LocWorld Montreal the same week to present, “Localization Models” The Search for the Optimal Linguistic Resource Model.”  If you would like to reach Sam to learn more about these presentations, reach out to