Predictive analysis in the localization industry is swiftly becoming a key approach to improving quality and efficiency in the translation workflow. Extracting information from existing datasets to determine patterns and predict future outcomes can significantly help the translation automation process. In this blog, Welocalize Technology Solutions team members Dave Clarke and Dave Landan give us an update on how two new Welocalize tools are benefiting clients.
It is becoming increasingly important for language service providers (LSPs) to quickly determine the nature of content; How suitable it actually is for the envisaged localization outcomes and, subsequently, the appropriate processes and workflows it should be routed through to successfully meet those expectations. At the same time, today’s clients have an ever-increasing volume, and often diversity, of content to be translated. Therefore, the ability to analyze large quantities of source content quickly, accurately, and consistently becomes an imperative.
Welocalize has recently added two new tools to its language tool portfolio to help automate these analyses: TMTprime and StyleScorer.
TMTprime was developed through a joint collaboration between Welocalize and the Centre for Next Generation Localisation (CNGL, now ADAPT). TMTprime provides a way to predict which of multiple given translation assistance systems, whether translation memories (TMs) and/or machine translation (MT) engines, would provide the best output for any given content set. By simply providing TMTprime with TMs and/or MT training data and a “tuning set,” TMTprime learns to predict which of the systems it is trained on is best for different source content types. We are also currently researching the capabilities of TMTprime when applied to the task of predictive quality analysis, with a view to drastically reducing and, more often, replacing the running of costly and time-consuming human evaluations of multiple MT engines.
StyleScorer is a proprietary Welocalize tool that learns the content authoring style of a set of documents and then through a scoring system, rates how well new content matches the style of the initial documents. Analytic tools like Welocalize StyleScorer can work with documents in any language and can be useful for analyzing source and target content. Automated analysis of source content gives fast, accurate impression of suitability or potential difficulty of translation at the very beginning of the production cycle, which quite obviously, is exactly the right time to be informed. Further through the cycle, analyzing target content gives us a way to automate certain tasks in linguistic quality analysis (LQA).
By running StyleScorer on raw MT output, the scores can be used to rank documents that are likely to need more post-editing (PE) to bring them in line with the style of known target documents. This is good news when time is precious because it allows us to focus PE work where it is needed.
TMTprime and StyleScorer are just two examples of the cutting-edge tools that Welocalize uses to make sure that content gets translated as quickly as possible, to appropriate quality levels. More exciting innovation in the area of content analysis will be brought out later this year so watch this space!
Welocalize Technology Solutions
Dave Clarke and Dave Landan
For further reading on StyleScorer, read Dave Landan’s blog: Welocalize StyleScorer helps MT and Linguistic Review Workflow
Click here for more information on weMT