Interview by Louise Law, Welocalize Communications Manager
I recently met with John Moran, an experienced translator and programmer who is working on a PhD in Computer Science. John has worked closely with Welocalize and CNGL (The Centre for Global Intelligent Content) for many years. In 2011, Welocalize began its partnership with the Irish-based academia-industry body. Very shortly after Welocalize joined CNGL, conversations began between John Moran and Dave Clarke, Welocalize Principal Engineer.
John’s research idea was to gather real-time data from translators post-editing MT output compared with translating “from scratch” using an instrumented CAT tool that records how a translation is produced rather than just the final translation. This work has resulted in a joint development effort with Welocalize contributing its developments to the code base and the commercial licensing of the iOmegaT Translator Productivity Workbench from Trinity College Dublin. iOmegaT measures the impact of MT on translator speed cheaply and accurately in a professional grade CAT tool. You can read more about in the March release when Welocalize announced the licensing of the iOmegaT technology in collaboration with CNGL. I caught up with John to find out his latest thoughts on MT and ask him how the iOmegaT project is progressing.
How long have you worked with Welocalize?
I worked in-house at the Welocalize Dublin office for nearly a year from 2011 to 2012, around the time Welocalize began their collaboration with CNGL. Since then Christian Saam, the second member of the iOmegaT team in CNGL, and I have been working with the Welocalize MT team and HP to test and refine the workbench. I have about ten years of commercial application development behind me so I am used to seeing software evolve; however, it is particularly satisfying when you can take something from proof of concept to a commercially viable solution. It’s definitely fair to say that this would not have been possible without the Welocalize team. I had touted the idea to a few translation companies and Dave Clarke at Welocalize spotted its potential right away. His engineering expertise complimented my own very well and Welocalize had already had a very advanced MT program when I came on the scene so there was post-editing work to test it on.
Can you tell me about your PhD work?
The problem I am trying to solve is how to accurately measure the impact of MT on a translator’s working speed using a technique we call Segment Level A/B testing. I had the idea for iOmegaT after I used MT for one of my own translation clients in OmegaT, a free open-source CAT tool I use whenever I can instead of Trados. At the end I could not tell if MT had helped me in terms of working speed, as I was so caught up in the translation itself. I suspected it had, as I was able to use a few sentences without changing them and the MT gave me a few ideas for terminology that might have taken me a few seconds longer to think of without it. I wanted hard data to support that intuition. Removing the MT from random sentences and measuring the speed ratio seemed like a good way of doing the measurement.
In order to do this, I adapted OmegaT to log user activity data as the translator translates some randomly chosen sentences from scratch (A) and post-edit other sentences (B). We call translation-from-scratch HT, shorthand for human translation.
This data is later analyzed to generate something we call a HT/MT SLAB score as “Human Translation versus MT Post-edit Segment Level A/B” score is a bit of a mouthful. For example a +54% HT/MT SLAB score indicates a particular translator was 54% faster using MT on a particular project. We also take the time a translator spends doing self-review into account. The system we developed to calculate this score is called iOmegaT. The “i” stands for instrumented. Others had thought of doing that using minimally functional web-applications; however, we were the first to do it in a professional grade CAT tool.
What do you think are the main barriers and challenges for companies looking to use MT?
I think the main barrier is that about three quarters of translators (in Europe at least) are freelancers and the vast majority use CAT tools like Trados, MemoQ and Wordfast. These CAT tools don’t report on how MT impacts on post-editing speed, so it makes it very hard to negotiate fair discounts when the translator is not working in-house.
One of the things we found by giving the same files to post-edit to different translators is that the MT utility can vary from person to person. You need to have some way of identifying people who can work well with MT. Edit distance does not really capture that directly. Because not everyone finds MT equally useful, as an agency you can very quickly find yourself in a situation where the discount you are asking for is unfair. Basically, the lack of hard data on speed ratios can lead to mistrust on both sides. We think this is a problem that can be reasonably and easily solved with SLAB scores.
Can you tell us about the iOmegaT project and where it’s heading?
Aside from some refinements, I think we are where we want to be in terms of measuring the impact of full-sentence MT on translation speed. On the research side, what we want to do next is look at the impact of automatic speech recognition using Dragon Naturally Speaking and various forms of predictive typing and auto-complete on productivity. On the commercial side, one really exciting development is that OmegaT has now been integrated with Welocalize’s GlobalSight, which is also free and open-source. That means you don’t need special workflows for productivity testing, so the testing process is much cheaper. This means we can gather speed data for longer periods to look at more gradual effects, like the impact of MT and/or speech recognition technology on translation speed in terms of words-per-hour over weeks and months.
What’s next on the horizon for CNGL and iOmegaT?
Right now, the iOmegaT CAT tool and the utilities and analysis software that go with it require a good deal of technical ability to use. For that reason we currently only engage with one new client at a time. Our focus has been on well-known corporate or enterprise end-buyers of translation with complex integration requirements, such as with SDL TMS and SDL WorldServer. This has worked well so our next aim is to develop the system into a suite that is easier to use to widen the user-base to smaller LSPs and even translators, while listening to our small core of enterprise clients to improve the software for them too.
Where do you think MT is heading in the future?
One problem is the fact that research systems in MT are being evaluated on the basis of automated evaluation metrics like BLEU. However, we know that small improvements in these scores mean little in terms of a translator’s working speed. It is an elephant in the room. There is some hope. If the desktop-based CAT tools like Trados, MemoQ, Wordfast can take a page out of our book and implement iOmegaT’s Segment Level A/B testing technique, I think at least some user activity data could be shunted into research.
Researchers could collaborate with existing MT providers, who are already closely linked to publicly funded MT research. This might facilitate a tighter development and testing loop between translators who are using MT and researchers developing better MT systems for different languages and content types.
Also, we don’t have to limit ourselves to full-sentence MT. The testing technique behind SLAB scores can work just as well for other technologies like predictive typing, interactive MT, full-sentence MT and automatic speech recognition. It is going to be interesting to see how these technologies interact with each other. I think speech and MT are particularly well suited to benefit from each other, as I would like to see more industrially focused research done on that topic. Dictation using Dragon can have health benefits for translators by reducing the risk of repetitive strain injury so it is important to shine a light on it to justify more research, even if it doesn’t really bring down the cost of translation for end buyers in the near term.
What’s one piece of advice you would offer to a global brand looking to deploy an MT program?
Don’t always believe what MT providers say about productivity improvements. Figure out how to test the impact of MT on translators first using cheaper systems like Microsoft Translator Hub and then work out how to improve on that MT baseline. Whether you use in-house translators who are closely monitored, TAUS’s DQF tools, iOmegaT or other productivity testing tools like PET, the important thing is to be able to accurately measure that which you wish to improve so you can see the impact of small changes. In software engineering, we call this test-first development. iOmegaT and Segment Level A/B testing makes that testing process cheaper so you can do it more, or, indeed, all the time.
About John Moran: Since the 90’s, John has worked variously as a lecturer in translation at Trinity College Dublin, as a translator in his own LSP and as a consultant software engineer for leading global companies like Cap Gemini, Siemens and Telefonica. He is currently writing a PhD in Computer Science on the topic of CAT tool instrumentation. You can find him on LinkedIn at https://www.linkedin.com/profile/view?id=18681141.
Find out more about weMT here.
For more information on iOmegaT, Check out Dave Clarke’s blog, Welocalize, CNGL and iOmegaT: Measuring the Impact of MT on Translator Speed.
All product and company names are trademarks™ or registered® trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.