When Can We Trust Generative AI? Podcast Episode 7 with Chris Grebisz

In the seventh episode of the Welocalize Presents podcast, host Louise Law is joined by Chris Grebisz, Chief Innovation Officer at Welocalize. Chris plays a critical role in Welocalize’s innovation strategy to ensure growth and scale for clients and to guide them through disruptive technologies such as generative AI and large language models (LLMs). In…

Generative AI

In the seventh episode of the Welocalize Presents podcast, host Louise Law is joined by Chris Grebisz, Chief Innovation Officer at Welocalize. Chris plays a critical role in Welocalize’s innovation strategy to ensure growth and scale for clients and to guide them through disruptive technologies such as generative AI and large language models (LLMs).

In the podcast, they explore the current state of accuracy and integrity in generative content and contemplate the future as generative AI tools look set to become as ubiquitous as the internet is now.  

Key topics… 

About the Welocalize Podcast 

The Welocalize podcast is dedicated to exploring the world of multilingual communication, content creation, and cutting-edge technologies that enable brands to reach global audiences.

Each episode features guests who share their expertise and stories on language, localization, technology, and translation. Through these engaging conversations, listeners gain valuable insights to enhance customer experience and navigate the landscape of global communication.   

You might also like… 

NEW! Try Welocalize’s ChatGPT Plugin, TranslationRater 

Beta Testing Enhanced Workflow for Managing Expansion of Multilingual AI-Generated Content 

Embracing Disruption in the Language Services Industry: Generative AI and Large Language Models (LLMs) 

 


Transcript

Louise Law  

Hello everyone, and welcome to the Welocalize podcast, where we talk about all things related to multilingual communication, content creation, and the cutting-edge technologies that enable brands to reach global audiences in multiple languages. Today I’m joined by Chris Grebisz, who is Welocalize’s Chief Innovation Officer, and we’ll be exploring how much people can start to trust generative AI and content creation, whatever the language, and just have a chat about the current state of accuracy and integrity in generative content. Hi, Chris.  

Chris Grebisz  

Hey Louise. Thanks for having me.  

Louise Law  

Pleasure. Chris, you’ve been working with innovation strategy at Welocalize for well over a decade, and now we’re witnessing a new wave of disruptive technologies led by generative AI and large language models, or LLMs as we commonly refer to them as. And it’s fair to say most people have probably played about with generative AI. I know that I checked some of this podcast script using ChatGPT Plus. I’m certainly starting to use it in my work. And you’ve worked with innovation for ages, so you must be pretty excited at the moment about all the activity. It’s a bit like the 1990s all over again where we started to see the rise of the Internet. So how are you feeling at the moment with the whole generative AI wave?  

Chris Grebisz  

Yeah, this is more than a wave, I think. This is like a tsunami, and anybody who’s in content services is inside of the middle of the tsunami without understanding, you know, how big of a flotation device do we need? It’s interesting to see. Will it be interesting to see where this goes? And one thing that I always remind myself of and people that I’m speaking to is it’s very new. This technology became publicly accessible last fall/winter, and there’s lots of conclusions kind of being drummed up around it, but it’s still very much in its infancy stage, even with how exciting the features are today and so there’s a lot of speculation as to where this might go. And yeah, absolutely, I remember receiving my first e-mail in the early 90s, and that was pretty cool. But I had no idea. Nobody knew what the Internet was going to be at that point in time.  

Louise Law  

Now we can’t live without it. For the listeners out there, can you give your take on a basic understanding of the topic. What is generative AI to you?  

Chris Grebisz  

So large language models are a form of artificial intelligence. We’ve all used chatbots in the past, right? And they’ve been somewhat limited, useful but limited. We’ve used Siri and Alexa and all those types of things, and this is another form of that. What’s unique about large language models as we know them now, whether it’s Open AI or Bard, is the size of the model itself. I mean, in building and developing these models, they basically consume all digitally available data, train that data, and make it accessible for a variety of use cases, and that’s what’s been most shocking is just the size and, as a result of the size of those models, they create all sorts of capabilities with natural language. And that’s unique.  

Louise Law  

So, the big question, and the reason we’re here to talk about the topic, is it’s centered around trust and whether individuals and brands can trust the integrity and the accuracy of generative output in any language. And you said before we’re dealing with a tsunami, and I tend to agree with that. I know anything we talk about today could be out of date, and it will be out of date within a couple of months because things are moving so quickly, but where are we today, Chris, in terms of the accuracy of the content? Can people trust generative AI output, and if not, when can we possibly expect to start being able to really, really trust the output in terms of content creation?  

Chris Grebisz  

In my experience and the experience in our own research at Welocalize, it is often trustworthy but unpredictably not trustworthy at certain times. Therefore, that unpredictability makes it not necessarily suitable for many enterprise environments out there.  

Louise Law  

Right.  

Chris Grebisz  

Trust has to be calibrated. Do we envision ever using this technology without humans in the loop? I think that’s a destination that’s pretty far away. We will continue to use humans, but the question will be how involved humans are in terms of evaluating the content that’s being produced by a language model for whatever use case there may be. We have started exploring things like we have a plug-in for OpenAI, and it’s really just intended to be a tool that a human could use, just like a large language model is to be able to inform, to give a signal as to what the quality may be of an output, and then that allows the human to work more efficiently. The companies out there that are producing productivity tools, whether it’s Microsoft or Google, will be injecting generative features into their productivity suites. The use case is to help the human be more productive. Microsoft’s release video they made mentioned many, many times of removing the drudgery of work so we can become more efficient using tools like this. We will become even more efficient the higher quality of data that’s produced by the language model, but for all intents and purposes, there’s going to be a human monitoring that data and that result.  

Louise Law  

And the tool you refer to,  Microsoft Copilot, one of the everyday tools that we have on our laptops, will just be embedded. It’ll just be as natural as having a spellcheck, presumably, and that will accelerate. As you know, we’ve had chats before on this, Chris. That will kind of be the start of the next really big thing where it just becomes part of our everyday life.  

Chris Grebisz  

Yeah, within three or four months of the release of OpenAI, almost all major content authoring systems came out with announcements of how they would be integrating some sort of generative feature set into their environment, whether it’s Adobe, Salesforce, Canva, Microsoft or Google which means and this is, you know that tsunami and the adoption curve on this, that most people will have access to these tools in the next 12/18 months, so they’ll start using these tools and it’ll become part of the everyday way that we do work.  

Louise Law  

Yeah, basically turning content creation workflows on their head really because it’ll just be a different model, right?  

Chris Grebisz  

Yeah, many people suggest that we will become less like content creators and more content editors.  

Louise Law  

Exactly.  

Chris Grebisz  

So, we’re using a tool where the human is in the loop, and the notion of trust is a very valid topic that I think will be a speed bump to some extent in terms of adoption in enterprises. But it’ll be the efficiency of the human and the different types of tools besides the content. The models get better, and the tools get better. You and I will just become far more productive.  

Louise Law  

Yeah, absolutely.   

Louise Law  

Let’s talk a little bit about machine translation. Obviously, that, especially neural machine translation, is used all over in a lot of our client’s programs and all over the language industry. How would you say that the output from generative AI performs against NMT? And when I say NMT, I mean neural machine translation.  

Chris Grebisz  

Yeah, I get asked that a lot, and there’s been research on this topic, and I don’t know if that’s the right question. I think it might be kind of like comparing a red apple to a green apple.  

Louise Law

Ok

Chris Grebisz  

What’s happening is there’s a machine that is generating translation, right? Machine translation means something very specific, but we can either use NMT technology or large language model technology. The outcome is machine-generated translation or machine-generated content. The way that they do it is very different and, depending on what your input is, could be different as well on a large language model as you could put in a prompt and ask it to produce some content and out comes content, and then you can say, please do that in Spanish. In machine translation, you have a source, and it’s processing that source against a trained engine typically. The tech itself is somewhat different, and it’s widely written about that NMTS are perceived to be more accurate, whereas a language model will have a much higher level of fluency. And you know the quality is the thing we hear about hallucinations because it’s emphasizing fluency and quality. If it can’t figure out exactly what a specific detail is, it’ll just make something up that sounds good, and so it’ll read really well, but it could be completely wrong. Now, there’s a little bit of fear, uncertainty, and doubt because as we get more sophisticated with the way that we fine-tune our own models, how we design the inputs, and the sequence of inputs, the hallucinations will become less and less. I mean, that’s not an unsolvable problem over time. But I also think in the future, there’s going to be use cases for both where for certain content types LLMs will have adoption that’s sooner than later and other content types because of the importance of accuracy in comparing a source to a target NMT will be a better technology stack.  

Louise Law  

Yeah, I suppose for something like regulated content where accuracy is really important, like maybe in the legal industry, custom NMT is going to perform better than something on an LLM that’s better for a more generic marketplace.  

Chris Grebisz  

Yeah, there are regulations out there that require certain workflows and certain human validation steps, so that will continue to be the case where it may not be a suitable technology model.  

Louise Law  

Yeah, absolutely. So, what do you think the necessary steps are where we get to a point where people can really, truly rely on generative AI and LLMs for content creation and translation?  

Chris Grebisz  

I advise customers to think about their content ecosystem from the context of content types. Every company has different repositories and tech stacks for different content. You know you have customer support content, you have technical documentation, marketing, and internal communications, and some of that content is less risky. Knowledge-based articles are less risky and something that you could do. Maybe in the past, because of the expense of translating it, we haven’t translated it. But maybe a language model, particularly a language model that has some fine-tuning to it, might be OK to start testing that content in the languages that are supported. We’re seeing a lot of writing about marketing and blog posts and things like that. I think that’s going to be it, particularly if you are still going to have a human look at the content.  

Louise Law  

Absolutely, yeah.  

Chris Grebisz  

But if you are writing a blog post, you could use a tool to generate something that sounds pretty good. It’s still a little stiff, it still reads a little computer-generated like, but I can go in and now post-edit it and make it sound pretty good.  

Louise Law  

We still need human expertise, but it’s just the way in which it’s applied is going to evolve, isn’t it?  

Chris Grebisz  

That’s the biggest thing. These are designed to be productivity tools, and they’ll be productivity tools for original source creation, and they’ll also be a productivity tool for a translator. Those are the types of use cases we should be looking at. How can we inject this technology in different places? And because of the power of the technology, it expands the possible opportunities.  

Louise Law  

Yeah, absolutely. Do you have any advice to pass on to any of the listeners out there who are dealing with global content and or translation programs? Any one piece of advice?  

Chris Grebisz  

The most important thing is to start using the technology. Using it for personal purposes. I use it. It’s really good at summarizing. If there are long articles, I’ll have it summarize them in bullet points. It is kind of funny, I’ve listened to people talk about how this is going to exponentially increase the volume of content that we have, and I think it’s a bit ironic. We already have too much content. In my opinion, the LLMs are going to exponentially increase the volume of content, but now I’m going to be relying on an LLM to summarize all that content for me because you just can’t process it all. I think the biggest thing is for folks, and I tell customers this, to identify little experiments, little use cases where you can test either content creation or content translation and then just start becoming creative, like how could I use this? Where is it suitable? Inside of my organization or my function? We know that the technology is going to change very quickly. In three months, it will be better, and then in six months, it’s going to be even better. And the more comfortable we are and the more we start thinking about how we could use it and apply it to our business today as it does become enterprise ready or at the level where we’re comfortable with it, we’ll be already mature in our mental models of how to use it.  

Louise Law  

Yeah, absolutely. I mean, to your point, it is great at summarizing large pieces of content, and even when it hallucinates, which I think is a great way of describing sometimes what happens with ChatGPT, you know it is definitely going towards being a productivity tool. It’s certainly not replacing humans.   

Chris Grebisz  

Absolutely, and I guess that we’ll see it on our desktops and most of our tool sets in the next 12 to 18 months, as those companies are building out safe integrations with whatever foundational model they’re using.  

Louise Law  

Yeah, absolutely. Lots of things to look forward to.  

Louise Law  

Chris, thanks so much for joining us on the podcast. It’s great to hear your take on things, and I’d like to say good luck with the rest of this tsunami that you’re dealing with and working with the teams on the generative AI tsunami. Thanks very much for joining the podcast, Chris.  

Chris Grebisz  

All right. Thanks. 

Search