Guarding against bias in ChatGPT and travel’s AI tools

For months artificial intelligence was hailed as a salve for all that ails business, and travel companies were no exception in rushing to capitalize on its potential.

Now, even the CEO of OpenAI, the startup that released ChatGPT in November, has gone before Congress to warn of the need to regulate the technology being developed by his company and the likes of Google and Microsoft.

Will AI and the tools it powers help make connected trips as simple as ordering a pizza and strip the friction from travel? Or will they increase the spread of bad information, take jobs from humans – or, well, destroy humanity altogether?

As the director of data science for hospitality platform Placemakr, Jyotika Singh has a broad perspective on the good that AI and natural language processing tools can do. As a woman on the rise in a male-dominated field, she also understands how even unintended bias can be harmful to individuals and systems.

Singh — whose book, “Natural Language Processing in the Real World,” comes out July 3 — spoke with PhocusWire about the benefits and pitfalls the technology presents in a conversation that has been condensed and edited for clarity.

Jyotika Singh

Setting aside the possibility that AI could destroy the world as we know it, you’ve spoken frequently about concerns that belong less to the realm of science fiction than, unfortunately, the world we live in. Tell us about the risks of bias in AI models.

The risks and concerns that come with AI are mostly divided into three different buckets. One of them is artificial general intelligence – AI taking over the world, the AI singularity and people being concerned with AI getting more powerful than human beings. The other is data privacy, how your data is used in actually building these models. The [third] one is bias, which is more related to the data or the model itself.

The bias in natural language processing or AI in general has two primary sources. One is the data used to build and train these models. The second would be assumptions used in developing models or who’s building them.

I’ll give you an example. In terms of data, as ChatGPT has become so popular, we are more aware of what these tools do. One thing to note is that these are built on a lot of data that’s taken from the internet: articles and social media and books and everything – a lot of different sources of data. Now there is bias in that data. On social media, one is free to say anything, right? There’s no filter, no bar. The issue there is that once something gets popular on social media, that is shared a lot more. It’s a little biased in what gets popular.

If that’s the data that’s getting used to train a model, the model will inevitably have learned those patterns. To give all this data to the model, the model is learning patterns from it or really ingesting it. That’s how ChatGPT knows about all these topics and is able to answer questions about a broad range of categories.

To illustrate the point, can you share an example of bias that sneaks into these language models?

There’s one that I see all the time. Go to Google Translate and just type in English, “She is a doctor. He is a nurse.” If you translate that from a language that’s more gender neutral to something like Hungarian or Punjabi and then you translate that back, it changes it to “He’s a doctor. She’s a nurse.” That is a very popular example of this sort of bias, especially when it comes to gender bias. More concerning, there are examples of incorrect or unsafe information that has been exposed to people where it should not have been. The good thing is that when these reports come out, corrective actions are taken almost immediately.

After going to school in India, you got your master’s from UCLA in a male-dominated engineering field. Since then, you’ve been an advocate for promoting women in STEM. How did your experiences in school affect your awareness of issues such as systemic bias?

There is a skew of [there] being more males in [a STEM] class. I did observe that skew. [But] I’ve been lucky in that way in that I received a lot of support from my mentors, especially my father. He’s been my biggest cheerleader. He’s the CEO of a group of hospitals. I come from an educated family background. My grandmother did her master’s in arts at a time [when] it was very rare and pretty common [in India] for women to not even finish school. My grandfather was a lawyer. My mother had her bachelor’s as well. My family has always encouraged me to get an education.

That was not always the case in school. Most of my friends, their mothers were not working and were not necessarily that well educated either. I would say maybe 99% of my friends, their mothers were not working professionally. But while their mothers did not necessarily have that career choice, they all wanted their kids to study and work and have financial independence. It’s almost like a transformational generation where I come from, where everyone in my generation has had access to education and support from their families [that prior generations didn’t have].

A lot of the publicity with generative AI in travel has focused on how customers can use it to plan and book trips. While it might not get the same amount of attention, at Placemakr you’ve seen how effective it can be in analyzing and automating customer feedback.

That is a big part of how AI helps businesses in general. It’s a massive area to be able to understand your customers’ messages, maybe the questions they are asking and try to get more of that automated so there’s less on actual humans. It not only reduces the load on human customer care representatives but also gives the flexibility to organizations to resource well.

This is not something new. It’s something we’re doing at Placemakr. We’re using multiple models, including large language models and some other techniques like natural language processing to take customer feedback and understand what are they really talking about. Do they like the arrival experience but not the technology or how the locks work? Maybe they had a maintenance issue that wasn’t resolved. We’re trying to capture that in an automated fashion.

When the scale of the comments you are getting is small, let’s say maybe 20 comments in a couple of weeks, you don’t need AI to help you with that. That’s something you can look at in a very reasonable amount of time, understand what’s going on and take corrective action. When that scales from 20 to, like, 500 or a thousand or even more than that, that’s when it’s extremely time-consuming for people to have to go through these user messages manually to understand the complaint areas and what action needs to happen. [At Placemakr] that so far has saved about 53% of the time it used to take, and we’re expecting a lot more savings with this.

Getting back to the risks of bias in AI, what should businesses do to guard against this?

Number one is just awareness: Know this issue exists. It’s not that AI practitioners are building all the models from scratch. There are so many sources available today that you can take a pre-trained model and attune it on your data. You might be able to handle the biases in your data if you’re really focused on building a bias-free system, but then you can’t do about the underlying model you’re using that’s maybe open-sourced.

What else can be done to guard against bias in AI?

One of the common discussion points when we are thinking of AI bias and why it’s happening and what needs to be done, one of the main resolution factors, is diversity of the teams who are building these models. Everybody has different experiences and exposures, and so we are carrying forward some bias in some ways. Having not only well-represented data sets but diverse AI teams that are actually building these models, that are looking at this data and determining how to test these models or what test cases should even exist – you need to have that representation in the people building these products as well.

Even after a company rolls out its AI model, the job’s not done, right?

That’s right. Over time, data changes as well. As you are updating models, does that bring in any new biases? How does that impact the outcome? One common approach, when you’re building something and not sure of the level of biases you have, you’ve done your test, now what? A lot of companies sometimes open up their product to a limited set of individuals and collect feedback. With that feedback, maybe some of the issues can be identified and corrected right away.

Phocuswright Europe 2023

Generative AI is set to revolutionize travel. Leaders from Trip.com, Microsoft and Kayak tell us how the technology will change everything from the back of the house to the front.

click here for details and to register

ChatGPT

Artificial Intelligence

generative AI

Machine Learning

Guarding against bias in ChatGPT and travel’s AI tools

More on Technology

The Latest

From Our Partners