• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
ConferencesBrainstorm AI
Asia

AI chatbots struggle to function beyond English: ‘They know a lot … but they miss the culture’

By
Cecilia Hult
Cecilia Hult
Down Arrow Button Icon
By
Cecilia Hult
Cecilia Hult
Down Arrow Button Icon
July 25, 2025, 10:52 AM ET
Kalika Bali, senior principal researcher at Microsoft Research India, speaking at the Fortune Brainstorm AI Singapore conference on July 23.
Kalika Bali, senior principal researcher at Microsoft Research India, speaking at the Fortune Brainstorm AI Singapore conference on July 23.Graham Uden for Fortune

The world’s leading AI chatbots can now generate everything from emails to research papers—in English. But shift to a different language, and AI’s performance begins to slip.

Recommended Video

Most large language models are “a bit like a Fulbright scholar who is interested in Asia as their area of study,” said Kalika Bali, a senior principal researcher at Microsoft Research India at the Fortune Brainstorm AI Singapore conference on Wednesday. “They know a lot about the [subject], but they miss the culture. It’s an outsider’s gaze into the culture of a country.”  

Bali pointed to a classic math question—”John and Mary have a key lime pie which they need to divide into five parts”—to show the trouble of using a culturally clueless AI. 

Generic AI models will translate the prompt directly. But as Bali pointed out, “in a country like India, most people don’t know what a pie is, [let alone] a key lime pie.” 

To develop models that better understand local culture, more data is needed in local languages. But getting that data is not always simple. 

Roughly half of all web content is in English, meaning there’s no shortage of high-quality digital resources for LLMs to learn English from. For other languages that do not enjoy this same abundance, developers have to explore different methods of getting training data. 

Kasima Tharnpipitchai, head of AI strategy at SCB 10X, highlighted the foundational work by native speakers needed to build a training dataset. 

Tharnpipitchai led SCB 10X’s project to launch the Thai LLM Typhoon. To build a dataset in Thai, Tharnpipitchai said that native speakers had to sift through open large datasets by hand, determining which Thai data sources were high-quality and which were not. 

“There are no tricks here, you really have to do the work,” he said. “It really is just effort. It’s almost brute force.” 

SCB 10X launched Typhoon a year and a half ago. Tharnpipitchai said Typhoon was able to outperform GPT-3.5 in Thai, a fact which “says more about how poorly GPT-3.5 was performing in Thai” than their own work. 

Yet scraping non-English web data is beginning to raise legal concerns.  

Khalil Nooh, cofounder and CEO of Malaysian startup Mesolitica, which is developing a Malay LLM, said that the company has had data owners request their sources be removed from the training dataset, which is available online since they are an open-source model. 

This has further limited the already small pool of high-quality data they have in Malay. To solve this, “the challenge for us is to work with private dataset owners,” Nooh said. 

Both Nooh and Bali are exploring synthetic data generation to help create more high-quality data in their target languages. Machines can translate the abundant English content online into other languages to supplement their limited datasets. This is especially useful for LLMs trying to work in regional dialects that have almost no digital presence otherwise. 

“How we are able to capture all the 16 dialects in Malaysia is through synthetic [data],” said Nooh. 

But there are some obstacles to getting data that neither “brute force” nor machine generation can overcome. In many communities, researchers must balance getting a full picture with managing cultural sensitivities when collecting data in local languages. 

While “on the whole, India is very tech positive,” Bali noted, “there are things that you would not ask” when doing on-the-ground data collection. Local communities may not want to share information on certain topics, even if it is widely known among people in the region. 

Nooh added that in Malaysia, the three Rs—“race, religion, and royalty”—are all subjects of regional sensitivity. 

Although there are currently no regulations on what LLMs can “say” in Malaysia, Nooh said that Mesolitica has “gone ahead to prepare the components that are needed if ever that is required to be implemented.” 

To tackle cultural sensitivities in Thailand, Tharnpipitchai similarly explained that SCB 10X released a “safety model” for public sector use, in addition to their regular Typhoon model. 

The Fortune 500 Innovation Forum will convene Fortune 500 executives, U.S. policy officials, top founders, and thought leaders to help define what’s next for the American economy, Nov. 16-17 in Detroit. Apply here.
About the Author
By Cecilia Hult

Cecilia Hult is an editorial intern based in Hong Kong.

See full bioRight Arrow Button Icon

Latest from our Conferences

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • Future 50
  • World’s Most Admired Companies
  • See All Rankings
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest from our Conferences

Fortune Brainstorm Tech 2026 livestream
ConferencesBrainstorm Tech
Fortune Brainstorm Tech 2026 livestream
By Fortune EditorsMarch 23, 2026
9 days ago
Fortune COO Summit 2026 livestream
ConferencesCOO Summit
Fortune COO Summit 2026 livestream
By Fortune EditorsMarch 23, 2026
9 days ago
Fortune Workplace Innovation Summit logo
ConferencesWorkplace Innovation Summit
Fortune Workplace Innovation Summit 2026 livestream
By Fortune EditorsMarch 23, 2026
9 days ago
Backflips are easy, stairs are hard: Robots still struggle with simple human movements, experts say
InnovationBrainstorm AI
Backflips are easy, stairs are hard: Robots still struggle with simple human movements, experts say
By Nicholas GordonDecember 11, 2025
4 months ago
Exelon CEO: The ‘warning lights are on’ for U.S. electric grid resilience and utility prices amid AI demand surge
ConferencesBrainstorm AI
Exelon CEO: The ‘warning lights are on’ for U.S. electric grid resilience and utility prices amid AI demand surge
By Jordan BlumDecember 9, 2025
4 months ago
AI’s reliance on patterns can lead to ‘somewhat mediocre’ results, warns CEO of design consultancy IDEO
AIBrainstorm Design
AI’s reliance on patterns can lead to ‘somewhat mediocre’ results, warns CEO of design consultancy IDEO
By Andrew StaplesDecember 9, 2025
4 months ago

Most Popular

Jerome Powell says the $39 trillion national debt is ‘not unsustainable,’ but warns the trajectory ‘will not end well’
Economy
Jerome Powell says the $39 trillion national debt is ‘not unsustainable,’ but warns the trajectory ‘will not end well’
By Fortune EditorsMarch 30, 2026
2 days ago
Markets cheer as Trump threatens to abandon Iran war, but Jamie Dimon sides with allies: ‘Win this thing and clean up the straits’
Energy
Markets cheer as Trump threatens to abandon Iran war, but Jamie Dimon sides with allies: ‘Win this thing and clean up the straits’
By Fortune EditorsMarch 31, 2026
1 day ago
A man used AI to call 3,000 Irish bartenders to track the cost of Guinness. Now pubs are lowering their prices to compete
AI
A man used AI to call 3,000 Irish bartenders to track the cost of Guinness. Now pubs are lowering their prices to compete
By Fortune EditorsMarch 30, 2026
2 days ago
Kevin O'Leary says if you earn $68,000 a year and follow this rule, you'll retire a millionaire
Personal Finance
Kevin O'Leary says if you earn $68,000 a year and follow this rule, you'll retire a millionaire
By Fortune EditorsMarch 31, 2026
1 day ago
Two-thirds of parents say their adult Gen Z kids still rely on them financially  for support—even though it's putting them under strain
Success
Two-thirds of parents say their adult Gen Z kids still rely on them financially  for support—even though it's putting them under strain
By Fortune EditorsMarch 31, 2026
1 day ago
Hiring just hit a level not seen since the economy was ‘closed down literally’ during COVID, top economist says
Economy
Hiring just hit a level not seen since the economy was ‘closed down literally’ during COVID, top economist says
By Fortune EditorsMarch 31, 2026
23 hours ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.