AI’s fluency in other languages hides a Western worldview that can mislead users − a scholar of Indonesian society explains
Published in News & Features
A friend in Indonesia recently told me about a conversation he had with ChatGPT. He had typed a question in Indonesian – Bahasa Indonesia – about how to handle a difficult family dispute. The chatbot responded fluently, in perfect Indonesian, with advice about communication strategies and conflict resolution. The grammar was flawless. The tone was appropriate. And yet something felt off.
What the AI offered was advice rooted in American cultural assumptions: prioritize your own preferences, communicate directly, and if family members don’t respect your boundaries, consider cutting them off.
The response was in Indonesian but shaped by values that centered individual autonomy over the consensus-building, social harmony and collective family dynamics that tend to matter more in Indonesian social life.
My friend was skeptical enough to notice the mismatch and mention it to me. Many users might not. That is what prompted my research, published in the International Review of Modern Sociology, into a pattern I found across major AI systems: Even when they were fluent in several languages, the language models retained their Western worldview. I call this “epistemological persistence.”
I have studied Indonesian society, media and culture for more than 30 years. That gives me a particular vantage point on a problem that reaches well beyond Indonesia: large language models – LLMs – like ChatGPT, Claude and Gemini can now speak dozens of languages with remarkable fluency. That fluency creates the impression that AI understands local cultures.
Producing grammatically correct Indonesian, Arabic, Swahili or Hindi, however, does not change the underlying worldview through which these systems reason. It does not alter how they think about people, relationships, responsibility or what counts as a good outcome.
Those assumptions are shaped by training data drawn predominantly from English-language sources based in the United States. Meta’s open-weight model LLaMA 2 was trained on approximately 89.7% English-language text; LLaMA 3 includes only about 5% non-English data. Major commercial models don’t publish equivalent breakdowns but draw heavily on the same sources. Arabic, the fifth-most-spoken language globally, accounts for under 1% of content in large training datasets. Languages with tens of millions of speakers, including Bengali and Hausa, barely appear.
Beneath the surface of these multilingual conversations, English functions as a hidden intermediary. A study by researchers at the University of Oxford found that LLMs routinely conduct their core reasoning in English, even when prompted in other languages. They translate the output at the final stage. A user receives flawless text in their preferred language, but the underlying logic originates elsewhere.
To examine how this plays out in practice, I ran experiments with ChatGPT, Claude and Gemini. I asked questions in both English and Indonesian about concepts such as education, responsibility, well-being and several Indonesian terms that resist direct translation into English. These included terms such as “gotong royong,” which describes a tradition of communal mutual assistance.
Then I asked questions about education in both languages, using the word “pendidikan” in Indonesian. The answers were consistently centered on individual development, personal autonomy, critical thinking and preparation for the labor market.
What largely disappeared were the dimensions of pendidikan that Indonesian educational traditions have historically emphasized. In Indonesia education has long been focused on ethical discipline. Scholars of Indonesian education such as Christopher Bjork and Robert Hefner have documented how distinct these traditions are from models that treat education primarily as a path to individual advancement and career preparation, which is the lens through which the AI tools viewed education.
The Indonesian concept of “malu” offers a starker example. Often translated as “shame” or “embarrassment,” malu has been analyzed by anthropologists Clifford Geertz and Tom Boellstorff as something closer to a shared social awareness.
A person might feel malu when speaking out of turn in front of elders, or when a family member’s behavior reflects poorly on the household. It regulates conduct and signals awareness of one’s position within a web of relationships. It is cultivated, not merely felt. It is a form of relational awareness rather than a private psychological event.
When asked directly to define malu, the models acknowledged its social dimensions. In scenario-based questions that simply used the word without asking for a definition, however, all three fell back on the English translation of shame, consistently framing it as an individual emotional experience.
One representative response framed malu as a normal emotional reaction to be managed through self-reflection and confidence-building – a personal psychological problem rather than a social one. The relational dimensions of the concept disappeared entirely, replaced by the language of individual emotional regulation.
A distinctly American worldview travels inside the translation, largely unannounced.
Translation is far cheaper: Train one model on the vast English-language web, then use multilingual output capabilities to serve global markets. As media scholar Safiya Umoja Noble argues about algorithmic systems more broadly, what looks like a technical outcome is actually a structural one, shaped by who has the wealth and infrastructure to build these systems.
The embedded worldview isn’t a mistake; it’s what happens when knowledge production is profit-seeking.
The main exceptions are Chinese models such as DeepSeek and Alibaba’s Qwen. They represent a genuine alternative to the U.S.-dominated pipeline, though research shows they operate through a distinctly Chinese cultural lens. Asked about a workplace disagreement, for instance, they tend to advise silence or indirect phrasing to preserve harmony rather than the direct, private correction that Western models recommend.
Other regional efforts, such as SEA-LION for Southeast Asia and Kan-LLaMA for the Indian language Kannada, use U.S. models as their foundation. They add additional vocabulary and cultural information related to local languages. But the core logic remains tied to the original U.S. training.
One might reasonably ask whether this is simply a limitation users can work around. Decades of media scholarship demonstrate how audiences interpret foreign media through their own cultural frameworks.
For example, anthropologist Brian Larkin documented how viewers in northern Nigeria rework the narratives of Bollywood films to align with local Islamic values. Larkin found that Muslim viewers in Kano reinterpreted Bollywood films through an Islamic moral lens, reading their narratives as reinforcing local values of propriety and ethical conduct. That dynamic depends on encountering media as something with a visible origin. But to do that, you need to know where your media is coming from.
Conversational AI is different. Research at Harvard Business School finds that people increasingly use AI systems for emotional support, advice and companionship. When a culturally specific worldview is delivered through a relationship that feels attentive and empathetic, in your own language, it arrives less as a claim to be evaluated and more as a shared premise within a dialogue. It becomes difficult to notice, and harder to contest.
The concern is that these perspectives become the new normal. Certain ways of reasoning about family life, education and responsibility may come to feel natural and self-evident. Linguistic diversity among AI systems is real and growing. Cultural worldview diversity, however, has not kept pace.
This article is republished from The Conversation, a nonprofit, independent news organization bringing you facts and trustworthy analysis to help you make sense of our complex world. It was written by: Gareth Barkin, University of Puget Sound
Read more:
Bypass the Strait of Hormuz with nuclear explosives? The US studied that in Panama and Colombia in the 1960s
A writing professor’s new task in the age of AI: Teaching students when to struggle
With AI finishing your sentences, what will happen to your unique voice on the page?
Gareth Barkin does not work for, consult, own shares in or receive funding from any company or organization that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.









Comments