‘Jailbreaks’ Threaten Low-Resource Languages

Winter 2025

The large language models (LLMS) that power many popular text-based artificial intelligence applications are vulnerable to jailbreaking attacks, during which a user enters a malicious prompt to bypass an application’s guardrails to trick it into making inappropriate or harmful content.

New research by Johns Hopkins computer scientists has found that low-resource languages, such as Armenian and Mãori, are more vulnerable to these attacks since there is limited text data available for AI model training.

The study, published in the proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, highlights a significant issue with serious implications for multilingual applications, the researchers say.graphical image of speech bubbles conveying communication

“The discrepancies stem from the initial training stage, when the LLMs are exposed to only a small amount of data in these languages with limited resources,” says lead author Lingfeng Shen, MS ’24, now a research scientist at ByteDance. “This means the root issue is that there simply isn’t enough data available for less widely used languages during the model’s first training process.”

“Ensuring that LLMs can safely interact with users in various languages, including those with fewer resources, is critical for inclusivity and global applicability,” says team member Daniel Khashabi, an assistant professor of computer science and a member of the Center for Language and Speech Processing. “If these systems are not safe and reliable across all languages, it could lead to misinformation, harmful content dissemination, and overall decreased trust in AI technologies.”

The researchers encourage those training the next iterations of popular LLMs to include more data from low-resource languages such as Mongolian, Urdu, and Hausa. They also suggest developing new approaches for handling languages with limited data for training AI models.

“Our research advocates for more equitable AI development that considers the linguistic diversity of all users,” says Khashabi.

In Impact