What if I ask in \textit{alia lingua}? Measuring Functional Similarity Across Languages

Ponnurangam Kumaraguru

What if I ask in alia lingua? Measuring Functional Similarity Across Languages

Abstract

How similar are model outputs across languages? In this work, we study this question using a recently proposed model similarity metric p applied to 20 languages and 47 subjects in GlobalMMLU. Our analysis reveals that a model's responses become increasingly consistent across languages as its size and capability grow. Interestingly, models exhibit greater cross-lingual consistency within themselves than agreement with other models prompted in the same language. These results highlight not only the value of p as a practical tool for evaluating multilingual reliability, but also its potential to guide the development of more consistent multilingual systems.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…