Interpretable Culture-aware Embedding Space


  • Diverse natural language processing (NLP) outperforms the prevailing monolithic approach that might introduce challenges related to cultural representation and diversity. It offers significant benefits, including the integration of a wide array of cultural perspectives and precise interpretation across various dialects and social groups. Apart from diversity, interpretability allows us to gain deeper insights into how black-box models encode critical semantic information within the embedding space.
  • This endeavor is rooted in the belief that current AI models prioritize one cultural norm/perspective over others, and existing attempts at “debias” models merely attempt to adjust the dominant norm, rather than respond to endemic cultural diversity. We, therefore, aim to design an interpretable culture-aware embedding space.
  • Specifically, we initially identify a critical semantic space indexed by a corpus generated by a large language model. This semantic space is subsequently transformed into an interpretable space, where the first dimension correlates with the concept. Then, such interpretable space is fine-tuned towards different cultures using the corpus we solicit from the user study. In contrast to prior studies that primarily focus on word-level or phrase-level associations, we propose the incorporation of an attention layer to capture relevant tokens associated with the concepts. We also tackle domain shifts where concept-indicative words overlap.
  • Our study advocates for the exploration of the feasibility of evolving towards AI models that can adapt to or be specifically tailored to align with various cultural contexts, as well as high interpretability. We believe that a comprehensive embrace of endemic cultural diversity can pave the way for AI systems that are more culturally aware, credible, and representative.