MESA Banner
A Gendered Approach to AI Translational Datasets: Examining Google Translate and ChatGPT's Arabic-English Processes
Abstract
The research presented in this paper relies on principles of dismantling bias, intersectionality, accountability, and inclusive, responsible, transparent use of power in social systems and algorithms. This research paper examines how to disentangle the English-Arabic translational biases embedded within Google Translate, which have been adopted and emulated by many LLMs and AI systems, including ChatGPT and Baird. For example, when prompted to translate the term “nurse” to Arabic, the result comes back coded as a feminine nurse (“ممرضة”). These translations embed a patriarchal understanding of the word in Arabic. The research presented is built upon collaboration among a network of other digital rights advocates, artists, journalists, civil society members, and academics who collaboratively define the challenges in analyzing bias within Google Translate and, by extension, AI systems between Arabic and English. In this paper, I will discuss different methods and analyses testing Google Translate’s word embeddings and lexical biases. I will begin by formulating a feminist dataset of terms in English and Arabic to compare the translational results between Google Translate and ChatGPT, which has been touted as a translational tool in addition to its generative AI capabilities.
Discipline
Communications
Geographic Area
Arab States
Sub Area
None