Improving Machine Translation with Context-Aware Entity-Only Pre-translations Using GPT-4o

Status: Completed | Funding: Howard University

Our work presents a novel three-step GPT pipeline designed to improve the translation of named entities. The process involves three phases: extracting entities from the source text, refining their translations using Wikidata context through GPT, and integrating these pre-processed names into the final translation. This approach addresses common mistranslation issues—such as confusing “Kwame Nkrumah” with “forest”—and results in more accurate, culturally sensitive translations. Evaluations across multiple languages demonstrate consistent performance gains over baseline models, particularly in languages with complex scripts or low resource availability.

Contributors

Dr Saurav Aryal

Jabez Agyemang-Prempeh

Resources

Improving MT with Context-Aware (1).pdf

Progress Timeline

2024-2025

Page updated

Google Sites

Report abuse