Machine translation (MT) has advanced significantly due to developments in deep learning and neural networks. However, translating literary texts remains a significant challenge due to their complexity, figurative language, and cultural variations. Often referred to as the “last frontier of machine translation,” literary translation represents a considerable task for MT systems.
Large language models (LLMs) have revolutionized AI. These models are pre-trained on extensive text data and can predict the following word in a sentence. After pretraining, the models are finely tuned using instructions, which help the systems adapt their existing language knowledge. Multi-agent systems are also used where intelligent agents are designed to understand their surroundings, make informed decisions, and take appropriate actions. Machine translation has seen recent advancements, including general-purpose MT, low-resource MT, multilingual MT, and non-autoregressive MT.
A group of researchers from Monash University, the University of Macau, and Tencent AI Lab have developed TRANSAGENTS, a multi-agent system designed for translating literary works. Although this system performed poorly in terms of d-BLEU scores, it was favored by human evaluators and an LLM evaluator over human-written translations and GPT-4 translations. TRANSAGENTS can generate translations that are detailed and varied, and it is 80 times less expensive than employing a professional human translator.
The researchers introduced two evaluation strategies: Monolingual Human Preference (MHP) and Bilingual LLM Preference (BLP). MHP prioritizes the impact of translation on the target audience and emphasizes cultural suitability and fluidity. In contrast, BLP compares translations directly to the original texts using advanced LLMs. Despite the system’s strength, the researchers identified certain limitations with content omission in LLM-based translation systems like GPT-4 and TRANSAGENTS.
The quality of translations produced by TRANSAGENTS was compared with other methods using monolingual human preference evaluations. The results demonstrated a preference for translations produced by TRANSAGENTS over other methods. When evaluated using BLP, the preference for detailed and varied translations made by TRANSAGENTS was clear. It also proved more cost-effective, costing $500 for the entire test set, 80 times cheaper than the $168.48 per chapter charged by the reference translator.
In conclusion, the researchers presented TRANSAGENTS, a multi-agent virtual company designed for literary translation that mirrors the traditional translation publication process. Despite the lower d-BLEU scores, human evaluators and language models gave preference to translations produced by TRANSAGENTS over other methods. Additionally, its cost-effectiveness highlights its potential as a valuable tool in literary translation. However, its limitations underscore ongoing challenges in machine translation evaluation, such as poor evaluation metrics and the reliability of reference translations.