ChatGPT 31.4 (± 6.1) 53.4 (± 3.7) 46.4 (± 5.4) 50.9 (± 5.3) 91.0 (± 1.2) 79.9 (± 1.9) 49.8 (± 3.4) Zainaldin et al

Aggregate Automated MT Evaluation Scores Text Model BLEU-4 chrF++ METEOR ROUGE-L BERTScore COMET BLEURT Mix · 2026

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Evaluating LLM-Based Translation of a Low-Resource Technical Language: The Medical and Philosophical Greek of Galen

cs.CL · 2026-02-27 · accept · novelty 8.0

LLMs achieve high-quality translations of Galen’s expository Greek (MQM 95.2/100) but lower and bimodal quality on pharmacological texts (79.9/100), with terminology rarity (corpus frequency) predicting failure at r = -0.97.

citing papers explorer

Showing 1 of 1 citing paper.

Evaluating LLM-Based Translation of a Low-Resource Technical Language: The Medical and Philosophical Greek of Galen cs.CL · 2026-02-27 · accept · none · ref 6
LLMs achieve high-quality translations of Galen’s expository Greek (MQM 95.2/100) but lower and bimodal quality on pharmacological texts (79.9/100), with terminology rarity (corpus frequency) predicting failure at r = -0.97.

ChatGPT 31.4 (± 6.1) 53.4 (± 3.7) 46.4 (± 5.4) 50.9 (± 5.3) 91.0 (± 1.2) 79.9 (± 1.9) 49.8 (± 3.4) Zainaldin et al

fields

years

verdicts

representative citing papers

citing papers explorer