Findings of the

Kocmi, Tom, Artemova, Ekaterina, Avramidis, Eleftherios, Bawden, Rachel, Bojar, Ond r ej, Dranch, Konstantin · 2025 · DOI 10.18653/v1/2025.wmt-1.22

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

open at publisher browse 9 citing papers

representative citing papers

Creativity Bias: How Machine Evaluation Struggles with Creativity in Literary Translations

cs.CL · 2026-05-13 · unverdicted · novelty 7.0

Automatic evaluation tools for literary translations correlate poorly with expert human judgments on creativity and exhibit bias favoring machine-translated texts.

What Does LLM Refinement Actually Improve? A Systematic Study on Document-Level Literary Translation

cs.CL · 2026-05-13 · accept · novelty 7.0

Document-level machine translation followed by segment-level LLM refinement provides the strongest and most stable improvements in literary translation quality, mainly enhancing fluency and style rather than adequacy.

Combining On-Policy Optimization and Distillation for Long-Context Reasoning in Large Language Models

cs.CL · 2026-05-12 · unverdicted · novelty 7.0

dGRPO merges outcome-based policy optimization with dense teacher guidance from on-policy distillation, yielding more stable long-context reasoning on the new LongBlocks synthetic dataset.

Dynamic Meta-Metrics: Source-Sentence Conditioned Weighting for MT Evaluation

cs.CL · 2026-05-09 · unverdicted · novelty 6.0

Dynamic Meta-Metrics learns source-sentence-conditioned combinations of MT metrics, with MLP-based hard and soft clustering versions outperforming static linear and Gaussian process ensembles on WMT data.

Misaligned by Reward: Socially Undesirable Preferences in LLMs

cs.CL · 2026-05-06 · unverdicted · novelty 6.0

Reward models for LLMs frequently select socially undesirable options across four social domains, show no overall best performer, and exhibit a bias-avoidance versus context-sensitivity trade-off.

Psychologically Potent, Computationally Invisible: LLMs Generate Social-Comparison Triggers They Fail to Detect

cs.CL · 2026-05-01 · unverdicted · novelty 6.0

LLMs generate social media posts that shift readers' perceived standing and comparison-related feelings but fail to reliably detect the same social-comparison triggers via prompt-based classification.

Syntax as a Rosetta Stone: Universal Dependencies for In-Context Coptic Translation

cs.CL · 2026-04-20 · unverdicted · novelty 6.0

Combining dictionary glosses with Universal Dependencies syntactic information in in-context learning produces new state-of-the-art Coptic-English translation results across model sizes.

Why Low-Resource NLP Needs More Than Cross-Lingual Transfer: Lessons Learned from Luxembourgish

cs.CL · 2026-05-11 · unverdicted · novelty 4.0

Cross-lingual transfer and language-specific data efforts are interdependent and complementary for effective low-resource NLP, as demonstrated through Luxembourgish case studies and synthesis.

FMI_SU_Yotkova_Kastreva at SemEval-2026 Task 13: Lightweight Detection of LLM-Generated Code via Stylometric Signals

cs.CL · 2026-05-05 · unverdicted · novelty 3.0

A feature-based decision tree with parsing-derived signals and heuristics detects LLM-generated code in a lightweight, CPU-only setup for SemEval-2026 Task 13.

citing papers explorer

Showing 9 of 9 citing papers.

Creativity Bias: How Machine Evaluation Struggles with Creativity in Literary Translations cs.CL · 2026-05-13 · unverdicted · none · ref 26
Automatic evaluation tools for literary translations correlate poorly with expert human judgments on creativity and exhibit bias favoring machine-translated texts.
What Does LLM Refinement Actually Improve? A Systematic Study on Document-Level Literary Translation cs.CL · 2026-05-13 · accept · none · ref 15
Document-level machine translation followed by segment-level LLM refinement provides the strongest and most stable improvements in literary translation quality, mainly enhancing fluency and style rather than adequacy.
Combining On-Policy Optimization and Distillation for Long-Context Reasoning in Large Language Models cs.CL · 2026-05-12 · unverdicted · none · ref 78
dGRPO merges outcome-based policy optimization with dense teacher guidance from on-policy distillation, yielding more stable long-context reasoning on the new LongBlocks synthetic dataset.
Dynamic Meta-Metrics: Source-Sentence Conditioned Weighting for MT Evaluation cs.CL · 2026-05-09 · unverdicted · none · ref 29
Dynamic Meta-Metrics learns source-sentence-conditioned combinations of MT metrics, with MLP-based hard and soft clustering versions outperforming static linear and Gaussian process ensembles on WMT data.
Misaligned by Reward: Socially Undesirable Preferences in LLMs cs.CL · 2026-05-06 · unverdicted · none · ref 153
Reward models for LLMs frequently select socially undesirable options across four social domains, show no overall best performer, and exhibit a bias-avoidance versus context-sensitivity trade-off.
Psychologically Potent, Computationally Invisible: LLMs Generate Social-Comparison Triggers They Fail to Detect cs.CL · 2026-05-01 · unverdicted · none · ref 153
LLMs generate social media posts that shift readers' perceived standing and comparison-related feelings but fail to reliably detect the same social-comparison triggers via prompt-based classification.
Syntax as a Rosetta Stone: Universal Dependencies for In-Context Coptic Translation cs.CL · 2026-04-20 · unverdicted · none · ref 10
Combining dictionary glosses with Universal Dependencies syntactic information in in-context learning produces new state-of-the-art Coptic-English translation results across model sizes.
Why Low-Resource NLP Needs More Than Cross-Lingual Transfer: Lessons Learned from Luxembourgish cs.CL · 2026-05-11 · unverdicted · none · ref 165
Cross-lingual transfer and language-specific data efforts are interdependent and complementary for effective low-resource NLP, as demonstrated through Luxembourgish case studies and synthesis.
FMI_SU_Yotkova_Kastreva at SemEval-2026 Task 13: Lightweight Detection of LLM-Generated Code via Stylometric Signals cs.CL · 2026-05-05 · unverdicted · none · ref 237
A feature-based decision tree with parsing-derived signals and heuristics detects LLM-generated code in a lightweight, CPU-only setup for SemEval-2026 Task 13.

Findings of the

fields

years

verdicts

representative citing papers

citing papers explorer