A Pocket Offline Model for Simultaneous Speech Translation as CUNI Submission to IWSLT 2026

Aziz Sharipov Ortega; Dominik Mach\'a\v{c}ek

arxiv: 2606.03948 · v1 · pith:TNF2VMZ6new · submitted 2026-06-02 · 💻 cs.CL

A Pocket Offline Model for Simultaneous Speech Translation as CUNI Submission to IWSLT 2026

Aziz Sharipov Ortega , Dominik Mach\'a\v{c}ek This is my paper

classification 💻 cs.CL

keywords translationmodelsimultaneousenglishiwsltofflinespeechalignatt

0 comments

read the original abstract

We implement simultaneous translation capability with the offline direct speech-to-text translation model Canary, using the state-of-the-art policy AlignAtt, and submit it to IWSLT 2026 Simultaneous Speech Translation Shared task for Czech to English and English to German and Italian. The strengths of our system are: (1) high translation quality, outperforming similarly sized baselines both in low- and high-latency regimes in computationally unaware simulations; (2) low computational requirements, as the model has only 1B parameters; (3) multilinguality -- support of 25 source and 25 target languages.

This paper has not been read by Pith yet.

A Pocket Offline Model for Simultaneous Speech Translation as CUNI Submission to IWSLT 2026

discussion (0)