Transformer Transducer: A Streamable Speech Recognition Model

Ankita Pasad, Ju-Chieh Chou, Karen Livescu · 2021 · arXiv 1503.2021

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Beyond Decodability: Reconstructing Language Model Representations with an Encoding Probe

cs.CL · 2026-05-01 · unverdicted · novelty 6.0

An encoding probe reconstructs transformer representations from acoustic, phonetic, syntactic, lexical and speaker features, showing independent syntactic/lexical contributions and training-dependent speaker effects.

WhisperPipe: A Resource-Efficient Streaming Architecture for Real-Time Automatic Speech Recognition

cs.CL · 2026-04-28 · unverdicted · novelty 5.0

WhisperPipe delivers 89 ms median latency and 48% lower peak GPU memory than standard Whisper while keeping word error rate within 2% of the offline model.

citing papers explorer

Showing 2 of 2 citing papers.

Beyond Decodability: Reconstructing Language Model Representations with an Encoding Probe cs.CL · 2026-05-01 · unverdicted · none · ref 35
An encoding probe reconstructs transformer representations from acoustic, phonetic, syntactic, lexical and speaker features, showing independent syntactic/lexical contributions and training-dependent speaker effects.
WhisperPipe: A Resource-Efficient Streaming Architecture for Real-Time Automatic Speech Recognition cs.CL · 2026-04-28 · unverdicted · none · ref 13
WhisperPipe delivers 89 ms median latency and 48% lower peak GPU memory than standard Whisper while keeping word error rate within 2% of the offline model.

Transformer Transducer: A Streamable Speech Recognition Model

fields

years

verdicts

representative citing papers

citing papers explorer