Smith, Yulia Tsvetkov, and Sachin Kumar

Orevaoghene Ahia, Martijn Bartelds, Kabir Ahuja, Hila Gonen, Valentin Hofmann, Siddhant Arora, Shuyue Stella Li, Vishal Puttagunta, Mofetoluwa Adeyemi, Charishma Buchireddy, Ben Walls, Noah Bennett, Shinji Watanabe, Noah A · 2025 · arXiv 2505.03054

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

Listening with Time: Precise Temporal Awareness for Long-Form Audio Understanding

eess.AS · 2026-04-24 · unverdicted · novelty 7.0

LAT-Audio introduces a global-to-local reasoning approach with TWA-CoT that outperforms prior models on temporal tasks for audio up to 30 minutes.

Generating Synthetic Doctor-Patient Conversations for Long-form Audio Summarization

cs.SD · 2026-04-07 · conditional · novelty 6.0

A three-stage synthetic data pipeline generates 8800 doctor-patient conversations totaling 1.3k hours of audio and LLM-produced SOAP notes, with evaluation showing cascaded transcription-then-summarization models outperform end-to-end audio models.

citing papers explorer

Showing 2 of 2 citing papers.

Listening with Time: Precise Temporal Awareness for Long-Form Audio Understanding eess.AS · 2026-04-24 · unverdicted · none · ref 1
LAT-Audio introduces a global-to-local reasoning approach with TWA-CoT that outperforms prior models on temporal tasks for audio up to 30 minutes.
Generating Synthetic Doctor-Patient Conversations for Long-form Audio Summarization cs.SD · 2026-04-07 · conditional · none · ref 14
A three-stage synthetic data pipeline generates 8800 doctor-patient conversations totaling 1.3k hours of audio and LLM-produced SOAP notes, with evaluation showing cascaded transcription-then-summarization models outperform end-to-end audio models.

Smith, Yulia Tsvetkov, and Sachin Kumar

fields

years

verdicts

representative citing papers

citing papers explorer