Audio-Mind introduces a conditional, auditable agentic framework for audio understanding that preserves frontend judgment and acquires bounded external evidence only when needed, reporting 80.4% on MMAR and 82.8% on MSU-Bench.
Efficient and generalizable speaker diarization via structured pruning of self-supervised models
2 Pith papers cite this work. Polarity classification is still indexing.
fields
eess.AS 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Cross-lifespan evaluation shows adult-trained speech foundation models degrade on child and older-adult data, with joint multi-age training and targeted adaptation improving robustness especially using Whisper encoder.
citing papers explorer
-
Audio-Mind: An Auditable Agentic Framework for Audio Understanding
Audio-Mind introduces a conditional, auditable agentic framework for audio understanding that preserves frontend judgment and acquires bounded external evidence only when needed, reporting 80.4% on MMAR and 82.8% on MSU-Bench.
-
Exploring Speech Foundation Models for Speaker Diarization Across Lifespan
Cross-lifespan evaluation shows adult-trained speech foundation models degrade on child and older-adult data, with joint multi-age training and targeted adaptation improving robustness especially using Whisper encoder.