LoSATok proposes a compact 128-dimensional semantic-acoustic tokenizer with semantic bottleneck, time-relation loss, and dual-level supervision that claims competitive understanding performance and improved DiT generation across audio domains.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
eess.AS 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
LoSATok: Low-dimensional Semantic-Acoustic Tokenizer for Cross-Domain Audio Understanding and Generation
LoSATok proposes a compact 128-dimensional semantic-acoustic tokenizer with semantic bottleneck, time-relation loss, and dual-level supervision that claims competitive understanding performance and improved DiT generation across audio domains.