pith. sign in

Title resolution pending

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

citation-role summary

background 2

citation-polarity summary

years

2026 9 2025 1

roles

background 2

polarities

background 2

representative citing papers

X-VC: Zero-shot Streaming Voice Conversion in Codec Space

eess.AS · 2026-04-14 · unverdicted · novelty 7.0

X-VC achieves zero-shot streaming voice conversion via one-step codec-space conversion with dual-conditioning acoustic converter and role-assignment training on generated paired data.

Multi-Faceted Interactivity Alignment in Full-Duplex Speech Models

cs.CL · 2026-06-09 · unverdicted · novelty 6.0

A multi-axis RL alignment technique improves pause handling, turn-taking, backchanneling, and interruption response in full-duplex spoken dialogue models by optimizing axis-specific rewards derived from human audio segments.

BareWave: Waveform-Native Flow-Matching Text-to-Speech

eess.AS · 2026-06-08 · unverdicted · novelty 6.0

BareWave develops a waveform-native flow-matching framework for direct text-to-waveform TTS using representation alignment, staged noise scheduling, and velocity-aware perceptual alignment to achieve strong zero-shot voice cloning results.

Quaternion Self-Attention with Shared Scores

cs.LG · 2026-05-24 · unverdicted · novelty 6.0

Shared-score quaternion self-attention reduces score multiplications by 75% and softmax operations from four to one while proving equivalence to component-wise attention under quaternion linear projections.

Voxtral TTS

cs.AI · 2026-03-26 · unverdicted · novelty 5.0

Voxtral TTS produces expressive multilingual speech from 3-second reference audio with a hybrid autoregressive-plus-flow-matching architecture and a new VQ-FSQ tokenizer, achieving 68.4% win rate over ElevenLabs in human evaluations.

AT-ADD: All-Type Audio Deepfake Detection Challenge Evaluation Plan

cs.SD · 2026-04-09 · unverdicted · novelty 3.0

AT-ADD introduces standardized tracks and datasets for evaluating audio deepfake detectors on speech under real-world conditions and on diverse unknown audio types to promote generalization beyond speech-centric methods.

citing papers explorer

Showing 10 of 10 citing papers.