pith. sign in

hub

Indextts2: A breakthrough in emotionally expressive and duration-controlled auto-regressive zero-shot text-to-speech

22 Pith papers cite this work. Polarity classification is still indexing.

22 Pith papers citing it

hub tools

citation-role summary

background 1 baseline 1 dataset 1

citation-polarity summary

years

2026 22

representative citing papers

CoSyncDiT: Cognitive Synchronous Diffusion Transformer for Movie Dubbing

cs.SD · 2026-04-14 · unverdicted · novelty 7.0

CoSyncDiT is a cognitive-inspired diffusion transformer that achieves state-of-the-art lip synchronization and naturalness in movie dubbing by guiding noise-to-speech generation through acoustic, visual, and contextual stages plus joint regularization.

UniVocal: Unified Speech-Singing Code-Switching Synthesis

cs.SD · 2026-06-01 · unverdicted · novelty 6.0

UniVocal presents a text-context-only framework for speech-singing code-switching synthesis via two-stage curriculum learning and a synthetic data pipeline, claiming SOTA on a new benchmark.

RTCFake: Speech Deepfake Detection in Real-Time Communication

cs.SD · 2026-04-26 · unverdicted · novelty 6.0

RTCFake is the first large-scale dataset of real-time communication speech deepfakes paired with offline versions, paired with a phoneme-guided consistency learning method that improves cross-platform and noise-robust detection.

DeepSlide: From Artifacts to Presentation Delivery

cs.AI · 2026-04-01 · unverdicted · novelty 6.0

DeepSlide introduces a multi-agent system for full presentation preparation that matches baselines on slide quality but improves narrative flow, pacing, and script synergy via a new dual-scoreboard benchmark.

citing papers explorer

Showing 22 of 22 citing papers.