pith. sign in

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

citation-role summary

method 2

citation-polarity summary

years

2026 5 2025 1

roles

method 2

polarities

use method 2

clear filters

representative citing papers

Step-Audio 2 Technical Report

cs.CL · 2025-07-22 · unverdicted · novelty 6.0

Step-Audio 2 integrates a latent audio encoder, reasoning-centric reinforcement learning, and discrete audio token generation into language modeling to deliver state-of-the-art performance on audio understanding and conversational benchmarks.

Woosh: A Sound Effects Foundation Model

cs.SD · 2026-04-02 · accept · novelty 5.0

Woosh is a new publicly released foundation model optimized for high-quality sound effect generation from text or video, showing competitive or better results than open alternatives like Stable Audio Open.

LTX-2: Efficient Joint Audio-Visual Foundation Model

cs.CV · 2026-01-06 · conditional · novelty 5.0

LTX-2 generates high-quality synchronized audiovisual content from text prompts via an asymmetric 14B-video / 5B-audio dual-stream transformer with cross-attention and modality-aware guidance.

citing papers explorer

Showing 2 of 2 citing papers after filters.

  • Woosh: A Sound Effects Foundation Model cs.SD · 2026-04-02 · accept · none · ref 20

    Woosh is a new publicly released foundation model optimized for high-quality sound effect generation from text or video, showing competitive or better results than open alternatives like Stable Audio Open.

  • LTX-2: Efficient Joint Audio-Visual Foundation Model cs.CV · 2026-01-06 · conditional · none · ref 13

    LTX-2 generates high-quality synchronized audiovisual content from text prompts via an asymmetric 14B-video / 5B-audio dual-stream transformer with cross-attention and modality-aware guidance.