pith. sign in

Yonghui Wu

Identifiers

  • name variant Yonghui Wu 0.60 · backfill

Papers (52)

  1. MARCH: Multi-Agent Radiology Clinical Hierarchy for CT Report Generation cs.AI · 2026 · author #3
  2. Seedance 2.0: Advancing Video Generation for World Complexity cs.CV · 2026 · author #127
  3. Detecting HIV-Related Stigma in Clinical Narratives Using Large Language Models cs.CL · 2026 · author #10
  4. A Parameter-Efficient Transfer Learning Approach through Multitask Prompt Distillation and Decomposition for Clinical NLP cs.CL · 2026 · author #4
  5. Retrieval-Augmented LLMs for Evidence Localization in Clinical Trial Recruitment from Longitudinal EHR Narratives cs.CL · 2026 · author #4
  6. Improving Automatic Summarization of Radiology Reports through Mid-Training of Large Language Models cs.CL · 2026 · author #6
  7. Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model cs.CV · 2025 · author #139
  8. Seedream 4.0: Toward Next-generation Multimodal Image Generation cs.CV · 2025 · author #34
  9. Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference cs.CL · 2025 · author #21
  10. Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities cs.CL · 2025 · author #1454
  11. Seed1.5-VL Technical Report cs.CV · 2025 · author #170
  12. VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks cs.AI · 2025 · author #26
  13. DAPO: An Open-Source LLM Reinforcement Learning System at Scale cs.LG · 2025 · author #34
  14. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context cs.CL · 2024 · author #1132
  15. Gemini: A Family of Highly Capable Multimodal Models cs.CL · 2023 · author #1347
  16. PaLM 2 Technical Report cs.CL · 2023 · author #128
  17. Scaling Autoregressive Models for Content-Rich Text-to-Image Generation cs.CV · 2022 · author #17
  18. CoCa: Contrastive Captioners are Image-Text Foundation Models cs.CV · 2022 · author #6
  19. Vector-quantized Image Modeling with Improved VQGAN cs.CV · 2021 · author #10
  20. GSPMD: General and Scalable Parallelization for ML Computation Graphs cs.DC · 2021 · author #15
  21. Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges cs.CL · 2019 · author #13
  22. Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning cs.CL · 2019 · author #4
  23. Gmail Smart Compose: Real-Time Assisted Writing cs.CL · 2019 · author #12
  24. Direct speech-to-speech translation with a sequence-to-sequence model cs.CL · 2019 · author #7
  25. LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech cs.SD · 2019 · author #8
  26. Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling cs.LG · 2019 · author #3
  27. Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes eess.AS · 2018 · author #4
  28. GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism cs.CV · 2018 · author #10
  29. Streaming End-to-end Speech Recognition For Mobile Devices cs.CL · 2018 · author #9
  30. Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation cs.CL · 2018 · author #9
  31. Hierarchical Generative Modeling for Controllable Speech Synthesis cs.CL · 2018 · author #5
  32. Training Deeper Neural Machine Translation Models with Transparent Attention cs.CL · 2018 · author #5
  33. A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition eess.AS · 2018 · author #4
  34. Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis cs.CL · 2018 · author #11
  35. The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation cs.CL · 2018 · author #11
  36. Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions cs.CL · 2017 · author #13
  37. An analysis of incorporating an external language model into a sequence-to-sequence model eess.AS · 2017 · author #2
  38. No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models cs.CL · 2017 · author #10
  39. Minimum Word Error Rate Training for Attention-based Sequence-to-Sequence Models cs.CL · 2017 · author #3
  40. Improving the Performance of Online Neural Transducer Models cs.CL · 2017 · author #5
  41. State-of-the-art Speech Recognition With Sequence-to-Sequence Models cs.CL · 2017 · author #3
  42. Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence Model eess.AS · 2017 · author #8
  43. Speech recognition for medical conversations cs.CL · 2017 · author #13
  44. Tacotron: Towards End-to-End Speech Synthesis cs.CL · 2017 · author #4
  45. Sequence-to-Sequence Models Can Directly Translate Foreign Speech cs.CL · 2017 · author #4
  46. Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation cs.CL · 2016 · author #5
  47. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation cs.CL · 2016 · author #1
  48. Reward Augmented Maximum Likelihood for Neural Structured Prediction cs.LG · 2016 · author #6
  49. Exploring the Limits of Language Modeling cs.CL · 2016 · author #5
  50. Barcoding-free BAC Pooling Enables Combinatorial Selective Sequencing of the Barley Gene Space q-bio.GN · 2011 · author #7
  51. Prisoner's dilemma in structured scale-free networks physics.soc-ph · 2009 · author #2
  52. A unified model for Sierpinski networks with scale-free scaling and small-world effect cond-mat.dis-nn · 2009 · author #5

Mentions

  • 1112.4438 #7 · backfill · confidence 0.70 Yonghui Wu
  • 2105.04663 #15 · arxiv_oai · confidence 0.70 Yonghui Wu
  • 2110.04627 #10 · arxiv_oai · confidence 0.70 Yonghui Wu
  • 0905.2724 #2 · backfill · confidence 0.70 Yonghui Wu
  • 2512.13507 #139 · arxiv_oai · confidence 0.70 Yonghui Wu
  • 0903.3997 #5 · backfill · confidence 0.70 Yonghui Wu
  • 2508.02193 #21 · arxiv_oai · confidence 0.70 Yonghui Wu

Frequent Coauthors