Weizhu Chen
Identifiers
- name variant Weizhu Chen 0.60 · backfill
Papers (22)
- Shuffle the Context: RoPE-Perturbed Self-Distillation for Long-Context Adaptation cs.CL · 2026 · author #6
- Rethinking Language Model Scaling under Transferable Hypersphere Optimization cs.LG · 2026 · author #4
- ThetaEvolve: Test-time Learning on Open Problems cs.LG · 2025 · author #13
- Reinforcement Learning for Reasoning in Large Language Models with One Training Example cs.LG · 2025 · author #11
- Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs cs.CL · 2025 · author #14
- Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone cs.CL · 2024 · author #20
- CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing cs.CL · 2023 · author #7
- AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models cs.CL · 2023 · author #8
- AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning cs.CL · 2023 · author #7
- Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback cs.CL · 2023 · author #10
- CodeT: Code Generation with Generated Tests cs.CL · 2022 · author #7
- DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing cs.CL · 2021 · author #3
- LoRA: Low-Rank Adaptation of Large Language Models cs.CL · 2021 · author #8
- DeBERTa: Decoding-enhanced BERT with Disentangled Attention cs.CL · 2020 · author #4
- Lessons from Contextual Bandit Learning in a Customer Support Bot cs.LG · 2019 · author #6
- Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding cs.CL · 2019 · author #3
- Multi-Task Deep Neural Networks for Natural Language Understanding cs.CL · 2019 · author #3
- IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles cs.CL · 2018 · author #6
- Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question Answering cs.CL · 2018 · author #3
- FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension cs.CL · 2017 · author #4
- DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization math.OC · 2017 · author #4
- ReasoNet: Learning to Stop Reading in Machine Comprehension cs.LG · 2016 · author #4
Mentions
- 2302.12813 #10 · arxiv_oai · confidence 0.70 Weizhu Chen
- 2511.23473 #13 · arxiv_oai · confidence 0.70 Weizhu Chen
- 2304.06364 #8 · arxiv_oai · confidence 0.70 Weizhu Chen
- 2207.10397 #7 · arxiv_oai · confidence 0.70 Weizhu Chen
- 2504.20571 #11 · arxiv_oai · confidence 0.70 Weizhu Chen
Frequent Coauthors
- Yelong Shen 10 shared papers
- Jianfeng Gao 8 shared papers
- Pengcheng He 7 shared papers
- Liliang Ren 6 shared papers
- Hao Cheng 4 shared papers
- Shuohang Wang 4 shared papers
- XiaoDong Liu 4 shared papers
- Anh Nguyen 3 shared papers
- Baolin Peng 3 shared papers
- Chen Liang 3 shared papers
- Nikos Karampatziakis 3 shared papers
- Zeqi Lin 3 shared papers
- Abhishek Goswami 2 shared papers
- Alon Benhaim 2 shared papers
- Amin Saied 2 shared papers
- Amit Garg 2 shared papers
- Arindam Mitra 2 shared papers
- Chenguang Zhu 2 shared papers
- Chong Luo 2 shared papers
- Daniel Perez-Becker 2 shared papers