pith. sign in

Qingpeng Cai

Identifiers

  • name variant Qingpeng Cai 0.60 · backfill

Papers (10)

  1. Reinforced Preference Optimization for Reasoning-Augmented Recommendations cs.IR · 2026 · author #9
  2. Position-Aware Drafting for Inference Acceleration in LLM-Based Generative List-Wise Recommendation cs.IR · 2026 · author #5
  3. Phase-Aware Mixture of Experts for Agentic Reinforcement Learning cs.AI · 2026 · author #5
  4. When Importance Sampling Misallocates Credit: Asymmetric Ratios for Outcome-Supervised RL cs.CL · 2025 · author #3
  5. Reinforcement Learning Driven Heuristic Optimization cs.LG · 2019 · author #1
  6. Policy Optimization with Model-based Explorations cs.LG · 2018 · author #2
  7. Deterministic Policy Gradients With General State Transitions cs.LG · 2018 · author #1
  8. A Deep Reinforcement Learning Framework for Rebalancing Dockless Bike Sharing Systems cs.AI · 2018 · author #2
  9. Policy Gradients for Contextual Recommendations cs.LG · 2018 · author #2
  10. Reinforcement Mechanism Design for e-commerce cs.MA · 2017 · author #1

Mentions

  • 2605.21967 #9 · arxiv_oai · confidence 0.70 Qingpeng Cai
  • 2602.17038 #5 · arxiv_oai · confidence 0.70 Qingpeng Cai
  • 2510.06062 #3 · arxiv_oai · confidence 0.70 Qingpeng Cai

Frequent Coauthors