Title resolution pending

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang · 2022

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

browse 9 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

cs.CL · 2023-10-10 · unverdicted · novelty 8.0

SWE-bench reveals that even top language models like Claude 2 resolve only 1.96% of 2,294 real-world GitHub issues, highlighting a gap in practical coding capabilities.

HL-OutPaint: Coarse-to-Fine Video Outpainting for High-Resolution Long-Range Videos

cs.CV · 2026-05-17 · unverdicted · novelty 7.0

HL-OutPaint enables high-resolution outpainting of long video sequences via a coarse-to-fine pipeline that first builds Global Coarse Guidance through global-local frame swapping then synthesizes details.

Text-to-Distribution Prediction with Quantile Tokens and Neighbor Context

cs.CL · 2026-04-22 · unverdicted · novelty 7.0

Quantile tokens inserted into LLM inputs combined with neighbor retrieval enable direct prediction of full distributions, yielding lower MAPE and narrower intervals than baselines on Airbnb and StackSample tasks.

HORIZON: A Benchmark for In-the-wild User Behaviour Modeling

cs.IR · 2026-04-19 · unverdicted · novelty 7.0

HORIZON creates a cross-domain, long-horizon user modeling benchmark from Amazon Reviews that tests generalization across time, domains, and unseen users, exposing gaps in sequential and LLM-based recommendation models.

Always Learning, Always Mixing: Efficient and Simple Data Mixing All The Time

cs.CL · 2026-05-13 · conditional · novelty 6.0

OP-Mix is an on-policy data mixing method that uses low-rank adapter interpolation to find near-optimal data mixtures throughout language model training with reduced compute.

POETS: Uncertainty-Aware LLM Optimization via Compute-Efficient Policy Ensembles

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

POETS uses compute-efficient LLM policy ensembles to implicitly perform KL-regularized Thompson sampling, delivering O(sqrt(T gamma_T)) regret bounds and state-of-the-art sample efficiency in scientific discovery tasks such as protein search and quantum circuit design.

BaLoRA: Bayesian Low-Rank Adaptation of Large Scale Models

cs.LG · 2026-04-27 · unverdicted · novelty 6.0

BaLoRA is a Bayesian LoRA variant with input-adaptive noise that improves accuracy over standard LoRA and supplies well-calibrated uncertainty estimates on language, vision, and scientific prediction tasks.

Representation-Guided Parameter-Efficient LLM Unlearning

cs.CL · 2026-04-19 · unverdicted · novelty 6.0

REGLU guides LoRA-based unlearning via representation subspaces and orthogonal regularization to outperform prior methods on forget-retain trade-off in LLM benchmarks.

Position: Agentic AI System Is a Foreseeable Pathway to AGI

cs.AI · 2026-05-13 · unverdicted · novelty 4.0

Agentic AI systems with DAG topologies are claimed to deliver exponentially superior generalization and sample efficiency compared to monolithic scaling for achieving AGI.

citing papers explorer

Showing 1 of 1 citing paper after filters.

HORIZON: A Benchmark for In-the-wild User Behaviour Modeling cs.IR · 2026-04-19 · unverdicted · none · ref 44
HORIZON creates a cross-domain, long-horizon user modeling benchmark from Amazon Reviews that tests generalization across time, domains, and unseen users, exposing gaps in sequential and LLM-based recommendation models.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer