VirtualME is a new infrastructure that continuously extracts and interprets in-IDE developer behaviors to build personalized personas, delivering 33.8% better performance on repository-level knowledge Q&A than generic baselines.
Search-based LLMs for code optimization
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
AdverMCTS frames code generation as a minimax game where an attacker evolves tests to expose flaws in solver-generated code, yielding more robust outputs than static-test baselines.
AutoTrainess exposes training operations via agent-computer interfaces and outperforms CLI-only baselines on PostTrainBench with scores of 26.94 vs 23.21 for GPT-5.4 and similar gains on other models.
AutoPass uses evidence from compiler states and runtime feedback to guide LLM agents in tuning LLVM optimizations, delivering 1.043x and 1.117x geometric-mean speedups over -O3 on x86-64 and ARM64.
citing papers explorer
-
AutoTrainess: Teaching Language Models to Improve Language Models Autonomously
AutoTrainess exposes training operations via agent-computer interfaces and outperforms CLI-only baselines on PostTrainBench with scores of 26.94 vs 23.21 for GPT-5.4 and similar gains on other models.