hub Canonical reference

A Survey on In-context Learning

Dong, Qingxiu, Li, Lei, Dai, Damai, Zheng, Ce, Ma, Jingyuan, Li, Rui · 2024 · DOI 10.18653/v1/2024.emnlp-main.64

Canonical reference. 100% of citing Pith papers cite this work as background.

26 Pith papers citing it

Background 100% of classified citations

open at publisher browse 26 citing papers

hub tools

JSON dossier citing papers JSON publisher DOI

citation-role summary

background 6

citation-polarity summary

background 6

representative citing papers

When Context Returns: Toward Robust Internalization in On-Policy Distillation

cs.LG · 2026-06-10 · unverdicted · novelty 7.0

A stop-gradient consistency regularizer mitigates context-induced degradation in on-policy distillation, improving robustness across 12 configurations.

HDSL: A Hierarchical Domain-Specific Language for Structured 3D Indoor Scene Generation and Localized Editing with LLM Agents

cs.CV · 2026-06-08 · unverdicted · novelty 7.0

HDSL is a tree-structured DSL for 3D indoor scenes that lets LLM agents generate subtrees recursively and perform localized edits via hierarchical retrieval and deterministic merge.

Why Do Time Series Models Need Long Context Windows?

cs.LG · 2026-06-01 · unverdicted · novelty 7.0

Long input windows are required to identify the generative process in time series forecasting even for short-memory processes, and decoupling identification from forecasting improves scalability.

NL2SQLBench: A Modular Benchmarking Framework for LLM-Enabled NL2SQL Solutions

cs.DB · 2026-04-13 · conditional · novelty 7.0

NL2SQLBench is a new modular benchmarking framework that evaluates LLM NL2SQL methods across three core modules on existing datasets, exposing large accuracy gaps and computational inefficiency.

Measuring and Exploiting Contextual Bias in LLM-Assisted Security Code Review

cs.SE · 2026-03-19 · accept · novelty 7.0

LLM-based security code review is vulnerable to framing bias, with a novel iterative refinement attack achieving 100% success in reintroducing vulnerabilities across real projects.

Dissociative Identity: Language Model Agents Lack Grounding for Reputation Mechanisms

cs.CY · 2026-05-28 · unverdicted · novelty 6.0

LM agents' changeable modules prevent persistent identity and sanction sensitivity, making reputation mechanisms structurally inapplicable and requiring protocol-based behavioral harnesses instead.

How Many Different Outputs Can a Transformer Generate?

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

Transformers are limited to a linearly growing number of accessible output sequences with prompt length, with exponential decay in accessible proportion beyond a critical point, even under unbounded context.

Always Learning, Always Mixing: Efficient and Simple Data Mixing All The Time

cs.CL · 2026-05-13 · conditional · novelty 6.0

OP-Mix is an on-policy data mixing method that uses low-rank adapter interpolation to find near-optimal data mixtures throughout language model training with reduced compute.

Predicting Decisions of AI Agents from Limited Interaction through Text-Tabular Modeling

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

A tabular foundation model with LLM-as-Observer features predicts AI agent decisions in controlled games, outperforming baselines by 4 AUC points and 14% lower error at K=16 interactions.

Adversarial SQL Injection Generation with LLM-Based Architectures

cs.CR · 2026-05-11 · unverdicted · novelty 6.0

RADAGAS-GPT4o achieves a 22.73% bypass rate against 10 WAFs, succeeding more against AI/ML-based firewalls than rule-based ones.

Belief or Circuitry? Causal Evidence for In-Context Graph Learning

cs.AI · 2026-05-08 · conditional · novelty 6.0

Causal evidence from representation analysis and interventions shows LLMs use both genuine structure inference and induction circuits in parallel for in-context graph learning.

cs.SE · 2026-05-08 · unverdicted · novelty 6.0

SPARK improves LLM-based test code fault localization by retrieving similar past faults and selectively annotating suspicious lines in new failing tests.

GRaSp: Automatic Example Optimization for In-Context Learning in Low-Data Tasks

cs.CL · 2026-05-08 · unverdicted · novelty 6.0

GRaSp optimizes in-context examples for LLMs via synthetic generation, clustering, dimensionality reduction, and genetic algorithms with diversity-adaptive mutation, reaching 45.84% micro-F1 on financial NER with real data and outperforming zero-shot and random few-shot baselines.

Leveraging LLMs for Multi-File DSL Code Generation: An Industrial Case Study

cs.SE · 2026-04-27 · unverdicted · novelty 6.0

Fine-tuning 7B code LLMs on a custom multi-file DSL dataset achieves structural fidelity of 1.00, high exact-match accuracy, and practical utility validated by expert survey and execution checks.

Understanding the Mechanism of Altruism in Large Language Models

econ.GN · 2026-04-21 · unverdicted · novelty 6.0

A small set of sparse autoencoder features in LLMs drives shifts between generous and selfish allocations in dictator games, with causal patching and steering confirming their role and generalization to other social games.

Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents

cs.AI · 2026-04-15 · conditional · novelty 6.0

Cross-domain memory transfer in coding agents boosts average performance by 3.7% via meta-knowledge like validation routines, with high-level abstractions transferring better than low-level traces.

Progress Ratio Embeddings: An Impatience Signal for Robust Length Control in Neural Text Generation

cs.CL · 2025-12-07 · unverdicted · novelty 6.0

Progress Ratio Embeddings use a trigonometric progress-ratio signal to deliver stable length control in transformers that generalizes to unseen target lengths.

How LLMs See Creativity: Zero-Shot Scoring of Visual Creativity with Interpretable Reasoning

cs.CL · 2026-06-29 · unverdicted · novelty 5.0 · 2 refs

Multimodal LLMs achieve moderate correlations with human visual creativity ratings in zero-shot evaluation across two datasets, with reasoning outputs providing interpretability but no accuracy gain.

Latent Bridges for Multi-Table Question Answering

cs.CL · 2026-06-27 · unverdicted · novelty 5.0

GRAB improves multi-table QA performance by encoding relational data as graphs and bridging structural signals to frozen LLMs through latent tokens.

Topic-to-Timestamp Alignment by Constrained Evidence Selection

cs.CL · 2026-06-18 · unverdicted · novelty 5.0

Constrained candidate selection from retrieved chunks raises Recall@5 from 31.9% to 50.0% and parseable outputs on 420 queries from 200 municipal meeting transcripts.

How LoRA Remembers? A Parametric Memory Law for LLM Finetuning

cs.CL · 2026-05-28 · unverdicted · novelty 5.0

Introduces Parametric Memory Law as power law for LoRA memory capacity and MemFT threshold-guided optimization for better memory fidelity.

Large Language Models as Virtual Survey Respondents: Evaluating Sociodemographic Response Generation

cs.AI · 2025-09-08 · conditional · novelty 5.0

Introduces PAS and FAS task abstractions plus the LLM-S^3 benchmark to evaluate LLMs on generating sociodemographic survey responses across 11 real datasets and multiple models.

Advancing Multi-Agent RAG Systems with Minimalist Reinforcement Learning

cs.CL · 2025-05-20 · unverdicted · novelty 5.0

Mujica-MyGo decomposes multi-turn RAG interactions via multi-agent workflows and applies minimalist policy gradient optimization to improve performance on QA benchmarks while avoiding long-context problems.

Are vision-language models ready to zero-shot replace supervised classification models in agriculture?

cs.CV · 2025-12-17 · unverdicted · novelty 4.0

Zero-shot VLMs reach at most 62% accuracy on agricultural classification tasks while supervised models like YOLO11 perform markedly higher, indicating they are not ready to replace task-specific systems.

citing papers explorer

Showing 9 of 9 citing papers after filters.

Always Learning, Always Mixing: Efficient and Simple Data Mixing All The Time cs.CL · 2026-05-13 · conditional · none · ref 8
OP-Mix is an on-policy data mixing method that uses low-rank adapter interpolation to find near-optimal data mixtures throughout language model training with reduced compute.
GRaSp: Automatic Example Optimization for In-Context Learning in Low-Data Tasks cs.CL · 2026-05-08 · unverdicted · none · ref 2
GRaSp optimizes in-context examples for LLMs via synthetic generation, clustering, dimensionality reduction, and genetic algorithms with diversity-adaptive mutation, reaching 45.84% micro-F1 on financial NER with real data and outperforming zero-shot and random few-shot baselines.
Progress Ratio Embeddings: An Impatience Signal for Robust Length Control in Neural Text Generation cs.CL · 2025-12-07 · unverdicted · none · ref 7
Progress Ratio Embeddings use a trigonometric progress-ratio signal to deliver stable length control in transformers that generalizes to unseen target lengths.
How LLMs See Creativity: Zero-Shot Scoring of Visual Creativity with Interpretable Reasoning cs.CL · 2026-06-29 · unverdicted · none · ref 10 · 2 links
Multimodal LLMs achieve moderate correlations with human visual creativity ratings in zero-shot evaluation across two datasets, with reasoning outputs providing interpretability but no accuracy gain.
Latent Bridges for Multi-Table Question Answering cs.CL · 2026-06-27 · unverdicted · none · ref 22
GRAB improves multi-table QA performance by encoding relational data as graphs and bridging structural signals to frozen LLMs through latent tokens.
Topic-to-Timestamp Alignment by Constrained Evidence Selection cs.CL · 2026-06-18 · unverdicted · none · ref 26
Constrained candidate selection from retrieved chunks raises Recall@5 from 31.9% to 50.0% and parseable outputs on 420 queries from 200 municipal meeting transcripts.
How LoRA Remembers? A Parametric Memory Law for LLM Finetuning cs.CL · 2026-05-28 · unverdicted · none · ref 10
Introduces Parametric Memory Law as power law for LoRA memory capacity and MemFT threshold-guided optimization for better memory fidelity.
Advancing Multi-Agent RAG Systems with Minimalist Reinforcement Learning cs.CL · 2025-05-20 · unverdicted · none · ref 17
Mujica-MyGo decomposes multi-turn RAG interactions via multi-agent workflows and applies minimalist policy gradient optimization to improve performance on QA benchmarks while avoiding long-context problems.
An Empirical Study of Many-Shot In-Context Learning for Machine Translation of Low-Resource Languages cs.CL · 2026-04-03 · unreviewed · ref 8

A Survey on In-context Learning

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer