A Survey on In-context Learning

Baobao Chang; Ce Zheng; Damai Dai; Heming Xia; Jingjing Xu; Jingyuan Ma; Lei Li; Qingxiu Dong; Rui Li; Tianyu Liu

arxiv: 2301.00234 · v6 · submitted 2022-12-31 · 💻 cs.CL · cs.AI

A Survey on In-context Learning

Qingxiu Dong , Lei Li , Damai Dai , Ce Zheng , Jingyuan Ma , Rui Li , Heming Xia , Jingjing Xu

show 5 more authors

Zhiyong Wu Tianyu Liu Baobao Chang Xu Sun Zhifang Sui

This is my paper

Pith reviewed 2026-05-12 12:53 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords in-context learninglarge language modelsprompt designfew-shot learningnatural language processingsurvey

0 comments

The pith

In-context learning allows large language models to make predictions from prompts that include a few task examples without parameter updates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This survey defines in-context learning formally as a paradigm where large language models condition predictions on a small set of demonstration examples placed in the input. It organizes existing work into training strategies that prepare models for this behavior, prompt design methods that select and format examples, and analyses that probe why the approach succeeds. The review also examines concrete uses such as selecting high-quality data and injecting fresh knowledge into a fixed model. By collecting these threads the authors aim to clarify the current state of the field and point out obstacles that block wider adoption. The result is a map that lets researchers see how different pieces of ICL research fit together.

Core claim

In-context learning has emerged as a new paradigm for natural language processing in which large language models make predictions based on contexts augmented with a few examples, and this survey organizes the techniques, applications, and open problems to facilitate further research into how such models extrapolate abilities from limited context.

What carries the argument

The formal definition of in-context learning as the process by which large language models generate outputs conditioned on a prompt that contains a small number of demonstration examples.

Load-bearing premise

The rapidly expanding body of ICL literature can be usefully organized and summarized within the scope and selection criteria of a single survey paper.

What would settle it

Publication of a substantial set of subsequent papers that introduce major ICL techniques or challenges absent from the survey's categories would show the organization does not capture the field's current state.

read the original abstract

With the increasing capabilities of large language models (LLMs), in-context learning (ICL) has emerged as a new paradigm for natural language processing (NLP), where LLMs make predictions based on contexts augmented with a few examples. It has been a significant trend to explore ICL to evaluate and extrapolate the ability of LLMs. In this paper, we aim to survey and summarize the progress and challenges of ICL. We first present a formal definition of ICL and clarify its correlation to related studies. Then, we organize and discuss advanced techniques, including training strategies, prompt designing strategies, and related analysis. Additionally, we explore various ICL application scenarios, such as data engineering and knowledge updating. Finally, we address the challenges of ICL and suggest potential directions for further research. We hope that our work can encourage more research on uncovering how ICL works and improving ICL.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a standard, well-organized survey on in-context learning that maps the literature up to early 2023 but introduces no new results or mechanisms.

read the letter

This survey on in-context learning gives a clear map of the area as it stood in late 2022. It defines ICL, breaks down techniques for training and prompting, lists applications, and flags open challenges. The structure follows the abstract closely: a formal definition first, then correlations to related work, followed by sections on training strategies, prompt design, analysis, applications such as data engineering and knowledge updating, and finally challenges with suggested directions. The paper does a good job pulling together the main threads without overclaiming. The sections on prompt designing strategies and analysis feel particularly organized, and the discussion of applications is straightforward. Citations draw from major NLP venues and cover a reasonable range of the early ICL papers. The main limitation is that any survey of a fast-moving topic like this will have gaps in coverage and will age quickly. The authors' choices about what to include and how to categorize are reasonable but ultimately editorial, and some later developments in scaling laws or more advanced prompting will already sit outside the scope. The analysis sections stay descriptive rather than resolving open questions, which is expected for this format. There are no experiments, code, or new derivations here. This paper is useful for someone entering the ICL literature or needing a single reference to point to recent work. Experienced researchers already deep in the area might skim it for organization ideas but will not find new technical substance. I would send it to peer review. A solid survey helps the community even if it is not groundbreaking.

Referee Report

0 major / 3 minor

Summary. The paper surveys in-context learning (ICL) as an emerging paradigm for large language models (LLMs) in NLP. It begins with a formal definition of ICL and its relation to related concepts, then organizes advanced techniques (training strategies, prompt design, and analyses), reviews applications (e.g., data engineering and knowledge updating), and concludes by addressing challenges and suggesting future research directions.

Significance. A structured survey on ICL would be moderately significant given the field's rapid expansion, as it could consolidate literature on techniques, applications, and open problems to guide researchers. The logical organization from definition through techniques and applications to challenges supports its utility as a reference, though its impact depends on the depth and balance of coverage across the cited works.

minor comments (3)

[Introduction/Abstract] The abstract states that the paper clarifies ICL's correlation to related studies, but the provided outline does not specify how overlaps with few-shot prompting or meta-learning are delineated; adding a dedicated subsection or table comparing these would improve clarity.
The discussion of prompt designing strategies is listed as a core technique area, yet without explicit criteria for paper selection or inclusion of quantitative benchmarks across methods, the summary risks appearing selective; a methods section detailing search terms and coverage would strengthen the survey.
[Challenges and Future Directions] In the challenges section, the suggestions for future directions are high-level; grounding each with references to specific recent papers or open benchmarks would make the recommendations more actionable.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their review and recommendation of minor revision. The positive assessment of the survey's logical organization, coverage of definitions, techniques, applications, and challenges is appreciated. As no specific major comments were provided, we have no point-by-point revisions to address.

Circularity Check

0 steps flagged

No significant circularity: literature survey without derivations or predictions

full rationale

This is a survey paper whose sole claim is to organize and summarize the existing ICL literature. It presents a formal definition of ICL and discusses techniques, applications, and challenges drawn from cited works, but advances no original equations, fitted parameters, predictions, or derivations. No load-bearing step reduces to self-definition, self-citation chains, or renaming of results. Self-citations (if any) support only descriptive coverage and are not invoked to justify uniqueness theorems or force technical conclusions. The paper is self-contained as an editorial synthesis against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As a survey paper the work aggregates and organizes prior publications; it introduces no free parameters, no new axioms, and no invented entities.

pith-pipeline@v0.9.0 · 5484 in / 1030 out tokens · 48351 ms · 2026-05-12T12:53:53.077073+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/LawOfExistence.lean law_of_existence unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The key idea of in-context learning is to learn from analogy. Figure 1 gives an example that describes how language models make decisions via ICL.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 60 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

From Context to Skills: Can Language Models Learn from Context Skillfully?
cs.AI 2026-04 unverdicted novelty 8.0

Ctx2Skill lets language models autonomously evolve context-specific skills via multi-agent self-play, improving performance on context learning tasks without human supervision.
Gradient-Based Program Synthesis with Neurally Interpreted Languages
cs.LG 2026-04 unverdicted novelty 8.0

NLI autonomously discovers a vocabulary of primitive operations and interprets variable-length programs via a neural executor, allowing end-to-end training and gradient-based test-time adaptation that outperforms prio...
Pro$^2$Assist: Continuous Step-Aware Proactive Assistance with Multimodal Egocentric Perception for Long-Horizon Procedural Tasks
cs.AI 2026-05 unverdicted novelty 7.0

Pro²Assist uses multimodal egocentric perception from AR glasses to track fine-grained progress in long-horizon procedural tasks and deliver timely proactive assistance, outperforming baselines by over 21% in action u...
AnchorSeg: Language Grounded Query Banks for Reasoning Segmentation
cs.CV 2026-04 unverdicted novelty 7.0

AnchorSeg uses ordered query banks of latent reasoning tokens plus a spatial anchor token and a Token-Mask Cycle Consistency loss to achieve 67.7% gIoU and 68.1% cIoU on the ReasonSeg benchmark.
Deformation-based In-Context Learning for Point Cloud Understanding
cs.CV 2026-04 unverdicted novelty 7.0

DeformPIC deforms query point clouds under prompt guidance for in-context learning, outperforming prior methods with lower Chamfer Distance on reconstruction, denoising, and registration tasks.
Evaluating Code Reasoning Abilities of Large Language Models Under Real-World Settings
cs.SE 2025-12 unverdicted novelty 7.0

A new dataset and nine-metric majority-vote procedure show that existing code-reasoning benchmarks are dominated by lower-complexity problems that do not reflect real-world code.
ProAgent: Harnessing On-Demand Sensory Contexts for Proactive LLM Agent Systems in the Wild
cs.AI 2025-12 conditional novelty 7.0

ProAgent uses on-demand tiered perception and context-aware LLM reasoning to deliver proactive assistance on AR glasses, achieving up to 27.7% higher prediction accuracy and 20.5% lower false detections than baselines.
When Search Goes Wrong: Red-Teaming Web-Augmented Large Language Models
cs.CR 2025-10 unverdicted novelty 7.0

CREST-Search is a red-teaming framework that crafts seemingly benign search queries to induce unsafe citations from web-augmented LLMs, backed by a new WebSearch-Harm dataset for fine-tuning a specialized attacker model.
Soft Head Selection for Injecting ICL-Derived Task Embeddings
cs.CL 2025-07 conditional novelty 7.0

SITE applies soft gradient-based head selection to inject ICL-derived task embeddings, outperforming prior embedding adaptation and few-shot ICL across generation, reasoning, and NLU tasks on 12 LLMs from 4B to 70B pa...
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
cs.LG 2025-04 accept novelty 7.0

One training example via RLVR boosts LLM math reasoning from 17.6% to 35.7% average across six benchmarks.
CodeMind: Evaluating Large Language Models for Code Reasoning
cs.SE 2024-02 unverdicted novelty 7.0

CodeMind evaluates ten LLMs on four benchmarks using three new code reasoning tasks, finding performance varies by model size and drops with complexity while showing no correlation with bug repair ability.
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
cs.CV 2023-10 accept novelty 7.0

Set-of-Mark prompting marks segmented image regions with alphanumerics and masks to let GPT-4V achieve state-of-the-art zero-shot results on referring expression comprehension and segmentation benchmarks like RefCOCOg.
Do as I Say, Not as I Do: Instruction-Induction Conflict in LLMs
cs.CL 2026-05 conditional novelty 6.0

Experiments reveal that LLMs follow instructions at rates from 1% to 99% when opposed by hardcoded conflicting patterns, with robustness tied to output diversity and alignment with model priors rather than general capability.
Internalizing Curriculum Judgment for LLM Reinforcement Fine-Tuning
cs.LG 2026-05 unverdicted novelty 6.0

METIS internalizes curriculum judgment in LLM reinforcement fine-tuning by predicting within-prompt reward variance via in-context learning and jointly optimizing with a self-judgment reward, yielding superior perform...
Personal Visual Context Learning in Large Multimodal Models
cs.CV 2026-05 unverdicted novelty 6.0

Introduces Personal VCL formalization and benchmark revealing LMM context gaps, plus an Agentic Context Bank baseline that boosts personalized visual reasoning.
Wasserstein-Aligned Localisation for VLM-Based Distributional OOD Detection in Medical Imaging
cs.CV 2026-05 unverdicted novelty 6.0

WALDO improves zero-shot anomaly localization in medical imaging by selecting reference distributions via entropy-weighted Sliced Wasserstein distances and Goldilocks zone sampling, yielding a 19% relative gain on bra...
Decompose and Recompose: Reasoning New Skills from Existing Abilities for Cross-Task Robotic Manipulation
cs.RO 2026-05 unverdicted novelty 6.0

Decompose and Recompose decomposes seen robotic demonstrations into skill-action alignments and recomposes them via visual-semantic retrieval and planning to enable zero-shot cross-task generalization.
RAQG-QPP: Query Performance Prediction with Retrieved Query Variants and Retrieval Augmented Query Generation
cs.IR 2026-04 unverdicted novelty 6.0

Retrieved query variants from logs combined with LLM-augmented generation improve unsupervised QPP accuracy by up to 30% for neural rankers on TREC DL'19 and DL'20.
Dual-Enhancement Product Bundling: Bridging Interactive Graph and Large Language Model
cs.CL 2026-04 unverdicted novelty 6.0

A graph-to-text paradigm with Dynamic Concept Binding Mechanism integrates interactive graphs and LLMs to recommend product bundles, yielding 6.3%-26.5% gains over baselines on POG, POG_dense, and Steam datasets.
Learning to Adapt: In-Context Learning Beyond Stationarity
cs.LG 2026-04 unverdicted novelty 6.0

Gated linear attention enables lower training and test errors in non-stationary in-context learning by adaptively modulating past inputs through a learnable recency bias under an autoregressive model of task evolution.
Bridging Natural Language and Microgrid Dynamics: A Context-Aware Simulator and Dataset
eess.SY 2026-04 unverdicted novelty 6.0

OpenCEM is the first open-source digital twin that integrates unstructured contextual information with quantitative microgrid dynamics to enable context-aware energy management.
Measuring Representation Robustness in Large Language Models for Geometry
cs.CL 2026-04 unverdicted novelty 6.0

LLMs display accuracy gaps of up to 14 percentage points on the same geometry problems solely due to representation choice, with vector forms consistently weakest and a convert-then-solve prompt helping only high-capa...
Memory in the LLM Era: Modular Architectures and Strategies in a Unified Framework
cs.CL 2026-04 unverdicted novelty 6.0

A unified framework for LLM agent memory is benchmarked, with a new hybrid method outperforming state-of-the-art on standard tasks.
A Rule-Aware Prompt Framework for Structured Numeric Reasoning in Cyber-Physical Systems
eess.SY 2025-12 unverdicted novelty 6.0

A rule-aware modular prompt framework enables LLMs to perform structured numeric reasoning on power grid data by separating rules from normalized deviations, improving anomaly detection consistency and reducing token ...
SnapAudit: Active Auditing of Differentially Private In-Context Learning via Snapshot-Based Simulation
cs.CR 2025-11 conditional novelty 6.0

SnapAudit decomposes DP-ICL into a deterministic snapshot stage and a stochastic noise stage, using bootstrap simulation to achieve 80-200x faster auditing and exposing privacy bound violations in existing Gaussian an...
Localizing Task Recognition and Task Learning in In-Context Learning via Attention Head Analysis
cs.CL 2025-09 unverdicted novelty 6.0

A new framework using Task Subspace Logit Attribution localizes attention heads specialized for task recognition and task learning in in-context learning, showing they align and rotate hidden states within a task subspace.
Artificial Phantasia: Emergent Mental Imagery in Large Language Models
cs.AI 2025-09 unverdicted novelty 6.0

LLMs achieve higher accuracy than humans on compositional imagery tasks previously argued to require pictorial representations, supporting emergent propositional mental imagery in AI.
Video models are zero-shot learners and reasoners
cs.LG 2025-09 unverdicted novelty 6.0

Generative video models exhibit emergent zero-shot capabilities across perception, manipulation, and basic reasoning tasks.
BugScope: Learn to Find Bugs Like Human
cs.SE 2025-07 conditional novelty 6.0

BugScope structures LLM bug detection into three human-mirroring steps and distills guidelines from examples, reaching 0.87 F1 on 33 real bugs while outperforming Claude and Cursor tools and uncovering 184 new issues ...
In-depth Analysis of Graph-based RAG in a Unified Framework
cs.IR 2025-03 unverdicted novelty 6.0

A unified framework and large-scale comparison of graph-based RAG methods on QA tasks yields new high-performing variants obtained by recombining existing components.
Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription
cs.LG 2025-02 unverdicted novelty 6.0

Introduces OCR+PAGE-1 and OCR+PAGE-N prompting strategies that improve zero-shot multi-page handwritten document transcription by sharing context across pages.
TabICL: A Tabular Foundation Model for In-Context Learning on Large Data
cs.LG 2025-02 unverdicted novelty 6.0

TabICL scales in-context learning to large tabular data via column-then-row attention for row embeddings followed by a transformer, matching TabPFNv2 speed and performance while outperforming it and CatBoost on datase...
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
cs.SE 2024-03 unverdicted novelty 6.0

LiveCodeBench collects 400 recent contest problems to create a contamination-free benchmark evaluating LLMs on code generation and related capabilities like self-repair and execution.
Large Language Models are not Fair Evaluators
cs.CL 2023-05 conditional novelty 6.0

LLMs show strong position bias when scoring model outputs, allowing easy manipulation of rankings, but calibration with multiple evidence, position balancing, and selective human input reduces this bias to better matc...
LLMs with in-context learning for Algorithmic Theoretical Physics
cs.LG 2026-05 unverdicted novelty 5.0

Frontier LLMs with in-context learning and CAS integration solve most algorithmic tasks in theoretical physics when supplied with worked examples.
When Context Sticks: Studying Interference in In-Context Learning
cs.LG 2026-04 unverdicted novelty 5.0

In-context learning shows persistent interference from prior examples, with more misleading linear examples degrading quadratic predictions and training curricula modulating recovery speed.
Consistency Analysis of Sentiment Predictions using Syntactic & Semantic Context Assessment Summarization (SSAS)
cs.CL 2026-04 unverdicted novelty 5.0

SSAS improves LLM sentiment prediction consistency and data quality by up to 30% on three review datasets via syntactic and semantic context assessment summarization.
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
cs.CV 2025-11 unverdicted novelty 5.0

Video generation models demonstrate competitive multimodal reasoning on a new benchmark, matching or exceeding VLMs on visual puzzles and achieving 92% on MATH and 69.2% on MMMU.
Context-Guided Decompilation: A Step Towards Re-executability
cs.SE 2025-11 unverdicted novelty 5.0

ICL4Decomp applies in-context learning to guide LLMs in generating re-executable decompiled code from binaries, reporting roughly 40% higher re-executability than prior methods across datasets and optimization levels.
Online In-Context Distillation for Low-Resource Vision Language Models
cs.CV 2025-10 unverdicted novelty 5.0

Online In-Context Distillation lets small VLMs gain up to 33% performance with as little as 4% teacher annotations by distilling knowledge through dynamic in-context demonstrations at inference.
SSA: Improving Performance With a Better Scoring Function
cs.CL 2025-08 unverdicted novelty 5.0

Replacing Softmax with Scaled Signed Averaging in transformer attention improves generalization under distribution shifts for in-context learning and boosts results on NLP benchmarks.
Diverse LLMs or Diverse Question Interpretations? That is the Ensembling Question
cs.CL 2025-07 unverdicted novelty 5.0

Question interpretation diversity outperforms model diversity for LLM ensembling on binary QA tasks using majority voting.
LENS: Multi-level Evaluation of Multimodal Reasoning with Large Language Models
cs.CV 2025-05 unverdicted novelty 5.0

LENS is a new multi-level benchmark dataset for evaluating MLLMs on perception-to-reasoning tasks using the same images across all levels with recent social media content.
Exploring Cross-lingual Latent Transplantation: Mutual Opportunities and Open Challenges
cs.CL 2024-12 unverdicted novelty 5.0

XTransplant empirically shows that cross-lingual latent transplantation yields mutual benefits for multilingual capability and cultural adaptability in LLMs, especially low-resource ones, while revealing underutilized...
E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning
cs.CL 2024-09 unverdicted novelty 5.0

E2LLM uses encoder-based soft prompt compression for long contexts to improve LLM reasoning on tasks like summarization and QA while maintaining efficiency.
The Pedagogy of AI Mistakes: Fostering Higher-Order Thinking
cs.CY 2026-05 unverdicted novelty 4.0

AI mistakes can be structured into course activities to foster higher-order thinking, metacognition, and AI literacy in higher education.
UnAC: Adaptive Visual Prompting with Abstraction and Stepwise Checking for Complex Multimodal Reasoning
cs.CV 2026-05 unverdicted novelty 4.0

UnAC improves LMM performance on visual reasoning benchmarks by combining adaptive visual prompting, image abstraction, and gradual self-checking.
Beyond the Basics: Leveraging Large Language Model for Fine-Grained Medical Entity Recognition
cs.AI 2026-04 conditional novelty 4.0

Fine-tuned LLaMA3 with LoRA reaches 81.24% F1 on 18-category fine-grained medical entity recognition, beating zero-shot by 63.11% and few-shot by 35.63%.
Leveraging Weighted Syntactic and Semantic Context Assessment Summary (wSSAS) Towards Text Categorization Using LLMs
cs.CL 2026-04 unverdicted novelty 4.0

wSSAS is a two-phase deterministic framework that uses hierarchical text organization and SNR-based feature prioritization to improve clustering integrity, categorization accuracy, and reproducibility when applying LL...
LLMs Underperform Graph-Based Parsers on Supervised Relation Extraction for Complex Graphs
cs.CL 2026-04 unverdicted novelty 4.0

Graph-based parsers outperform LLMs on supervised relation extraction as linguistic graph complexity grows with more relations per document.
Combining Static Code Analysis and Large Language Models Improves Correctness and Performance of Algorithm Recognition
cs.SE 2026-04 conditional novelty 4.0

Hybrid LLM plus static analysis for algorithm recognition in code cuts required model calls by 72-97% and lifts F1-scores by as much as 12 points.
Position: LLM Watermarking Should Align Stakeholders' Incentives for Practical Adoption
cs.CR 2025-10 unverdicted novelty 4.0

LLM watermarking adoption is limited by misaligned stakeholder incentives; incentive-aligned approaches such as in-context watermarking can enable practical use in targeted domains like education and peer review.
A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence
cs.AI 2025-07 accept novelty 4.0

The paper delivers the first systematic review of self-evolving agents, structured around what components evolve, when adaptation occurs, and how it is implemented.
Large Language Model-Brained GUI Agents: A Survey
cs.AI 2024-11 unverdicted novelty 4.0

A survey consolidating frameworks, data practices, large action models, benchmarks, applications, and research gaps in LLM-brained GUI agents.
Agent AI: Surveying the Horizons of Multimodal Interaction
cs.AI 2024-01 unverdicted novelty 4.0

The paper defines Agent AI as interactive multimodal systems that perceive grounded data and generate embodied actions, arguing this approach can mitigate hallucinations in foundation models.
The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)
cs.CV 2023-09 conditional novelty 4.0

GPT-4V processes interleaved image-text inputs generically and supports visual referring prompting for new human-AI interaction.
The Rise and Potential of Large Language Model Based Agents: A Survey
cs.AI 2023-09 accept novelty 4.0

The paper surveys the origins, frameworks, applications, and open challenges of AI agents built on large language models.
PaLI-X: On Scaling up a Multilingual Vision and Language Model
cs.CV 2023-05 unverdicted novelty 4.0

Scaling a multilingual vision-language model in size and training breadth yields new state-of-the-art results on over 25 benchmarks plus emerging abilities in counting and multilingual detection.
Network Edge Inference for Large Language Models: Principles, Techniques, and Opportunities
cs.DC 2026-04 unverdicted novelty 3.0

A survey synthesizing challenges, system architectures, model optimizations, deployment methods, and resource management techniques for large language model inference at the network edge.
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
cs.AI 2025-01 unverdicted novelty 3.0

The paper surveys reinforced reasoning techniques for LLMs, covering automated data construction, learning-to-reason methods, and test-time scaling as steps toward Large Reasoning Models.

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages · cited by 71 Pith papers · 1 internal anchor

[1]

In Ad- vances in Neural Information Processing Systems 33: Annual Conference on Neural Information Process- ing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual

Language models are few-shot learners. In Ad- vances in Neural Information Processing Systems 33: Annual Conference on Neural Information Process- ing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. Marc-Etienne Brunet, Ashton Anderson, and Richard S. Zemel. 2023. ICL markup: Structuring in- context learning using soft-token tags. CoRR, abs/2312...

work page arXiv 2020
[2]

Data distributional properties drive emergent in-context learning in transformers. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Sys- tems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022. Anwoy Chatterjee, Eshaan Tanwar, Subhabrata Dutta, and Tanmoy Chakraborty. 2024. L...

work page arXiv 2022
[3]

Transformers

Transformers learn higher-order optimization methods for in-context learning: A study with linear models. CoRR, abs/2310.17086. Yeqi Gao, Zhao Song, and Shenghao Xie. 2023. In- context learning for attention scheme: from single softmax regression to multiple softmax regression via a tensor trick. CoRR, abs/2307.02419. Shivam Garg, Dimitris Tsipras, Percy ...

work page arXiv 2023
[4]

arXiv:2303.07971

Association for Computational Linguistics. Michael Hahn and Navin Goyal. 2023. A theory of emergent in-context learning as implicit structure induction. CoRR, abs/2303.07971. Chi Han, Ziqi Wang, Han Zhao, and Heng Ji. 2023a. Explaining emergent in-context learning as kernel regression. Preprint, arXiv:2305.12766. Xiaochuang Han, Daniel Simig, Todor Mihayl...

work page arXiv 2023
[5]

CoRR, abs/2404.00884

Self-demos: Eliciting out-of-demonstration generalizability in large language models. CoRR, abs/2404.00884. 11 Clyde Highmore. 2024. In-context learning in large language models: A comprehensive survey. Or Honovich, Uri Shaham, Samuel R. Bowman, and Omer Levy. 2023. Instruction induction: From few examples to natural language task descriptions. In Proceed...

work page arXiv 2024
[6]

arXiv preprint arXiv:2304.09960 , year=

The dual form of neural networks revisited: Connecting test time predictions to training patterns via spotlights of attention. In International Confer- ence on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA , volume 162 of Proceedings of Machine Learning Research, pages 9639–9659. PMLR. Srinivasan Iyer, Xi Victoria Lin, Ramakanth P...

work page arXiv 2022
[7]

One step of gradient descent is provably the optimal in-context learner with one layer of linear self-attention

Association for Computational Linguistics. Arvind Mahankali, Tatsunori B. Hashimoto, and Tengyu Ma. 2023. One step of gradient descent is provably the optimal in-context learner with one layer of linear self-attention. CoRR, abs/2307.03576. Costas Mavromatis, Balasubramaniam Srinivasan, Zhengyuan Shen, Jiani Zhang, Huzefa Rangwala, Christos Faloutsos, and...

work page arXiv 2023
[8]

Learning to retrieve prompts for in-context learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Tech- nologies, pages 2655–2671, Seattle, United States. Association for Computational Linguistics. Abulhair Saparov and He He. 2023. Language models are greedy reasoners...

work page 2022
[9]

arXiv:2310.08540 [cs]

Do pretrained transformers learn in-context by gradient descent? Preprint, arXiv:2310.08540. Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush V osoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, et al. 2022. Language models are multilingual chain-of-thought reasoners. ArXiv preprint, abs/2210.03057. Weijia Shi, Sewo...

work page arXiv 2022
[10]

An information-theoretic approach to prompt engineering without ground truth labels. In Proc. of ACL, pages 819–862, Dublin, Ireland. Association for Computational Linguistics. Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, et al. 2022. Beyond ...

work page internal anchor Pith review Pith/arXiv arXiv 2022
[11]

2023 , month = nov, journal =

Pretraining data mixtures enable narrow model selection capabilities in transformer models. CoRR, abs/2311.00871. Jinghan Yang, Shuming Ma, and Furu Wei. 2023a. Auto-icl: In-context learning without human supervi- sion. CoRR, abs/2311.09263. Zhe Yang, Damai Dai, Peiyi Wang, and Zhifang Sui. 2023b. Not all demonstration examples are equally beneficial: Rew...

work page arXiv 2023
[12]

Hattie Zhou, Azade Nova, Hugo Larochelle, Aaron C

OpenReview.net. Hattie Zhou, Azade Nova, Hugo Larochelle, Aaron C. Courville, Behnam Neyshabur, and Hanie Sedghi

work page
[13]

Teaching algorithmic reasoning via in-context learning.arXiv preprint arXiv:2211.09066, 2022

Teaching algorithmic reasoning via in-context learning. CoRR, abs/2211.09066. Wangchunshu Zhou, Yuchen Eleanor Jiang, Ryan Cot- terell, and Mrinmaya Sachan. 2023b. Efficient prompting via dynamic in-context learning. CoRR, abs/2305.11170. Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, and Jimmy Ba. 2023c. Large l...

work page arXiv 2023

[1] [1]

In Ad- vances in Neural Information Processing Systems 33: Annual Conference on Neural Information Process- ing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual

Language models are few-shot learners. In Ad- vances in Neural Information Processing Systems 33: Annual Conference on Neural Information Process- ing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. Marc-Etienne Brunet, Ashton Anderson, and Richard S. Zemel. 2023. ICL markup: Structuring in- context learning using soft-token tags. CoRR, abs/2312...

work page arXiv 2020

[2] [2]

Data distributional properties drive emergent in-context learning in transformers. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Sys- tems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022. Anwoy Chatterjee, Eshaan Tanwar, Subhabrata Dutta, and Tanmoy Chakraborty. 2024. L...

work page arXiv 2022

[3] [3]

Transformers

Transformers learn higher-order optimization methods for in-context learning: A study with linear models. CoRR, abs/2310.17086. Yeqi Gao, Zhao Song, and Shenghao Xie. 2023. In- context learning for attention scheme: from single softmax regression to multiple softmax regression via a tensor trick. CoRR, abs/2307.02419. Shivam Garg, Dimitris Tsipras, Percy ...

work page arXiv 2023

[4] [4]

arXiv:2303.07971

Association for Computational Linguistics. Michael Hahn and Navin Goyal. 2023. A theory of emergent in-context learning as implicit structure induction. CoRR, abs/2303.07971. Chi Han, Ziqi Wang, Han Zhao, and Heng Ji. 2023a. Explaining emergent in-context learning as kernel regression. Preprint, arXiv:2305.12766. Xiaochuang Han, Daniel Simig, Todor Mihayl...

work page arXiv 2023

[5] [5]

CoRR, abs/2404.00884

Self-demos: Eliciting out-of-demonstration generalizability in large language models. CoRR, abs/2404.00884. 11 Clyde Highmore. 2024. In-context learning in large language models: A comprehensive survey. Or Honovich, Uri Shaham, Samuel R. Bowman, and Omer Levy. 2023. Instruction induction: From few examples to natural language task descriptions. In Proceed...

work page arXiv 2024

[6] [6]

arXiv preprint arXiv:2304.09960 , year=

The dual form of neural networks revisited: Connecting test time predictions to training patterns via spotlights of attention. In International Confer- ence on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA , volume 162 of Proceedings of Machine Learning Research, pages 9639–9659. PMLR. Srinivasan Iyer, Xi Victoria Lin, Ramakanth P...

work page arXiv 2022

[7] [7]

One step of gradient descent is provably the optimal in-context learner with one layer of linear self-attention

Association for Computational Linguistics. Arvind Mahankali, Tatsunori B. Hashimoto, and Tengyu Ma. 2023. One step of gradient descent is provably the optimal in-context learner with one layer of linear self-attention. CoRR, abs/2307.03576. Costas Mavromatis, Balasubramaniam Srinivasan, Zhengyuan Shen, Jiani Zhang, Huzefa Rangwala, Christos Faloutsos, and...

work page arXiv 2023

[8] [8]

Learning to retrieve prompts for in-context learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Tech- nologies, pages 2655–2671, Seattle, United States. Association for Computational Linguistics. Abulhair Saparov and He He. 2023. Language models are greedy reasoners...

work page 2022

[9] [9]

arXiv:2310.08540 [cs]

Do pretrained transformers learn in-context by gradient descent? Preprint, arXiv:2310.08540. Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush V osoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, et al. 2022. Language models are multilingual chain-of-thought reasoners. ArXiv preprint, abs/2210.03057. Weijia Shi, Sewo...

work page arXiv 2022

[10] [10]

An information-theoretic approach to prompt engineering without ground truth labels. In Proc. of ACL, pages 819–862, Dublin, Ireland. Association for Computational Linguistics. Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, et al. 2022. Beyond ...

work page internal anchor Pith review Pith/arXiv arXiv 2022

[11] [11]

2023 , month = nov, journal =

Pretraining data mixtures enable narrow model selection capabilities in transformer models. CoRR, abs/2311.00871. Jinghan Yang, Shuming Ma, and Furu Wei. 2023a. Auto-icl: In-context learning without human supervi- sion. CoRR, abs/2311.09263. Zhe Yang, Damai Dai, Peiyi Wang, and Zhifang Sui. 2023b. Not all demonstration examples are equally beneficial: Rew...

work page arXiv 2023

[12] [12]

Hattie Zhou, Azade Nova, Hugo Larochelle, Aaron C

OpenReview.net. Hattie Zhou, Azade Nova, Hugo Larochelle, Aaron C. Courville, Behnam Neyshabur, and Hanie Sedghi

work page

[13] [13]

Teaching algorithmic reasoning via in-context learning.arXiv preprint arXiv:2211.09066, 2022

Teaching algorithmic reasoning via in-context learning. CoRR, abs/2211.09066. Wangchunshu Zhou, Yuchen Eleanor Jiang, Ryan Cot- terell, and Mrinmaya Sachan. 2023b. Efficient prompting via dynamic in-context learning. CoRR, abs/2305.11170. Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, and Jimmy Ba. 2023c. Large l...

work page arXiv 2023