pith. machine review for the scientific record. sign in

arxiv: 2601.02902 · v2 · submitted 2026-01-06 · 💻 cs.AI · cs.CL· cs.LO

Recognition: unknown

Logical Phase Transitions: Understanding Collapse in LLM Logical Reasoning

Authors on Pith no claims yet
classification 💻 cs.AI cs.CLcs.LO
keywords logicalreasoningcriticalphasetransitionsbeyondcollapsecomplexity
0
0 comments X
read the original abstract

Symbolic logical reasoning is a critical yet underexplored capability of large language models (LLMs), providing reliable and verifiable decision-making in high-stakes domains such as mathematical reasoning and legal judgment. In this study, we present a systematic analysis of logical reasoning under controlled increases in logical complexity, and reveal a previously unrecognized phenomenon, which we term Logical Phase Transitions: rather than degrading smoothly, logical reasoning performance remains stable within a regime but collapses abruptly beyond a critical logical depth, mirroring physical phase transitions such as water freezing beyond a critical temperature threshold. Building on this insight, we propose Neuro-Symbolic Curriculum Tuning, a principled framework that adaptively aligns natural language with logical symbols to establish a shared representation, and reshapes training dynamics around phase-transition boundaries to progressively strengthen reasoning at increasing logical depths. Experiments on five benchmarks show that our approach effectively mitigates logical reasoning collapse at high complexity, yielding average accuracy gains of +1.26 in naive prompting and +3.95 in CoT, while improving generalization to unseen logical compositions. Code and data are available at https://github.com/AI4SS/Logical-Phase-Transitions.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. IntervenSim: Intervention-Aware Social Network Simulation for Opinion Dynamics

    cs.SI 2026-04 unverdicted novelty 7.0

    IntervenSim is an intervention-aware social network simulation that couples source interventions with crowd interactions in a feedback loop, improving MAPE by 41.6% and DTW by 66.9% over prior static frameworks on rea...

  2. OmniTrend: Content-Context Modeling for Scalable Social Popularity Prediction

    cs.CV 2026-04 unverdicted novelty 6.0

    OmniTrend predicts popularity by combining separate content attractiveness and contextual exposure predictors using cross-modal and exogenous signals.

  3. HotComment: A Benchmark for Evaluating Popularity of Online Comments

    cs.AI 2026-04 unverdicted novelty 6.0

    HotComment is a new multimodal benchmark that quantifies online comment popularity via content quality assessment, interaction-based prediction, and agent-simulated user engagement, accompanied by the StyleCmt stylist...

  4. Seeing Further and Wider: Joint Spatio-Temporal Enlargement for Micro-Video Popularity Prediction

    cs.MM 2026-04 unverdicted novelty 5.0

    A new joint spatio-temporal enlargement model for micro-video popularity prediction using frame scoring for long sequences and a topology-aware memory bank for unbounded historical associations.

  5. ActorMind: Emulating Human Actor Reasoning for Speech Role-Playing

    cs.SD 2026-04 unverdicted novelty 5.0

    ActorMind is a four-agent chain-of-thought framework that emulates human actors to produce spontaneous, emotion-infused speech responses for role-playing scenarios.

  6. CurEvo: Curriculum-Guided Self-Evolution for Video Understanding

    cs.CV 2026-04 unverdicted novelty 4.0

    CurEvo integrates curriculum guidance into self-evolution to structure autonomous improvement of video understanding models, yielding gains on VideoQA benchmarks.