{"total":14,"items":[{"citing_arxiv_id":"2605.19944","ref_index":9,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"A Measure-Theoretic Analysis of Reasoning: Structural Generalization and Approximation Limits","primary_cat":"cs.LG","submitted_at":"2026-05-19T15:00:51+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Applies optimal transport to bound OOD generalization error in Transformers via Lipschitz continuity and TC^0 circuit depth lower bounds for Dyck-k backtracking, supported by evaluations on 54 configurations.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.08221","ref_index":57,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"NoisyCoconut: Counterfactual Consensus via Latent Space Reasoning","primary_cat":"cs.LG","submitted_at":"2026-05-06T13:58:55+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Injecting noise into LLM latent trajectories creates diverse reasoning paths whose agreement acts as a confidence signal for selective abstention, cutting error rates from 40-70% to under 15% on math tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.01373","ref_index":11,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Focus on the Core: Empowering Diffusion Large Language Models by Self-Contrast","primary_cat":"cs.CL","submitted_at":"2026-05-02T10:46:56+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"FoCore uses self-contrast on early-converging high-density tokens to boost diffusion LLM quality on reasoning benchmarks while cutting decoding steps by over 2x.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"1 Step-wise JS Divergence Upper-bounded by Conditional Entropy Proposition D.1(Entropic Upper Bound on JS Divergence).Let fθ be a sufficiently trained masked diffusion language model satisfying P (i) t ≈p(x i |x t). Let ∆t denote the contextual increment introduced at step t, i.e., the set of newly decoded tokens. Then the step-wise JS divergence at position isatisfies: D(i) t ≤ 1 2 I(x i; ∆t |x t−1)≤ 1 2 Hi (11) Proof.First inequality.By the standard relationship between JS divergence and KL divergence: D(i) t = JS(P (i) t ∥P (i) t−1)≤ 1 2 DKL(P (i) t ∥P (i) t−1)(12) Since the distributional shift between P (i) t and P (i) t−1 is induced solely by the contextual increment ∆t, applying the Data Processing Inequality yields: DKL(P (i) t ∥P (i) t−1)≤I(x i; ∆t |x t−1)(13)"},{"citing_arxiv_id":"2604.21027","ref_index":16,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering","primary_cat":"cs.AI","submitted_at":"2026-04-22T19:18:36+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"HypEHR is a hyperbolic embedding model for EHR data that uses Lorentzian geometry and hierarchy-aware pretraining to answer clinical questions nearly as well as large language models but with much smaller size.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"cnull if the answer is empty): Lconcept-1 =−logp j .(15) D.1.3 Float Value Head This head handles questions whose answers are single numeric values, typically derived from labo- ratory tests or scalar measurements. Event candidates.For a given variable (e.g., cre- atinine), we collect all matching events in the pa- tient record: E={e 1, . . . , eM },(16) where each event ej has a timestamp tj, a scalar value νj, and a hyperbolic embedding hval j ∈H d L (e.g., obtained from the corresponding visit state and variable identity). We additionally introduce a learned \"null event\"e null for no-answer cases. Input and scoring.We use the question- conditioned visit-level representation zvisit p|q and compute pairwise scores:"},{"citing_arxiv_id":"2604.19567","ref_index":4,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Multi-modal Reasoning with LLMs for Visual Semantic Arithmetic","primary_cat":"cs.AI","submitted_at":"2026-04-21T15:19:49+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"SAri-RFT applies GRPO-based reinforcement fine-tuning to LVLMs on novel two-term and three-term visual semantic arithmetic tasks, reaching SOTA on the new IRPD dataset and Visual7W-Telling.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2602.04476","ref_index":11,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Vision-aligned Latent Reasoning for Multi-modal Large Language Model","primary_cat":"cs.CV","submitted_at":"2026-02-04T12:04:02+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"VaLR generates vision-aligned latent tokens before each reasoning step to preserve perceptual cues, improving VSI-Bench accuracy from 33.0% to 52.9%.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2601.12538","ref_index":127,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Agentic Reasoning for Large Language Models","primary_cat":"cs.AI","submitted_at":"2026-01-18T18:58:23+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"The survey structures agentic reasoning for LLMs into foundational, self-evolving, and collective multi-agent layers while distinguishing in-context orchestration from post-training optimization and reviewing applications across domains.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Besides that, MCTS is heavily explored in agentic research: [112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123] use MCTS or its variations for controlled exploration and improved reasoning fidelity. Beam search is leveraged in [124, 125, 126] to prune and prioritize reasoning trajectories efficiently. Other tree-search-inspired works include [127] which 11 Agentic Reasoning for Large Language Models Table 2: RepresentativeAgentic Planningsystems categorized byModality,Structure,Format, andTool. Method Structure Format Tool Modality I: Language Agents (e.g., Search Agents, Code Agents) ReWOO [71] Decomposed Natural Language None Reflexion [14] Sequential Natural Language None LLM+P [72] Sequential Formal Language None"},{"citing_arxiv_id":"2503.22693","ref_index":30,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Bridging Language Models and Financial Analysis","primary_cat":"q-fin.ST","submitted_at":"2025-03-14T01:35:20+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":2.0,"formal_verification":"none","one_line_summary":"A survey synthesizing recent LLM research and assessing its applicability to financial data analysis.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2502.17419","ref_index":255,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"From System 1 to System 2: A Survey of Reasoning Large Language Models","primary_cat":"cs.AI","submitted_at":"2025-02-24T18:50:52+00:00","verdict":"ACCEPT","verdict_confidence":"MODERATE","novelty_score":3.0,"formal_verification":"none","one_line_summary":"The survey organizes the shift of LLMs toward deliberate System 2 reasoning, covering model construction techniques, performance on math and coding benchmarks, and future research directions.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"agnostic training frameworks ( e.g., Vision-R1's [254] cold- start CoT data generation), and efficient parameter utiliza- tion ( e.g., MMR1 [249]'s 7B models rivaling larger pro- prietary counterparts). The work emphasizes transparency through explicit reasoning paths ( e.g., MedVLM-R1's [251] interpretable medical analysis) and open-source contribu- tions (code [255], data [250], benchmarks [256], [257]), foster- ing reproducibility and MLLM-community improvement. Despite the strengths of RFT, it still faces the following challenges: 1) Unclear Mechanism behind Reasoning: The underly- ing mechanisms driving the reasoning improvements in DeepSeek-R1 remain poorly understood. For example, while DeepSeek-R1 exhibits emergent properties ( e."},{"citing_arxiv_id":"2501.19201","ref_index":9,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Efficient Reasoning with Hidden Thinking","primary_cat":"cs.CL","submitted_at":"2025-01-31T15:10:29+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Heima compresses verbose CoT into hidden thinking tokens via information-theoretic analysis and an adaptive interpreter, claiming maintained or improved zero-shot accuracy on reasoning benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2501.09732","ref_index":20,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps","primary_cat":"cs.CV","submitted_at":"2025-01-16T18:30:37+00:00","verdict":"CONDITIONAL","verdict_confidence":"MODERATE","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Diffusion models improve generation quality via inference-time search over noise candidates guided by verifiers and algorithms, yielding gains beyond denoising step scaling on class- and text-conditioned benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2412.18925","ref_index":23,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs","primary_cat":"cs.CL","submitted_at":"2024-12-25T15:12:34+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"HuatuoGPT-o1 achieves superior medical complex reasoning by using a verifier to curate reasoning trajectories for fine-tuning and then applying RL with verifier-based rewards.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2412.06769","ref_index":9,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Training Large Language Models to Reason in a Continuous Latent Space","primary_cat":"cs.CL","submitted_at":"2024-12-09T18:55:56+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Coconut lets LLMs perform reasoning directly in continuous latent space by recycling hidden states as inputs, outperforming standard chain-of-thought on search-intensive logical tasks with better accuracy-efficiency trade-offs.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2408.07199","ref_index":14,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents","primary_cat":"cs.AI","submitted_at":"2024-08-13T20:52:13+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Agent Q integrates MCTS-guided search, self-critique, and off-policy DPO to train LLM agents that outperform behavior cloning and reinforced fine-tuning baselines in WebShop and achieve up to 95.4% success in real-world booking scenarios.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}