hub

Logic-LM: Empowering Large Language Models With Symbolic Solvers for Faithful Logical Reasoning

· 2023 · arXiv 2305.12295

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

read on arXiv browse 10 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

NoisyCausal: A Benchmark for Evaluating Causal Reasoning Under Structured Noise

cs.CL · 2026-05-05 · unverdicted · novelty 7.0

NoisyCausal benchmark tests LLMs on causal reasoning with structured noise, and a modular LLM-plus-causal-graph framework outperforms baselines while generalizing to Cladder.

CodeClinic: Evaluating Automation of Coding Skills for Clinical Reasoning Agents

cs.AI · 2026-05-10 · unverdicted · novelty 6.0

CodeClinic benchmark demonstrates that LLM-generated Python skill libraries from clinical guidelines enhance consistency and reduce token consumption by up to 40% compared to zero-shot approaches on MIMIC-IV based tasks.

From Natural Language to Executable Narsese: A Neuro-Symbolic Benchmark and Pipeline for Reasoning with NARS

cs.AI · 2026-04-20 · unverdicted · novelty 6.0

A new benchmark and deterministic pipeline translate natural language reasoning into executable Narsese for NARS, with execution-based validation and initial LLM adaptation for three-label classification.

LAST: Leveraging Tools as Hints to Enhance Spatial Reasoning for Multimodal Large Language Models

cs.CV · 2026-04-08 · unverdicted · novelty 6.0

LAST augments MLLMs with a tool-abstraction sandbox and three-stage training to deliver around 20% gains on spatial reasoning tasks, outperforming closed-source models.

VERGE: Formal Refinement and Guidance Engine for Verifiable LLM Reasoning

cs.CL · 2026-01-27 · unverdicted · novelty 6.0

VERGE decomposes LLM outputs into atomic claims, autoformalizes them to first-order logic, verifies with SMT solvers and consensus, and refines via minimal correction subsets, yielding 18.7% average uplift on reasoning benchmarks.

LLM Reasoning Is Latent, Not the Chain of Thought

cs.AI · 2026-04-17 · unverdicted · novelty 5.0

LLM reasoning is primarily mediated by latent-state trajectories rather than by explicit surface chain-of-thought outputs.

VeriTrans: Fine-Tuned LLM-Assisted NL-to-PL Translation via a Deterministic Neuro-Symbolic Pipeline

cs.AI · 2026-04-11 · unverdicted · novelty 5.0

VeriTrans achieves 94.46% SAT/UNSAT correctness on SatBench via LLM translation gated by round-trip similarity and deterministic neuro-symbolic execution.

Semantic-Aware Logical Reasoning via a Semiotic Framework

cs.AI · 2025-09-29 · conditional · novelty 5.0

LogicAgent uses a semiotic-square-guided approach to enhance logical reasoning in LLMs on the new RepublicQA benchmark and others, reporting average gains of 6.25% and 7.05% respectively.

ORFS-agent: Tool-Using Agents for Chip Design Optimization

cs.AI · 2025-06-10 · unverdicted · novelty 5.0

ORFS-agent uses LLM agents to tune parameters in chip design flows, improving geometric-mean wirelength, clock period, and co-optimization objectives by up to 2.7% over OR-AutoTuner with 40% fewer iterations on ASAP7 and SKY130HD benchmarks.

LLM-Assisted Tool for Joint Generation of Formulas and Functions in Rule-Based Verification of Map Transformations

cs.SE · 2025-11-03 · unverdicted · novelty 4.0

LLM-assisted pipeline jointly generates logical formulas and executable predicates for rule-based verification of HD map transformations in CommonRoad, evaluated on synthetic bridge and slope scenarios.

citing papers explorer

Showing 6 of 6 citing papers after filters.

CodeClinic: Evaluating Automation of Coding Skills for Clinical Reasoning Agents cs.AI · 2026-05-10 · unverdicted · none · ref 29
CodeClinic benchmark demonstrates that LLM-generated Python skill libraries from clinical guidelines enhance consistency and reduce token consumption by up to 40% compared to zero-shot approaches on MIMIC-IV based tasks.
From Natural Language to Executable Narsese: A Neuro-Symbolic Benchmark and Pipeline for Reasoning with NARS cs.AI · 2026-04-20 · unverdicted · none · ref 8
A new benchmark and deterministic pipeline translate natural language reasoning into executable Narsese for NARS, with execution-based validation and initial LLM adaptation for three-label classification.
LLM Reasoning Is Latent, Not the Chain of Thought cs.AI · 2026-04-17 · unverdicted · none · ref 14
LLM reasoning is primarily mediated by latent-state trajectories rather than by explicit surface chain-of-thought outputs.
VeriTrans: Fine-Tuned LLM-Assisted NL-to-PL Translation via a Deterministic Neuro-Symbolic Pipeline cs.AI · 2026-04-11 · unverdicted · none · ref 17
VeriTrans achieves 94.46% SAT/UNSAT correctness on SatBench via LLM translation gated by round-trip similarity and deterministic neuro-symbolic execution.
Semantic-Aware Logical Reasoning via a Semiotic Framework cs.AI · 2025-09-29 · conditional · none · ref 27
LogicAgent uses a semiotic-square-guided approach to enhance logical reasoning in LLMs on the new RepublicQA benchmark and others, reporting average gains of 6.25% and 7.05% respectively.
ORFS-agent: Tool-Using Agents for Chip Design Optimization cs.AI · 2025-06-10 · unverdicted · none · ref 42
ORFS-agent uses LLM agents to tune parameters in chip design flows, improving geometric-mean wirelength, clock period, and co-optimization objectives by up to 2.7% over OR-AutoTuner with 40% fewer iterations on ASAP7 and SKY130HD benchmarks.

Logic-LM: Empowering Large Language Models With Symbolic Solvers for Faithful Logical Reasoning

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer