InProceed- ings of the AAAI Conference on Artificial Intelligence, volume 39, pages 24858–24866

Agnieszka Mensfelt, others , title = · 2025 · arXiv 2509.09810

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Fixing FOLIO and MALLS: Verified Annotations and an LLM-assisted Framework to Focus Human Relabeling

cs.CL · 2026-06-01 · unverdicted · novelty 7.0

Audit finds 36-39% incorrect FOL labels in FOLIO and MALLS; corrections raise LLM accuracy 9-22 points and an LLM-guided review framework achieves 90% dataset quality after checking fewer than 24% of examples.

FormalScience: Scalable Human-in-the-Loop Autoformalisation of Science with Agentic Code Generation in Lean

cs.AI · 2026-04-24 · unverdicted · novelty 7.0

FormalScience provides a scalable human-in-the-loop system for autoformalising scientific reasoning into Lean, demonstrated on a new 200-problem physics dataset with perfect formal validity.

citing papers explorer

Showing 2 of 2 citing papers.

Fixing FOLIO and MALLS: Verified Annotations and an LLM-assisted Framework to Focus Human Relabeling cs.CL · 2026-06-01 · unverdicted · none · ref 84
Audit finds 36-39% incorrect FOL labels in FOLIO and MALLS; corrections raise LLM accuracy 9-22 points and an LLM-guided review framework achieves 90% dataset quality after checking fewer than 24% of examples.
FormalScience: Scalable Human-in-the-Loop Autoformalisation of Science with Agentic Code Generation in Lean cs.AI · 2026-04-24 · unverdicted · none · ref 3
FormalScience provides a scalable human-in-the-loop system for autoformalising scientific reasoning into Lean, demonstrated on a new 200-problem physics dataset with perfect formal validity.

InProceed- ings of the AAAI Conference on Artificial Intelligence, volume 39, pages 24858–24866

fields

years

verdicts

representative citing papers

citing papers explorer