Authors create a benchmark across discrete/continuous and static/dynamical systems and introduce the Causal Abstraction Error (CAE) metric that reliably distinguishes valid from invalid causal abstractions when it includes faithfulness testing.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
Forge pipeline combines LLM code generation with MDE transformations to produce verifiable artifacts in Dafny, CSP, and Isabelle, iterating on failures to generate standards-relevant evidence for Java code.
Case study shows that SPIN and DIVINE model checkers can uncover design flaws and code defects in a C++ framework missed by hundreds of hours of testing and can be integrated into the development workflow.
citing papers explorer
-
Validating Causal Abstraction Metrics on Simulated Complex Systems
Authors create a benchmark across discrete/continuous and static/dynamical systems and introduce the Causal Abstraction Error (CAE) metric that reliably distinguishes valid from invalid causal abstractions when it includes faithfulness testing.
-
Formal-Method-Guided Vibe Coding: Closing the Verification Loop on AI-Generated Safety-Critical Software Through Model-Driven Engineering
Forge pipeline combines LLM code generation with MDE transformations to produce verifiable artifacts in Dafny, CSP, and Isabelle, iterating on failures to generate standards-relevant evidence for Java code.