Design by contract for deep learning APIs,

· 2023 · arXiv 1643.361624

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

read on arXiv browse 13 citing papers

citation-role summary

background 3 baseline 1

citation-polarity summary

background 3 baseline 1

representative citing papers

Analyzing the Narration Gap in LLM-Solver Loops

cs.AI · 2026-06-17 · unverdicted · novelty 8.0

The narration step in LLM-solver loops is vulnerable to prompt injection that inverts verified solver conclusions, and hardened prompts reduce but do not eliminate the risk under adaptive attacks.

Event-B Agent: Towards LLM Agent for Formal Model Synthesis and Repair

cs.SE · 2026-05-17 · unverdicted · novelty 7.0

Event-B Agent is an LLM agent that synthesizes, refines, and repairs Event-B formal models from natural language requirements via iterative verification feedback loops.

Agentic Interpretation: Lattice-Structured Evidence for LLM-Based Program Analysis

cs.SE · 2026-05-12 · unverdicted · novelty 7.0

Agentic interpretation uses lattices to track LLM judgments on decomposed program claims during analysis.

Gleaner: A Semantically-Rich and Efficient Online Sampler for Microservice Diagnostics

cs.SE · 2026-04-18 · unverdicted · novelty 7.0

Gleaner replaces slow graph-based trace analysis with bag-of-edges set operations plus log semantics and alarm-driven diversity to deliver faster, higher-fidelity sampling that improves RCA accuracy even at 1% rates.

Evaluation-Strategy Gap in Fault Diagnosis of Deep Learning Programs

cs.SE · 2026-06-25 · unverdicted · novelty 6.0

Using a corpus of 5542 fault-injected traces from 38 DL programs, the study finds a 0.19 balanced accuracy gap in fault diagnosis between within-program and cross-program evaluation caused by program-specific feature structures.

Planning to Hammer: Difficulty-Aware Decomposition for Automating Rocq Proofs

cs.SE · 2026-06-16 · unverdicted · novelty 6.0

Quarry improves Rocq proof automation success rates by 7-13% under 10-minute budgets via LLM-planned decompositions ranked by a proof-state difficulty model for CoqHammer solvability.

Data-aware Static Analysis: Improving Detection of Semantic Faults in Machine Learning Code Using Data Characteristics

cs.SE · 2026-06-08 · unverdicted · novelty 6.0

Proposes data-aware static analysis combining data/control flow and API contracts to detect semantic faults in ML code early, shown on sample real-world notebooks.

Are We Lost in the Woods? Detecting Silent Semantic Faults for Random Forest Classifiers with Data-informed Static Analysis

cs.SE · 2026-06-05 · unverdicted · novelty 6.0

dille detects silent semantic faults in random forest ML pipelines with 91% precision via data-informed static analysis on Kaggle notebooks, finding 12-18% of scripts affected.

AutoSOUP: Safety-Oriented Unit Proof Generation for Component-level Memory-Safety Verification

cs.SE · 2026-05-11 · unverdicted · novelty 6.0

AutoSOUP automates component-level memory-safety verification by generating Safety-Oriented Unit Proofs via three techniques and a hybrid LLM-plus-program-synthesis architecture called LLM-As-Function-Call.

PROMISE: Proof Automation as Structural Imitation of Human Reasoning

cs.LO · 2026-04-07 · unverdicted · novelty 6.0

PROMISE reframes automated proof generation as stateful search over structural embeddings of proof states, outperforming prior LLM-based systems by up to 26 points on the seL4 benchmark.

On Reasoning-Centric LLM-based Automated Theorem Proving

cs.SE · 2026-04-21 · unverdicted · novelty 5.0

ReCent-Prover achieves a 22.58% relative improvement over prior state-of-the-art in proved theorems on the CoqStoq benchmark by using reasoning-centric techniques under a fixed LLM invocation budget.

Reinforcement Learning with Negative Tests as Completeness Signal for Formal Specification Synthesis

cs.SE · 2026-04-07 · unverdicted · novelty 5.0

SpecRL uses the fraction of negative tests rejected by candidate specifications as a reward signal in RL training to produce stronger and more verifiable formal specifications than prior methods.

LogCopilot: Automating Log Aggregation Analysis through Large Language Models

cs.SE · 2026-06-13 · unverdicted · novelty 4.0

LogCopilot is an LLM framework that builds a hierarchical knowledge base from logs and generates/executes LogQL queries from natural language instructions, reporting 76.8% average accuracy across four datasets.

citing papers explorer

Showing 3 of 3 citing papers after filters.

AutoSOUP: Safety-Oriented Unit Proof Generation for Component-level Memory-Safety Verification cs.SE · 2026-05-11 · unverdicted · none · ref 38
AutoSOUP automates component-level memory-safety verification by generating Safety-Oriented Unit Proofs via three techniques and a hybrid LLM-plus-program-synthesis architecture called LLM-As-Function-Call.
PROMISE: Proof Automation as Structural Imitation of Human Reasoning cs.LO · 2026-04-07 · unverdicted · none · ref 10
PROMISE reframes automated proof generation as stateful search over structural embeddings of proof states, outperforming prior LLM-based systems by up to 26 points on the seL4 benchmark.
Reinforcement Learning with Negative Tests as Completeness Signal for Formal Specification Synthesis cs.SE · 2026-04-07 · unverdicted · none · ref 16
SpecRL uses the fraction of negative tests rejected by candidate specifications as a reward signal in RL training to produce stronger and more verifiable formal specifications than prior methods.

Design by contract for deep learning APIs,

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer