pith. sign in

hub

In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis (Vienna, Austria) (ISSTA 2024)

33 Pith papers cite this work. Polarity classification is still indexing.

33 Pith papers citing it

hub tools

citation-role summary

background 4

citation-polarity summary

years

2026 29 2025 4

roles

background 4

polarities

background 4

representative citing papers

Understanding the (In)Security of Vibe-Coded Applications

cs.CR · 2026-06-22 · unverdicted · novelty 7.0

Empirical study of real-world vibe-coded apps finds recurring vulnerabilities like placeholder logic and secret exposure caused by AI agent limitations such as memory loss and insufficient security knowledge.

SWE-Explore: Benchmarking How Coding Agents Explore Repositories

cs.SE · 2026-06-05 · unverdicted · novelty 7.0

SWE-Explore is a new benchmark evaluating repository exploration by coding agents on 848 issues across 203 repositories, using line-level ground truth from successful agent trajectories and showing agentic methods outperform classical retrieval on coverage and ranking.

Synthesizing Multi-Agent Harnesses for Vulnerability Discovery

cs.CR · 2026-04-22 · unverdicted · novelty 7.0

AgentFlow uses a typed graph DSL covering roles, prompts, tools, topology and protocol plus a runtime-signal feedback loop to optimize multi-agent harnesses, reaching 84.3% on TerminalBench-2 and discovering ten new zero-days in Chrome including two critical sandbox escapes.

Certified Program Synthesis with a Multi-Modal Verifier

cs.SE · 2026-04-17 · unverdicted · novelty 7.0

LeetProof achieves higher rates of fully certified program synthesis from natural language by using a multi-modal verifier in Lean to validate specifications via randomized testing and delegate proofs to AI tools, outperforming single-mode baselines on benchmarks while uncovering defects in prior参考.

Evaluating LLM Agents on Automated Software Analysis Tasks

cs.SE · 2026-04-13 · unverdicted · novelty 7.0

A custom LLM agent achieves 94% manually verified success on a new benchmark of 35 software analysis setups, outperforming baselines at 77%, but struggles with stage mixing, error localization, and overestimating its own success.

AgentBound: Securing Execution Boundaries of AI Agents

cs.CR · 2025-10-24 · conditional · novelty 7.0

AgentBound is the first declarative access control framework for Model Context Protocol servers that generates policies from source code at 80.9% accuracy and blocks most threats in malicious servers with negligible overhead.

Agentic Coding Needs Proactivity, Not Just Autonomy

cs.SE · 2026-05-07 · conditional · novelty 6.0

Coding agents require a three-level proactivity taxonomy (Reactive, Scheduled, Situation Aware) evaluated by insight policy quality using Insight Decision Quality, Context Grounding Score, and Learning Lift.

citing papers explorer

Showing 33 of 33 citing papers.