First empirical study of correctness bugs in torch.compile characterizes their patterns and proposes AlignGuard, which found 23 confirmed new bugs via LLM-guided test mutation.
InProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Athens, Greece)(ESEC/FSE 2021)
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
XCheck extracts cross-layer constraints to generate test models and monitor behaviors, revealing 2,034 compiler-platform interaction bugs in three DL compilers.
SAGE uses sparse autoencoders to boost vulnerability signals in LLMs, raising internal SNR 12.7x and delivering up to 318% MCC gains on vulnerability detection benchmarks.
UntrustVul identifies untrustworthy vulnerability predictions by marking lines that neither match historical vulnerability patterns nor influence vulnerable lines through dependencies, reporting AUC 70-88% and F1 82-94% on 115K predictions.
HYDRA is a hybrid model that uses heuristics plus deep embeddings and a VAE to predict latent zero-day vulnerabilities in patched functions from Chrome, Android, and ImageMagick.
citing papers explorer
-
Demystifying the Silence of Correctness Bugs in PyTorch Compiler
First empirical study of correctness bugs in torch.compile characterizes their patterns and proposes AlignGuard, which found 23 confirmed new bugs via LLM-guided test mutation.
-
Finding Compiler-Platform Interaction Bugs in Deep Learning Pipelines via Cross-Layer Constraints
XCheck extracts cross-layer constraints to generate test models and monitor behaviors, revealing 2,034 compiler-platform interaction bugs in three DL compilers.
-
SAGE: Signal-Amplified Guided Embeddings for LLM-based Vulnerability Detection
SAGE uses sparse autoencoders to boost vulnerability signals in LLMs, raising internal SNR 12.7x and delivering up to 318% MCC gains on vulnerability detection benchmarks.
-
UntrustVul: An Automated Approach for Identifying Untrustworthy Alerts in Vulnerability Detection Models
UntrustVul identifies untrustworthy vulnerability predictions by marking lines that neither match historical vulnerability patterns nor influence vulnerable lines through dependencies, reporting AUC 70-88% and F1 82-94% on 115K predictions.
-
HYDRA: A Hybrid Heuristic-Guided Deep Representation Architecture for Predicting Latent Zero-Day Vulnerabilities in Patched Functions
HYDRA is a hybrid model that uses heuristics plus deep embeddings and a VAE to predict latent zero-day vulnerabilities in patched functions from Chrome, Android, and ImageMagick.
- Understanding Bugs in Modern Agentic Frameworks: A Study of Symptoms, Root Causes, and Triggering Conditions