Recognition: unknown
Automatic Detection of Reference Counting Bugs in Linux Kernel Drivers
Pith reviewed 2026-05-14 20:25 UTC · model grok-4.3
The pith
DrvHorn reduces reference counting verification in Linux kernel drivers to assertion checking through kernel modeling and program slicing.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DrvHorn reduces reference counting verification to an assertion checking problem by efficiently modeling the Linux kernel and applying aggressive program slicing. When run on all platform drivers in Linux kernel version 6.6, it identified 545 bugs, of which 424 were previously unknown, with a false positive rate of 29.9 percent that is lower than rates reported in prior studies. Patches addressing the newly discovered bugs were submitted to the kernel, and 45 of them have been merged.
What carries the argument
The DrvHorn tool, which reduces reference counting verification to assertion checking by modeling the Linux driver interface and applying aggressive program slicing.
If this is right
- Reference counting bugs are common across platform drivers in the current Linux kernel.
- Automated reduction to assertion checking can locate these bugs at scale with a usable false positive rate.
- Patches generated from the tool's reports can be accepted and merged by kernel maintainers.
- The approach covers the entire set of platform drivers without requiring per-driver manual modeling.
Where Pith is reading between the lines
- The same interface-modeling reduction could be applied to detect other classes of driver bugs such as locking or memory ordering errors.
- Running DrvHorn as part of the kernel's continuous integration tests might prevent new reference counting bugs from reaching mainline.
- Lower false positive rates achieved through driver-specific modeling suggest that generic pointer analyses miss important domain patterns.
Load-bearing premise
The kernel modeling and aggressive program slicing preserve enough reference counting semantics to detect real bugs without excessive distortion or omission of relevant paths.
What would settle it
A manual audit of a random sample of the 424 newly reported bugs to determine how many are genuine reference counting errors, or re-running the tool on an older kernel version where the complete set of such bugs is already documented from prior reports.
Figures
read the original abstract
Reference counting bugs in Linux kernel drivers can lead to severe resource mismanagement and security vulnerabilities. We introduce DrvHorn, a novel automated tool to detect these bugs by reducing reference counting verification to an assertion checking problem leveraging the Linux driver interface. Through efficient modeling of the Linux kernel and aggressive program slicing, DrvHorn discovered 545 bugs, of which 424 were previously unknown, across all platform drivers in v6.6 Linux kernel, with a lower false positive rate of 29.9% compared to prior studies. To address the root causes of these newly discovered bugs, we submitted patches to the Linux kernel, and 45 of them were merged.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces DrvHorn, a static analysis tool that reduces reference-counting bug detection in Linux kernel drivers to assertion checking. It does so via custom modeling of kernel primitives and aggressive program slicing, then applies the tool to all platform drivers in Linux v6.6. The authors report 545 bugs found (424 previously unknown), a 29.9% false-positive rate, and 45 submitted patches that were merged upstream.
Significance. If the modeling and slicing preserve the relevant reference-counting semantics, the work would constitute a meaningful advance in automated kernel bug finding: it scales to the entire platform-driver tree, reports a large absolute number of issues, achieves a lower false-positive rate than prior studies, and demonstrates real-world impact through merged patches.
major comments (3)
- [§3] §3 (Kernel Modeling): The central claim that reference counting is reduced to assertion checking rests on 'efficient modeling of the Linux kernel,' yet the manuscript provides no concrete description of how kref/kobject_get/put, devm_* allocators, or RCU-protected paths are abstracted, nor how indirect calls through driver interfaces are resolved. Without these details the 545-bug count and 29.9% FP rate cannot be independently assessed.
- [Evaluation] Evaluation section: The false-positive rate of 29.9% is presented without stating the validation procedure (manual inspection criteria, sampling method, or inter-rater agreement). The merged-patch signal only confirms a subset of reports and does not address the soundness of paths that were sliced away.
- [§4] §4 (Slicing): The description of 'aggressive program slicing' does not specify the slicing criteria used to retain all relevant call sites for reference-counting operations or how module-boundary crossings are handled. If relevant paths are omitted, both the true-positive count and the reported FP rate become unreliable.
minor comments (2)
- [Abstract] The abstract states concrete bug counts and FP rate but the main text should cross-reference the exact table or figure that tabulates them.
- Notation for the modeled reference-counting primitives is introduced without a summary table; adding one would improve readability.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review. We address each major comment below and will incorporate clarifications in a revised manuscript to improve reproducibility and transparency.
read point-by-point responses
-
Referee: [§3] §3 (Kernel Modeling): The central claim that reference counting is reduced to assertion checking rests on 'efficient modeling of the Linux kernel,' yet the manuscript provides no concrete description of how kref/kobject_get/put, devm_* allocators, or RCU-protected paths are abstracted, nor how indirect calls through driver interfaces are resolved. Without these details the 545-bug count and 29.9% FP rate cannot be independently assessed.
Authors: We agree that the current description in §3 is insufficient for independent assessment. In the revision we will expand §3 with concrete modeling details for kref/kobject_get/put, devm_* allocators, RCU-protected paths, and the resolution of indirect calls through driver interfaces, including any simplifying assumptions. revision: yes
-
Referee: [Evaluation] Evaluation section: The false-positive rate of 29.9% is presented without stating the validation procedure (manual inspection criteria, sampling method, or inter-rater agreement). The merged-patch signal only confirms a subset of reports and does not address the soundness of paths that were sliced away.
Authors: We acknowledge the gap in describing the validation procedure. The revised Evaluation section will specify the manual inspection criteria, sampling method, and inter-rater agreement measures used to compute the 29.9% false-positive rate. We will also add discussion of soundness for sliced paths and clarify the role of merged patches as supporting evidence rather than complete validation. revision: yes
-
Referee: [§4] §4 (Slicing): The description of 'aggressive program slicing' does not specify the slicing criteria used to retain all relevant call sites for reference-counting operations or how module-boundary crossings are handled. If relevant paths are omitted, both the true-positive count and the reported FP rate become unreliable.
Authors: We will revise §4 to specify the exact slicing criteria that retain reference-counting call sites and to describe how module-boundary crossings are handled. This will include justification that the retained paths preserve the relevant semantics for the bug-detection property. revision: yes
Circularity Check
No circularity: empirical static-analysis tool evaluated on real kernel code
full rationale
The paper presents DrvHorn as a practical static-analysis tool that reduces reference-count verification to assertion checking via kernel modeling and slicing. Its central results (545 bugs found, 424 new, 29.9% FP rate, 45 merged patches) are obtained by executing the tool on the v6.6 Linux kernel drivers and confirming a subset via external patch review. No equations, fitted parameters, or self-citations are used to derive the bug counts or FP rate; the modeling decisions are design choices whose soundness is assessed by the empirical outcomes rather than presupposed by them. This is a standard empirical software-engineering paper whose claims rest on observable detections and external validation, not on any reduction to its own inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Linux driver interface can be modeled sufficiently accurately for reference counting verification via assertion checking.
Reference graph
Works this paper leans on
-
[1]
Devicetree Kernel API,https://docs.kernel.org/devicetree/kernel-api. html
-
[2]
Linux and the Devicetree,https://docs.kernel.org/devicetree/usage-model. html
-
[3]
LLVM’s Analysis and Transform Passes,https://llvm.org/docs/Passes.html
-
[4]
CVE-2023-7192,https://cve.mitre.org/cgi-bin/cvename.cgi?name= CVE-2023-7192
MITRE. CVE-2023-7192,https://cve.mitre.org/cgi-bin/cvename.cgi?name= CVE-2023-7192
work page 2023
-
[5]
ACM SIGOPS Operating Systems Review40(4), 73–85 (2006)
Ball, T., Bounimova, E., Cook, B., Levin, V., Lichtenberg, J., McGarvey, C., On- drusek, B., Rajamani, S.K., Ustuner, A.: Thorough static analysis of device drivers. ACM SIGOPS Operating Systems Review40(4), 73–85 (2006)
work page 2006
-
[6]
Communications of the ACM54(7), 68–76 (2011)
Ball, T., Levin, V., Rajamani, S.K.: A decade of software model checking with SLAM. Communications of the ACM54(7), 68–76 (2011)
work page 2011
-
[7]
In: International Conference on Computer Aided Verification
Gurfinkel, A., Kahsai, T., Komuravelli, A., Navas, J.A.: The SeaHorn verification framework. In: International Conference on Computer Aided Verification. pp. 343–
-
[8]
Gurfinkel, A., Navas, J.A.: A context-sensitive memory model for verification of C/C++ programs. In: Static Analysis: 24th International Symposium, SAS 2017, New York, NY, USA, August 30–September 1, 2017, Proceedings 24. pp. 148–168. Springer (2017)
work page 2017
-
[9]
Henzinger, T.A., Jhala, R., Majumdar, R., Sutre, G.: Software verification with BLAST. In: Model Checking Software: 10th International SPIN Workshop Port- land, OR, USA, May 9–10, 2003 Proceedings 10. pp. 235–239. Springer (2003)
work page 2003
-
[10]
In: 31st USENIX Security Symposium (USENIX Security 22)
Liu, J., Yi, L., Chen, W., Song, C., Qian, Z., Yi, Q.: LinKRID: Vetting imbal- ance reference counting in linux kernel with symbolic execution. In: 31st USENIX Security Symposium (USENIX Security 22). pp. 125–142 (2022)
work page 2022
-
[11]
Mao, J., Chen, Y., Xiao, Q., Shi, Y.: RID: finding reference count bugs with in- consistent path pair checking. In: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems. pp. 531–544 (2016)
work page 2016
-
[12]
In: International Workshop on Parallel and Distributed Methods in Verification
Mühlberg, J.T., Lüttgen, G.: Blasting linux code. In: International Workshop on Parallel and Distributed Methods in Verification. pp. 211–226. Springer (2006)
work page 2006
-
[13]
In: International conference on integrated formal methods
Post, H., Küchlin, W.: Integrated static analysis for linux device driver verification. In: International conference on integrated formal methods. pp. 518–537. Springer (2007)
work page 2007
-
[14]
Software Testing, Verifi- cation and Reliability19(2), 155–172 (2009)
Post, H., Sinz, C., Küchlin, W.: Towards automatic software model checking of thousands of linux modules—a case study with Avinux. Software Testing, Verifi- cation and Reliability19(2), 155–172 (2009)
work page 2009
-
[15]
In: 2022 Formal Methods in Computer-Aided Design (FMCAD)
Priya, S., Su, Y., Bao, Y., Zhou, X., Vizel, Y., Gurfinkel, A.: Bounded model check- ing for LLVM. In: 2022 Formal Methods in Computer-Aided Design (FMCAD). pp. 214–224. TU Wien Academic Press (2022) Automatic Detection of Reference Counting Bugs in Linux Kernel Drivers 13
work page 2022
-
[16]
Tan,X.,Zhang,Y.,Yang,X.,Lu,K.,Yang,M.:Detectingkernelrefcountbugswith {Two-Dimensional}consistency checking. In: 30th USENIX Security Symposium (USENIX Security 21). pp. 2471–2488 (2021) 14 Joe Hattori , Naoki Kobayashi , and Ken Sakayori Appendix A Patch A.1 Commit 5d8e297 Related to Section 4.2 (Kernel Bug Fixes and Contributions). This example fixes a ...
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.