arxiv: 2605.13246 · v1 · submitted 2026-05-13 · 💻 cs.CR · cs.SE

Recognition: unknown

Automatic Detection of Reference Counting Bugs in Linux Kernel Drivers

Joe Hattori , Naoki Kobayashi , Ken Sakayori

Authors on Pith no claims yet

Pith reviewed 2026-05-14 20:25 UTC · model grok-4.3

classification 💻 cs.CR cs.SE

keywords reference countingLinux kernel driversbug detectionstatic analysisprogram slicingassertion checkingDrvHorn

0 comments

The pith

DrvHorn reduces reference counting verification in Linux kernel drivers to assertion checking through kernel modeling and program slicing.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents DrvHorn as an automated tool that finds reference counting bugs in Linux kernel drivers. These bugs produce resource leaks and security issues when drivers fail to balance increments and decrements on shared objects. By modeling the standard Linux driver interface and applying aggressive slicing to remove irrelevant code, DrvHorn turns the tracking problem into ordinary assertion checks that existing analyzers can run. When applied to every platform driver in kernel version 6.6, the tool reported 545 bugs, 424 of them previously unknown, at a false-positive rate of 29.9 percent. Developers reviewed the reports, submitted patches, and 45 were accepted into the kernel.

Core claim

DrvHorn reduces reference counting verification to an assertion checking problem by efficiently modeling the Linux kernel and applying aggressive program slicing. When run on all platform drivers in Linux kernel version 6.6, it identified 545 bugs, of which 424 were previously unknown, with a false positive rate of 29.9 percent that is lower than rates reported in prior studies. Patches addressing the newly discovered bugs were submitted to the kernel, and 45 of them have been merged.

What carries the argument

The DrvHorn tool, which reduces reference counting verification to assertion checking by modeling the Linux driver interface and applying aggressive program slicing.

If this is right

Reference counting bugs are common across platform drivers in the current Linux kernel.
Automated reduction to assertion checking can locate these bugs at scale with a usable false positive rate.
Patches generated from the tool's reports can be accepted and merged by kernel maintainers.
The approach covers the entire set of platform drivers without requiring per-driver manual modeling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same interface-modeling reduction could be applied to detect other classes of driver bugs such as locking or memory ordering errors.
Running DrvHorn as part of the kernel's continuous integration tests might prevent new reference counting bugs from reaching mainline.
Lower false positive rates achieved through driver-specific modeling suggest that generic pointer analyses miss important domain patterns.

Load-bearing premise

The kernel modeling and aggressive program slicing preserve enough reference counting semantics to detect real bugs without excessive distortion or omission of relevant paths.

What would settle it

A manual audit of a random sample of the 424 newly reported bugs to determine how many are genuine reference counting errors, or re-running the tool on an older kernel version where the complete set of such bugs is already documented from prior reports.

Figures

Figures reproduced from arXiv: 2605.13246 by Joe Hattori, Ken Sakayori, Naoki Kobayashi.

**Figure 2.** Figure 2: Overall architecture of DrvHorn 3.1 Reduction to Assertion Checking Problem This section describes how we reduce refcount verification to an assertion checking problem by exploiting the structure of the Linux kernel driver interface. In general, static refcount bug detection in the Linux kernel tends to lack scalability, accuracy, or both due to its vastness and complexity. However, refcount invariants in … view at source ↗

**Figure 3.** Figure 3: Abstract overview of the verification target. The assertion checking problem is expressed in LLVM bitcode ( [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Abstract implementation of usb_find_interface(). In this example, bus_find_device() returns a refcount-incremented struct device embedded in a struct usb_interface, and the driver recovers the enclosing interface structure via container_of. Device Resource Management Devres provides a device-managed layer of cleanup. __devm_add_action() is the API that registers a devres entry by associating a cleanup fun… view at source ↗

**Figure 5.** Figure 5: Simplified version of tpm_bios_measurements_open() before (left) and after (right) slicing. 3.4 CHC-based Verification As described in Section 3.1, it suffices to solve the assertion checking problem to verify the refcount invariant. Arbitrary off-the-shelf solvers can be used to solve this problem if they can accept LLVM bitcode as the input. We adopted SeaHorn [7] as our backend solver since it is suited… view at source ↗

**Figure 6.** Figure 6: Commit 5d8e297 tpm_bios_measurements_open() is registered as the open operation for tpm_bios_measurements_ops. It calls get_device() to increment the refcount of chip->dev, but the original code did not decrement it in the error path, leading to a memory leak. The patch adds put_device() in the error branch to balance the refcount. A.2 Commit 60e7c43e Related to Section 4.2 (Kernel Bug Fixes and Contributi… view at source ↗

**Figure 7.** Figure 7: Commit 60e7c43e node. This can lead to use-after-free bugs. The patch replaces the loop with for_each_child_of_node() and of_get_child_by_name(), and removes the extra of_node_put(). A.4 Commit 54a8cd0 Related to Section 4.2 (Kernel Bug Fixes and Contributions). This example removes a stored device link pointer that should only be used for a null check. With the DL_FLAG_AUTOREMOVE_SUPPLIER flag, the device… view at source ↗

**Figure 8.** Figure 8: Commit b9784e5cd struct qcom_ice { ... - struct device_link *link; }; struct qcom_ice *of_qcom_ice_get(struct device *dev) { struct qcom_ice *ice; + struct device_link *link; ... - ice->link = device_link_add(dev, ..., DL_FLAG_AUTOREMOVE_SUPPLIER); - if (!ice->link) + link = device_link_add(dev, ..., DL_FLAG_AUTOREMOVE_SUPPLIER); + if (!link) ... } [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗

**Figure 9.** Figure 9: Commit 54a8cd0 pointer in a field struct device_link *link; of struct qcom_ice, which was unnecessary. The patch removes the field and uses a local variable solely for the null check [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗

**Figure 10.** Figure 10: Implementation of acer_platform_probe. C Paper-Code Mapping This section briefly explains where to look in the codebase. As already mentioned, DrvHorn is availabe at github.com/joehattori/drvhorn. DrvHorn is developed as a fork of SeaHorn, and most of the modifications are made in the directory lib/Transforms/Kernel/. The following table gives a mapping describing in which file the contents of the paper … view at source ↗

read the original abstract

Reference counting bugs in Linux kernel drivers can lead to severe resource mismanagement and security vulnerabilities. We introduce DrvHorn, a novel automated tool to detect these bugs by reducing reference counting verification to an assertion checking problem leveraging the Linux driver interface. Through efficient modeling of the Linux kernel and aggressive program slicing, DrvHorn discovered 545 bugs, of which 424 were previously unknown, across all platform drivers in v6.6 Linux kernel, with a lower false positive rate of 29.9% compared to prior studies. To address the root causes of these newly discovered bugs, we submitted patches to the Linux kernel, and 45 of them were merged.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DrvHorn finds 545 refcount bugs in Linux v6.6 drivers with 45 patches merged, but thin details on the kernel model and slicing leave the numbers hard to trust fully.

read the letter

The paper's core result is that DrvHorn detected 545 reference counting bugs across platform drivers in Linux v6.6, including 424 that were new, and 45 of the submitted patches got merged. That scale and the external signal from the kernel tree are the parts worth paying attention to first. The approach reduces the problem to assertion checking by building a model of the driver interfaces and applying aggressive slicing, which lets the tool run over the full set of drivers rather than small samples. This is more targeted than generic static analyzers and produces a reported false positive rate of 29.9 percent, lower than some earlier studies. The merged patches provide concrete evidence that at least some of the reports were real and actionable. The main gaps are in the modeling choices. The abstract mentions efficient modeling of the kernel and slicing but does not spell out how kref, kobject, devm, or RCU paths are abstracted or what criteria decide which call sites stay in the slice. If indirect calls or cross-module paths get dropped, the bug counts and false positive rate could shift. There is also no description of how the false positives were manually validated or whether the 29.9 percent figure includes error bars. Those omissions make the soundness claim rest more on the patch outcomes than on transparent methodology. This work is aimed at researchers and developers who build static tools for large systems codebases. Someone looking for practical bug-finding results on real kernel drivers would find the numbers and patch data useful. It is worth sending to peer review because the empirical footprint is large enough to justify checking the modeling details and validation steps in full.

Referee Report

3 major / 2 minor

Summary. The paper introduces DrvHorn, a static analysis tool that reduces reference-counting bug detection in Linux kernel drivers to assertion checking. It does so via custom modeling of kernel primitives and aggressive program slicing, then applies the tool to all platform drivers in Linux v6.6. The authors report 545 bugs found (424 previously unknown), a 29.9% false-positive rate, and 45 submitted patches that were merged upstream.

Significance. If the modeling and slicing preserve the relevant reference-counting semantics, the work would constitute a meaningful advance in automated kernel bug finding: it scales to the entire platform-driver tree, reports a large absolute number of issues, achieves a lower false-positive rate than prior studies, and demonstrates real-world impact through merged patches.

major comments (3)

[§3] §3 (Kernel Modeling): The central claim that reference counting is reduced to assertion checking rests on 'efficient modeling of the Linux kernel,' yet the manuscript provides no concrete description of how kref/kobject_get/put, devm_* allocators, or RCU-protected paths are abstracted, nor how indirect calls through driver interfaces are resolved. Without these details the 545-bug count and 29.9% FP rate cannot be independently assessed.
[Evaluation] Evaluation section: The false-positive rate of 29.9% is presented without stating the validation procedure (manual inspection criteria, sampling method, or inter-rater agreement). The merged-patch signal only confirms a subset of reports and does not address the soundness of paths that were sliced away.
[§4] §4 (Slicing): The description of 'aggressive program slicing' does not specify the slicing criteria used to retain all relevant call sites for reference-counting operations or how module-boundary crossings are handled. If relevant paths are omitted, both the true-positive count and the reported FP rate become unreliable.

minor comments (2)

[Abstract] The abstract states concrete bug counts and FP rate but the main text should cross-reference the exact table or figure that tabulates them.
Notation for the modeled reference-counting primitives is introduced without a summary table; adding one would improve readability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We address each major comment below and will incorporate clarifications in a revised manuscript to improve reproducibility and transparency.

read point-by-point responses

Referee: [§3] §3 (Kernel Modeling): The central claim that reference counting is reduced to assertion checking rests on 'efficient modeling of the Linux kernel,' yet the manuscript provides no concrete description of how kref/kobject_get/put, devm_* allocators, or RCU-protected paths are abstracted, nor how indirect calls through driver interfaces are resolved. Without these details the 545-bug count and 29.9% FP rate cannot be independently assessed.

Authors: We agree that the current description in §3 is insufficient for independent assessment. In the revision we will expand §3 with concrete modeling details for kref/kobject_get/put, devm_* allocators, RCU-protected paths, and the resolution of indirect calls through driver interfaces, including any simplifying assumptions. revision: yes
Referee: [Evaluation] Evaluation section: The false-positive rate of 29.9% is presented without stating the validation procedure (manual inspection criteria, sampling method, or inter-rater agreement). The merged-patch signal only confirms a subset of reports and does not address the soundness of paths that were sliced away.

Authors: We acknowledge the gap in describing the validation procedure. The revised Evaluation section will specify the manual inspection criteria, sampling method, and inter-rater agreement measures used to compute the 29.9% false-positive rate. We will also add discussion of soundness for sliced paths and clarify the role of merged patches as supporting evidence rather than complete validation. revision: yes
Referee: [§4] §4 (Slicing): The description of 'aggressive program slicing' does not specify the slicing criteria used to retain all relevant call sites for reference-counting operations or how module-boundary crossings are handled. If relevant paths are omitted, both the true-positive count and the reported FP rate become unreliable.

Authors: We will revise §4 to specify the exact slicing criteria that retain reference-counting call sites and to describe how module-boundary crossings are handled. This will include justification that the retained paths preserve the relevant semantics for the bug-detection property. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical static-analysis tool evaluated on real kernel code

full rationale

The paper presents DrvHorn as a practical static-analysis tool that reduces reference-count verification to assertion checking via kernel modeling and slicing. Its central results (545 bugs found, 424 new, 29.9% FP rate, 45 merged patches) are obtained by executing the tool on the v6.6 Linux kernel drivers and confirming a subset via external patch review. No equations, fitted parameters, or self-citations are used to derive the bug counts or FP rate; the modeling decisions are design choices whose soundness is assessed by the empirical outcomes rather than presupposed by them. This is a standard empirical software-engineering paper whose claims rest on observable detections and external validation, not on any reduction to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on domain assumptions about the fidelity of the Linux driver model and slicing; no free parameters or invented entities are described in the abstract.

axioms (1)

domain assumption Linux driver interface can be modeled sufficiently accurately for reference counting verification via assertion checking.
This modeling is the core reduction step that enables the automated detection.

pith-pipeline@v0.9.0 · 5402 in / 1237 out tokens · 29428 ms · 2026-05-14T20:25:57.625437+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

[1]

Devicetree Kernel API,https://docs.kernel.org/devicetree/kernel-api. html

work page
[2]

Linux and the Devicetree,https://docs.kernel.org/devicetree/usage-model. html

work page
[3]

LLVM’s Analysis and Transform Passes,https://llvm.org/docs/Passes.html

work page
[4]

CVE-2023-7192,https://cve.mitre.org/cgi-bin/cvename.cgi?name= CVE-2023-7192

MITRE. CVE-2023-7192,https://cve.mitre.org/cgi-bin/cvename.cgi?name= CVE-2023-7192

work page 2023
[5]

ACM SIGOPS Operating Systems Review40(4), 73–85 (2006)

Ball, T., Bounimova, E., Cook, B., Levin, V., Lichtenberg, J., McGarvey, C., On- drusek, B., Rajamani, S.K., Ustuner, A.: Thorough static analysis of device drivers. ACM SIGOPS Operating Systems Review40(4), 73–85 (2006)

work page 2006
[6]

Communications of the ACM54(7), 68–76 (2011)

Ball, T., Levin, V., Rajamani, S.K.: A decade of software model checking with SLAM. Communications of the ACM54(7), 68–76 (2011)

work page 2011
[7]

In: International Conference on Computer Aided Verification

Gurfinkel, A., Kahsai, T., Komuravelli, A., Navas, J.A.: The SeaHorn verification framework. In: International Conference on Computer Aided Verification. pp. 343–

work page
[8]

In: Static Analysis: 24th International Symposium, SAS 2017, New York, NY, USA, August 30–September 1, 2017, Proceedings 24

Gurfinkel, A., Navas, J.A.: A context-sensitive memory model for verification of C/C++ programs. In: Static Analysis: 24th International Symposium, SAS 2017, New York, NY, USA, August 30–September 1, 2017, Proceedings 24. pp. 148–168. Springer (2017)

work page 2017
[9]

In: Model Checking Software: 10th International SPIN Workshop Port- land, OR, USA, May 9–10, 2003 Proceedings 10

Henzinger, T.A., Jhala, R., Majumdar, R., Sutre, G.: Software verification with BLAST. In: Model Checking Software: 10th International SPIN Workshop Port- land, OR, USA, May 9–10, 2003 Proceedings 10. pp. 235–239. Springer (2003)

work page 2003
[10]

In: 31st USENIX Security Symposium (USENIX Security 22)

Liu, J., Yi, L., Chen, W., Song, C., Qian, Z., Yi, Q.: LinKRID: Vetting imbal- ance reference counting in linux kernel with symbolic execution. In: 31st USENIX Security Symposium (USENIX Security 22). pp. 125–142 (2022)

work page 2022
[11]

In: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems

Mao, J., Chen, Y., Xiao, Q., Shi, Y.: RID: finding reference count bugs with in- consistent path pair checking. In: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems. pp. 531–544 (2016)

work page 2016
[12]

In: International Workshop on Parallel and Distributed Methods in Verification

Mühlberg, J.T., Lüttgen, G.: Blasting linux code. In: International Workshop on Parallel and Distributed Methods in Verification. pp. 211–226. Springer (2006)

work page 2006
[13]

In: International conference on integrated formal methods

Post, H., Küchlin, W.: Integrated static analysis for linux device driver verification. In: International conference on integrated formal methods. pp. 518–537. Springer (2007)

work page 2007
[14]

Software Testing, Verifi- cation and Reliability19(2), 155–172 (2009)

Post, H., Sinz, C., Küchlin, W.: Towards automatic software model checking of thousands of linux modules—a case study with Avinux. Software Testing, Verifi- cation and Reliability19(2), 155–172 (2009)

work page 2009
[15]

In: 2022 Formal Methods in Computer-Aided Design (FMCAD)

Priya, S., Su, Y., Bao, Y., Zhou, X., Vizel, Y., Gurfinkel, A.: Bounded model check- ing for LLVM. In: 2022 Formal Methods in Computer-Aided Design (FMCAD). pp. 214–224. TU Wien Academic Press (2022) Automatic Detection of Reference Counting Bugs in Linux Kernel Drivers 13

work page 2022
[16]

q6wcss"); qcom_add_pdm_subdev(rproc, &wcss->pdm_subdev); qcom_add_ssr_subdev(rproc, &wcss->ssr_subdev,

Tan,X.,Zhang,Y.,Yang,X.,Lu,K.,Yang,M.:Detectingkernelrefcountbugswith {Two-Dimensional}consistency checking. In: 30th USENIX Security Symposium (USENIX Security 21). pp. 2471–2488 (2021) 14 Joe Hattori , Naoki Kobayashi , and Ken Sakayori Appendix A Patch A.1 Commit 5d8e297 Related to Section 4.2 (Kernel Bug Fixes and Contributions). This example fixes a ...

work page 2021