pith. machine review for the scientific record. sign in

arxiv: 2605.13246 · v1 · submitted 2026-05-13 · 💻 cs.CR · cs.SE

Recognition: unknown

Automatic Detection of Reference Counting Bugs in Linux Kernel Drivers

Authors on Pith no claims yet

Pith reviewed 2026-05-14 20:25 UTC · model grok-4.3

classification 💻 cs.CR cs.SE
keywords reference countingLinux kernel driversbug detectionstatic analysisprogram slicingassertion checkingDrvHorn
0
0 comments X

The pith

DrvHorn reduces reference counting verification in Linux kernel drivers to assertion checking through kernel modeling and program slicing.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents DrvHorn as an automated tool that finds reference counting bugs in Linux kernel drivers. These bugs produce resource leaks and security issues when drivers fail to balance increments and decrements on shared objects. By modeling the standard Linux driver interface and applying aggressive slicing to remove irrelevant code, DrvHorn turns the tracking problem into ordinary assertion checks that existing analyzers can run. When applied to every platform driver in kernel version 6.6, the tool reported 545 bugs, 424 of them previously unknown, at a false-positive rate of 29.9 percent. Developers reviewed the reports, submitted patches, and 45 were accepted into the kernel.

Core claim

DrvHorn reduces reference counting verification to an assertion checking problem by efficiently modeling the Linux kernel and applying aggressive program slicing. When run on all platform drivers in Linux kernel version 6.6, it identified 545 bugs, of which 424 were previously unknown, with a false positive rate of 29.9 percent that is lower than rates reported in prior studies. Patches addressing the newly discovered bugs were submitted to the kernel, and 45 of them have been merged.

What carries the argument

The DrvHorn tool, which reduces reference counting verification to assertion checking by modeling the Linux driver interface and applying aggressive program slicing.

If this is right

  • Reference counting bugs are common across platform drivers in the current Linux kernel.
  • Automated reduction to assertion checking can locate these bugs at scale with a usable false positive rate.
  • Patches generated from the tool's reports can be accepted and merged by kernel maintainers.
  • The approach covers the entire set of platform drivers without requiring per-driver manual modeling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same interface-modeling reduction could be applied to detect other classes of driver bugs such as locking or memory ordering errors.
  • Running DrvHorn as part of the kernel's continuous integration tests might prevent new reference counting bugs from reaching mainline.
  • Lower false positive rates achieved through driver-specific modeling suggest that generic pointer analyses miss important domain patterns.

Load-bearing premise

The kernel modeling and aggressive program slicing preserve enough reference counting semantics to detect real bugs without excessive distortion or omission of relevant paths.

What would settle it

A manual audit of a random sample of the 424 newly reported bugs to determine how many are genuine reference counting errors, or re-running the tool on an older kernel version where the complete set of such bugs is already documented from prior reports.

Figures

Figures reproduced from arXiv: 2605.13246 by Joe Hattori, Ken Sakayori, Naoki Kobayashi.

Figure 1
Figure 1. Figure 1: Commit e386209 3 Proposed Method [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overall architecture of DrvHorn 3.1 Reduction to Assertion Checking Problem This section describes how we reduce refcount verification to an assertion checking problem by exploiting the structure of the Linux kernel driver interface. In general, static refcount bug detection in the Linux kernel tends to lack scalability, accuracy, or both due to its vastness and complexity. However, refcount invariants in … view at source ↗
Figure 3
Figure 3. Figure 3: Abstract overview of the verification target. The assertion checking problem is expressed in LLVM bitcode ( [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Abstract implementation of usb_find_interface(). In this example, bus_find_device() returns a refcount-incremented struct device embedded in a struct usb_interface, and the driver recovers the enclosing interface structure via container_of. Device Resource Management Devres provides a device-managed layer of cleanup. __devm_add_action() is the API that registers a devres entry by asso￾ciating a cleanup fun… view at source ↗
Figure 5
Figure 5. Figure 5: Simplified version of tpm_bios_measurements_open() before (left) and after (right) slicing. 3.4 CHC-based Verification As described in Section 3.1, it suffices to solve the assertion checking problem to verify the refcount invariant. Arbitrary off-the-shelf solvers can be used to solve this problem if they can accept LLVM bitcode as the input. We adopted SeaHorn [7] as our backend solver since it is suited… view at source ↗
Figure 6
Figure 6. Figure 6: Commit 5d8e297 tpm_bios_measurements_open() is registered as the open operation for tpm_bios_measurements_ops. It calls get_device() to increment the refcount of chip->dev, but the original code did not decrement it in the error path, leading to a memory leak. The patch adds put_device() in the error branch to balance the refcount. A.2 Commit 60e7c43e Related to Section 4.2 (Kernel Bug Fixes and Contributi… view at source ↗
Figure 7
Figure 7. Figure 7: Commit 60e7c43e node. This can lead to use-after-free bugs. The patch replaces the loop with for_each_child_of_node() and of_get_child_by_name(), and removes the extra of_node_put(). A.4 Commit 54a8cd0 Related to Section 4.2 (Kernel Bug Fixes and Contributions). This example removes a stored device link pointer that should only be used for a null check. With the DL_FLAG_AUTOREMOVE_SUPPLIER flag, the device… view at source ↗
Figure 8
Figure 8. Figure 8: Commit b9784e5cd struct qcom_ice { ... - struct device_link *link; }; struct qcom_ice *of_qcom_ice_get(struct device *dev) { struct qcom_ice *ice; + struct device_link *link; ... - ice->link = device_link_add(dev, ..., DL_FLAG_AUTOREMOVE_SUPPLIER); - if (!ice->link) + link = device_link_add(dev, ..., DL_FLAG_AUTOREMOVE_SUPPLIER); + if (!link) ... } [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Commit 54a8cd0 pointer in a field struct device_link *link; of struct qcom_ice, which was unnecessary. The patch removes the field and uses a local variable solely for the null check [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Implementation of acer_platform_probe. C Paper-Code Mapping This section briefly explains where to look in the codebase. As already men￾tioned, DrvHorn is availabe at github.com/joehattori/drvhorn. DrvHorn is developed as a fork of SeaHorn, and most of the modifications are made in the directory lib/Transforms/Kernel/. The following table gives a mapping describing in which file the contents of the paper … view at source ↗
read the original abstract

Reference counting bugs in Linux kernel drivers can lead to severe resource mismanagement and security vulnerabilities. We introduce DrvHorn, a novel automated tool to detect these bugs by reducing reference counting verification to an assertion checking problem leveraging the Linux driver interface. Through efficient modeling of the Linux kernel and aggressive program slicing, DrvHorn discovered 545 bugs, of which 424 were previously unknown, across all platform drivers in v6.6 Linux kernel, with a lower false positive rate of 29.9% compared to prior studies. To address the root causes of these newly discovered bugs, we submitted patches to the Linux kernel, and 45 of them were merged.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces DrvHorn, a static analysis tool that reduces reference-counting bug detection in Linux kernel drivers to assertion checking. It does so via custom modeling of kernel primitives and aggressive program slicing, then applies the tool to all platform drivers in Linux v6.6. The authors report 545 bugs found (424 previously unknown), a 29.9% false-positive rate, and 45 submitted patches that were merged upstream.

Significance. If the modeling and slicing preserve the relevant reference-counting semantics, the work would constitute a meaningful advance in automated kernel bug finding: it scales to the entire platform-driver tree, reports a large absolute number of issues, achieves a lower false-positive rate than prior studies, and demonstrates real-world impact through merged patches.

major comments (3)
  1. [§3] §3 (Kernel Modeling): The central claim that reference counting is reduced to assertion checking rests on 'efficient modeling of the Linux kernel,' yet the manuscript provides no concrete description of how kref/kobject_get/put, devm_* allocators, or RCU-protected paths are abstracted, nor how indirect calls through driver interfaces are resolved. Without these details the 545-bug count and 29.9% FP rate cannot be independently assessed.
  2. [Evaluation] Evaluation section: The false-positive rate of 29.9% is presented without stating the validation procedure (manual inspection criteria, sampling method, or inter-rater agreement). The merged-patch signal only confirms a subset of reports and does not address the soundness of paths that were sliced away.
  3. [§4] §4 (Slicing): The description of 'aggressive program slicing' does not specify the slicing criteria used to retain all relevant call sites for reference-counting operations or how module-boundary crossings are handled. If relevant paths are omitted, both the true-positive count and the reported FP rate become unreliable.
minor comments (2)
  1. [Abstract] The abstract states concrete bug counts and FP rate but the main text should cross-reference the exact table or figure that tabulates them.
  2. Notation for the modeled reference-counting primitives is introduced without a summary table; adding one would improve readability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We address each major comment below and will incorporate clarifications in a revised manuscript to improve reproducibility and transparency.

read point-by-point responses
  1. Referee: [§3] §3 (Kernel Modeling): The central claim that reference counting is reduced to assertion checking rests on 'efficient modeling of the Linux kernel,' yet the manuscript provides no concrete description of how kref/kobject_get/put, devm_* allocators, or RCU-protected paths are abstracted, nor how indirect calls through driver interfaces are resolved. Without these details the 545-bug count and 29.9% FP rate cannot be independently assessed.

    Authors: We agree that the current description in §3 is insufficient for independent assessment. In the revision we will expand §3 with concrete modeling details for kref/kobject_get/put, devm_* allocators, RCU-protected paths, and the resolution of indirect calls through driver interfaces, including any simplifying assumptions. revision: yes

  2. Referee: [Evaluation] Evaluation section: The false-positive rate of 29.9% is presented without stating the validation procedure (manual inspection criteria, sampling method, or inter-rater agreement). The merged-patch signal only confirms a subset of reports and does not address the soundness of paths that were sliced away.

    Authors: We acknowledge the gap in describing the validation procedure. The revised Evaluation section will specify the manual inspection criteria, sampling method, and inter-rater agreement measures used to compute the 29.9% false-positive rate. We will also add discussion of soundness for sliced paths and clarify the role of merged patches as supporting evidence rather than complete validation. revision: yes

  3. Referee: [§4] §4 (Slicing): The description of 'aggressive program slicing' does not specify the slicing criteria used to retain all relevant call sites for reference-counting operations or how module-boundary crossings are handled. If relevant paths are omitted, both the true-positive count and the reported FP rate become unreliable.

    Authors: We will revise §4 to specify the exact slicing criteria that retain reference-counting call sites and to describe how module-boundary crossings are handled. This will include justification that the retained paths preserve the relevant semantics for the bug-detection property. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical static-analysis tool evaluated on real kernel code

full rationale

The paper presents DrvHorn as a practical static-analysis tool that reduces reference-count verification to assertion checking via kernel modeling and slicing. Its central results (545 bugs found, 424 new, 29.9% FP rate, 45 merged patches) are obtained by executing the tool on the v6.6 Linux kernel drivers and confirming a subset via external patch review. No equations, fitted parameters, or self-citations are used to derive the bug counts or FP rate; the modeling decisions are design choices whose soundness is assessed by the empirical outcomes rather than presupposed by them. This is a standard empirical software-engineering paper whose claims rest on observable detections and external validation, not on any reduction to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on domain assumptions about the fidelity of the Linux driver model and slicing; no free parameters or invented entities are described in the abstract.

axioms (1)
  • domain assumption Linux driver interface can be modeled sufficiently accurately for reference counting verification via assertion checking.
    This modeling is the core reduction step that enables the automated detection.

pith-pipeline@v0.9.0 · 5402 in / 1237 out tokens · 29428 ms · 2026-05-14T20:25:57.625437+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

  1. [1]

    Devicetree Kernel API,https://docs.kernel.org/devicetree/kernel-api. html

  2. [2]

    Linux and the Devicetree,https://docs.kernel.org/devicetree/usage-model. html

  3. [3]

    LLVM’s Analysis and Transform Passes,https://llvm.org/docs/Passes.html

  4. [4]

    CVE-2023-7192,https://cve.mitre.org/cgi-bin/cvename.cgi?name= CVE-2023-7192

    MITRE. CVE-2023-7192,https://cve.mitre.org/cgi-bin/cvename.cgi?name= CVE-2023-7192

  5. [5]

    ACM SIGOPS Operating Systems Review40(4), 73–85 (2006)

    Ball, T., Bounimova, E., Cook, B., Levin, V., Lichtenberg, J., McGarvey, C., On- drusek, B., Rajamani, S.K., Ustuner, A.: Thorough static analysis of device drivers. ACM SIGOPS Operating Systems Review40(4), 73–85 (2006)

  6. [6]

    Communications of the ACM54(7), 68–76 (2011)

    Ball, T., Levin, V., Rajamani, S.K.: A decade of software model checking with SLAM. Communications of the ACM54(7), 68–76 (2011)

  7. [7]

    In: International Conference on Computer Aided Verification

    Gurfinkel, A., Kahsai, T., Komuravelli, A., Navas, J.A.: The SeaHorn verification framework. In: International Conference on Computer Aided Verification. pp. 343–

  8. [8]

    In: Static Analysis: 24th International Symposium, SAS 2017, New York, NY, USA, August 30–September 1, 2017, Proceedings 24

    Gurfinkel, A., Navas, J.A.: A context-sensitive memory model for verification of C/C++ programs. In: Static Analysis: 24th International Symposium, SAS 2017, New York, NY, USA, August 30–September 1, 2017, Proceedings 24. pp. 148–168. Springer (2017)

  9. [9]

    In: Model Checking Software: 10th International SPIN Workshop Port- land, OR, USA, May 9–10, 2003 Proceedings 10

    Henzinger, T.A., Jhala, R., Majumdar, R., Sutre, G.: Software verification with BLAST. In: Model Checking Software: 10th International SPIN Workshop Port- land, OR, USA, May 9–10, 2003 Proceedings 10. pp. 235–239. Springer (2003)

  10. [10]

    In: 31st USENIX Security Symposium (USENIX Security 22)

    Liu, J., Yi, L., Chen, W., Song, C., Qian, Z., Yi, Q.: LinKRID: Vetting imbal- ance reference counting in linux kernel with symbolic execution. In: 31st USENIX Security Symposium (USENIX Security 22). pp. 125–142 (2022)

  11. [11]

    In: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems

    Mao, J., Chen, Y., Xiao, Q., Shi, Y.: RID: finding reference count bugs with in- consistent path pair checking. In: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems. pp. 531–544 (2016)

  12. [12]

    In: International Workshop on Parallel and Distributed Methods in Verification

    Mühlberg, J.T., Lüttgen, G.: Blasting linux code. In: International Workshop on Parallel and Distributed Methods in Verification. pp. 211–226. Springer (2006)

  13. [13]

    In: International conference on integrated formal methods

    Post, H., Küchlin, W.: Integrated static analysis for linux device driver verification. In: International conference on integrated formal methods. pp. 518–537. Springer (2007)

  14. [14]

    Software Testing, Verifi- cation and Reliability19(2), 155–172 (2009)

    Post, H., Sinz, C., Küchlin, W.: Towards automatic software model checking of thousands of linux modules—a case study with Avinux. Software Testing, Verifi- cation and Reliability19(2), 155–172 (2009)

  15. [15]

    In: 2022 Formal Methods in Computer-Aided Design (FMCAD)

    Priya, S., Su, Y., Bao, Y., Zhou, X., Vizel, Y., Gurfinkel, A.: Bounded model check- ing for LLVM. In: 2022 Formal Methods in Computer-Aided Design (FMCAD). pp. 214–224. TU Wien Academic Press (2022) Automatic Detection of Reference Counting Bugs in Linux Kernel Drivers 13

  16. [16]

    q6wcss"); qcom_add_pdm_subdev(rproc, &wcss->pdm_subdev); qcom_add_ssr_subdev(rproc, &wcss->ssr_subdev,

    Tan,X.,Zhang,Y.,Yang,X.,Lu,K.,Yang,M.:Detectingkernelrefcountbugswith {Two-Dimensional}consistency checking. In: 30th USENIX Security Symposium (USENIX Security 21). pp. 2471–2488 (2021) 14 Joe Hattori , Naoki Kobayashi , and Ken Sakayori Appendix A Patch A.1 Commit 5d8e297 Related to Section 4.2 (Kernel Bug Fixes and Contributions). This example fixes a ...