Descriptor: Certus Caliber Classification Gunshot Dataset (C3GD)

Ryan Quinn; Sinclair Gurny

arxiv: 2606.18135 · v1 · pith:WJKWAHMEnew · submitted 2026-06-16 · 💻 cs.SD · cs.AI

Descriptor: Certus Caliber Classification Gunshot Dataset (C3GD)

Sinclair Gurny , Ryan Quinn This is my paper

Pith reviewed 2026-06-26 22:33 UTC · model grok-4.3

classification 💻 cs.SD cs.AI

keywords gunshot datasetcaliber classificationfield recordingsmuzzle blast audioaudio datasetfirearm soundmachine learning datasignal processing

0 comments

The pith

A new public dataset supplies over 8000 field recordings of 28 firearms across 16 calibers with detailed metadata for audio analysis.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents the Certus Caliber Classification Gunshot Dataset as a resource of muzzle blast sounds gathered outdoors rather than taken from the internet. It records more than 8000 samples from 28 distinct firearms spanning 16 calibers, along with metadata on cartridges, microphones, and recording positions that goes beyond typical releases. The authors argue that this field approach reduces label noise and supplies enough variety to train models that work in varied real settings. Primary use is caliber classification, yet the same files support detection, separation, and general signal processing work. The dataset is released publicly to give researchers a consistent, high-quality reference.

Core claim

The paper's central contribution is the release of the C3GD dataset, which contains more than 8000 field-collected audio recordings of muzzle blasts from 28 firearms across 16 calibers together with metadata on firearms, cartridges, microphones, and microphone locations that exceeds what is otherwise publicly available; the collection is positioned to improve caliber classification while also enabling gunshot detection, audio separation, and signal processing tasks.

What carries the argument

The C3GD dataset: a set of field-collected muzzle blast recordings equipped with firearm, caliber, cartridge, microphone, and location metadata.

If this is right

Classifiers for caliber identification can be trained without the label errors common in web-scraped audio.
Audio separation and detection algorithms gain a reference set that includes realistic microphone placement and environmental variation.
Studies can now isolate the effect of specific metadata variables such as microphone distance on classification performance.
Signal processing methods developed for firearm sounds can be validated against a single, documented collection rather than scattered sources.
The dataset supplies a benchmark that future work can use to measure progress in real-world gunshot audio tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Researchers working on public-safety audio systems may find the metadata useful for building location-aware or microphone-aware models.
The emphasis on field collection suggests that similar costly but low-noise datasets could be created for other impulsive sounds such as explosions or industrial events.
If the diversity proves sufficient, the same files could serve as a test bed for domain-adaptation techniques that move models from controlled to uncontrolled environments.
The release lowers the barrier for academic groups that lack resources to perform their own field recordings.

Load-bearing premise

Field-collected recordings carry lower label noise than internet audio and the chosen mix of firearms, calibers, and conditions is broad enough for models to generalize to new real-world situations.

What would settle it

Training a caliber classifier on C3GD and finding that its accuracy on a fresh set of field recordings is no higher than the accuracy of the same model trained on internet-sourced gunshot audio would falsify the advantage claimed for the new dataset.

Figures

Figures reproduced from arXiv: 2606.18135 by Ryan Quinn, Sinclair Gurny.

**Figure 1.** Figure 1: Locations of microphones for Ohio, New Jersey, and New York collection events, respectively. The red point [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

read the original abstract

In this work, we introduce the Certus Caliber Classification Gunshot Dataset (C3GD), a publicly accessible data set developed for the analysis of firearm muzzle blast sounds. The dataset aims to provide a wide variety of firearms, calibers, cartridges, microphones, and microphone locations with metadata detailed beyond what is currently otherwise available. It comprises more than 8000 field-collected data points from 28 firearms across 16 calibers. Because data collection in the field is costly, much of the existing research has been done using gunshot audio collected from the internet, which increases the risk of low-quality data and label noise. This dataset is primarily focused on caliber classification, but can also be used for gunshot detection, audio separation, and audio signal processing, providing a diversified and real-world reference. The dataset aims to provide enough diversity to be able to generalize to more real-world applications while also providing enough metadata for detailed academic analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Dataset release claiming field collection beats web audio on quality but supplies no methods or validation to show it.

read the letter

The core of this paper is the release of the C3GD dataset: more than 8000 field-collected gunshot recordings from 28 firearms across 16 calibers, with an emphasis on detailed metadata for caliber classification and secondary uses like detection or signal processing. It correctly notes that scraping audio from the internet often brings label noise and quality problems, and it tries to improve on that by collecting in the field.

The work does a straightforward job of identifying a practical gap and offering a public resource that covers a range of firearms, calibers, cartridges, microphones, and locations. That setup could matter for people who need real-world audio examples rather than cleaned or synthetic ones.

The soft spot is the complete absence of supporting details. The text asserts advantages in quality, diversity, and generalization potential but gives no account of how labels were assigned, what the recording distances or environments were, what sampling rates or equipment specs applied, or what quality-control steps were taken. Without those, the claims about lower noise and sufficient coverage for real applications stay untestable, exactly as the stress-test note flags. If the full paper adds protocols and statistics, that would change the picture; based on what is here, the evidence is missing.

This is for a narrow group working on audio ML for security or forensic tasks who might download and experiment with the data. A reader already in that subfield could get incremental value from the resource itself if the release is clean and documented.

It does not look ready for serious peer review. The methodological transparency gap is central for a dataset paper, and fixing it would be needed before referees could usefully assess the contribution.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces the Certus Caliber Classification Gunshot Dataset (C3GD), a publicly accessible collection of more than 8000 field-collected gunshot audio recordings from 28 firearms across 16 calibers, accompanied by metadata claimed to exceed what is otherwise available. It positions the dataset as superior to internet-sourced audio due to reduced label noise and greater diversity in firearms, calibers, cartridges, microphones, and locations, with primary utility for caliber classification and secondary support for gunshot detection, audio separation, and signal processing.

Significance. If the dataset's scale, diversity, and quality claims are substantiated, it would supply a valuable real-world reference for audio machine learning in forensic and security domains, enabling better generalization than web-scraped alternatives and supporting detailed metadata-driven analyses.

major comments (2)

[Abstract] Abstract: The assertion that field collection produces higher-quality data with lower label noise than internet-sourced audio is unsupported by any description of the labeling protocol, quality-control procedures, or quantitative validation metrics, leaving the central quality advantage untestable.
[Abstract] Abstract: No recording parameters (distances, environments, sampling rates), microphone specifications, or location details are supplied, which are required to assess whether the claimed diversity across 28 firearms and 16 calibers supports generalization claims.

minor comments (1)

[Abstract] Abstract: 'data set' appears inconsistently; standardize to 'dataset'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments highlighting areas where the abstract requires additional support. We address each major comment below and have revised the manuscript to incorporate the requested details.

read point-by-point responses

Referee: [Abstract] Abstract: The assertion that field collection produces higher-quality data with lower label noise than internet-sourced audio is unsupported by any description of the labeling protocol, quality-control procedures, or quantitative validation metrics, leaving the central quality advantage untestable.

Authors: We agree that the abstract's claim regarding reduced label noise would benefit from explicit supporting information. We have revised the manuscript to include a description of the labeling protocol, on-site verification steps, and quality-control procedures used during field collection. revision: yes
Referee: [Abstract] Abstract: No recording parameters (distances, environments, sampling rates), microphone specifications, or location details are supplied, which are required to assess whether the claimed diversity across 28 firearms and 16 calibers supports generalization claims.

Authors: We acknowledge that these parameters are necessary to evaluate the diversity and generalization claims. We have revised the manuscript to supply the recording parameters, including distances, environments, sampling rates, microphone specifications, and location details. revision: yes

Circularity Check

0 steps flagged

No circularity in dataset descriptor paper

full rationale

This is a descriptive dataset release paper with no equations, derivations, predictions, fitted parameters, or self-referential logic. The central claims concern the composition of the C3GD dataset (>8000 field-collected recordings from 28 firearms across 16 calibers) and its intended uses. No load-bearing steps reduce by construction to inputs, self-citations, or ansatzes. Assertions about field collection yielding lower label noise than web audio are presented as motivations, not as derived results. The paper is self-contained against external benchmarks as a data descriptor.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a dataset release paper; the contribution is empirical data collection and public sharing rather than a theoretical or mathematical claim. No free parameters, axioms, or invented entities are involved.

pith-pipeline@v0.9.1-grok · 5687 in / 1246 out tokens · 51686 ms · 2026-06-26T22:33:25.844738+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

18 extracted references · 1 canonical work pages

[1]

Acoustic-based sensing and applications: A survey,

Y . Bai, L. Lu, J. Cheng, J. Liu, Y . Chen, and J. Yu, “Acoustic-based sensing and applications: A survey,”Computer Networks, vol. 181, p. 107447, Nov. 2020

2020
[2]

Acoustic detection and localization of small arms, influence of urban conditions,

P. Naz, C. Marty, S. Hengy, and P. Hamery, “Acoustic detection and localization of small arms, influence of urban conditions,” inUnattended Ground, Sea, and Air Sensor Technologies and Applications X(E. M. Carapezza, ed.), vol. 6963, p. 69630E, SPIE, 2008. Backup Publisher: International Society for Optics and Photonics

2008
[3]

Gunshot Detection: Reducing Gunfire through Acoustic Technology,

D. Mares, “Gunshot Detection: Reducing Gunfire through Acoustic Technology,” Response Guide 14, Center for Problem-Oriented Policing, Arizona State University, 2022

2022
[4]

Gunfire or Plastic Bag Popping? Trained Computer Knows the Difference,

G. Galoustian, “Gunfire or Plastic Bag Popping? Trained Computer Knows the Difference,” Dec. 2021. Published: Florida Atlantic University News Desk. 6 Certus Caliber Classification Gunshot Dataset (C3GD)A PREPRINT

2021
[5]

Modeling and Signal Processing of Acoustic Gunshot Recordings,

R. C. Maher, “Modeling and Signal Processing of Acoustic Gunshot Recordings,” inProceedings of the IEEE Signal Processing Society 12th DSP Workshop & 4th IEEE Signal Processing Education Workshop, (Jackson Lake, WY , USA), pp. 257–261, Sept. 2006

2006
[6]

Acoustical Characterization of Gunshots,

R. C. Maher, “Acoustical Characterization of Gunshots,” inProceedings of the IEEE Workshop on Signal Processing Applications for Public Security and Forensics (SAFE 2007), (Washington, DC, USA), pp. 109–113, Apr. 2007

2007
[7]

Development of Computational Methods for the Audio Analysis of Gunshots,

R. Lilien, “Development of Computational Methods for the Audio Analysis of Gunshots,” Final Research Performance Progress Report 252947, Cadre Research Labs, LLC, June 2018

2018
[8]

A Digitally Manipulated Gunshot Sound Identification,

S. Madzharov, I. Simeonov Ivanov, and N. Yordanov, “A Digitally Manipulated Gunshot Sound Identification,” in ENVIRONMENT. TECHNOLOGIES. RESOURCES. Proceedings of the International Scientific and Practical Conference, vol. 3, Aug. 2024

2024
[9]

How ShotSpotter Fights Criticism and Leverages Federal Cash to Win Police Contracts,

J. Schuppe and J. Eaton, “How ShotSpotter Fights Criticism and Leverages Federal Cash to Win Police Contracts,” NBC News, Feb. 2022

2022
[10]

NYPD’s ShotSpotter Gunshot-Detection System Overwhelmingly Sends Officers to Locations Where No Confirmed Shooting Occurred, New Audit Uncovers,

Office of the New York City Comptroller, “NYPD’s ShotSpotter Gunshot-Detection System Overwhelmingly Sends Officers to Locations Where No Confirmed Shooting Occurred, New Audit Uncovers,” June 2024. Published: Press Release

2024
[11]

ShotSpotter Generated Over 40,000 Dead-End Police Deployments in Chicago in 21 Months, According to New Study,

MacArthur Justice Center, “ShotSpotter Generated Over 40,000 Dead-End Police Deployments in Chicago in 21 Months, According to New Study,” tech. rep., Roderick & Solange MacArthur Justice Center, Northwestern Pritzker School of Law, May 2021

2021
[12]

Sound of Guns: Digital Forensics of Gun Audio Samples Meets Artificial Intelligence,

S. Raponi, G. Oligeri, and I. M. Ali, “Sound of Guns: Digital Forensics of Gun Audio Samples Meets Artificial Intelligence,”Multimedia Tools and Applications, vol. 81, pp. 30387–30412, 2022

2022
[13]

Deciphering GunType Hierarchy through Acoustic Analysis of Gunshot Recordings,

A. Shah, R. Singh, B. Raj, and A. Hauptmann, “Deciphering GunType Hierarchy through Acoustic Analysis of Gunshot Recordings,” June 2025. _eprint: 2506.20609

work page arXiv 2025
[14]

Machine Learning Analysis on Gunshot Recognition,

M. S. B. Nesar, B. M. Whitaker, and R. C. Maher, “Machine Learning Analysis on Gunshot Recognition,” in2024 Intermountain Engineering, Technology and Computing (IETC), 2024

2024
[15]

A multi-firearm, multi-orientation audio dataset of gunshots,

R. Kabealo, S. Wyatt, A. Aravamudan, X. Zhang, D. N. Acaron, M. P. Dao, D. Elliott, A. O. Smith, C. E. Otero, L. D. Otero, G. C. Anagnostopoulos, A. M. Peter, W. Jones, and E. Lam, “A multi-firearm, multi-orientation audio dataset of gunshots,”Data in Brief, vol. 48, p. 109091, 2023

2023
[16]

A Gunshot Recognition Method Based on Multi-Scale Spectrum Shift Module,

J. Li, J. Guo, M. Ma, Y . Zeng, C. Li, and J. Xu, “A Gunshot Recognition Method Based on Multi-Scale Spectrum Shift Module,”Electronics, vol. 11, no. 23, 2022

2022
[17]

Investigating Time-Frequency Representations for Audio Feature Extraction in Singing Technique Classification,

Y . Yamamoto, J. Nam, H. Terasawa, and Y . Hiraga, “Investigating Time-Frequency Representations for Audio Feature Extraction in Singing Technique Classification,” 2021

2021
[18]

Efficiently Classifying Lung Sounds through Depthwise Separable CNN Models with Fused STFT and MFCC Features,

S.-Y . Jung, C.-H. Liao, Y .-S. Wu, S.-M. Yuan, and C.-T. Sun, “Efficiently Classifying Lung Sounds through Depthwise Separable CNN Models with Fused STFT and MFCC Features,”Diagnostics, vol. 11, p. 732, Apr. 2021. 7

2021

[1] [1]

Acoustic-based sensing and applications: A survey,

Y . Bai, L. Lu, J. Cheng, J. Liu, Y . Chen, and J. Yu, “Acoustic-based sensing and applications: A survey,”Computer Networks, vol. 181, p. 107447, Nov. 2020

2020

[2] [2]

Acoustic detection and localization of small arms, influence of urban conditions,

P. Naz, C. Marty, S. Hengy, and P. Hamery, “Acoustic detection and localization of small arms, influence of urban conditions,” inUnattended Ground, Sea, and Air Sensor Technologies and Applications X(E. M. Carapezza, ed.), vol. 6963, p. 69630E, SPIE, 2008. Backup Publisher: International Society for Optics and Photonics

2008

[3] [3]

Gunshot Detection: Reducing Gunfire through Acoustic Technology,

D. Mares, “Gunshot Detection: Reducing Gunfire through Acoustic Technology,” Response Guide 14, Center for Problem-Oriented Policing, Arizona State University, 2022

2022

[4] [4]

Gunfire or Plastic Bag Popping? Trained Computer Knows the Difference,

G. Galoustian, “Gunfire or Plastic Bag Popping? Trained Computer Knows the Difference,” Dec. 2021. Published: Florida Atlantic University News Desk. 6 Certus Caliber Classification Gunshot Dataset (C3GD)A PREPRINT

2021

[5] [5]

Modeling and Signal Processing of Acoustic Gunshot Recordings,

R. C. Maher, “Modeling and Signal Processing of Acoustic Gunshot Recordings,” inProceedings of the IEEE Signal Processing Society 12th DSP Workshop & 4th IEEE Signal Processing Education Workshop, (Jackson Lake, WY , USA), pp. 257–261, Sept. 2006

2006

[6] [6]

Acoustical Characterization of Gunshots,

R. C. Maher, “Acoustical Characterization of Gunshots,” inProceedings of the IEEE Workshop on Signal Processing Applications for Public Security and Forensics (SAFE 2007), (Washington, DC, USA), pp. 109–113, Apr. 2007

2007

[7] [7]

Development of Computational Methods for the Audio Analysis of Gunshots,

R. Lilien, “Development of Computational Methods for the Audio Analysis of Gunshots,” Final Research Performance Progress Report 252947, Cadre Research Labs, LLC, June 2018

2018

[8] [8]

A Digitally Manipulated Gunshot Sound Identification,

S. Madzharov, I. Simeonov Ivanov, and N. Yordanov, “A Digitally Manipulated Gunshot Sound Identification,” in ENVIRONMENT. TECHNOLOGIES. RESOURCES. Proceedings of the International Scientific and Practical Conference, vol. 3, Aug. 2024

2024

[9] [9]

How ShotSpotter Fights Criticism and Leverages Federal Cash to Win Police Contracts,

J. Schuppe and J. Eaton, “How ShotSpotter Fights Criticism and Leverages Federal Cash to Win Police Contracts,” NBC News, Feb. 2022

2022

[10] [10]

NYPD’s ShotSpotter Gunshot-Detection System Overwhelmingly Sends Officers to Locations Where No Confirmed Shooting Occurred, New Audit Uncovers,

Office of the New York City Comptroller, “NYPD’s ShotSpotter Gunshot-Detection System Overwhelmingly Sends Officers to Locations Where No Confirmed Shooting Occurred, New Audit Uncovers,” June 2024. Published: Press Release

2024

[11] [11]

ShotSpotter Generated Over 40,000 Dead-End Police Deployments in Chicago in 21 Months, According to New Study,

MacArthur Justice Center, “ShotSpotter Generated Over 40,000 Dead-End Police Deployments in Chicago in 21 Months, According to New Study,” tech. rep., Roderick & Solange MacArthur Justice Center, Northwestern Pritzker School of Law, May 2021

2021

[12] [12]

Sound of Guns: Digital Forensics of Gun Audio Samples Meets Artificial Intelligence,

S. Raponi, G. Oligeri, and I. M. Ali, “Sound of Guns: Digital Forensics of Gun Audio Samples Meets Artificial Intelligence,”Multimedia Tools and Applications, vol. 81, pp. 30387–30412, 2022

2022

[13] [13]

Deciphering GunType Hierarchy through Acoustic Analysis of Gunshot Recordings,

A. Shah, R. Singh, B. Raj, and A. Hauptmann, “Deciphering GunType Hierarchy through Acoustic Analysis of Gunshot Recordings,” June 2025. _eprint: 2506.20609

work page arXiv 2025

[14] [14]

Machine Learning Analysis on Gunshot Recognition,

M. S. B. Nesar, B. M. Whitaker, and R. C. Maher, “Machine Learning Analysis on Gunshot Recognition,” in2024 Intermountain Engineering, Technology and Computing (IETC), 2024

2024

[15] [15]

A multi-firearm, multi-orientation audio dataset of gunshots,

R. Kabealo, S. Wyatt, A. Aravamudan, X. Zhang, D. N. Acaron, M. P. Dao, D. Elliott, A. O. Smith, C. E. Otero, L. D. Otero, G. C. Anagnostopoulos, A. M. Peter, W. Jones, and E. Lam, “A multi-firearm, multi-orientation audio dataset of gunshots,”Data in Brief, vol. 48, p. 109091, 2023

2023

[16] [16]

A Gunshot Recognition Method Based on Multi-Scale Spectrum Shift Module,

J. Li, J. Guo, M. Ma, Y . Zeng, C. Li, and J. Xu, “A Gunshot Recognition Method Based on Multi-Scale Spectrum Shift Module,”Electronics, vol. 11, no. 23, 2022

2022

[17] [17]

Investigating Time-Frequency Representations for Audio Feature Extraction in Singing Technique Classification,

Y . Yamamoto, J. Nam, H. Terasawa, and Y . Hiraga, “Investigating Time-Frequency Representations for Audio Feature Extraction in Singing Technique Classification,” 2021

2021

[18] [18]

Efficiently Classifying Lung Sounds through Depthwise Separable CNN Models with Fused STFT and MFCC Features,

S.-Y . Jung, C.-H. Liao, Y .-S. Wu, S.-M. Yuan, and C.-T. Sun, “Efficiently Classifying Lung Sounds through Depthwise Separable CNN Models with Fused STFT and MFCC Features,”Diagnostics, vol. 11, p. 732, Apr. 2021. 7

2021