Recognition: unknown
An Extended Evaluation Split for DeepSpaceYoloDataset
Pith reviewed 2026-05-07 08:57 UTC · model grok-4.3
The pith
DeepSpaceYoloDataset gains a new test2026 split to evaluate models on more varied deep sky images.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper states that the addition of the test2026 split to DeepSpaceYoloDataset supplies a more diverse collection of images for evaluating YOLO-based detection models of deep sky objects, extending the original 2023 dataset for use in accessible astronomy applications.
What carries the argument
The test2026 split, an added evaluation set of annotated images selected to increase diversity over prior test data.
If this is right
- Detection models can now be assessed against a broader range of deep sky object appearances and imaging conditions.
- Performance numbers obtained on test2026 are intended to reflect real-world use with smart telescopes more closely than earlier splits.
- The updated dataset supports continued development of publicly accessible detection tools for deep sky objects.
Where Pith is reading between the lines
- Future model comparisons could adopt test2026 as a standard benchmark to reduce overfitting to older test distributions.
- If diversity gains hold, the split may highlight failure modes that prior evaluations missed, guiding targeted improvements in YOLO architectures for astronomy.
- The approach of releasing incremental evaluation splits could be applied to other domain-specific detection datasets to track progress more reliably.
Load-bearing premise
The new test2026 images actually deliver greater diversity and share no overlap or unintended similarity with the training data.
What would settle it
Inspection or similarity metrics showing that many test2026 images match or closely resemble images already in the training or validation sets would falsify the claim of increased diversity.
Figures
read the original abstract
Recent technological advances in astronomy, particularly the growing popularity of smart telescopes for the general public, make it possible to develop highly effective detection solutions that are accessible to a wide audience, rather than being reserved for major scientific observatories. Published in 2023, DeepSpaceYoloDataset is a collection of annotated images created to train YOLO-based models for detecting Deep Sky Objects, particularly suited for Electronically Assisted Astronomy. In this paper, we present an update to DeepSpaceYoloDataset with the addition of a new split, test2026, designed to evaluate detection models with a greater diversity of images.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper updates DeepSpaceYoloDataset (originally published in 2023) by introducing a new evaluation split called test2026. The stated purpose is to enable more robust testing of YOLO-based deep-sky-object detection models by supplying a collection of images with greater diversity than prior splits, in support of Electronically Assisted Astronomy applications.
Significance. A properly validated, more diverse held-out split would be a useful community resource for assessing generalization in astronomical object detection, particularly for consumer-grade smart-telescope imagery. The contribution is modest in scope but directly addresses a practical need in the field; its value, however, hinges on demonstrating that the claimed diversity increase and absence of leakage have actually been achieved.
major comments (1)
- Abstract and main dataset-description section: the manuscript asserts that test2026 supplies “greater diversity of images” yet supplies no quantitative evidence—such as metadata histograms (instrument, exposure, sky conditions), class-balance statistics, feature-space distances, or explicit overlap checks against the training and original test sets—to substantiate the claim. Without these, the central design goal remains an unverified assertion rather than a demonstrated property.
minor comments (1)
- The paper would be strengthened by the addition of a small table or figure showing basic statistics (number of images, object counts, instrument distribution) for test2026 alongside the existing splits.
Simulated Author's Rebuttal
We thank the referee for the constructive review and the recommendation for major revision. We address the single major comment below and will update the manuscript to incorporate quantitative evidence supporting the diversity claim for test2026.
read point-by-point responses
-
Referee: Abstract and main dataset-description section: the manuscript asserts that test2026 supplies “greater diversity of images” yet supplies no quantitative evidence—such as metadata histograms (instrument, exposure, sky conditions), class-balance statistics, feature-space distances, or explicit overlap checks against the training and original test sets—to substantiate the claim. Without these, the central design goal remains an unverified assertion rather than a demonstrated property.
Authors: We agree that the current manuscript does not provide quantitative substantiation for the increased diversity of test2026. The split was constructed by drawing from a wider pool of publicly available astronomical images with varied instruments, exposure settings, and observing conditions, but this was described only qualitatively. In the revised manuscript we will add the requested analyses, including: (i) comparative histograms of metadata (instrument type, exposure time, sky brightness/conditions); (ii) class-balance statistics across splits; (iii) feature-space distance metrics (e.g., average cosine distance between image embeddings from a pre-trained astronomical model); and (iv) explicit overlap checks confirming zero image or source leakage with the training and original test sets. These additions will convert the diversity claim from an assertion into a demonstrated property. revision: yes
Circularity Check
No circularity: straightforward dataset update with no derivations or fitted quantities
full rationale
The paper is a data-release description that announces the addition of a new test split (test2026) to DeepSpaceYoloDataset. It contains no equations, no parameter fitting, no predictions, no uniqueness theorems, and no self-citation chains that could reduce any claim to its own inputs. The central statement is simply that the new split is “designed to evaluate detection models with a greater diversity of images,” which is an assertion of intent rather than a derived result. Because no derivation chain exists, no step can be circular by construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Astronomy and Computing32, 100384 (2020)
Beroiz, M., Cabral, J.B., Sanchez, B.: Astroalign: A python module for astronom- ical image registration. Astronomy and Computing32, 100384 (2020)
2020
-
[2]
In: International Conference on Neuroinformatics
Kanev, A.I., Mikheeva, V.A., Babasanova, N.S.: Contrast transformations as a stage for improving deep space object detection quality in astronomical images. In: International Conference on Neuroinformatics. pp. 217–229. Springer (2025)
2025
-
[3]
The Intelligent Universe: AI’s Role in Astronomy pp
Mehta, H., Singh, S., Pandey, V., Verma, P., Antony, A.: Autonomous telescopes and observatories. The Intelligent Universe: AI’s Role in Astronomy pp. 313–357 (2025)
2025
-
[4]
Data9(1), 12 (2024)
Parisot, O.: DeepSpaceYoloDataset: Annotated Astronomical Images Captured with Smart Telescopes. Data9(1), 12 (2024)
2024
- [5]
-
[6]
Data in Brief48, 109133 (2023) An Extended Evaluation Split for DeepSpaceYoloDataset 9
Parisot, O., Hitzelberger, P., Bruneau, P., Krebs, G., Destruel, C., Vandame, B.: MILAN Sky Survey, a dataset of raw deep sky images captured during one year with a Stellina automated telescope. Data in Brief48, 109133 (2023) An Extended Evaluation Split for DeepSpaceYoloDataset 9
2023
-
[7]
Journal of Imaging 11(6), 184 (2025)
Piratinskii, E., Rabaev, I.: Cosmica: A novel dataset for astronomical object de- tection with evaluation across diverse detection architectures. Journal of Imaging 11(6), 184 (2025)
2025
-
[8]
Neural Computing and Ap- plications pp
Ramos, L.T., Rivas-Echeverr´ ıa, F.: Deep sky object detection in astronomical im- agery using YOLO models: a comparative assessment. Neural Computing and Ap- plications pp. 1–23 (2025)
2025
-
[9]
In: Big science, innovation, and societal contributions: the organisations and collaborations in big science experiments, pp
Reitze, D., Duffy, A.R., Gilbert, J., Casali, M., Barberio, E., Liyanage, S.: The evolution of astrophysics towards big science: insights from the innovation land- scape. In: Big science, innovation, and societal contributions: the organisations and collaborations in big science experiments, pp. 185–219. Oxford University Press (2024)
2024
-
[10]
Cambridge University Press (2010)
Steinicke, W.: Observing and cataloguing nebulae and star clusters: from Herschel to Dreyer’s New General Catalogue. Cambridge University Press (2010)
2010
-
[11]
Stellarium contributors: Stellarium v26.1 astronomy software (2026).https:// doi.org/10.5281/zenodo.19428881,https://stellarium.org/
-
[12]
Journal of Astrophysics and Astronomy46(1), 23 (2025)
Sule, A., Ramanujam, N.M., Maji, M., More, S., Yadav, V., Narayanan, A., Dhurde, S., Ganguly, J., Seetha, S., Srivastava, A.M., et al.: Astronomy and society: The road ahead. Journal of Astrophysics and Astronomy46(1), 23 (2025)
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.