arxiv: 2604.27593 · v1 · submitted 2026-04-30 · 🌌 astro-ph.IM · cs.CV

Recognition: unknown

An Extended Evaluation Split for DeepSpaceYoloDataset

Olivier Parisot

Authors on Pith no claims yet

Pith reviewed 2026-05-07 08:57 UTC · model grok-4.3

classification 🌌 astro-ph.IM cs.CV

keywords DeepSpaceYoloDatasetYOLOdeep sky objectsobject detectionastronomy datasettest splitelectronically assisted astronomy

0 comments

The pith

DeepSpaceYoloDataset gains a new test2026 split to evaluate models on more varied deep sky images.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper updates the 2023 DeepSpaceYoloDataset, a collection of annotated images for training YOLO models to detect deep sky objects in images from smart telescopes and electronically assisted astronomy. It adds one new evaluation split named test2026, chosen specifically to increase the variety of images seen during testing. A sympathetic reader cares because reliable detection tools for these objects could make advanced sky observation practical for non-professional users rather than only large observatories. The update rests on the premise that greater image diversity in the test set produces more trustworthy model performance numbers.

Core claim

The paper states that the addition of the test2026 split to DeepSpaceYoloDataset supplies a more diverse collection of images for evaluating YOLO-based detection models of deep sky objects, extending the original 2023 dataset for use in accessible astronomy applications.

What carries the argument

The test2026 split, an added evaluation set of annotated images selected to increase diversity over prior test data.

If this is right

Detection models can now be assessed against a broader range of deep sky object appearances and imaging conditions.
Performance numbers obtained on test2026 are intended to reflect real-world use with smart telescopes more closely than earlier splits.
The updated dataset supports continued development of publicly accessible detection tools for deep sky objects.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Future model comparisons could adopt test2026 as a standard benchmark to reduce overfitting to older test distributions.
If diversity gains hold, the split may highlight failure modes that prior evaluations missed, guiding targeted improvements in YOLO architectures for astronomy.
The approach of releasing incremental evaluation splits could be applied to other domain-specific detection datasets to track progress more reliably.

Load-bearing premise

The new test2026 images actually deliver greater diversity and share no overlap or unintended similarity with the training data.

What would settle it

Inspection or similarity metrics showing that many test2026 images match or closely resemble images already in the training or validation sets would falsify the claim of increased diversity.

Figures

Figures reproduced from arXiv: 2604.27593 by Olivier Parisot.

**Figure 1.** Figure 1: Distribution of image dimensions (width, height) in the view at source ↗

**Figure 2.** Figure 2: Composite image of M1 supernova remnant, captured with a Stellina and view at source ↗

**Figure 3.** Figure 3: Composite image of M4 globular cluster, captured with a Stellina and view at source ↗

**Figure 4.** Figure 4: Composite image of the NGC4490 spiral galaxy, captured with a Stel view at source ↗

**Figure 5.** Figure 5: Composite image of the NGC6781 planetary nebula, captured with a view at source ↗

read the original abstract

Recent technological advances in astronomy, particularly the growing popularity of smart telescopes for the general public, make it possible to develop highly effective detection solutions that are accessible to a wide audience, rather than being reserved for major scientific observatories. Published in 2023, DeepSpaceYoloDataset is a collection of annotated images created to train YOLO-based models for detecting Deep Sky Objects, particularly suited for Electronically Assisted Astronomy. In this paper, we present an update to DeepSpaceYoloDataset with the addition of a new split, test2026, designed to evaluate detection models with a greater diversity of images.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper updates DeepSpaceYoloDataset (originally published in 2023) by introducing a new evaluation split called test2026. The stated purpose is to enable more robust testing of YOLO-based deep-sky-object detection models by supplying a collection of images with greater diversity than prior splits, in support of Electronically Assisted Astronomy applications.

Significance. A properly validated, more diverse held-out split would be a useful community resource for assessing generalization in astronomical object detection, particularly for consumer-grade smart-telescope imagery. The contribution is modest in scope but directly addresses a practical need in the field; its value, however, hinges on demonstrating that the claimed diversity increase and absence of leakage have actually been achieved.

major comments (1)

Abstract and main dataset-description section: the manuscript asserts that test2026 supplies “greater diversity of images” yet supplies no quantitative evidence—such as metadata histograms (instrument, exposure, sky conditions), class-balance statistics, feature-space distances, or explicit overlap checks against the training and original test sets—to substantiate the claim. Without these, the central design goal remains an unverified assertion rather than a demonstrated property.

minor comments (1)

The paper would be strengthened by the addition of a small table or figure showing basic statistics (number of images, object counts, instrument distribution) for test2026 alongside the existing splits.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review and the recommendation for major revision. We address the single major comment below and will update the manuscript to incorporate quantitative evidence supporting the diversity claim for test2026.

read point-by-point responses

Referee: Abstract and main dataset-description section: the manuscript asserts that test2026 supplies “greater diversity of images” yet supplies no quantitative evidence—such as metadata histograms (instrument, exposure, sky conditions), class-balance statistics, feature-space distances, or explicit overlap checks against the training and original test sets—to substantiate the claim. Without these, the central design goal remains an unverified assertion rather than a demonstrated property.

Authors: We agree that the current manuscript does not provide quantitative substantiation for the increased diversity of test2026. The split was constructed by drawing from a wider pool of publicly available astronomical images with varied instruments, exposure settings, and observing conditions, but this was described only qualitatively. In the revised manuscript we will add the requested analyses, including: (i) comparative histograms of metadata (instrument type, exposure time, sky brightness/conditions); (ii) class-balance statistics across splits; (iii) feature-space distance metrics (e.g., average cosine distance between image embeddings from a pre-trained astronomical model); and (iv) explicit overlap checks confirming zero image or source leakage with the training and original test sets. These additions will convert the diversity claim from an assertion into a demonstrated property. revision: yes

Circularity Check

0 steps flagged

No circularity: straightforward dataset update with no derivations or fitted quantities

full rationale

The paper is a data-release description that announces the addition of a new test split (test2026) to DeepSpaceYoloDataset. It contains no equations, no parameter fitting, no predictions, no uniqueness theorems, and no self-citation chains that could reduce any claim to its own inputs. The central statement is simply that the new split is “designed to evaluate detection models with a greater diversity of images,” which is an assertion of intent rather than a derived result. Because no derivation chain exists, no step can be circular by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are required; the paper is a dataset extension without mathematical modeling or new theoretical constructs.

pith-pipeline@v0.9.0 · 5384 in / 984 out tokens · 42138 ms · 2026-05-07T08:57:25.990176+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

12 extracted references · 2 canonical work pages

[1]

Astronomy and Computing32, 100384 (2020)

Beroiz, M., Cabral, J.B., Sanchez, B.: Astroalign: A python module for astronom- ical image registration. Astronomy and Computing32, 100384 (2020)

2020
[2]

In: International Conference on Neuroinformatics

Kanev, A.I., Mikheeva, V.A., Babasanova, N.S.: Contrast transformations as a stage for improving deep space object detection quality in astronomical images. In: International Conference on Neuroinformatics. pp. 217–229. Springer (2025)

2025
[3]

The Intelligent Universe: AI’s Role in Astronomy pp

Mehta, H., Singh, S., Pandey, V., Verma, P., Antony, A.: Autonomous telescopes and observatories. The Intelligent Universe: AI’s Role in Astronomy pp. 313–357 (2025)

2025
[4]

Data9(1), 12 (2024)

Parisot, O.: DeepSpaceYoloDataset: Annotated Astronomical Images Captured with Smart Telescopes. Data9(1), 12 (2024)

2024
[5]

Parisot, O., Fernandes, D.R.: Robustness analysis of Deep Sky Objects detection models on HPC (2025),https://arxiv.org/abs/2508.09831

work page arXiv 2025
[6]

Data in Brief48, 109133 (2023) An Extended Evaluation Split for DeepSpaceYoloDataset 9

Parisot, O., Hitzelberger, P., Bruneau, P., Krebs, G., Destruel, C., Vandame, B.: MILAN Sky Survey, a dataset of raw deep sky images captured during one year with a Stellina automated telescope. Data in Brief48, 109133 (2023) An Extended Evaluation Split for DeepSpaceYoloDataset 9

2023
[7]

Journal of Imaging 11(6), 184 (2025)

Piratinskii, E., Rabaev, I.: Cosmica: A novel dataset for astronomical object de- tection with evaluation across diverse detection architectures. Journal of Imaging 11(6), 184 (2025)

2025
[8]

Neural Computing and Ap- plications pp

Ramos, L.T., Rivas-Echeverr´ ıa, F.: Deep sky object detection in astronomical im- agery using YOLO models: a comparative assessment. Neural Computing and Ap- plications pp. 1–23 (2025)

2025
[9]

In: Big science, innovation, and societal contributions: the organisations and collaborations in big science experiments, pp

Reitze, D., Duffy, A.R., Gilbert, J., Casali, M., Barberio, E., Liyanage, S.: The evolution of astrophysics towards big science: insights from the innovation land- scape. In: Big science, innovation, and societal contributions: the organisations and collaborations in big science experiments, pp. 185–219. Oxford University Press (2024)

2024
[10]

Cambridge University Press (2010)

Steinicke, W.: Observing and cataloguing nebulae and star clusters: from Herschel to Dreyer’s New General Catalogue. Cambridge University Press (2010)

2010
[11]

Stellarium contributors: Stellarium v26.1 astronomy software (2026).https:// doi.org/10.5281/zenodo.19428881,https://stellarium.org/

work page doi:10.5281/zenodo.19428881 2026
[12]

Journal of Astrophysics and Astronomy46(1), 23 (2025)

Sule, A., Ramanujam, N.M., Maji, M., More, S., Yadav, V., Narayanan, A., Dhurde, S., Ganguly, J., Seetha, S., Srivastava, A.M., et al.: Astronomy and society: The road ahead. Journal of Astrophysics and Astronomy46(1), 23 (2025)

2025