pith. sign in

arxiv: 2503.01605 · v2 · submitted 2025-03-03 · 💻 cs.CV

A Leaf-Level Dataset for Soybean-Cotton Detection and Segmentation

Pith reviewed 2026-05-23 01:39 UTC · model grok-4.3

classification 💻 cs.CV
keywords soybeancottonleaf detectioninstance segmentationagricultural datasetYOLOweed managementprecision agriculture
0
0 comments X

The pith

A new dataset of 640 field images provides leaf-level annotations for soybean and cotton detection amid overlaps.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper collects 640 high-resolution images from commercial farms across growth stages, weed pressures, and lighting conditions, then annotates them with bounding boxes and segmentation masks for 7,221 soybean leaves and 5,190 cotton leaves. This fills gaps in prior datasets that miss real-world complexities such as overlapping foliage and similar leaf shapes. Validation with YOLOv11 shows strong detection and segmentation results on these images. Readers would care because the data directly supports tools for selective spraying and pest control in mixed soybean-cotton systems.

Core claim

The authors create and release a leaf-instance dataset drawn from actual commercial fields that records individual soybean and cotton leaves with both boxes and masks, explicitly including overlaps, small sizes, and morphological similarities, and they show that YOLOv11 trained on it achieves state-of-the-art identification and segmentation performance.

What carries the argument

Leaf-instance annotations consisting of bounding boxes and segmentation masks applied to 640 high-resolution images collected across multiple growth stages and field conditions.

If this is right

  • Enables training of models for selective herbicide application that targets only volunteer plants and weeds.
  • Supports automated pest monitoring systems that operate at the leaf level in complex canopies.
  • Provides a public benchmark for comparing future detection and segmentation algorithms in soybean-cotton settings.
  • Facilitates data-driven crop management strategies that reduce unnecessary chemical use.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same annotation style could be extended to other row crops that face volunteer-plant problems.
  • Pairing the dataset with temporal sequences from the same fields would allow testing of growth-stage tracking models.
  • Performance gains from this data may translate to lower overall herbicide volumes when deployed on sprayers with leaf-level targeting.

Load-bearing premise

The manual leaf annotations are accurate and the 640 images sufficiently represent the variability of real commercial soybean-cotton fields across growth stages and conditions.

What would settle it

If a model trained on this dataset shows large drops in accuracy when tested on images from different farms, seasons, or equipment that were not represented in the original collection, the dataset's claimed coverage of real-world variability would be undermined.

Figures

Figures reproduced from arXiv: 2503.01605 by Jo\~ao Manoel Herrera Pinheiro, Juliano Negri, Marcelo Becker, Paulo H. Polegato, Ricardo V. Godoy, Thiago H. Segreto.

Figure 1
Figure 1. Figure 1: Ground-truth annotations include detection bounding boxes, shown in the first row, and segmentation masks, shown in the second row. Despite the importance of robust training data, comprehensive real-world agricultural datasets, particularly those supporting instance segmentation, remain scarce. For example, the Moving Fields Weed Dataset19 provides numerous annotations but is constrained to controlled indo… view at source ↗
Figure 2
Figure 2. Figure 2: Growth stage variations in soybean and cotton fields. A 3×3 grid of raw images illustrates early (a–c), middle (d–f), and dense (g–i) canopy stages. In the early stage (1–3 weeks), sparse foliage and minimal leaf overlap simplify segmentation but offer limited complexity. The middle stage (4–7 weeks) introduces denser coverage, partial occlusions, and moderate weed presence. The dense stage (8–10 weeks) ex… view at source ↗
Figure 3
Figure 3. Figure 3: Satellite imagery of the data collection site in Jaboticabal, São Paulo, Brazil. The red polygon delineates the farm’s specific boundaries where image acquisition took place. Maps data: Google and Airbus 2025 This choice ensured a realistic context where the target crops coexisted with non-target vegetation. To further illustrate the progression of crop development across the ten-week span, [PITH_FULL_IMA… view at source ↗
Figure 4
Figure 4. Figure 4: Illustration of the dataset creation and annotation workflow. Field images are first acquired under near-vertical perspectives and filtered to remove near-identical samples. Experts and a reviewer then generate initial segmentation masks and bounding boxes in CVAT, assisted by the SAM. Connected component analysis eliminates small “blob” artifacts in the masks, and any duplicate labels are merged using a 9… view at source ↗
Figure 5
Figure 5. Figure 5: Detection comparison. The left image (GT) depicts ground-truth bounding boxes for soybean (yellow) and cotton (purple). The right image (Pred) shows the model’s bounding-box outputs with confidence scores (blue). Technical Validation This section outlines the procedures and results used to validate the dataset for both object detection and instance segmentation tasks. We first detail the rationale for sele… view at source ↗
Figure 6
Figure 6. Figure 6: Segmentation comparison. The left image (GT) illustrates manual annotations, with soybean leaves in yellow and cotton leaves in purple, whereas the right image (Pred) presents the model-generated segmentation masks. fraction of ground-truth objects or regions successfully identified. High recall implies fewer missed detections (i.e., fewer false negatives). The F1-score is the harmonic mean of precision an… view at source ↗
Figure 7
Figure 7. Figure 7: Detection and segmentation across soybean and cotton growth stages. This 2×3 grid displays YOLO11m outputs: the top row shows predicted bounding boxes, and the bottom row presents corresponding segmentation masks for early, mid, and dense leaf maturity (left to right). Early-stage predictions exhibit clear leaf separation, while dense-stage outputs reflect challenges posed by occlusions and overlapping fol… view at source ↗
Figure 8
Figure 8. Figure 8: Effect of dataset size on YOLOv11 performance for detection (yellow for S, orange for M, red for X) and segmentation (green for S, dark green for M, dark blue for X). The x-axis indicates the total number of annotated objects used in training, and the y-axis shows mAP50−95. Detection appears to plateau near 7,000 objects, while segmentation gains persist until roughly 9,000–11,000 objects. An exponential c… view at source ↗
Figure 9
Figure 9. Figure 9: Discretized error analysis across growth stages. The bar charts compare normalized error rates for Object Detection (left) and Instance Segmentation (right). While misclassification (purple) remains negligible, Instance Segmentation suffers from higher false positive rates (light blue/orange), particularly for Soybean in the Late stage (22.4%), likely due to its morphological similarity to background weeds… view at source ↗
Figure 10
Figure 10. Figure 10: Qualitative analysis of critical failure modes. The top row illustrates the most significant false positive and false negative cases for soybean, while the bottom row displays the corresponding errors for cotton. Consistent with the quantitative findings, errors are primarily driven by severe leaf-on-leaf occlusion in dense canopies and morphological similarities between soybean foliage and background wee… view at source ↗
read the original abstract

Soybean and cotton are major drivers of many countries' agricultural sectors, offering substantial economic returns but also facing persistent challenges from volunteer plants and weeds that hamper sustainable management. Effectively controlling volunteer plants and weeds demands advanced recognition strategies that can identify these amidst complex crop canopies. While deep learning methods have demonstrated promising results for leaf-level detection and segmentation, existing datasets often fail to capture the complexity of real-world agricultural fields. To address this, we collected 640 high-resolution images from a commercial farm spanning multiple growth stages, weed pressures, and lighting variations. Each image is annotated at the leaf-instance level, with 7,221 soybean and 5,190 cotton leaves labeled via bounding boxes and segmentation masks, capturing overlapping foliage, small leaf size, and morphological similarities. We validate this dataset using YOLOv11, demonstrating state-of-the-art performance in accurately identifying and segmenting overlapping foliage. Our publicly available dataset supports advanced applications such as selective herbicide spraying and pest monitoring and can foster more robust, data-driven strategies for soybean-cotton management.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 0 minor

Summary. The paper introduces a publicly available leaf-level dataset of 640 high-resolution images collected from commercial soybean-cotton fields across multiple growth stages, weed pressures, and lighting conditions. The dataset provides instance-level annotations consisting of 7,221 soybean and 5,190 cotton leaves with bounding boxes and segmentation masks that capture overlaps, small sizes, and morphological similarities. The authors validate the dataset by training YOLOv11 and claim state-of-the-art performance for detection and segmentation of overlapping foliage.

Significance. If the ground-truth annotations are shown to be reliable and the images adequately sample real-field variability, the dataset would provide a useful benchmark for precision-agriculture computer-vision tasks such as selective spraying and volunteer-plant monitoring. Public release of the data is a clear positive.

major comments (3)
  1. [Abstract] Abstract: the claim that YOLOv11 validation demonstrates 'state-of-the-art performance' is unsupported because no quantitative metrics, baseline comparisons, train/test protocol, or error analysis are supplied. This directly undermines the central empirical claim of the work.
  2. [Dataset Collection and Annotation] Dataset Collection and Annotation (implied section): no inter-annotator agreement, annotation protocol for ambiguous overlapping boundaries, or multi-expert review is reported for the 12,411 leaf masks. Because these masks constitute the sole ground truth for the YOLOv11 experiments, their unquantified quality is load-bearing for any performance interpretation.
  3. [Dataset description] Dataset description: the statement that the 640 images 'sufficiently represent the variability of real commercial soybean-cotton fields across growth stages and conditions' is asserted without supporting statistics on growth-stage distribution, weed-pressure coverage, or lighting variation.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and commit to revisions that strengthen the empirical support, annotation transparency, and dataset characterization without overstating current content.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that YOLOv11 validation demonstrates 'state-of-the-art performance' is unsupported because no quantitative metrics, baseline comparisons, train/test protocol, or error analysis are supplied. This directly undermines the central empirical claim of the work.

    Authors: We agree the abstract claim is not supported by the quantitative details listed. The manuscript presents YOLOv11 results primarily to illustrate dataset usability rather than to establish rigorous SOTA benchmarks. In revision we will either remove or qualify the 'state-of-the-art' phrasing and, where feasible, add the requested metrics, baselines, protocol description, and error analysis. revision: yes

  2. Referee: [Dataset Collection and Annotation] Dataset Collection and Annotation (implied section): no inter-annotator agreement, annotation protocol for ambiguous overlapping boundaries, or multi-expert review is reported for the 12,411 leaf masks. Because these masks constitute the sole ground truth for the YOLOv11 experiments, their unquantified quality is load-bearing for any performance interpretation.

    Authors: The concern is valid; the current text does not report inter-annotator agreement or explicit protocols for overlap cases. We will add a detailed description of the annotation workflow and guidelines used for ambiguous boundaries. If agreement statistics or additional expert review can be obtained or computed from existing records, they will be included; otherwise the limitations will be stated explicitly. revision: partial

  3. Referee: [Dataset description] Dataset description: the statement that the 640 images 'sufficiently represent the variability of real commercial soybean-cotton fields across growth stages and conditions' is asserted without supporting statistics on growth-stage distribution, weed-pressure coverage, or lighting variation.

    Authors: We will revise the dataset section to include summary statistics (e.g., counts or percentages) on growth-stage distribution, weed-pressure levels, and lighting conditions across the 640 images to substantiate the representativeness statement. revision: yes

Circularity Check

0 steps flagged

No circularity: dataset paper with external validation

full rationale

This is a dataset collection and empirical validation paper with no derivations, equations, fitted parameters, or load-bearing self-citations. The central claim rests on manual annotations of 640 images validated against the external YOLOv11 model, which is independent of the authors' prior work. No step reduces by construction to its inputs; the work is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an empirical dataset paper; no mathematical modeling, free parameters, axioms, or invented entities are present.

pith-pipeline@v0.9.0 · 5736 in / 1040 out tokens · 46478 ms · 2026-05-23T01:39:40.106931+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages · 3 internal anchors

  1. [1]

    Primeira estimativa para safra de grãos 2024/25 indica produção de 322,47 milhões de toneladas (2024)

    Companhia Nacional de Abastecimento (Conab). Primeira estimativa para safra de grãos 2024/25 indica produção de 322,47 milhões de toneladas (2024). Accessed: 28/11/2024

  2. [2]

    3.Embrapa Algodão.Cultura do Algodão no Cerrado(Embrapa, Brasília, Brasil, 2017)

    Food and Agriculture Organization of the United Nations.World F ood and Agriculture – Statistical Yearbook 2023(FAO, Rome, 2023). 3.Embrapa Algodão.Cultura do Algodão no Cerrado(Embrapa, Brasília, Brasil, 2017). 4.Agrolink. Como manejar as tigueras no algodoeiro (2023). Accessed: 28/11/2024

  3. [3]

    Circular técnica n 41: Destruição de soqueiras do algodoeiro (2019)

    Instituto Mato-Grossense do Algodão. Circular técnica n 41: Destruição de soqueiras do algodoeiro (2019). Accessed: 28/11/2024

  4. [4]

    L., Antuniassi, U

    Cavenaghi, A. L., Antuniassi, U. R., Correia, N. M. & Belapart, D. Manejo químico de plantas voluntárias de algodão rr. InXXVIII Congresso Brasileiro da Ciência das Plantas Daninhas(Sociedade Brasileira da Ciência das Plantas Daninhas, Campo Grande, Brasil, 2012)

  5. [5]

    S., Silva, C

    Ikeda, F. S., Silva, C. P., Lima, R. S. & Oliveira, R. A. Comunidade de plantas daninhas e tiguera de algodão após a aplicação de herbicidas e adubação nitrogenada de cobertura em diferentes épocas no consórcio de milho com *brachiaria ruziziensis*. InXXVII Congresso Brasileiro da Ciência das Plantas Daninhas(Sociedade Brasileira da Ciência das Plantas Da...

  6. [6]

    Cuiabá, MT, 1ª edição edn

    Belot, J.-L.Manual de Boas Práticas de Manejo do Algodoeiro em Mato Grosso. Cuiabá, MT, 1ª edição edn. (2012). ISBN: 978-85-66457-00-1

  7. [7]

    & Ding, Y

    Wu, Z., Chen, Y ., Zhao, B., Kang, X. & Ding, Y . Review of weed detection methods based on computer vision.Sensors21, 3647, 10.3390/s21113647 (2021)

  8. [8]

    Silva, J. A. O. S.et al.Deep learning for weed detection and segmentation in agricultural crops using images captured by an unmanned aerial vehicle.Remote. Sens.16, 4394, 10.3390/rs16234394 (2024)

  9. [9]

    G.et al.Field-based multispecies weed and crop detection using ground robots and advanced yolo models: A data and model-centric approach.Smart Agric

    C, S. G.et al.Field-based multispecies weed and crop detection using ground robots and advanced yolo models: A data and model-centric approach.Smart Agric. Technol.9, 100538, 10.1016/j.atech.2024.100538 (2024)

  10. [10]

    Plant Sci.8, e11373, https://doi.org/10.1002/aps3.11373 (2020)

    Champ, J.et al.Instance segmentation for the fine detection of crop and weed plants by precision agricultural robots.Appl. Plant Sci.8, e11373, https://doi.org/10.1002/aps3.11373 (2020)

  11. [11]

    O., Knoll, F

    Czymmek, V ., Harders, L. O., Knoll, F. J. & Hussmann, S. Vision-based deep learning approach for real-time detection of weeds in organic farming. In2019 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), 1–5 (IEEE, 2019)

  12. [12]

    & Young, S

    Chen, D., Lu, Y ., Li, Z. & Young, S. Performance evaluation of deep transfer learning on multi-class identification of common weed species in cotton production systems.Comput. Electron. Agric.198, 107091 (2022)

  13. [13]

    H., Alves, J

    Yano, I. H., Alves, J. R., Santiago, W. E. & Mederos, B. J. Identification of weeds in sugarcane fields through images taken by uav and random forest classifier.IF AC-PapersOnLine49, 415–420 (2016)

  14. [14]

    Madec, S.et al.Vegann, vegetation annotation of multi-crop rgb images acquired under diverse conditions for segmentation. Sci. Data10, 302 (2023)

  15. [15]

    Data11, 109, 10.1038/s41597-024-02945-6 (2024)

    Genze, N.et al.Manually annotated and curated dataset of diverse weed species in maize and sorghum for computer vision.Sci. Data11, 109, 10.1038/s41597-024-02945-6 (2024)

  16. [16]

    & Tsaftaris, S

    Minervini, M., Fischbach, A., Scharr, H. & Tsaftaris, S. A. Finely-grained annotated datasets for image-based plant phenotyping.Pattern Recognit. Lett.81, 80–89, https://doi.org/10.1016/j.patrec.2015.10.013 (2016)

  17. [17]

    Plant Sci.12, 774068, 10.3389/fpls.2021.774068 (2022)

    Zenkl, R.et al.Outdoor plant segmentation with deep learning for high-throughput field phenotyping on a diverse wheat dataset.Front. Plant Sci.12, 774068, 10.3389/fpls.2021.774068 (2022)

  18. [18]

    Fan, X., Zhou, R., Tjahjadi, T., Choudhury, S. D. & Ye, Q. A segmentation-guided deep learning framework for leaf counting.Front. Plant Sci.13, 844522, 10.3389/fpls.2022.844522 (2022)

  19. [19]

    M., Meyer, G

    Woebbecke, D. M., Meyer, G. E., V on Bargen, K. & Mortensen, D. A. Color indices for weed identification under various soil, residue, and lighting conditions.Transactions ASAE38, 259–269 (1995). 16/17

  20. [20]

    Sekachev, N

    Gongal, A., Amatya, S., Karkee, M., Zhang, Q. & Lewis, K. Sensors and systems for fruit detection and localization: A review.Comput. Electron. Agric.116, 8–19 (2015). 25.Corporation, C. Computer vision annotation tool (cvat), 10.5281/zenodo.4009388 (2023)

  21. [21]

    In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV)

    Kirillov, A.et al.Segment anything. In2023 IEEE/CVF International Conference on Computer Vision (ICCV), 3992–4003, 10.1109/ICCV51070.2023.00371 (2023). 27.Bradski, G. The OpenCV Library.Dr . Dobb’s J. Softw. Tools(2000)

  22. [22]

    29.Carion, N.et al.Sam 3: Segment anything with concepts (2025)

    Tan, Z.et al.Large language models for data annotation and synthesis: A survey.arXiv preprint arXiv:2402.13446(2024). 29.Carion, N.et al.Sam 3: Segment anything with concepts (2025). 2511.16719

  23. [23]

    & Karkee, M

    Sapkota, R., Paudel, A. & Karkee, M. Zero-shot automatic annotation and instance segmentation using llm-generated datasets: Eliminating field imaging and manual annotation for deep learning model development (2025). 2411.11285

  24. [24]

    Silva, T. H. S. A leaf-level dataset for soybean–cotton detection and segmentation (2025). 10.6084/m9.figshare.28466636. v3. 32.Pedregosa, F.et al.Scikit-learn: Machine learning in Python.J. Mach. Learn. Res.12, 2825–2830 (2011)

  25. [25]

    & Girshick, R

    He, K., Gkioxari, G., Dollár, P. & Girshick, R. Mask r-cnn. InProceedings of the IEEE international conference on computer vision, 2961–2969 (2017)

  26. [26]

    & Bhatia, R

    Kotthapalli, M., Ravipati, D. & Bhatia, R. YOLOv1 to YOLOv11: A Comprehensive Survey of Real-Time Object Detection Innovations and Challenges.arXiv preprint arXiv:2508.02067(2025). 2508.02067

  27. [27]

    Y ., Abdelatti, M

    Jeghamh, N., Koh, C. Y ., Abdelatti, M. & Hendawi, A. YOLO Evolution: A Comprehensive Benchmark and Architectural Review of YOLOv12, YOLO11, and Their Previous Versions.arXiv preprint arXiv:2411.00201(2025). 2411.00201v4

  28. [28]

    Varghese, R. & M., S. Yolov8: A novel object detection algorithm with enhanced performance and robustness. In 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS), 1–6, 10.1109/ADICS58448.2024.10533619 (2024). 37.Wang, A.et al.YOLOv10: Real-Time End-to-End Object Detection (2024). ArXiv:2405.14458

  29. [29]

    YOLOv11: An Overview of the Key Architectural Enhancements

    Khanam, R. & Hussain, M. YOLOV11: AN OVERVIEW OF THE KEY ARCHITECTURAL ENHANCEMENTS.arXiv preprint arXiv:2410.17725(2024). 2410.17725

  30. [30]

    & Karkee, M

    Sapkota, R. & Karkee, M. Ultralytics Yolo Evolution: An Overview Of YOLO26, YOLO11, YOLOv8, And YOLOv5 Object Detectors For Computer Vision And Pattern Recognition.arXiv preprint arXiv:2510.09653(2025). 2510.09653. 40.Jocher, G. & Qiu, J. Ultralytics yolo11 (2024)

  31. [31]

    ArXiv:1809.02165 [cs]

    Liu, L.et al.Deep Learning for Generic Object Detection: A Survey, 10.48550/arXiv.1809.02165 (2019). ArXiv:1809.02165 [cs]

  32. [32]

    & Cox, D

    Bergstra, J., Yamins, D. & Cox, D. Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In Dasgupta, S. & McAllester, D. (eds.)Proceedings of the 30th International Conference on Machine Learning, vol. 28 ofProceedings of Machine Learning Research, 115–123 (PMLR, Atlanta, Georgia, USA, 2013)

  33. [33]

    Watanabe, Tree-structured parzen estimator: Understanding its al- gorithm components and their roles for better empirical performance (2023)

    Watanabe, S. Tree-Structured Parzen Estimator: Understanding Its Algorithm Components and Their Roles for Better Empirical Performance, 10.48550/arXiv.2304.11127 (2023). ArXiv:2304.11127 [cs]

  34. [34]

    & Onishi, M

    Ozaki, Y ., Tanigaki, Y ., Watanabe, S., Nomura, M. & Onishi, M. Multiobjective Tree-Structured Parzen Estimator.J. Artif. Intell. Res.73, 1209–1250, 10.1613/jair.1.13188 (2022)

  35. [35]

    Optuna: A Next-generation Hyperparameter Optimization Framework

    Akiba, T., Sano, S., Yanase, T., Ohta, T. & Koyama, M. Optuna: A Next-generation Hyperparameter Optimization Framework, 10.48550/arXiv.1907.10902 (2019). ArXiv:1907.10902 [cs]

  36. [36]

    Adam: A Method for Stochastic Optimization

    ClearML. Clearml - your entire mlops stack in one open-source tool (2024). Software available from http://github.com/clearml/clearml. 47.Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980(2014). 17/17