pith. sign in

arxiv: 2606.31772 · v1 · pith:YUP44JDTnew · submitted 2026-06-30 · 💻 cs.RO

Autonomous UAV Navigation for Individual Wildlife Re-Identification

Pith reviewed 2026-07-01 05:18 UTC · model grok-4.3

classification 💻 cs.RO
keywords UAV navigationwildlife re-identificationindividual animal trackingdrone image capturepose classificationecological data collectionreal-time controllateral flank
0
0 comments X

The pith

An autonomous drone system uses real-time detection and pose classification to capture images that meet the needs of individual wildlife re-identification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes an active navigation method for UAVs that detects animals, orients to show the lateral flank, and approaches until the subject fills enough of the frame for downstream re-ID models. This differs from earlier drone work that captured general behavioral video of groups. The approach relies on YOLOv11 to locate animals and a DINOv2 classifier to confirm the desired viewpoint in real time. A case study with zebra footage from Kenya demonstrates the pipeline, and the same logic is shown to apply to giraffes, tigers, and elephants. If the method holds, it reduces reliance on manual image collection for population and behavioral studies.

Core claim

The paper presents a closed-loop UAV controller that fuses YOLOv11 detection with DINOv2 pose classification to drive flight decisions: locate the animal, rotate to expose the lateral flank, and close distance until a minimum bounding-box size is reached, thereby producing images that satisfy the input requirements of pattern-based individual re-identification models rather than optimizing for group-level observation.

What carries the argument

YOLOv11 object detector paired with a DINOv2-based lateral-flank pose classifier that supplies real-time control signals for orientation and approach distance.

If this is right

  • Wildlife monitoring programs can shift from labor-intensive manual photography to automated collection of re-ID-ready images.
  • Drone missions can be tuned to the viewpoint and scale needs of specific re-ID algorithms instead of generic video recording.
  • The same detection-plus-pose pipeline applies across multiple species that carry diagnostic surface patterns.
  • Task-driven perception replaces passive sensing as the design goal for ecological UAV platforms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The system could be closed with an on-board re-ID model to decide when an individual has been sufficiently documented before moving to the next animal.
  • Flight paths that prioritize the lateral flank may incidentally reduce total flight time and therefore animal disturbance compared with random circling.
  • The same viewpoint-control logic could be transferred to other platforms such as ground robots or camera traps that must acquire specific angles.

Load-bearing premise

The DINOv2 pose classifier can correctly identify the lateral flank under uncontrolled field lighting and motion, and images that meet the bounding-box size threshold will produce better re-identification results than images collected without that constraint.

What would settle it

A side-by-side field trial in which re-identification accuracy is measured on two matched sets of images from the same animals: one set captured by the autonomous system and one set captured by a passive drone following standard flight paths.

Figures

Figures reproduced from arXiv: 2606.31772 by Claire Sun, Jenna Kline, Tanya Berger-Wolf.

Figure 1
Figure 1. Figure 1: Navigation pipeline for individual zebra identification employing the detect-orient-approach-capture (DOAC) approach. The [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Example images from the pose training dataset. The left [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
read the original abstract

Reliable individual re-identification (re-ID) of wildlife is essential for population monitoring, behavioral tracking, and conservation policy evaluation, yet large-scale data collection remains labor-intensive, relying on manual efforts by ecologists or citizen scientists. We propose an autonomous drone navigation system that actively optimizes image capture for downstream re-ID, moving beyond passive aerial sensing. The system combines YOLOv11 object detection with a DINOv2-based pose classifier to guide real-time flight decisions: detecting animals, orienting to expose the lateral flank (the surface of interest for pattern-based re-ID), and approaching until the subject meets a minimum bounding-box threshold. Unlike prior drone systems that optimize for group-level behavioral video, ours targets the specific image-quality requirements of individual-identification models. We demonstrate feasibility through a case study on zebra using footage collected in Kenya, and show the approach generalizes to other species with diagnostic surface patterns, including giraffes, tigers, and elephants. Our work establishes a framework for task-aware embodied AI for ecological data collection, in which downstream re-ID requirements drive real-time perception and control.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes an autonomous UAV navigation system for wildlife re-identification that integrates YOLOv11 object detection with a DINOv2-based pose classifier to detect animals, orient to expose the lateral flank, and approach until a minimum bounding-box size is reached. It claims this actively optimizes image capture for downstream individual re-ID (unlike passive sensing), demonstrates feasibility via a zebra case study in Kenya, and asserts generalization to giraffes, tigers, and elephants.

Significance. If the central claim holds, the work could provide a practical framework for task-aware embodied AI in ecological monitoring, potentially reducing labor-intensive manual data collection for pattern-based re-ID in conservation. However, the manuscript supplies no quantitative re-ID performance metrics, baselines, or comparisons, so the significance cannot be assessed from the provided evidence.

major comments (2)
  1. [Abstract] Abstract: The central claim that the system 'actively optimizes image capture for downstream re-ID' and 'targets the specific image-quality requirements of individual-identification models' is unsupported, as the manuscript reports no re-ID evaluation (e.g., rank-1 accuracy, mAP) on images captured by the proposed pipeline versus passive aerial views or controls.
  2. [Case study / Abstract] Case study description (implied in abstract and full text): The zebra demonstration shows detection, pose classification, and approach to bounding-box threshold but supplies no error metrics, success rates, real-time latency figures, or validation against field conditions, leaving the feasibility claim unquantified.
minor comments (1)
  1. [Abstract] The generalization claim to other species (giraffes, tigers, elephants) is stated without supporting details or examples in the provided text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback highlighting the need for clearer claims and better quantification. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that the system 'actively optimizes image capture for downstream re-ID' and 'targets the specific image-quality requirements of individual-identification models' is unsupported, as the manuscript reports no re-ID evaluation (e.g., rank-1 accuracy, mAP) on images captured by the proposed pipeline versus passive aerial views or controls.

    Authors: We agree that the manuscript provides no quantitative re-ID metrics comparing the proposed pipeline to passive capture. The system is engineered around established re-ID requirements (lateral flank exposure and minimum resolution) drawn from prior literature, and the navigation loop is designed to achieve these targets. However, we do not present empirical re-ID accuracy results. In revision we will rephrase the abstract and introduction to state that the navigation targets these known requirements, with the demonstrated contribution being successful real-time control rather than measured downstream re-ID gains. revision: yes

  2. Referee: [Case study / Abstract] Case study description (implied in abstract and full text): The zebra demonstration shows detection, pose classification, and approach to bounding-box threshold but supplies no error metrics, success rates, real-time latency figures, or validation against field conditions, leaving the feasibility claim unquantified.

    Authors: The case study is presented as a real-world feasibility demonstration rather than a comprehensive benchmark. We will add quantitative details available from the experiments, including detection and pose-classification accuracies, approach success rates, processing latencies, and notes on field conditions, to better support the feasibility claim. revision: yes

Circularity Check

0 steps flagged

No circularity; engineering system description with no derivations or self-referential predictions

full rationale

The manuscript presents a UAV navigation pipeline that composes off-the-shelf detectors (YOLOv11) and classifiers (DINOv2) to achieve lateral-pose framing and distance thresholds. No equations, fitted parameters, uniqueness theorems, or predictions appear; the case study simply demonstrates that the combined system can execute the described behaviors on zebra footage. The downstream re-ID benefit is stated as motivation rather than a derived or fitted result, so no reduction to inputs occurs. This is the normal non-circular outcome for an applied systems paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review based on abstract only; no explicit parameters or new entities described. System depends on pre-existing models.

axioms (1)
  • domain assumption YOLOv11 and DINOv2 models can be applied directly for real-time animal detection and flank pose classification in wildlife footage without additional training details provided.
    Abstract invokes these models as the core of the navigation pipeline.

pith-pipeline@v0.9.1-grok · 5718 in / 1331 out tokens · 40477 ms · 2026-07-01T05:18:07.747880+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

31 extracted references · 5 canonical work pages · 3 internal anchors

  1. [1]

    William Andrew, Colin Greatwood, and Tilo Burghardt. Deep learning for exploration and recovery of uncharted and dynamic targets from uav-like vision.2018 IEEE/RSJ In- ternational Conference on Intelligent Robots and Systems (IROS 2018), page 1124–1131, 2019. 2

  2. [2]

    Softwarepilot, 2023

    Kevyn Angueira Irizarry. Softwarepilot, 2023. GitHub repository, accessed 2026-03-12. 3

  3. [3]

    Wildbook: Crowdsourcing, computer vision, and data science for conservation

    Tanya Y . Berger-Wolf, Daniel I. Rubenstein, Charles V . Stewart, Jason A. Holmberg, Jason Parham, Sreejith Menon, Jonathan Crall, Jon Van Oast, Emre Kiciman, and Lucas Joppa. Wildbook: Crowdsourcing, computer vision, and data science for conservation. (arXiv:1710.08880), 2017. arXiv:1710.08880 [cs]. 1, 2

  4. [4]

    Automated detection of wildlife using drones: Synthesis, opportunities and constraints.Methods in Ecology and Evolution, 12(6):1103–1114, 2021

    Evangeline Corcoran, Megan Winsen, Ashlee Sudholz, and Grant Hamilton. Automated detection of wildlife using drones: Synthesis, opportunities and constraints.Methods in Ecology and Evolution, 12(6):1103–1114, 2021. 1, 2, 4

  5. [5]

    Hotspot- ter—patterned species instance recognition

    Jonathan P Crall, Charles V Stewart, Tanya Y Berger-Wolf, Daniel I Rubenstein, and Siva R Sundaresan. Hotspot- ter—patterned species instance recognition. In2013 IEEE workshop on applications of computer vision (WACV), pages 230–237. IEEE, 2013. 1

  6. [6]

    Grevy’s zebra trust annual report 2024

    Grevy’s Zebra Trust. Grevy’s zebra trust annual report 2024. Technical report, Grevy’s Zebra Trust, Nanyuki, Kenya,

  7. [7]

    Accessed: 2026-03-12. 1

  8. [8]

    Decentralized multi-drone coordination for wildlife video acquisition

    Denys Grushchak. Decentralized multi-drone coordination for wildlife video acquisition. 2023. 2

  9. [9]

    YOLOv11: An Overview of the Key Architectural Enhancements

    Rahima Khanam and Muhammad Hussain. Yolov11: An overview of the key architectural enhancements.arXiv preprint arXiv:2410.17725, 2024. 2

  10. [10]

    Kabr: In-situ dataset for kenyan animal behavior recognition from drone videos

    Maksim Kholiavchenko, Jenna Kline, Michelle Ramirez, Sam Stevens, Alec Sheets, Reshma Babu, Namrata Banerji, Elizabeth Campolongo, Matthew Thompson, Nina Van Tiel, Jackson Miliko, Eduardo Bessa, Isla Duporge, Tanya Berger- Wolf, Daniel Rubenstein, and Charles Stewart. Kabr: In-situ dataset for kenyan animal behavior recognition from drone videos. page 31–...

  11. [11]

    Fine-tuned yolov11m animal detection model (revision 1f0b26b), 2025

    Jenna Kline. Fine-tuned yolov11m animal detection model (revision 1f0b26b), 2025. 3

  12. [12]

    Stewart, and Christopher Stewart

    Jenna Kline, Tanya Berger-Wolf, Maksim Kholiavchenko, Otto Brookes, Charles V . Stewart, and Christopher Stewart. Integrating biological data into autonomous remote sensing systems for in situ imageomics: A case study for kenyan an- imal behavior sensing with unmanned aerial vehicles (uavs). First Workshop on Imageomics, 2024. 1, 2

  13. [13]

    Jenna Kline, Saadia Afridi, Edouard G. A. Rolland, Guy Maalouf, Lucie Laporte-Devylder, Christopher Stewart, Margaret Crofoot, Charles V . Stewart, Daniel I. Rubenstein, and Tanya Berger-Wolf. Studying collective animal be- haviour with drones and computer vision.Methods in Ecol- ogy and Evolution, n/a(n/a), 2025. 1, 2

  14. [14]

    Edge-native, behavior-adaptive drone system for wildlife monitoring

    Jenna Kline, Rugved Katole, Tanya Berger-Wolf, and Christopher Stewart. Edge-native, behavior-adaptive drone system for wildlife monitoring. InProceedings of the Tenth ACM/IEEE Symposium on Edge Computing, page 1–9, New York, NY , USA, 2025. Association for Computing Machin- ery. 3, 4

  15. [15]

    Kabr raw videos: Unprocessed drone footage for kenyan an- imal behavior analysis (revision fda789c), 2025

    Jenna Kline, Maksim Kholiavchenko, Michelle Ramirez, Samuel Stevens, Alec Sheets, Reshma Ramesh Babu, Nam- rata Banerji, Elizabeth Campolongo, Matthew Thompson, Nina Van Tiel, Jackson Miliko, Neil Rosser, Isla Duporge, Charles Stewart, Tanya Berger-Wolf, and Daniel Rubenstein. Kabr raw videos: Unprocessed drone footage for kenyan an- imal behavior analysi...

  16. [16]

    Neural Inf

    Jenna Kline, Samuel Stevens, Guy Maalouf, Camille Ron- deau Saint-Jean, Dat Nguyen Ngoc, Majid Mirmehdi, David Guerin, Tilo Burghardt, Elzbieta Pastucha, Blair Costelloe, et al. Mmla: Multi-environment, multi-species, low-altitude aerial footage dataset.arXiv preprint arXiv:2504.07744,

  17. [17]

    Kenyan animal behavior recognition (kabr) mini-scene raw videos, 2026

    Jenna Kline, Maksim Kholiavchenko, Michelle Ramirez, Samuel Stevens, Alec Sheets, Reshma Ramesh Babu, Nam- rata Banerji, Elizabeth Campolongo, Matthew Thompson, Nina Van Tiel, Jackson Miliko, Isla Duporge, Neil Rosser, Eduardo Bessa, Charles Stewart, Tanya Berger-Wolf, and Daniel Rubenstein. Kenyan animal behavior recognition (kabr) mini-scene raw videos, 2026. 3

  18. [18]

    Stew- art, Christopher Stewart, Daniel I

    Jenna Kline, Alison Zhong, Kevyn Irizarry, Charles V . Stew- art, Christopher Stewart, Daniel I. Rubenstein, and Tanya Berger-Wolf. Wildwing: An open-source, autonomous and affordable uas for animal behaviour video monitoring.Meth- ods in Ecology and Evolution, n/a(n/a), 2026. 1, 2, 3

  19. [19]

    Elephantbook: A semi-automated human-in-the- loop system for elephant re-identification

    Peter Kulits, Jake Wall, Anka Bedetti, Michelle Henley, and Sara Beery. Elephantbook: A semi-automated human-in-the- loop system for elephant re-identification. InACM SIGCAS Conference on Computing and Sustainable Societies, page 88–98, 2021. arXiv:2106.15083 [cs]. 1

  20. [20]

    Adaptive high- frequency transformer for diverse wildlife re-identification

    Chenyue Li, Shuoyi Chen, and Mang Ye. Adaptive high- frequency transformer for diverse wildlife re-identification. InComputer Vision – ECCV 2024, page 296–313, Cham,

  21. [21]

    Springer Nature Switzerland. 1

  22. [22]

    Sora 2.5-guided bvlos uas for wildlife conservation in kenya: Reducing fric- tion between safety and field operations.Drones, 10(3):178,

    Guy Maalouf, Thomas Stuart Richardson, David Roy Guerin, Matthew Watson, Ulrik Pagh Schultz Lundquist, Blair R Costelloe, Elzbieta Pastucha, Saadia Afridi, Edouard George Alain Rolland, Kilian Meier, et al. Sora 2.5-guided bvlos uas for wildlife conservation in kenya: Reducing fric- tion between safety and field operations.Drones, 10(3):178,

  23. [23]

    Wildbridge: Con- servation software for animal localisation using commercial drones

    Kilian Meier, Arthur Richards, Matthew Watson, C Johnson, D Hine, T Richardson, and G Maalouf. Wildbridge: Con- servation software for animal localisation using commercial drones. In15th annual international micro air vehicle con- ference and competition, pages 324–333. IMA VS, 2024. 1, 2

  24. [24]

    Bucktales: A multi- uav dataset for multi-object tracking and re-identification of wild antelopes.Advances in Neural Information Processing Systems, 37:81992–82009, 2024

    Hemal Naik, Junran Yang, Dipin Das, Margaret C Crofoot, Akanksha Rathore, and Vivek H Sridhar. Bucktales: A multi- uav dataset for multi-object tracking and re-identification of wild antelopes.Advances in Neural Information Processing Systems, 37:81992–82009, 2024. 2

  25. [25]

    Ekaterina Nepovinnykh, Ilia Chelak, Tuomas Eerola, Veikka Immonen, Heikki K ¨alvi¨ainen, Maksim Kholiavchenko, and Charles V . Stewart. Species-agnostic patterned animal re- identification by aggregating deep local features.Interna- tional Journal of Computer Vision, 132(9):4003–4018, 2024. 1, 2

  26. [26]

    DINOv2: Learning Robust Visual Features without Supervision

    Maxime Oquab, Timoth ´ee Darcet, Th ´eo Moutakanni, Huy V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193, 2023. 3

  27. [27]

    Goddard, L´eni E

    Stephen Pringle, Martin Dallimer, Mark A. Goddard, L´eni E. Le Goff, Emma Hart, Simon J. Langdale, Jessica C. Fisher, Sara-Adela Abad, Marc Ancrenaz, Fabio Angeoletto, Fer- nando Auat Cheein, Gail E. Austen, Joseph Bailey, Kather- ine Baldock, Lindsay Banin, Cristina Banks-Leite, Aliyu Barau, Reshu Bashyal, Adam J. Bates, Jake E. Bicknell, Jon Bielby, Pet...

  28. [28]

    Drone swarms for animal monitor- ing: A method for collecting high-quality multi-perspective data

    Edouard G A Rolland, Kasper A R Grøntved, Lucie Laporte- Devylder, Jenna M Kline, Ulrik P S Lundquist, and An- ders Lyhne Christensen. Drone swarms for animal monitor- ing: A method for collecting high-quality multi-perspective data. 2024. 2

  29. [29]

    Taylor, Stefan Linquist, and Stefan C

    Stefan Schneider, Graham W. Taylor, Stefan Linquist, and Stefan C. Kremer. Past, present and future approaches us- ing computer vision for animal re-identification from cam- era trap data.Methods in Ecology and Evolution, 10(4): 461–470, 2019. 1, 2

  30. [30]

    Costelloe, Silvia Zuffi, Benjamin Risse, Alexander Mathis, Mackenzie W

    Devis Tuia, Benjamin Kellenberger, Sara Beery, Blair R. Costelloe, Silvia Zuffi, Benjamin Risse, Alexander Mathis, Mackenzie W. Mathis, Frank Van Langevelde, Tilo Burghardt, Roland Kays, Holger Klinck, Martin Wikelski, Iain D. Couzin, Grant Van Horn, Margaret C. Crofoot, Charles V . Stewart, and Tanya Berger-Wolf. Perspectives in machine learning for wild...

  31. [31]

    Wildlifedatasets: An open-source toolkit for animal re-identification

    V ojtˇech ˇCerm´ak, Lukas Picek, Luk ´aˇs Adam, and Kostas Papafitsoros. Wildlifedatasets: An open-source toolkit for animal re-identification. In2024 IEEE/CVF Winter Con- ference on Applications of Computer Vision (WACV), page 5941–5951, Waikoloa, HI, USA, 2024. IEEE. 1, 4