pith. sign in

arxiv: 2606.00990 · v2 · pith:ZKD3WABAnew · submitted 2026-05-31 · 💻 cs.RO

OSCAR: Obstacle Survival Curves for Adaptive Robot Navigation

Pith reviewed 2026-06-28 17:27 UTC · model grok-4.3

classification 💻 cs.RO
keywords adaptive navigationsurvival modelingobstacle clearancegraph planningonline learningtemporary obstaclesrobot navigation
0
0 comments X

The pith

A robot learns class-specific obstacle clearance distributions from experience to set optimal patience thresholds for waiting or rerouting.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes learning class-conditioned survival models for how long obstacles block paths, using both full observations and censored data from reroutes. These models integrate into a graph planner that decides how long to wait at each blockage before choosing an alternate route. This matters because fixed heuristics either delay too long behind movable obstacles or reroute too soon around persistent ones, while the learned approach adapts over episodes. In simulation the method reaches within 1 percent of an oracle with perfect knowledge after under 20 observations per class and beats baselines. Real robot tests in an atrium confirm online improvement across multiple episodes.

Core claim

The central discovery is that online learning of residual clearance-time distributions conditioned on obstacle class, incorporating right-censored observations, enables a planner to compute dynamic patience thresholds that balance waiting and rerouting costs more effectively than fixed rules or heuristics.

What carries the argument

Class-conditioned residual clearance-time distributions learned from online experience with right-censored data, used to set patience thresholds in a time-dependent graph planner.

If this is right

  • The policy converges to within 1% of oracle time-to-goal after fewer than 20 observations per obstacle class in simulation.
  • It outperforms all heuristic baselines in simulation.
  • Real-world tests show the policy improves its patience thresholds across 50 navigation episodes.
  • The planner uses the models to decide patience at each blocked edge while maintaining obstacle memory.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method could be extended to cases where obstacle classes must be inferred from sensor data rather than provided at encounter.
  • Similar survival modeling might improve decisions in other robotics tasks involving uncertain event durations such as task scheduling.
  • Shared clearance models across multiple robots could accelerate learning for the group in collaborative settings.

Load-bearing premise

Obstacle class labels must be available at the time each blockage is encountered.

What would settle it

If the learned policy in simulation does not approach the oracle's time-to-goal within 1% after 20 observations per class, or fails to improve over heuristics, the central claim would be falsified.

Figures

Figures reproduced from arXiv: 2606.00990 by Aoran Jiao, Hshmat Sahak, Nicholas Rhinehart, Tim Barfoot.

Figure 1
Figure 1. Figure 1: Motivating example showing a robot navigating on a graph where it might be blocked by [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Our proposed pipeline. The robot follows the planned path till a blockage is detected. The [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Example wait-or-reroute decision at a blocked edge. After identifying a chair, the planner uses the learned survival model to compute Aclear(c), Aavoid(W), and J(W). The minimizer W⋆ gives the patience threshold; here the obstacle persists, so the robot reroutes. During planning, the robot must estimate how costly each edge will be if reached at a fu￾ture time t. We write this cost as bc(e, t) = w(e) + ∆( … view at source ↗
Figure 4
Figure 4. Figure 4: Race to 200. Oracle comes first place. Our method is best amongst alternatives (198). [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Simulation performance and data-efficiency results. (a) The learned policy improves with [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Simulation analysis of the learned KM policy. Left: comparison of navigation strategies [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Real-world deployment setup. (a) Graph and allowable obstacle spawn locations used in [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Real-world experiment results. (a) Time-to-goal over 50 navigation episodes, showing [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Graph used for simulation experiments. The graph contains multiple alternate routes, [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Ground-truth obstacle clearance distributions used in simulation. Top: clearance-time [PITH_FULL_IMAGE:figures/full_fig_p016_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Example obstacle-classification log. The VLM receives a LiDAR reflectivity image and [PITH_FULL_IMAGE:figures/full_fig_p020_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Unorchestrated person–chair real-world experiment. Unlike the orchestrated real-world [PITH_FULL_IMAGE:figures/full_fig_p021_12.png] view at source ↗
read the original abstract

A mobile robot following a graph of known routes can make costly navigation errors when a temporary obstacle blocks a critical edge: waiting too long behind a parked cart wastes time, but immediately rerouting around a person who would move in a few seconds is also inefficient. Standard reactive obstacle avoidance addresses local motion around obstacles, while fixed wait-or-reroute rules ignore how long different obstacle types tend to persist. We propose OSCAR: an adaptive survival-modeling framework for graph-based navigation with temporary blockages. Assuming obstacle class labels are available at encounter time, the robot learns class-conditioned residual clearance-time distributions from online experience, including right-censored observations when it reroutes before observing clearance. These survival models are integrated into a time-dependent graph planner that maintains obstacle memory and computes a patience threshold at each blocked edge: how long to wait before taking an alternate route. The method continuously updates its clearance estimates across episodes and uses them to balance waiting against rerouting. We evaluate the approach in simulation and on a real mobile robot in a university atrium with obstacles including people, chairs, bins, and tubes. In simulation, the learned policy's time-to-goal converges to within 1% of an oracle with access to ground-truth clearance distributions after fewer than 20 observations per obstacle class, outperforming all heuristic baselines. Real-world deployment confirms that the policy improves online, adapting its patience thresholds from experience across 50 navigation episodes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes OSCAR, a framework for graph-based robot navigation that learns class-conditioned residual clearance-time distributions online from navigation episodes (including right-censored observations when rerouting occurs before clearance) and integrates these survival models into a time-dependent planner that computes patience thresholds for waiting versus rerouting at blocked edges. It assumes obstacle class labels are available at encounter and claims that, in simulation, the learned policy's time-to-goal converges to within 1% of an oracle with ground-truth distributions after fewer than 20 observations per class while outperforming heuristic baselines; real-robot experiments in a university atrium with people, chairs, bins, and tubes are reported to show online improvement across 50 episodes.

Significance. If the convergence result holds after addressing potential biases in the survival estimation, the work would offer a data-driven alternative to fixed wait-or-reroute heuristics for temporary obstacles, with the online adaptation and explicit handling of censored clearance times as notable technical contributions. The simulation-to-real transfer and explicit comparison to an oracle provide a clear benchmark for assessing practical utility.

major comments (2)
  1. [Abstract] Abstract: the headline simulation claim (convergence to within 1% of oracle after <20 observations per class) rests on class-conditioned clearance distributions converging to ground truth, yet the censoring mechanism is policy-dependent—the patience threshold is computed from the current running model, so early inaccurate estimates directly determine the censoring times. This violates the non-informative censoring assumption required by standard survival estimators (Kaplan-Meier or parametric), creating a feedback loop whose effect on convergence is not analyzed.
  2. [Abstract] Abstract: no information is supplied on the concrete survival model (parametric family, non-parametric estimator, or mixture), the exact fitting procedure across episodes, how right-censored observations are incorporated, the statistical test or confidence interval used to assert “within 1%” convergence, or the precise definition of the heuristic baselines and their parameter settings.
minor comments (1)
  1. [Abstract] The assumption that obstacle class labels are available at encounter time is stated but its impact on real-world applicability is not quantified.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for highlighting important methodological clarifications needed in the abstract and for raising the issue of policy-dependent censoring. We address both points below and will revise the manuscript accordingly to strengthen the presentation of the survival modeling approach and its assumptions.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the headline simulation claim (convergence to within 1% of oracle after <20 observations per class) rests on class-conditioned clearance distributions converging to ground truth, yet the censoring mechanism is policy-dependent—the patience threshold is computed from the current running model, so early inaccurate estimates directly determine the censoring times. This violates the non-informative censoring assumption required by standard survival estimators (Kaplan-Meier or parametric), creating a feedback loop whose effect on convergence is not analyzed.

    Authors: The referee correctly identifies that the online patience threshold introduces dependence between the censoring time and the current survival estimate, which can violate the standard non-informative censoring assumption. While our empirical results demonstrate convergence to oracle performance, we did not provide a dedicated analysis of this feedback loop. We will add a new subsection in the methods and an additional simulation experiment that isolates the effect of policy-dependent censoring (e.g., by comparing against an oracle that uses fixed thresholds) and discuss the conditions under which the bias remains limited in practice. revision: partial

  2. Referee: [Abstract] Abstract: no information is supplied on the concrete survival model (parametric family, non-parametric estimator, or mixture), the exact fitting procedure across episodes, how right-censored observations are incorporated, the statistical test or confidence interval used to assert “within 1%” convergence, or the precise definition of the heuristic baselines and their parameter settings.

    Authors: We agree that these implementation details are essential and were omitted from the abstract. The full manuscript uses the Kaplan-Meier estimator updated incrementally per class, incorporating right-censored observations at the current patience threshold; the 1% figure is the relative difference in mean time-to-goal averaged over 100 simulation trials; baselines are fixed-wait policies at 5 s / 10 s / 20 s and a random-reroute policy. We will expand the abstract with a concise methods sentence and ensure the supplementary material or methods section contains the exact update rule, convergence metric definition, and baseline parameter values. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper describes an online learning procedure that fits class-conditioned survival models to observed clearance times (including right-censored cases produced by the current policy). The headline simulation result is an empirical statement that the resulting policy's time-to-goal approaches an external oracle after <20 observations per class. No equations, self-citations, or ansatzes are shown that reduce this convergence claim to a tautology, a fitted parameter renamed as a prediction, or a self-referential definition. The method remains self-contained against external benchmarks (ground-truth distributions and heuristic baselines).

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Central claim rests on availability of class labels and standard survival analysis assumptions for learning clearance distributions from experience including censored cases.

free parameters (1)
  • class-conditioned survival distribution parameters
    Parameters of residual clearance-time distributions are fitted online from observations per obstacle class.
axioms (1)
  • domain assumption Obstacle class labels are available at encounter time
    Required to condition the survival models on obstacle type.

pith-pipeline@v0.9.1-grok · 5789 in / 1090 out tokens · 25553 ms · 2026-06-28T17:27:01.118267+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

33 extracted references · 13 canonical work pages

  1. [1]

    Lacerda, F

    B. Lacerda, F. Faruq, D. Parker, and N. Hawes. Probabilistic planning with formal performance guarantees for mobile service robots.The International Journal of Robotics Research, 38 (9):1098–1123, 2019. doi:10.1177/0278364919856695. URLhttps://doi.org/10.1177/ 0278364919856695

  2. [2]

    Nardi and C

    L. Nardi and C. Stachniss. Long-term robot navigation in indoor environments estimating patterns in traversability changes.2020 IEEE International Conference on Robotics and Automation (ICRA), pages 300–306, 2019. URLhttps://api.semanticscholar.org/ CorpusID:203591939

  3. [3]

    B. Marthi. Navigation in partially observed dynamic roadmaps. 2010. URLhttps://api. semanticscholar.org/CorpusID:12577833

  4. [4]

    Wurman, R

    P. Wurman, R. D’Andrea, and M. Mountz. Coordinating hundreds of cooperative, autonomous vehicles in warehouses.AI Magazine, 29:9–20, 03 2008

  5. [5]

    Fragapane, H.-H

    G. Fragapane, H.-H. Hvolby, F. Sgarbossa, and J. O. Strandhagen. Autonomous mobile robots in hospital logistics. InAdvances in Production Management Systems. The Path to Digital Transformation and Innovation of Production Management Systems, pages 672–679, Cham,

  6. [6]

    Springer International Publishing

  7. [7]

    Hawes, C

    N. Hawes, C. Burbridge, F. Jovan, L. Kunze, B. Lacerda, L. Mudrova, J. Young, J. Wyatt, D. Hebesberger, T. Kortner, et al. The strands project: Long-term autonomy in everyday environments.IEEE Robotics & Automation Magazine, 24(3):146–156, 2017

  8. [8]

    A. R. Di ´eguez, R. Sanz, and J. L. Fern ´andez. Integrating time performance in global path planning for autonomous mobile robots. In X.-J. Jing, editor,Motion Planning, chapter 24. IntechOpen, London, 2008. doi:10.5772/5999. URLhttps://doi.org/10.5772/5999

  9. [9]

    S. Bing. Online canadian traveler problem with stochastic blockages recovery time.Sys- tems Engineering - Theory & Practice, 2005. URLhttps://api.semanticscholar.org/ CorpusID:156857270

  10. [10]

    E. L. Kaplan and P. Meier. Nonparametric estimation from incomplete observations.Jour- nal of the American Statistical Association, 53:457–481, 1958. URLhttps://api. semanticscholar.org/CorpusID:18549513

  11. [11]

    D. R. Cox. Regression models and life tables (with discussion. 1972. URLhttps://api. semanticscholar.org/CorpusID:116197768

  12. [12]

    J. P. Klein and M. L. Moeschberger.Survival Analysis: Techniques for Censored and Truncated Data. Statistics for Biology and Health. Springer, New York, 2 edition, 2003. ISBN 978-0- 387-95399-1

  13. [13]

    Fujimura

    K. Fujimura. Time-minimum routes in time-dependent networks.IEEE Transactions on Robotics and Automation, 11(3):343–351, 1995. doi:10.1109/70.388776

  14. [14]

    M. P. Wellman, M. Ford, and K. Larson. Path planning under time-dependent uncer- tainty. InConference on Uncertainty in Artificial Intelligence, 1995. URLhttps://api. semanticscholar.org/CorpusID:12281821

  15. [15]

    Gao and I

    S. Gao and I. Chabini. Optimal routing policy problems in stochastic time-dependent net- works.Transportation Research Part B-methodological, 40:93–122, 2006. URLhttps: //api.semanticscholar.org/CorpusID:121870726

  16. [16]

    D. M. Rosen, J. Mason, and J. J. Leonard. Towards lifelong feature-based mapping in semi- static environments. In2016 IEEE International Conference on Robotics and Automation (ICRA), pages 1063–1070, 2016. doi:10.1109/ICRA.2016.7487237. 9

  17. [17]

    Tsamis, I

    G. Tsamis, I. Kostavelis, D. Giakoumis, and D. Tzovaras. Towards life-long mapping of dynamic environments using temporal persistence modeling.2020 25th International Con- ference on Pattern Recognition (ICPR), pages 10480–10485, 2021. URLhttps://api. semanticscholar.org/CorpusID:233877760

  18. [18]

    D. Fox, W. Burgard, and S. Thrun. The dynamic window approach to collision avoidance. IEEE Robotics & Automation Magazine, 4(1):23–33, 1997

  19. [19]

    Kruse, A

    T. Kruse, A. K. Pandey, R. Alami, and A. Kirsch. Human-aware robot navigation: A survey. Robotics and Autonomous Systems, 61(12):1726–1743, 2013. ISSN 0921-8890. doi:https:// doi.org/10.1016/j.robot.2013.05.007. URLhttps://www.sciencedirect.com/science/ article/pii/S0921889013001048

  20. [20]

    Trautman, J

    P. Trautman, J. Ma, R. M. Murray, and A. Krause. Robot navigation in dense human crowds: Statistical models and experimental studies of human–robot cooperation.The International Journal of Robotics Research, 34(3):335–356, 2015. doi:10.1177/0278364914557874. URL https://doi.org/10.1177/0278364914557874

  21. [21]

    J. Han, M. Vanniasinghe, H. Sahak, N. Rhinehart, and T. Barfoot. Ratatouille: Imitation learning ingredients for real-world social robot navigation.arXiv preprint arXiv:2509.17204. doi:10.48550/arXiv.2509.17204

  22. [22]

    E. W. Dijkstra. A note on two problems in connexion with graphs.Numerische Mathematik, 1:269–271, 1959. doi:10.1007/BF01386390

  23. [23]

    P. E. Hart, N. J. Nilsson, and B. Raphael. A formal basis for the heuristic determination of minimum cost paths.IEEE Transactions on Systems Science and Cybernetics, 4(2):100–107,

  24. [24]

    doi:10.1109/TSSC.1968.300136

  25. [25]

    Papadimitriou and M

    C. Papadimitriou and M. Yannakakis. Shortest paths without a map. pages 610–620, 01 1989. ISBN 978-3-540-51371-1. doi:10.1007/BFb0035787

  26. [26]

    Nikolova

    E. Nikolova. Exact algorithms for the canadian traveller problem on paths and trees. 01 2008

  27. [27]

    Eyerich, T

    P. Eyerich, T. Keller, and M. Helmert. High-quality policies for the canadian traveler’s prob- lem.Proceedings of the AAAI Conference on Artificial Intelligence, 24(1):51–58, Jul. 2010. doi:10.1609/aaai.v24i1.7542. URLhttps://ojs.aaai.org/index.php/AAAI/article/ view/7542

  28. [28]

    Guo and T

    H. Guo and T. D. Barfoot. The robust canadian traveler problem applied to robot routing. In 2019 International Conference on Robotics and Automation (ICRA), page 5523–5529. IEEE Press, 2019. doi:10.1109/ICRA.2019.8794252. URLhttps://doi.org/10.1109/ICRA. 2019.8794252

  29. [29]

    J. Halpern. Shortest route with time dependent length of edges and limited delay possi- bilities in nodes.Zeitschrift f ¨ur Operations Research, 21:117–124, 1977. URLhttps: //api.semanticscholar.org/CorpusID:12304537

  30. [30]

    R. W. Hall. The fastest path through a network with random time-dependent travel times. Transp. Sci., 20:182–188, 1986. URLhttps://api.semanticscholar.org/CorpusID: 207246327

  31. [31]

    Orda and R

    A. Orda and R. Rom. Minimum weight paths in time-dependent networks.Networks, 21: 295–319, 1991. URLhttps://api.semanticscholar.org/CorpusID:15025216

  32. [32]

    Furgale and T

    P. Furgale and T. Barfoot. Visual teach and repeat for long-range rover autonomy.J. Field Robotics, 27:534–560, 09 2010. doi:10.1002/rob.20342

  33. [33]

    Gpt-5.3 instant: Smoother, more useful everyday conversations.https://openai

    OpenAI. Gpt-5.3 instant: Smoother, more useful everyday conversations.https://openai. com/index/gpt-5-3-instant/, 2026. Accessed: 2026-05-27. 10 Appendix A Functional Time-Dependent Planner and Proofs This appendix gives the functional planner used to computeA clear andA avoid, and proves the prop- erties claimed in Sec. 3. Throughout, times are measured ...