pith. machine review for the scientific record. sign in

arxiv: 2604.16741 · v1 · submitted 2026-04-17 · 💻 cs.RO

Recognition: unknown

LiDAR-based Crowd Navigation with Visible Edge Group Representation

Authors on Pith no claims yet

Pith reviewed 2026-05-10 07:38 UTC · model grok-4.3

classification 💻 cs.RO
keywords crowd navigationLiDARgroup representationvisible edgespedestrian groupsrobot navigationdense crowdssocial navigation
0
0 comments X

The pith

Group prediction accuracy affects robot navigation performance only marginally in crowded environments, enabling a simplified visible edge-based group representation from LiDAR.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that precise knowledge of pedestrian groups is not essential for a robot to navigate crowded spaces safely and in a socially appropriate manner. This finding supports replacing detailed individual tracking with a simpler visible edge-based group representation derived directly from LiDAR sensor data, which avoids the errors that arise from occlusions and external detection modules. Simulation results show that navigation using this reduced representation achieves comparable safety and social compliance to more complex approaches while running faster. Real-robot experiments confirm that the method can be deployed practically in actual dense crowds without additional sensing infrastructure. A sympathetic reader would care because it lowers the sensing and computation barriers to deploying group-aware robots in everyday crowded settings where perfect data is unavailable.

Core claim

We show that group prediction accuracy affects navigation performance only marginally in crowded environments. Based on this observation, we propose the visible edge-based group representation. We additionally demonstrate via simulation experiments that our navigation framework, integrated with the simplified group representation, performs comparatively in terms of safety and socialness in dense crowds, while achieving faster computation speed. Finally, we deploy our navigation framework on a real robot to explore the benefits of practically deploying group-based representations in the real world.

What carries the argument

visible edge-based group representation from LiDAR, which captures pedestrian groups via their detectable outer boundaries rather than tracking every individual

If this is right

  • Navigation safety and social compliance hold steady even when group assignments contain errors common in occluded crowds.
  • Computation time for planning decreases, supporting real-time control in faster-moving or larger crowds.
  • Real-world robot deployment becomes feasible without auxiliary sensors or trackers that fail under occlusion.
  • Group-aware behavior can be added to existing LiDAR pipelines with minimal added complexity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The marginal impact of group accuracy could allow even coarser approximations, such as treating all nearby pedestrians as one loose group, in extremely dense conditions.
  • The approach might transfer to other range sensors if visible boundary extraction can be adapted, broadening sensor options for group navigation.
  • Testing with rapidly forming or dissolving groups would show whether the visible-edge focus handles dynamic collective motion better than identity-based methods.
  • Coordinating multiple robots could leverage shared visible-edge maps to improve collective flow through the same crowd.

Load-bearing premise

A visible edge-based group representation extracted from LiDAR can be obtained reliably enough in real dense crowds to support the claimed navigation performance without external detection modules.

What would settle it

Run the navigation system in a dense crowd while deliberately degrading or removing the group representation input and measure whether safety metrics such as minimum distance to pedestrians or social metrics such as path deviation increase substantially.

Figures

Figures reproduced from arXiv: 2604.16741 by Aaron Steinfeld, Allan Wang.

Figure 1
Figure 1. Figure 1: The top row is how an example group evolves in time. The middle [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Demonstration of the construction of our visible edge-based [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Qualitative examples of group-conv (top) vs. edge-MPC (bottom). [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Left: The robot used in the real world experiment. Middle: The crossing scenario. Yellow trajectories illustrate edge-MPC’s behavior. Blue [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Ratings from the participants about the naturalness and comfort of [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
read the original abstract

Robot navigation in crowded pedestrian environments is a well-known challenge and we explore the practical deployment of group-based representations in this setting. Pedestrian groups have been empirically shown to enable a mobile robot's navigation behavior to be safer and more social. However, existing approaches either explored groups only in limited scenarios with no high-density crowds or depended on external detection modules to track individuals, which are prone to noise and errors due to occlusions in crowds. We show that group prediction accuracy affects navigation performance only marginally in crowded environments. Based on this observation, we propose the visible edge-based group representation. We additionally demonstrate via simulation experiments that our navigation framework, integrated with the simplified group representation, performs comparatively in terms of safety and socialness in dense crowds, while achieving faster computation speed. Finally, we deploy our navigation framework on a real robot to explore the benefits of practically deploying group-based representations in the real world.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript presents a LiDAR-based crowd navigation framework that employs a visible edge group representation extracted directly from point clouds, avoiding external pedestrian detectors. The central claim is that group prediction accuracy affects navigation performance only marginally in crowded environments. Simulation experiments are used to show that the simplified representation yields comparable safety and socialness to more detailed group models while improving computation speed, and a real-robot deployment is reported to explore practical benefits.

Significance. If the marginal-impact claim is substantiated with quantitative validation, the work could simplify perception requirements for social navigation in dense crowds, enabling faster and more deployable systems that rely solely on LiDAR without fragile external modules. The direct use of visible edges addresses occlusion challenges common in real crowds, and the real-robot demonstration provides initial evidence of practicality.

major comments (3)
  1. [Abstract and simulation experiments] Abstract and simulation experiments section: the central claim that group prediction accuracy affects navigation performance only marginally rests on simulation ablations that inject controlled errors, yet no quantitative metrics (e.g., success rates, collision counts, or socialness scores with error bars), baselines, or statistical analysis are provided to establish the size or robustness of the marginal effect.
  2. [Real-robot deployment] Real-robot deployment section: the deployment is presented qualitatively without any reported quantitative group-extraction accuracy (e.g., precision/recall against ground-truth groups), navigation performance metrics, or ablation that substitutes perfect groups for the visible-edge output under identical real-world conditions, leaving the weakest assumption about reliability in dense crowds untested.
  3. [Simulation experiments] Simulation experiments section: the synthetic perturbations used to model group-label errors do not demonstrably replicate the structured, non-random errors (fragmented edges, merged clusters) that arise from LiDAR occlusions, partial views, and sensor noise in dense crowds, undermining the transferability of the marginal-effect result to the proposed visible-edge representation.
minor comments (2)
  1. [Method] The visible-edge group representation would benefit from an explicit algorithmic description or pseudocode to clarify how edges are extracted and grouped from raw LiDAR points.
  2. [Figures] Figure captions and axis labels in the simulation results could be expanded to include exact metric definitions and the number of trials per condition for improved reproducibility.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below with point-by-point responses and indicate planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Abstract and simulation experiments] Abstract and simulation experiments section: the central claim that group prediction accuracy affects navigation performance only marginally rests on simulation ablations that inject controlled errors, yet no quantitative metrics (e.g., success rates, collision counts, or socialness scores with error bars), baselines, or statistical analysis are provided to establish the size or robustness of the marginal effect.

    Authors: We agree that additional quantitative details would strengthen the presentation of the central claim. In the revised manuscript, we will expand the simulation experiments section to include success rates, collision counts, and socialness scores reported with error bars across multiple randomized trials. We will also incorporate comparisons against relevant baselines and apply statistical tests to quantify the marginal effect and assess its robustness. revision: yes

  2. Referee: [Real-robot deployment] Real-robot deployment section: the deployment is presented qualitatively without any reported quantitative group-extraction accuracy (e.g., precision/recall against ground-truth groups), navigation performance metrics, or ablation that substitutes perfect groups for the visible-edge output under identical real-world conditions, leaving the weakest assumption about reliability in dense crowds untested.

    Authors: The real-robot deployment was included primarily to demonstrate initial practicality rather than to serve as a full quantitative validation. We will revise the section to report available navigation performance metrics from the experiments, such as traversal time and observed safety behaviors. However, computing precision/recall for group extraction requires ground-truth labels that are difficult to obtain in real dense crowds without additional sensors or manual annotation, and we did not conduct an ablation replacing visible-edge groups with perfect groups under the same real-world conditions. We will explicitly note these limitations and the underlying assumptions about reliability. revision: partial

  3. Referee: [Simulation experiments] Simulation experiments section: the synthetic perturbations used to model group-label errors do not demonstrably replicate the structured, non-random errors (fragmented edges, merged clusters) that arise from LiDAR occlusions, partial views, and sensor noise in dense crowds, undermining the transferability of the marginal-effect result to the proposed visible-edge representation.

    Authors: We acknowledge that the synthetic perturbations represent a controlled simplification and may not capture every structured error pattern observed in real LiDAR data. In the revision, we will add a dedicated discussion explaining the design choices for the perturbation model, including how it approximates common effects such as edge fragmentation and cluster merging. We will also include a brief comparison of error statistics derived from sample real LiDAR point clouds versus the synthetic perturbations to better support transferability of the marginal-effect finding. revision: yes

standing simulated objections not resolved
  • The request for a real-world ablation study substituting perfect groups for the visible-edge representation under identical conditions, as no such experiment was performed and obtaining reliable ground-truth group labels in dense real crowds would require substantial additional instrumentation not available in the original deployment.

Circularity Check

0 steps flagged

No significant circularity; claims rest on empirical experiments

full rationale

The paper's central claim—that group prediction accuracy affects navigation performance only marginally—is presented as an empirical observation from simulation experiments and real-robot deployment, not as a mathematical derivation. No equations, fitted parameters renamed as predictions, self-definitional constructs, or load-bearing self-citations are evident in the abstract or described structure. The visible-edge representation is proposed based on the experimental finding rather than defined in terms of it, and the work does not invoke uniqueness theorems or ansatzes from prior self-citations to force results. The derivation chain is self-contained through direct experimental validation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities are described in the provided text.

pith-pipeline@v0.9.0 · 5444 in / 1019 out tokens · 51230 ms · 2026-05-10T07:38:42.370851+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

63 extracted references · 2 canonical work pages · 1 internal anchor

  1. [1]

    A survey on socially aware robot navigation: Taxonomy and future challenges,

    P. T. Singamaneni, P. Bachiller-Burgos, L. J. Manso, A. Garrell, A. Sanfeliu, A. Spalanzani, and R. Alami, “A survey on socially aware robot navigation: Taxonomy and future challenges,”Int. J. Robot. Res., vol. 43, no. 10, pp. 1533–1572, 2024

  2. [2]

    Unfreezing the robot: Navigation in dense, interacting crowds,

    P. Trautman and A. Krause, “Unfreezing the robot: Navigation in dense, interacting crowds,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS). IEEE, 2010, pp. 797–803

  3. [3]

    Core challenges of social robot navigation: A survey,

    C. Mavrogiannis, F. Baldini, A. Wang, D. Zhao, P. Trautman, A. Stein- feld, and J. Oh, “Core challenges of social robot navigation: A survey,” ACM Trans. Hum.-Robot Interact., vol. 12, no. 3, pp. 1–39, 2023

  4. [4]

    Tbd pedestrian data collection: Towards rich, portable, and large- scale natural pedestrian data,

    A. Wang, D. Sato, Y . Corzo, S. Simkin, A. Biswas, and A. Steinfeld, “Tbd pedestrian data collection: Towards rich, portable, and large- scale natural pedestrian data,” inProc. IEEE Int. Conf. Robot. Autom. (ICRA). IEEE, 2024, pp. 637–644

  5. [5]

    Bytetrack: Multi-object tracking by associating every detection box,

    Y . Zhang, P. Sun, Y . Jiang, D. Yu, F. Weng, Z. Yuan, P. Luo, W. Liu, and X. Wang, “Bytetrack: Multi-object tracking by associating every detection box,” inProc. Eur. Conf. Comput. Vis. (ECCV). Springer, 2022, pp. 1–21

  6. [6]

    Proactive model predictive control with multi-modal human motion prediction in cluttered dynamic environments,

    L. Heuer, L. Palmieri, A. Rudenko, A. Mannucci, M. Magnusson, and K. O. Arras, “Proactive model predictive control with multi-modal human motion prediction in cluttered dynamic environments,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS). IEEE, 2023, pp. 229–236

  7. [7]

    Social navigation in crowded environments with model predictive control and deep learning-based human trajectory prediction,

    V .-A. Le, B. Chalaki, V . Tadiparthi, H. N. Mahjoub, J. D’sa, and E. Moradi-Pari, “Social navigation in crowded environments with model predictive control and deep learning-based human trajectory prediction,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS). IEEE, 2024, pp. 4793–4799

  8. [8]

    Socially competent navigation planning by deep learning of multi-agent path topologies,

    C. I. Mavrogiannis, V . Blukis, and R. A. Knepper, “Socially competent navigation planning by deep learning of multi-agent path topologies,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS). IEEE, 2017, pp. 6817–6824

  9. [9]

    Robot navigation in dense human crowds: Statistical models and experimental studies of human–robot cooperation,

    P. Trautman, J. Ma, R. M. Murray, and A. Krause, “Robot navigation in dense human crowds: Statistical models and experimental studies of human–robot cooperation,”Int. J. Robot. Res., vol. 34, no. 3, pp. 335–356, 2015

  10. [10]

    Socially aware motion planning with deep reinforcement learning,

    Y . F. Chen, M. Everett, M. Liu, and J. P. How, “Socially aware motion planning with deep reinforcement learning,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS). IEEE, 2017, pp. 1343–1350

  11. [11]

    Intention aware robot crowd navigation with attention-based interaction graph,

    S. Liu, P. Chang, Z. Huang, N. Chakraborty, K. Hong, W. Liang, D. L. McPherson, J. Geng, and K. Driggs-Campbell, “Intention aware robot crowd navigation with attention-based interaction graph,” inProc. IEEE Int. Conf. Robot. Autom. (ICRA). IEEE, 2023, pp. 12 015– 12 021

  12. [12]

    Navistar: Socially aware robot navigation with hybrid spatio-temporal graph transformer and preference learning,

    W. Wang, R. Wang, L. Mao, and B.-C. Min, “Navistar: Socially aware robot navigation with hybrid spatio-temporal graph transformer and preference learning,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS). IEEE, 2023, pp. 11 348–11 355

  13. [13]

    Risk- sensitive mobile robot navigation in crowded environment via offline reinforcement learning,

    J. Wu, Y . Wang, H. Asama, Q. An, and A. Yamashita, “Risk- sensitive mobile robot navigation in crowded environment via offline reinforcement learning,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS). IEEE, 2023, pp. 7456–7462

  14. [14]

    Understanding dynamic social grouping behaviors of pedestrians,

    L. Feng and B. Bhanu, “Understanding dynamic social grouping behaviors of pedestrians,”IEEE J. Sel. Topics Signal Process., vol. 9, no. 2, pp. 317–329, 2014

  15. [15]

    Group-based motion prediction for navigation in crowded environments,

    A. Wang, C. Mavrogiannis, and A. Steinfeld, “Group-based motion prediction for navigation in crowded environments,” inProc. Conf. Robot Learn. (CoRL). PMLR, 2022, pp. 871–882

  16. [16]

    Learning a group-aware policy for robot navigation,

    K. Katyal, Y . Gao, J. Markowitz, S. Pohland, C. Rivera, I.-J. Wang, and C.-M. Huang, “Learning a group-aware policy for robot navigation,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS). IEEE, 2022, pp. 11 328–11 335

  17. [17]

    Human-aware navigation in crowded environments using adaptive proxemic area and group detection,

    C. Medina-S ´anchez, S. Janzon, M. Zella, J. Capit ´an, and P. J. Marr ´on, “Human-aware navigation in crowded environments using adaptive proxemic area and group detection,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS). IEEE, 2023, pp. 6741–6748

  18. [18]

    Gson: a group-based social navigation framework with large multimodal model,

    S. Luo, P. Sun, J. Zhu, Y . Deng, C. Yu, A. Xiao, and X. Wang, “Gson: a group-based social navigation framework with large multimodal model,”IEEE Robot. Autom. Lett., 2025

  19. [19]

    Group- aware robot navigation in crowds using spatio-temporal graph attention network with deep reinforcement learning,

    X. Lu, A. Faragasso, Y . Wang, A. Yamashita, and H. Asama, “Group- aware robot navigation in crowds using spatio-temporal graph attention network with deep reinforcement learning,”IEEE Robot. Autom. Lett., 2025

  20. [20]

    Koffka,Principles of Gestalt psychology

    K. Koffka,Principles of Gestalt psychology. Routledge, 2013

  21. [21]

    Social force model for pedestrian dynam- ics,

    D. Helbing and P. Molnar, “Social force model for pedestrian dynam- ics,”Phys. Rev. E, vol. 51, no. 5, p. 4282, 1995

  22. [22]

    Reciprocal velocity obstacles for real-time multi-agent navigation,

    J. Van den Berg, M. Lin, and D. Manocha, “Reciprocal velocity obstacles for real-time multi-agent navigation,” inProc. IEEE Int. Conf. Robot. Autom. (ICRA). IEEE, 2008, pp. 1928–1935

  23. [23]

    Human-aware naviga- tion planner for diverse human-robot interaction contexts,

    P. T. Singamaneni, A. Favier, and R. Alami, “Human-aware naviga- tion planner for diverse human-robot interaction contexts,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS). IEEE, 2021, pp. 5817–5824

  24. [24]

    Mixed strategy nash equilibrium for crowd navigation,

    M. Muchen Sun, F. Baldini, K. Hughes, P. Trautman, and T. Murphey, “Mixed strategy nash equilibrium for crowd navigation,”Int. J. Robot. Res., vol. 44, no. 7, pp. 1156–1185, 2025

  25. [25]

    Probabilistic dynamic crowd prediction for social navigation,

    S. H. Kiss, K. Katuwandeniya, A. Alempijevic, and T. Vidal-Calleja, “Probabilistic dynamic crowd prediction for social navigation,” in Proc. IEEE Int. Conf. Robot. Autom. (ICRA). IEEE, 2021, pp. 9269– 9275

  26. [26]

    You’ll never walk alone: Modeling social behavior for multi-target tracking,

    S. Pellegrini, A. Ess, K. Schindler, and L. Van Gool, “You’ll never walk alone: Modeling social behavior for multi-target tracking,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV). IEEE, 2009, pp. 261– 268

  27. [27]

    Crowds by example,

    A. Lerner, Y . Chrysanthou, and D. Lischinski, “Crowds by example,” inComput. Graph. Forum, vol. 26, no. 3. Wiley Online Library, 2007, pp. 655–664

  28. [28]

    Jrdb: A dataset and benchmark of egocentric robot visual perception of humans in built environments,

    R. Martin-Martin, M. Patel, H. Rezatofighi, A. Shenoi, J. Gwak, E. Frankel, A. Sadeghian, and S. Savarese, “Jrdb: A dataset and benchmark of egocentric robot visual perception of humans in built environments,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 6, pp. 6748–6765, 2021

  29. [29]

    Socially compliant mobile robot navigation via inverse reinforcement learning,

    H. Kretzschmar, M. Spies, C. Sprunk, and W. Burgard, “Socially compliant mobile robot navigation via inverse reinforcement learning,” Int. J. Robot. Res., vol. 35, no. 11, pp. 1289–1307, 2016

  30. [30]

    Rethinking social robot navigation: Leveraging the best of two worlds,

    A. H. Raj, Z. Hu, H. Karnan, R. Chandra, A. Payandeh, L. Mao, P. Stone, J. Biswas, and X. Xiao, “Rethinking social robot navigation: Leveraging the best of two worlds,” inProc. IEEE Int. Conf. Robot. Autom. (ICRA). IEEE, 2024, pp. 16 330–16 337

  31. [31]

    Crowd-robot interaction: Crowd-aware robot navigation with attention-based deep reinforce- ment learning,

    C. Chen, Y . Liu, S. Kreiss, and A. Alahi, “Crowd-robot interaction: Crowd-aware robot navigation with attention-based deep reinforce- ment learning,” inProc. IEEE Int. Conf. Robot. Autom. (ICRA). IEEE, 2019, pp. 6015–6022

  32. [32]

    Sacson: Scalable autonomous control for social navigation,

    N. Hirose, D. Shah, A. Sridhar, and S. Levine, “Sacson: Scalable autonomous control for social navigation,”IEEE Robot. Autom. Lett., vol. 9, no. 1, pp. 49–56, 2023

  33. [33]

    Selfi: Autonomous self-improvement with reinforcement learning for social navigation,

    N. Hirose, D. Shah, K. Stachowicz, A. Sridhar, and S. Levine, “Selfi: Autonomous self-improvement with reinforcement learning for social navigation,” inProc. Conf. Robot Learn. (CoRL). PMLR, 2024, p. TBD

  34. [34]

    Spatiotemporal attention enhances lidar-based robot navigation in dynamic environments,

    J. de Heuvel, X. Zeng, W. Shi, T. Sethuraman, and M. Bennewitz, “Spatiotemporal attention enhances lidar-based robot navigation in dynamic environments,”IEEE Robot. Autom. Lett., 2024

  35. [35]

    Realtime collision avoidance for mobile robots in dense crowds using implicit multi-sensor fusion and deep reinforcement learning,

    J. Liang, U. Patel, A. J. Sathyamoorthy, and D. Manocha, “Realtime collision avoidance for mobile robots in dense crowds using implicit multi-sensor fusion and deep reinforcement learning,”arXiv preprint arXiv:2004.03089, 2020

  36. [36]

    Drl-vo: Learning to navigate through crowded dynamic scenes using velocity obstacles,

    Z. Xie and P. Dames, “Drl-vo: Learning to navigate through crowded dynamic scenes using velocity obstacles,”IEEE Trans. Robot., vol. 39, no. 4, pp. 2700–2719, 2023

  37. [37]

    Who knows who-inverting the social force model for finding groups,

    J. ˇSochman and D. C. Hogg, “Who knows who-inverting the social force model for finding groups,” inProc. IEEE Int. Conf. Comput. Vis. Workshops (ICCVW). IEEE, 2011, pp. 830–837

  38. [38]

    Improving data association by joint modeling of pedestrian trajectories and groupings,

    S. Pellegrini, A. Ess, and L. Van Gool, “Improving data association by joint modeling of pedestrian trajectories and groupings,” inProc. Eur. Conf. Comput. Vis. (ECCV). Springer, 2010, pp. 452–465

  39. [39]

    Probabilistic group-level motion analysis and scenario recognition,

    M.-C. Chang, N. Krahnstoever, and W. Ge, “Probabilistic group-level motion analysis and scenario recognition,” inProc. IEEE Int. Conf. Comput. Vis. (ICCV). IEEE, 2011, pp. 747–754

  40. [40]

    Social group discovery from surveillance videos: A data-driven approach with attention-based cues

    I. Chamveha, Y . Sugano, Y . Sato, and A. Sugimoto, “Social group discovery from surveillance videos: A data-driven approach with attention-based cues.” inProc. Brit. Mach. Vis. Conf. (BMVC), 2013

  41. [41]

    Detection of social groups in pedestrian crowds using computer vision,

    S. D. Khan, G. Vizzari, S. Bandini, and S. Basalamah, “Detection of social groups in pedestrian crowds using computer vision,” inProc. Int. Conf. Adv. Concepts Intell. Vis. Syst. (ACIVS). Springer, 2015, pp. 249–260

  42. [42]

    Real-time trajectory- based social group detection,

    S. Jahangard, M. Hayat, and H. Rezatofighi, “Real-time trajectory- based social group detection,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS). IEEE, 2023, pp. 1901–1908

  43. [43]

    Socially constrained structural learning for groups detection in crowd,

    F. Solera, S. Calderara, and R. Cucchiara, “Socially constrained structural learning for groups detection in crowd,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 5, pp. 995–1008, 2015

  44. [44]

    Robot-centric perception of human groups,

    A. Taylor, D. M. Chan, and L. D. Riek, “Robot-centric perception of human groups,”ACM Trans. Hum.-Robot Interact., vol. 9, no. 3, pp. 1–21, 2020

  45. [45]

    Performance of a low-cost, human- inspired perception approach for dense moving crowd navigation,

    I. Chatterjee and A. Steinfeld, “Performance of a low-cost, human- inspired perception approach for dense moving crowd navigation,” in Proc. IEEE Int. Symp. Robot Hum. Interact. Commun. (RO-MAN). IEEE, 2016, pp. 578–585

  46. [46]

    Dynamic-group-aware networks for multi-agent trajectory prediction with relational reasoning,

    C. Xu, Y . Wei, B. Tang, S. Yin, Y . Zhang, S. Chen, and Y . Wang, “Dynamic-group-aware networks for multi-agent trajectory prediction with relational reasoning,”Neural Netw., vol. 170, pp. 564–577, 2024

  47. [47]

    Learning pedestrian group representations for multi-modal trajectory prediction,

    I. Bae, J.-H. Park, and H.-G. Jeon, “Learning pedestrian group representations for multi-modal trajectory prediction,” inProc. Eur. Conf. Comput. Vis. (ECCV). Springer, 2022, pp. 270–289

  48. [48]

    The walking behaviour of pedestrian social groups and its impact on crowd dynamics,

    M. Moussa ¨ıd, N. Perozo, S. Garnier, D. Helbing, and G. Theraulaz, “The walking behaviour of pedestrian social groups and its impact on crowd dynamics,”PLoS One, vol. 5, no. 4, p. e10047, 2010

  49. [49]

    Social-aware navigation in crowds with static and dynamic groups,

    F. Yang and C. Peters, “Social-aware navigation in crowds with static and dynamic groups,” inProc. Int. Conf. Virtual Worlds Games Serious Appl. (VS-Games). IEEE, 2019, pp. 1–4

  50. [50]

    Vlm-social-nav: Socially aware robot navigation through scoring using vision-language models,

    D. Song, J. Liang, A. Payandeh, A. H. Raj, X. Xiao, and D. Manocha, “Vlm-social-nav: Socially aware robot navigation through scoring using vision-language models,”IEEE Robot. Autom. Lett., 2024

  51. [51]

    Comet: Modeling group cohesion for socially compliant robot navigation in crowded scenes,

    A. J. Sathyamoorthy, U. Patel, M. Paul, N. K. S. Kumar, Y . Savle, and D. Manocha, “Comet: Modeling group cohesion for socially compliant robot navigation in crowded scenes,”IEEE Robot. Autom. Lett., vol. 7, no. 2, pp. 1008–1015, 2021

  52. [52]

    Flow4d: Leveraging 4d voxel network for lidar scene flow estimation,

    J. Kim, J. Woo, U. Shin, J. Oh, and S. Im, “Flow4d: Leveraging 4d voxel network for lidar scene flow estimation,”IEEE Robot. Autom. Lett., 2025

  53. [53]

    A density-based algorithm for discovering clusters in large spatial databases with noise,

    M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise,” inProc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min. (KDD), 1996, pp. 226–231

  54. [54]

    Social GAN: Socially acceptable trajectories with generative adversarial networks,

    A. Gupta, J. Johnson, L. Fei-Fei, S. Savarese, and A. Alahi, “Social GAN: Socially acceptable trajectories with generative adversarial networks,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 2255–2264

  55. [55]

    Sophie: An attentive gan for predicting paths compliant to social and physical constraints,

    A. Sadeghian, V . Kosaraju, A. Sadeghian, N. Hirose, H. Rezatofighi, and S. Savarese, “Sophie: An attentive gan for predicting paths compliant to social and physical constraints,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 1349–1358

  56. [56]

    From goals, way- points & paths to long term human trajectory forecasting,

    K. Mangalam, Y . An, H. Girase, and J. Malik, “From goals, way- points & paths to long term human trajectory forecasting,” inProc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2021, pp. 15 233–15 242

  57. [57]

    Agentformer: Agent- aware transformers for socio-temporal multi-agent forecasting,

    Y . Yuan, X. Weng, Y . Ou, and K. M. Kitani, “Agentformer: Agent- aware transformers for socio-temporal multi-agent forecasting,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2021, pp. 9813– 9823

  58. [58]

    Moflow: One-step flow matching for human trajectory forecasting via implicit maximum likelihood estimation based distillation,

    Y . Fu, Q. Yan, L. Wang, K. Li, and R. Liao, “Moflow: One-step flow matching for human trajectory forecasting via implicit maximum likelihood estimation based distillation,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2025, pp. 17 282–17 293

  59. [59]

    From crowd motion prediction to robot navigation in crowds,

    S. Poddar, C. Mavrogiannis, and S. S. Srinivasa, “From crowd motion prediction to robot navigation in crowds,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS). IEEE, 2023, pp. 6765–6772

  60. [60]

    YOLOv4: Optimal Speed and Accuracy of Object Detection

    A. Bochkovskiy, C.-Y . Wang, and H.-Y . M. Liao, “Yolov4: Op- timal speed and accuracy of object detection,”arXiv preprint arXiv:2004.10934, 2020

  61. [61]

    Simple online and realtime tracking,

    A. Bewley, Z. Ge, L. Ott, F. Ramos, and B. Upcroft, “Simple online and realtime tracking,” inProc. IEEE Int. Conf. Image Process. (ICIP). IEEE, 2016, pp. 3464–3468

  62. [62]

    Observer-aware legibility for social navigation,

    A. V . Taylor, E. Mamantov, and H. Admoni, “Observer-aware legibility for social navigation,” inProc. IEEE Int. Conf. Robot Human Interact. Commun. (RO-MAN). IEEE, 2022, pp. 1115–1122

  63. [63]

    Social navigation with pedestrian groups,

    A. Wang, “Social navigation with pedestrian groups,” Ph.D. disserta- tion, Carnegie Mellon Univ., 2023