pith. machine review for the scientific record. sign in

arxiv: 2604.11705 · v1 · submitted 2026-04-13 · 💻 cs.AI · cs.CL· cs.RO· cs.SY· eess.SY

Recognition: unknown

Agentic Driving Coach: Robustness and Determinism of Agentic AI-Powered Human-in-the-Loop Cyber-Physical Systems

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:19 UTC · model grok-4.3

classification 💻 cs.AI cs.CLcs.ROcs.SYeess.SY
keywords agentic AIhuman-in-the-loopcyber-physical systemsdeterminismreactor model of computationdriving coachrobustnessnondeterminism
0
0 comments X

The pith

A reactor model of computation can reintroduce determinism and robustness into AI-powered human-in-the-loop cyber-physical systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that a reactor-model-of-computation approach can manage the nondeterminism arising from unpredictable human users, variable AI agent behavior, and changing physical environments in agentic AI-powered HITL CPS. A sympathetic reader would care because foundation models are entering real-world interactive applications where safety and consistency matter, yet their inherent variability threatens reliable operation. The authors evaluate this through a concrete agentic driving coach case study, identify the resulting practical challenges, and outline pathways to address them.

Core claim

The paper proposes a reactor-model-of-computation-based approach for agentic AI-powered human-in-the-loop cyber-physical systems, realized through a concrete agentic driving coach application, to reintroduce determinism and robustness despite the sources of nondeterminism; evaluation reveals practical challenges in doing so and presents pathways to overcome them.

What carries the argument

The reactor-model-of-computation-based approach, which organizes system timing and component interactions to enforce deterministic behavior across human inputs, AI agents, and physical dynamics.

If this is right

  • The agentic driving coach can maintain consistent coaching behavior despite varying driver actions and road conditions.
  • Practical challenges to determinism in agentic HITL CPS become identifiable and addressable through targeted pathways.
  • Other interactive AI systems with human and physical elements can achieve improved reliability by adopting the same structured computation model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same structure might support safer adaptive assistance in related real-time domains such as vehicle control or home automation.
  • Implementing the proposed pathways with actual human participants could provide measurable data on consistency gains.
  • Connections to timing analysis in other reactive systems could strengthen verification methods for these applications.

Load-bearing premise

That a reactor model of computation can impose enough structure to control outcomes even when humans, AI agents, and physical environments introduce variability.

What would settle it

An experiment with the agentic driving coach in which system outputs or response timings differ across repeated similar human driving scenarios and environmental conditions after the reactor-based approach is applied.

Figures

Figures reproduced from arXiv: 2604.11705 by Daniel Fan, Deeksha Prahlad, Hokeun Kim.

Figure 1
Figure 1. Figure 1: Challenges faced by an agentic driving coach as an agentic [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: Our proposed reactor model implemented using Lingua Franca, targeting the stop sign scenario, consists of four key reactors: (a) [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The Planner reactor as a modal model where transitions are triggered by control signals defined in TABLE III [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Three scenarios to evaluate the implementation of the proposed [PITH_FULL_IMAGE:figures/full_fig_p004_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Evaluation results illustrating velocity vs. displacement and time for three Llama 3 models (1B, 8B, 70B) in a stop sign scenario. Blue [PITH_FULL_IMAGE:figures/full_fig_p005_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Evaluation results with Llama 3 (8B vs. 70B) in speed and lane change scenarios, illustrated by velocity vs. displacement and time. [PITH_FULL_IMAGE:figures/full_fig_p005_8.png] view at source ↗
read the original abstract

Foundation models, including large language models (LLMs), are increasingly used for human-in-the-loop (HITL) cyber-physical systems (CPS) because foundation model-based AI agents can potentially interact with both the physical environments and human users. However, the unpredictable behavior of human users and AI agents, in addition to the dynamically changing physical environments, leads to uncontrollable nondeterminism. To address this urgent challenge of enabling agentic AI-powered HITL CPS, we propose a reactor-model-of-computation (MoC)-based approach, realized by the open-source Lingua Franca (LF) framework. We also carry out a concrete case study using the agentic driving coach as an application of HITL CPS. By evaluating the LF-based agentic HITL CPS, we identify practical challenges in reintroducing determinism into such agentic HITL CPS and present pathways to address them.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript proposes a reactor-model-of-computation approach realized via the open-source Lingua Franca (LF) framework to address nondeterminism arising from unpredictable human users, AI agents, and dynamic physical environments in agentic AI-powered human-in-the-loop cyber-physical systems (HITL CPS). It presents a concrete case study of an agentic driving coach application, uses this to identify practical challenges in reintroducing determinism and robustness, and outlines pathways to address them.

Significance. If the LF-based approach can be shown to manage nondeterminism effectively, the work would offer a structured, potentially reproducible method for designing safer agentic HITL CPS in domains such as driving assistance. The decision to treat reintroduction of determinism as an open problem rather than a solved claim, combined with reliance on an established open-source framework, strengthens the paper as an exploratory contribution that surfaces real implementation issues.

major comments (1)
  1. [Case Study] The abstract states that the authors 'evaluate the LF-based agentic HITL CPS' to identify challenges, yet the manuscript contains no quantitative results, error analysis, implementation details, determinism metrics, or comparison against non-LF baselines. This is load-bearing because the central claim rests on evaluation-driven challenge identification rather than pure proposal.
minor comments (2)
  1. [Introduction] Key terms such as 'agentic', 'determinism', and 'reactor model of computation' are used without early, self-contained definitions that would allow readers unfamiliar with LF to follow the argument.
  2. [Pathways] The pathways section would be strengthened by explicitly mapping each suggested mitigation back to concrete LF language features (e.g., specific reactor coordination or timing constructs).

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the thoughtful summary and recommendation. We agree that the current manuscript is exploratory and that the case study evaluation requires more concrete details to support the challenge identification. We will revise the paper to address this.

read point-by-point responses
  1. Referee: [Case Study] The abstract states that the authors 'evaluate the LF-based agentic HITL CPS' to identify challenges, yet the manuscript contains no quantitative results, error analysis, implementation details, determinism metrics, or comparison against non-LF baselines. This is load-bearing because the central claim rests on evaluation-driven challenge identification rather than pure proposal.

    Authors: We acknowledge that the evaluation presented is qualitative and centers on using the agentic driving coach case study to surface practical challenges in reintroducing determinism, rather than providing quantitative benchmarks. The manuscript does not include error analysis, determinism metrics, or non-LF baselines because the primary goal was to demonstrate the reactor MoC approach via LF and outline open issues. In revision, we will expand the case study with additional implementation details on the LF reactor configuration, any observed timing and determinism behaviors from our prototype, and a clearer statement that the evaluation is challenge-identification oriented. We will also add a limitations section and pathways for future quantitative comparisons. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper proposes an approach based on the established open-source Lingua Franca framework for the reactor model of computation to manage nondeterminism in agentic HITL CPS, using a driving-coach case study to surface challenges and outline pathways. No load-bearing derivations, equations, fitted predictions, or self-citations reduce the central claims to their own inputs by construction; the work is explicitly exploratory and self-limiting rather than asserting resolved determinism.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review prevents identification of concrete free parameters or invented entities. The central premise is treated as a domain assumption.

axioms (1)
  • domain assumption Reactor model of computation can reintroduce determinism into agentic HITL CPS despite unpredictable human and AI behavior.
    This is the load-bearing premise stated in the abstract without supporting derivation or evidence.

pith-pipeline@v0.9.0 · 5475 in / 1218 out tokens · 55443 ms · 2026-05-10T15:19:08.589473+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

42 extracted references · 8 canonical work pages · 4 internal anchors

  1. [1]

    On the Opportunities and Risks of Foundation Models

    R. Bommasaniet al., “On the opportunities and risks of foundation models,”arXiv preprint arXiv:2108.07258, 2021

  2. [2]

    A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

    J. Fanget al., “A comprehensive survey of self-evolving AI agents: A new paradigm bridging foundation models and lifelong agentic systems,” arXiv preprint arXiv:2508.07407, 2025

  3. [3]

    Toward foundation models for online complex event detection in CPS-IoT: A case study,

    L. Hanet al., “Toward foundation models for online complex event detection in CPS-IoT: A case study,” inProc. of the 2nd Int’l Workshop on Foundation Models for Cyber-Physical Systems & Internet of Things (FMSys), 2025, pp. 1–6

  4. [4]

    Exploring the capabilities of llms for IMU-based fine-grained human activity understanding,

    L. Xu, K. Hou, and X. Jiang, “Exploring the capabilities of llms for IMU-based fine-grained human activity understanding,” inProceedings of the 2nd International Workshop on Foundation Models for Cyber- Physical Systems & Internet of Things, 2025, pp. 13–18

  5. [5]

    The future of human-in-the-loop cyber-physical systems,

    G. Schirneret al., “The future of human-in-the-loop cyber-physical systems,”Computer, vol. 46, no. 1, pp. 36–45, 2013

  6. [6]

    People 4.0—a model for human-in-the-loop CPS-based systems,

    J. M. Fernandeset al., “People 4.0—a model for human-in-the-loop CPS-based systems,”Computer Standards & Interfaces, vol. 91, p. 103895, 2025

  7. [7]

    CPS-LLM: large language model based safe usage plan generator for human-in-the-loop human-in-the-plant cyber-physical system,

    A. Banerjeeet al., “CPS-LLM: large language model based safe usage plan generator for human-in-the-loop human-in-the-plant cyber-physical system,”arXiv preprint arXiv:2405.11458, 2024

  8. [8]

    An LLM-based digital twin for optimizing human-in-the loop systems,

    H. Yang, M. Siew, and C. Joe-Wong, “An LLM-based digital twin for optimizing human-in-the loop systems,” inIEEE Int’l Workshop on Foundation Models for Cyber-Physical Systems & IoT (FMSys). IEEE, 2024, pp. 26–31

  9. [9]

    LLM-enabled cyber-physical systems: Survey, research opportunities, and challenges,

    W. Xuet al., “LLM-enabled cyber-physical systems: Survey, research opportunities, and challenges,” in2024 IEEE International Workshop on Foundation Models for Cyber-Physical Systems & Internet of Things (FMSys). IEEE, 2024, pp. 50–55

  10. [10]

    Safe llm-controlled robots with formal guarantees via reachability analysis.arXivpreprintarXiv:2503.03911, 2025

    A. Hafezet al., “Safe LLM-controlled robots with formal guarantees via reachability analysis,”arXiv preprint arXiv:2503.03911, 2025

  11. [11]

    On the efficiency and robustness of vibration- based foundation models for IoT sensing: A case study,

    T. Kimuraet al., “On the efficiency and robustness of vibration- based foundation models for IoT sensing: A case study,” inIEEE Int’l Workshop on Foundation Models for Cyber-Physical Systems & IoT (FMSys). IEEE, 2024, pp. 7–12

  12. [12]

    Determinism,

    E. A. Lee, “Determinism,”ACM Transactions on Embedded Computing Systems (TECS), vol. 20, no. 5, pp. 1–34, 2021

  13. [13]

    Reactors: A deterministic model for composable reactive systems,

    M. Lohstrohet al., “Reactors: A deterministic model for composable reactive systems,” inInternational Workshop on Design, Modeling, and Evaluation of Cyber Physical Systems. Springer, 2019, pp. 59–85

  14. [14]

    Toward a Lingua Franca for deterministic concurrent systems,

    ——, “Toward a Lingua Franca for deterministic concurrent systems,” ACM Trans. on Embedded Computing Syst. (TECS), vol. 20, no. 4, 2021

  15. [15]

    A universal modular actor formalism for artificial intelligence,

    C. Hewitt, P. Bishop, and R. Steiger, “A universal modular actor formalism for artificial intelligence,” inProc. of the 3rd Int’l Joint Conf. on Artificial Intelligence, San Francisco, CA, USA, 1973, p. 235–245

  16. [16]

    Actors revisited for time-critical systems,

    M. Lohstrohet al., “Actors revisited for time-critical systems,” inProc. of the 56th Annual Design Automation Conference, 2019, pp. 1–4

  17. [17]

    E. A. Leeet al.(2025, Aug.) Working with deadlines. Lingua Franca Project. [Online]. Available: https://www.lf-lang.org/blog/deadlines/ #reactions-that-monitor-their-execution-time

  18. [18]

    Consistency vs. availability in distributed cyber-physical sys- tems,

    ——, “Consistency vs. availability in distributed cyber-physical sys- tems,”ACM Trans. on Embedded Computing Syst., vol. 22, no. 5s, pp. 1–24, 2023

  19. [19]

    Cyber-physical AI: Systematic research domain for integrating ai and cyber-physical systems,

    S. Leeet al., “Cyber-physical AI: Systematic research domain for integrating ai and cyber-physical systems,”ACM Trans. Cyber-Phys. Syst., vol. 9, no. 2, 2025

  20. [20]

    Cyber physical games: Rational multi- agent decision-making in temporally non-deterministic environments,

    W. Sritriratanarak and P. Garcia, “Cyber physical games: Rational multi- agent decision-making in temporally non-deterministic environments,” ACM Trans. Cyber-Phys. Syst., vol. 9, no. 2, Apr. 2025

  21. [21]

    Who is responsible? explaining safety violations in multi- agent cyber-physical systems,

    L. Niuet al., “Who is responsible? explaining safety violations in multi- agent cyber-physical systems,” in2024 International Conference on Assured Autonomy (ICAA), 2024, pp. 11–20

  22. [22]

    Voyager: An Open-Ended Embodied Agent with Large Language Models

    G. Wanget al., “V oyager: An open-ended embodied agent with large language models,”arXiv preprint arXiv:2305.16291, 2023

  23. [23]

    A unified privacy preserving model with ai at the edge for human-in-the-loop cyber-physical systems,

    J. E. Rivadeneiraet al., “A unified privacy preserving model with ai at the edge for human-in-the-loop cyber-physical systems,”Internet of Things, vol. 25, p. 101034, 2024

  24. [24]

    Human-in-the-loop: role in cyber physical agricultural systems,

    M. Sreeram and S. Y . Nof, “Human-in-the-loop: role in cyber physical agricultural systems,”International Journal of Computers Communica- tions & Control, vol. 16, no. 2, 2021

  25. [25]

    Designing human-in-the-loop autonomous cyber-physical systems,

    M. Gilet al., “Designing human-in-the-loop autonomous cyber-physical systems,”International journal of human-computer studies, vol. 130, pp. 21–39, 2019

  26. [26]

    A human behavior exploration approach using LLMs for cyber-physical systems,

    L. Burgue ˜noet al., “A human behavior exploration approach using LLMs for cyber-physical systems,” inProc. of the ACM/IEEE 27th Int’l Conf. on Model Driven Eng. Languages and Syst., 2024, pp. 578–586

  27. [27]

    Deterministic coordination across multiple time- lines,

    M. Lohstrohet al., “Deterministic coordination across multiple time- lines,”ACM Transactions on Embedded Computing Systems, vol. 23, no. 5, pp. 1–29, 2024

  28. [28]

    Dura-cps: A multi-role orchestrator for depend- ability assurance in llm-enabled cyber-physical systems,

    T. Srinivasanet al., “Dura-cps: A multi-role orchestrator for depend- ability assurance in llm-enabled cyber-physical systems,” in2025 55th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). IEEE, 2025, pp. 63–70

  29. [29]

    Risk and mitigation of nondeterminism in distributed cyber-physical systems,

    S. Bateniet al., “Risk and mitigation of nondeterminism in distributed cyber-physical systems,” inProceedings of the 21st ACM-IEEE Interna- tional Conference on Formal Methods and Models for System Design, ser. MEMOCODE ’23. ACM, 2023, p. 1–11

  30. [30]

    Achieving determinism in adaptive autosar,

    C. Menardet al., “Achieving determinism in adaptive autosar,” in2020 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2020, pp. 822–827

  31. [31]

    Autocoach: Driving behavior management using intelligent iot services,

    Z. Marafieet al., “Autocoach: Driving behavior management using intelligent iot services,” in2019 IEEE 12th Conference on Service- Oriented Computing and Applications (SOCA), 2019, pp. 103–110

  32. [32]

    Drive like a human: Rethinking autonomous driving with large language models,

    D. Fuet al., “Drive like a human: Rethinking autonomous driving with large language models,” in2024 IEEE/CVF Winter Conf. on App. of Computer Vision Workshops (WACVW). IEEE, 2024, pp. 910–919

  33. [33]

    Comprehensive assessment of artificial intelligence tools for driver monitoring and analyzing safety critical events in vehicles,

    G. Yanget al., “Comprehensive assessment of artificial intelligence tools for driver monitoring and analyzing safety critical events in vehicles,” Sensors, vol. 24, no. 8, 2024

  34. [34]

    Driver’s avoidance characteristics to hazardous situations: A driving simulator study,

    H. Hanet al., “Driver’s avoidance characteristics to hazardous situations: A driving simulator study,”Transportation research part F: traffic psychology and behaviour, vol. 81, pp. 522–539, 2021

  35. [35]

    Gillespie,Fundamentals of vehicle dynamics

    T. Gillespie,Fundamentals of vehicle dynamics. SAE international, 2021

  36. [36]

    Waymo public road safety performance data,

    M. Schwallet al., “Waymo public road safety performance data,”arXiv preprint arXiv:2011.00038, 2020

  37. [37]

    Polyglot modal models through Lingua Franca,

    A. Schulz-Rosengartenet al., “Polyglot modal models through Lingua Franca,” inProceedings of Cyber-Physical Systems and IoT Week 2023, 2023, pp. 337–342

  38. [38]

    Chatbot and fatigued driver: Exploring the use of llm- based voice assistants for driving fatigue,

    S. Huanget al., “Chatbot and fatigued driver: Exploring the use of llm- based voice assistants for driving fatigue,” inExtended Abstracts of the CHI Conference, 2024, pp. 1–8

  39. [39]

    Using Ollama,

    F. S. Marcondeset al., “Using Ollama,” inNatural Language Analytics with Generative Large-Language Models: A Practical Approach with Ollama and Open-Source LLMs. Springer, 2025, pp. 23–35

  40. [40]

    The Llama 3 Herd of Models

    A. Grattafioriet al., “The Llama 3 herd of models,”arXiv preprint arXiv:2407.21783, 2024

  41. [41]

    Personalizing large language models using retrieval augmented generation and knowledge graph,

    D. Prahlad, C. Lee, D. Kim, and H. Kim, “Personalizing large language models using retrieval augmented generation and knowledge graph,” in Companion Proceedings of the ACM on Web Conference (WWW) 2025, 2025, pp. 1259–1263

  42. [42]

    Zhao et al

    J. Zhaoet al., “Galore: Memory-efficient llm training by gradient low- rank projection,”arXiv preprint arXiv:2403.03507, 2024