Managed Autonomy at Runtime: Gear-Based Safety and Governance for Single- and Multi-Agent Cyber-Physical Systems

Srini Ramaswamy; Wang Miaosheng

arxiv: 2607.00334 · v1 · pith:IXSIYNZ7new · submitted 2026-07-01 · 💻 cs.AI

Managed Autonomy at Runtime: Gear-Based Safety and Governance for Single- and Multi-Agent Cyber-Physical Systems

Srini Ramaswamy , Wang Miaosheng This is my paper

Pith reviewed 2026-07-02 13:11 UTC · model grok-4.3

classification 💻 cs.AI

keywords managed autonomyexecution gearscyber-physical systemsruntime safetymulti-agent systemsstability proofsanomaly detectiongovernance states

0 comments

The pith

Five execution gears deliver monotonic stability, safety, and zero-collision guarantees for single- and multi-agent cyber-physical systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a discrete-time control system that pairs five execution gears with utility-gated dispatch and event-driven fallback to prevent safety violations, instability, and continuity loss in autonomous agents. For single agents it establishes formal properties of monotonic stability, execution safety, eventual stabilization, fallback completeness, and equivalence to a gear-constrained Markov decision process. For multi-agent cyber-physical systems the gears map into four governance states, supported by consensus gating, swarm-level Lyapunov analysis, per-agent authority, and rendezvous control to deliver distributed safety including zero collision under the stated assumptions. Evaluation on a three-agent UR5 assembly cell using NIST-calibrated faults across 10,000 episodes reports 99.6 percent anomaly detection, 3.5 times lower latency than baseline, and a formal physical-workspace safety certificate.

Core claim

The system combines five execution gears with utility-gated dispatch and event-driven fallback to achieve monotonic stability, execution safety, eventual stabilization, fallback completeness, and equivalence to a gear-constrained Markov decision process in the single-agent case. In multi-agent settings, consensus gating, swarm-level Lyapunov analysis, per-agent gear authority, and rendezvous control mapped to four governance states provide distributed safety and stability guarantees, including zero collision under the stated assumptions.

What carries the argument

The five execution gears (observation, suggestion, planning, execution, intervention) with utility-gated dispatch and event-driven fallback that function as micro-level permissions beneath higher governance states.

If this is right

Single-agent case yields monotonic stability, execution safety, eventual stabilization, and fallback completeness.
Multi-agent case supplies zero-collision guarantees via consensus gating and swarm-level Lyapunov analysis.
Runtime evidence maps into four governance states to separate action control from autonomy oversight.
Evaluation achieves 99.6 percent anomaly detection and 3.5 times lower latency than the single-agent baseline.
The approach supplies a formal physical-workspace safety certificate for the robotic cell.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The gear structure could transfer to domains such as autonomous vehicles where runtime permissions must be enforced without constant human input.
Mapping gears to governance states offers a modular pattern that might combine with large-language-model agents for hybrid oversight.
Extending the Monte Carlo setup to physical trials with uncalibrated or time-varying faults would test robustness beyond the reported conditions.
The separation of micro-level gears from macro-level states suggests applicability to mixed human-robot teams where authority levels change dynamically.

Load-bearing premise

The assumptions under which zero collision is guaranteed and equivalence to the gear-constrained Markov decision process hold, including accurate fault calibration from the dataset and Monte Carlo episodes representing real conditions.

What would settle it

Observing even one collision in the three-agent UR5 robotic assembly cell under the paper's stated assumptions, or failing to demonstrate the claimed equivalence to the gear-constrained Markov decision process, would falsify the central guarantees.

Figures

Figures reproduced from arXiv: 2607.00334 by Srini Ramaswamy, Wang Miaosheng.

**Figure 1.** Figure 1: The EntropyRuntime control loop. Definition 3 (Utility Gate). The utility gate GATE(s, a) is a binary predicate: GATE(s, a) = ( 1 if U(s, a) ≥ θ 0 otherwise where θ ≥ 0 is the safety threshold. Definition 4 (Runtime State). The runtime state at cycle t is the tuple ρt = (st , gt , σt , ϵt) where st ∈ S is the environment state, gt ∈ G is the current gear, σt ∈ R≥0 is the accumulated instability measure, an… view at source ↗

read the original abstract

Autonomous agents, whether LLM-driven software agents or robotic physical agents, face a common class of failure modes when operating without continuous human oversight: safety violations from unverified actions, behavioral instability from unconstrained loops, and continuity loss from unhandled error states. We develop \system{}, a discrete-time control system that combines five execution gears (\Gobs{}, \Gsug{}, \Gplan{}, \Gexec{}, \Gint{}) with utility-gated dispatch and event-driven fallback. For the single-agent case, we prove monotonic stability, execution safety, eventual stabilization, fallback completeness, and equivalence to a gear-constrained Markov decision process. For multi-agent cyber-physical systems (CPS), we apply the established \smart{} managed-autonomy lifecycle and map runtime evidence into its four governance states (\Stable{}/\Meta{}/\Assisted{}/\Regulated{}). Consensus gating, swarm-level Lyapunov analysis, per-agent gear authority, and rendezvous control provide distributed safety and stability guarantees, including zero collision under the stated assumptions. We evaluate the resulting runtime on a three-agent UR5 robotic assembly cell using fault magnitudes calibrated from the NIST \emph{Degradation Measurement of Robot Arm Position Accuracy} dataset across 10,000 Monte Carlo episodes. It achieves a 99.6\% anomaly detection rate versus 2.1\% for the single-agent baseline, reduces detection latency by $3.5\times$, and supplies a formal physical-workspace safety certificate. The execution gears act as micro-level permissions beneath the \smart{} runtime governance states, separating action control from autonomy governance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces five execution gears for runtime safety in single- and multi-agent CPS with formal single-agent proofs and solid robot simulation numbers, but the zero-collision and stability claims rest on assumptions whose scope is not shown to be tight.

read the letter

The core contribution is a discrete-time system that runs agents through five gears—observation, suggestion, planning, execution, and intervention—using utility-gated dispatch and event-driven fallback. For the single-agent case it proves monotonic stability, execution safety, eventual stabilization, fallback completeness, and equivalence to a gear-constrained MDP. For multi-agent CPS it maps the gears onto the four SMART governance states and adds consensus gating, swarm Lyapunov analysis, per-agent authority, and rendezvous control to claim zero collisions under the stated assumptions.

The evaluation on a three-agent UR5 assembly cell, with faults taken from the NIST dataset and run across 10,000 Monte Carlo episodes, reports 99.6 % anomaly detection (versus 2.1 % for the single-agent baseline) and 3.5× lower detection latency, plus a formal physical-workspace safety certificate. The separation of micro-level gear permissions from macro governance states is a clean architectural move.

The soft spots sit in the assumptions. The stability, safety, and zero-collision results are conditional on fault magnitudes being accurately calibrated and on the Monte Carlo episodes being representative. The abstract does not show how tightly those conditions match realistic sensor correlations, actuator delays, or nondeterministic LLM behavior. If the assumptions turn out narrow, both the formal certificate and the headline detection number become conditional. The multi-agent part also leans on the existing SMART lifecycle, so the incremental novelty there is mainly the gear integration rather than a new governance model.

This is for researchers and engineers working on runtime safety layers for robotic or LLM-driven agents in CPS. A reader who needs concrete gear designs plus formal single-agent results and robot-scale numbers will find usable material. It deserves a serious referee because the combination of proofs and empirical work on a practical problem is worth detailed checking, even if the assumption boundaries need clearer mapping in revision.

Referee Report

3 major / 2 minor

Summary. The paper introduces \system{}, a discrete-time control framework using five execution gears (Gobs, Gsug, Gplan, Gexec, Gint) combined with utility-gated dispatch and event-driven fallback. For single agents it claims proofs of monotonic stability, execution safety, eventual stabilization, fallback completeness, and equivalence to a gear-constrained MDP. For multi-agent CPS it maps runtime evidence into the four SMART governance states and asserts distributed safety via consensus gating, swarm Lyapunov analysis, per-agent gear authority, and rendezvous control, including zero collision under stated assumptions. Evaluation on a three-agent UR5 assembly cell with NIST-calibrated faults across 10,000 Monte Carlo episodes reports 99.6% anomaly detection (vs. 2.1% baseline), 3.5× lower latency, and a formal physical-workspace safety certificate.

Significance. If the formal claims hold under explicitly enumerated and realistic assumptions, the work supplies a concrete micro-level permission mechanism (gears) beneath macro governance states that could be adopted in safety-critical robotic and autonomous systems. The combination of per-agent stability proofs with swarm-level guarantees and empirical anomaly detection rates would represent a useful engineering contribution to runtime safety for LLM-driven or physical agents.

major comments (3)

[§3 / Abstract] §3 (single-agent proofs) and abstract: the claims of monotonic stability, execution safety, fallback completeness, and equivalence to a gear-constrained MDP are stated to hold only under unspecified assumptions; without an enumerated list of those assumptions and a demonstration that they remain valid under realistic sensor/actuator correlations or LLM nondeterminism, the central formal results cannot be assessed for scope.
[§4 / Evaluation] §4 (multi-agent CPS) and evaluation: the zero-collision guarantee via consensus gating, swarm Lyapunov analysis, and rendezvous control is asserted only under the same unexamined assumptions; the NIST fault magnitudes and Monte Carlo episode fidelity are load-bearing for both the safety certificate and the 99.6% detection figure, yet no sensitivity analysis or justification of representativeness is supplied.
[Evaluation] Evaluation section: the single-agent baseline achieving only 2.1% anomaly detection is used to highlight the 99.6% result, but the implementation details of that baseline (gear usage, dispatch policy, fault injection) are not provided, preventing verification that the comparison isolates the contribution of the multi-agent governance layer.

minor comments (2)

[Abstract] Notation for the five gears and the four SMART states is introduced in the abstract without a compact reference table; adding one would improve readability.
[Abstract / Introduction] The manuscript uses \system{} and \smart{} macros without an initial expansion or acronym list.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed report. We address each major comment below and will revise the manuscript accordingly to improve clarity and completeness.

read point-by-point responses

Referee: [§3 / Abstract] §3 (single-agent proofs) and abstract: the claims of monotonic stability, execution safety, fallback completeness, and equivalence to a gear-constrained MDP are stated to hold only under unspecified assumptions; without an enumerated list of those assumptions and a demonstration that they remain valid under realistic sensor/actuator correlations or LLM nondeterminism, the central formal results cannot be assessed for scope.

Authors: We agree that the assumptions require explicit enumeration for proper assessment of scope. While the proofs reference assumptions throughout §3 (discrete-time dynamics, bounded disturbances, and deterministic intra-gear execution), they are not collected in one location. In the revision we will insert a dedicated subsection at the start of §3 that lists every assumption verbatim. We will also add a short discussion paragraph addressing sensor/actuator correlations and LLM nondeterminism, stating that the current proofs assume uncorrelated error terms and that extensions to correlated or nondeterministic cases remain future work. revision: yes
Referee: [§4 / Evaluation] §4 (multi-agent CPS) and evaluation: the zero-collision guarantee via consensus gating, swarm Lyapunov analysis, and rendezvous control is asserted only under the same unexamined assumptions; the NIST fault magnitudes and Monte Carlo episode fidelity are load-bearing for both the safety certificate and the 99.6% detection figure, yet no sensitivity analysis or justification of representativeness is supplied.

Authors: We accept that both the formal multi-agent guarantees and the empirical results rest on the same assumptions and on the specific NIST-calibrated fault model. The revision will (1) add an explicit enumerated list of multi-agent assumptions in §4 that cross-references the single-agent list and (2) include a new sensitivity analysis subsection that varies fault magnitudes around the NIST values and reports the resulting changes in detection rate, latency, and safety-certificate validity. This will supply the requested justification of representativeness. revision: yes
Referee: [Evaluation] Evaluation section: the single-agent baseline achieving only 2.1% anomaly detection is used to highlight the 99.6% result, but the implementation details of that baseline (gear usage, dispatch policy, fault injection) are not provided, preventing verification that the comparison isolates the contribution of the multi-agent governance layer.

Authors: We agree that the baseline implementation details are insufficient and that this prevents verification of the comparison. In the revised evaluation section we will add a dedicated paragraph describing the single-agent baseline, specifying the exact gear set used, the dispatch policy, and the precise fault-injection procedure applied during the 10,000 Monte Carlo episodes. revision: yes

Circularity Check

0 steps flagged

Minor self-citation to prior \$ \smart{} \$ lifecycle; core proofs and evaluations remain independent

full rationale

The paper states formal proofs for single-agent monotonic stability, execution safety, fallback completeness and MDP equivalence, plus multi-agent guarantees via consensus gating and Lyapunov analysis. These are presented as derived within the current manuscript. The sole self-reference is the phrase 'apply the established \$ \smart{} \$ managed-autonomy lifecycle', which is not shown to be the sole justification for any theorem; the evaluation uses external NIST calibration and Monte Carlo episodes rather than any fitted parameter renamed as a prediction. No self-definitional equations, ansatz smuggling, or uniqueness theorems imported from the same authors appear in the provided text. The derivation chain is therefore self-contained against external benchmarks, warranting only the minimal score for a non-load-bearing self-citation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

Abstract does not specify numerical free parameters or mathematical axioms; the gears represent the primary new components of the method.

invented entities (1)

five execution gears (Gobs, Gsug, Gplan, Gexec, Gint) no independent evidence
purpose: Provide discrete control levels for safety and governance
Core of the proposed system, introduced to combine with utility-gated dispatch.

pith-pipeline@v0.9.1-grok · 5822 in / 1368 out tokens · 41076 ms · 2026-07-02T13:11:15.810573+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

33 extracted references · 7 canonical work pages · 3 internal anchors

[1]

H. Chase. LangChain: Building applications with LLMs through composability.GitHub repository, 2022

2022
[2]

Richards

T. Richards. AutoGPT: An autonomous GPT-4 experiment.GitHub repository, 2023

2023
[3]

Doshi and J

R. Doshi and J. Hong. Verifiably safe tool use for LLM agents.arXiv preprint arXiv:2601.08012, 2026

work page arXiv 2026
[4]

Grigor, A

M. Grigor, A. Kumar, and S. Lee. VET your agent: Verification, evaluation, and testing for autonomous LLM agents.arXiv preprint arXiv:2512.15892, 2025

work page arXiv 2025
[5]

Intelligence as Managed Autonomy: Failure, Escalation, and Governance for Agentic AI Systems

S. Ramaswamy. Intelligence as managed autonomy: Failure, escalation, and governance for agentic AI systems. Journal of Intelligent & Robotic Systems, to appear, 2026. Preprint: arXiv:2605.27628

work page internal anchor Pith review Pith/arXiv arXiv 2026
[6]

Feng and R

Z. Feng and R. McDonald. Levels of autonomy for AI agents.arXiv preprint arXiv:2506.12469, 2025

work page arXiv 2025
[7]

Hadfield-Menell, A

D. Hadfield-Menell, A. Dragan, P. Abbeel, and S. Russell. The off-switch game. InProc. IJCAI, pages 220-227, 2017

2017
[8]

N. G. Leveson.Engineering a Safer World: Systems Thinking Applied to Safety. MIT Press, 2011

2011
[9]

Hwang, S

C. Hwang, S. Majumder, and N. Peng. Autonomous language model agents with tool use. InFindings EMNLP 2023, pages 5678-5692, 2023

2023
[10]

S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y . Cao. ReAct: Synergizing reasoning and acting in language models. InProc. ICLR, 2023

2023
[11]

Shinn, F

N. Shinn, F. Cassano, A. Gopinath, K. Narasimhan, and S. Yao. Reflexion: Language agents with verbal reinforcement learning. InNeurIPS 36, pages 8634-8652, 2023

2023
[12]

Survey of LLM Agent Communication with MCP: A Software Design Pattern Centric Review

A. Sarkar and R. Sarkar. A survey of LLM agent communication with the model context protocol.arXiv preprint arXiv:2506.05364, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[13]

Concrete Problems in AI Safety

D. Amodei, C. Olah, J. Steinhardt, P. Christiano, J. Schulman, and D. Man ´e. Concrete problems in AI safety. arXiv preprint arXiv:1606.06565, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[14]

Russell.Human Compatible: Artificial Intelligence and the Problem of Control

S. Russell.Human Compatible: Artificial Intelligence and the Problem of Control. Viking, 2019

2019
[15]

R. S. Sutton, D. Precup, and S. Singh. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning.Artificial Intelligence, 112(1-2):181-211, 1999

1999
[16]

Haarnoja, A

T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. InProc. ICML, pages 1861-1870, 2018

2018
[17]

Pathak, P

D. Pathak, P. Agrawal, A. A. Efros, and T. Darrell. Curiosity-driven exploration by self-supervised prediction. In Proc. ICML, pages 2778-2787, 2017

2017
[18]

J. A. Stankovic. Misconceptions about real-time computing.Computer, 21(10):10-19, 1988

1988
[19]

J. R. Norris.Markov Chains. Cambridge University Press, 1997

1997
[20]

Bellman.Dynamic Programming

R. Bellman.Dynamic Programming. Princeton University Press, 1957. 14

1957
[21]

M. L. Puterman.Markov Decision Processes. Wiley, 1994

1994
[22]

T. M. Cover and J. A. Thomas.Elements of Information Theory. Wiley, 2nd edition, 2006

2006
[23]

Olfati-Saber, J

R. Olfati-Saber, J. A. Fax, and R. M. Murray. Consensus and cooperation in networked multi-agent systems. Proceedings of the IEEE, 95(1):215-233, 2007

2007
[24]

Digani, L

V . Digani, L. Sabattini, C. Secchi, and C. Fantuzzi. Ensemble coordination for multi-robot systems.IEEE Transactions on Automation Science and Engineering, 12(2):649-662, 2015

2015
[25]

A. Rizk, M. Awad, and E. W. Tunstel. Cooperative heterogeneous multi-robot systems: A survey.ACM Computing Surveys, 52(2):1-31, 2019

2019
[26]

H. K. Khalil.Nonlinear Systems, 3rd edition. Prentice Hall, 2002

2002
[27]

Universal Robots A/S, Odense, Denmark, 2022

Universal Robots.UR5/CB3 User Manual, Software Version 3.15. Universal Robots A/S, Odense, Denmark, 2022

2022
[28]

ISO, Geneva, 2016

ISO/TS 15066:2016.Robots and Robotic Devices: Collaborative Robots. ISO, Geneva, 2016

2016
[29]

ISO, Geneva, 2011

ISO 10218-1:2011.Robots and Robotic Devices: Safety Requirements for Industrial Robots, Part 1: Robots. ISO, Geneva, 2011

2011
[30]

Haddadin, A

S. Haddadin, A. De Luca, and A. Albu-Sch ¨affer. Robot collisions: A survey on detection, isolation, and identification.IEEE Transactions on Robotics, 33(6):1292-1312, 2017

2017
[31]

National Institute of Standards and Technology, Version 1.0, 2018

Helen Qiao.Degradation Measurement of Robot Arm Position Accuracy. National Institute of Standards and Technology, Version 1.0, 2018. DOI: https://doi.org/10.18434/M31962 . NIST Public Data Repository: https://data.nist.gov/od/id/754A77D9DA1E771AE0532457068179851962 . Accessed June 29, 2026

work page doi:10.18434/m31962 2018
[32]

G. E. Uhlenbeck and L. S. Ornstein. On the theory of the Brownian motion.Physical Review, 36(5):823-841, 1930

1930
[33]

D. P. Kroese, T. Brereton, T. Taimre, and Z. I. Botev. Why the Monte Carlo method is so important today.WIREs Computational Statistics, 6(6):386-392, 2014. A Complete Proofs: Single-Agent System A.1 Proof of Theorem 1 (Monotonic Stability) Proof.Letρ t = (st, gt, σt, ϵt). We consider three cases. Case 1: Action accepted.GATE(s t, at) = 1⇒σ t+1 = max(0, σt...

2014

[1] [1]

H. Chase. LangChain: Building applications with LLMs through composability.GitHub repository, 2022

2022

[2] [2]

Richards

T. Richards. AutoGPT: An autonomous GPT-4 experiment.GitHub repository, 2023

2023

[3] [3]

Doshi and J

R. Doshi and J. Hong. Verifiably safe tool use for LLM agents.arXiv preprint arXiv:2601.08012, 2026

work page arXiv 2026

[4] [4]

Grigor, A

M. Grigor, A. Kumar, and S. Lee. VET your agent: Verification, evaluation, and testing for autonomous LLM agents.arXiv preprint arXiv:2512.15892, 2025

work page arXiv 2025

[5] [5]

Intelligence as Managed Autonomy: Failure, Escalation, and Governance for Agentic AI Systems

S. Ramaswamy. Intelligence as managed autonomy: Failure, escalation, and governance for agentic AI systems. Journal of Intelligent & Robotic Systems, to appear, 2026. Preprint: arXiv:2605.27628

work page internal anchor Pith review Pith/arXiv arXiv 2026

[6] [6]

Feng and R

Z. Feng and R. McDonald. Levels of autonomy for AI agents.arXiv preprint arXiv:2506.12469, 2025

work page arXiv 2025

[7] [7]

Hadfield-Menell, A

D. Hadfield-Menell, A. Dragan, P. Abbeel, and S. Russell. The off-switch game. InProc. IJCAI, pages 220-227, 2017

2017

[8] [8]

N. G. Leveson.Engineering a Safer World: Systems Thinking Applied to Safety. MIT Press, 2011

2011

[9] [9]

Hwang, S

C. Hwang, S. Majumder, and N. Peng. Autonomous language model agents with tool use. InFindings EMNLP 2023, pages 5678-5692, 2023

2023

[10] [10]

S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y . Cao. ReAct: Synergizing reasoning and acting in language models. InProc. ICLR, 2023

2023

[11] [11]

Shinn, F

N. Shinn, F. Cassano, A. Gopinath, K. Narasimhan, and S. Yao. Reflexion: Language agents with verbal reinforcement learning. InNeurIPS 36, pages 8634-8652, 2023

2023

[12] [12]

Survey of LLM Agent Communication with MCP: A Software Design Pattern Centric Review

A. Sarkar and R. Sarkar. A survey of LLM agent communication with the model context protocol.arXiv preprint arXiv:2506.05364, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[13] [13]

Concrete Problems in AI Safety

D. Amodei, C. Olah, J. Steinhardt, P. Christiano, J. Schulman, and D. Man ´e. Concrete problems in AI safety. arXiv preprint arXiv:1606.06565, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[14] [14]

Russell.Human Compatible: Artificial Intelligence and the Problem of Control

S. Russell.Human Compatible: Artificial Intelligence and the Problem of Control. Viking, 2019

2019

[15] [15]

R. S. Sutton, D. Precup, and S. Singh. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning.Artificial Intelligence, 112(1-2):181-211, 1999

1999

[16] [16]

Haarnoja, A

T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. InProc. ICML, pages 1861-1870, 2018

2018

[17] [17]

Pathak, P

D. Pathak, P. Agrawal, A. A. Efros, and T. Darrell. Curiosity-driven exploration by self-supervised prediction. In Proc. ICML, pages 2778-2787, 2017

2017

[18] [18]

J. A. Stankovic. Misconceptions about real-time computing.Computer, 21(10):10-19, 1988

1988

[19] [19]

J. R. Norris.Markov Chains. Cambridge University Press, 1997

1997

[20] [20]

Bellman.Dynamic Programming

R. Bellman.Dynamic Programming. Princeton University Press, 1957. 14

1957

[21] [21]

M. L. Puterman.Markov Decision Processes. Wiley, 1994

1994

[22] [22]

T. M. Cover and J. A. Thomas.Elements of Information Theory. Wiley, 2nd edition, 2006

2006

[23] [23]

Olfati-Saber, J

R. Olfati-Saber, J. A. Fax, and R. M. Murray. Consensus and cooperation in networked multi-agent systems. Proceedings of the IEEE, 95(1):215-233, 2007

2007

[24] [24]

Digani, L

V . Digani, L. Sabattini, C. Secchi, and C. Fantuzzi. Ensemble coordination for multi-robot systems.IEEE Transactions on Automation Science and Engineering, 12(2):649-662, 2015

2015

[25] [25]

A. Rizk, M. Awad, and E. W. Tunstel. Cooperative heterogeneous multi-robot systems: A survey.ACM Computing Surveys, 52(2):1-31, 2019

2019

[26] [26]

H. K. Khalil.Nonlinear Systems, 3rd edition. Prentice Hall, 2002

2002

[27] [27]

Universal Robots A/S, Odense, Denmark, 2022

Universal Robots.UR5/CB3 User Manual, Software Version 3.15. Universal Robots A/S, Odense, Denmark, 2022

2022

[28] [28]

ISO, Geneva, 2016

ISO/TS 15066:2016.Robots and Robotic Devices: Collaborative Robots. ISO, Geneva, 2016

2016

[29] [29]

ISO, Geneva, 2011

ISO 10218-1:2011.Robots and Robotic Devices: Safety Requirements for Industrial Robots, Part 1: Robots. ISO, Geneva, 2011

2011

[30] [30]

Haddadin, A

S. Haddadin, A. De Luca, and A. Albu-Sch ¨affer. Robot collisions: A survey on detection, isolation, and identification.IEEE Transactions on Robotics, 33(6):1292-1312, 2017

2017

[31] [31]

National Institute of Standards and Technology, Version 1.0, 2018

Helen Qiao.Degradation Measurement of Robot Arm Position Accuracy. National Institute of Standards and Technology, Version 1.0, 2018. DOI: https://doi.org/10.18434/M31962 . NIST Public Data Repository: https://data.nist.gov/od/id/754A77D9DA1E771AE0532457068179851962 . Accessed June 29, 2026

work page doi:10.18434/m31962 2018

[32] [32]

G. E. Uhlenbeck and L. S. Ornstein. On the theory of the Brownian motion.Physical Review, 36(5):823-841, 1930

1930

[33] [33]

D. P. Kroese, T. Brereton, T. Taimre, and Z. I. Botev. Why the Monte Carlo method is so important today.WIREs Computational Statistics, 6(6):386-392, 2014. A Complete Proofs: Single-Agent System A.1 Proof of Theorem 1 (Monotonic Stability) Proof.Letρ t = (st, gt, σt, ϵt). We consider three cases. Case 1: Action accepted.GATE(s t, at) = 1⇒σ t+1 = max(0, σt...

2014