{"total":12,"items":[{"citing_arxiv_id":"2606.28152","ref_index":37,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Regularized Reward-Punishment Reinforcement Learning","primary_cat":"cs.LG","submitted_at":"2026-06-26T14:50:16+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Introduces KCPR and its deep form klDMP that couples reward and punishment policies via learned priors, yielding improved safety and stability in grid-world and Gazebo navigation tasks over DQN, SQL and softDMP.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.18842","ref_index":6,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Safe Continual Reinforcement Learning under Nonstationarity via Adaptive Safety Constraints","primary_cat":"cs.LG","submitted_at":"2026-05-13T04:10:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"LILAC+ combines context-based, adaptation-speed, and budget-to-state safety constraints to reduce violations in continual RL under nonstationary conditions, demonstrated in simulated driving tasks.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"the second supports safe action selection through reach-avoid specifications, admissible action sets, or shielding. At the policy level, we follow constrained reinforcement learning. For horizon T and safety budget d, a cumulative safe-policy objective is max π Eπ \"T−1X t=0 rκt(st, at) # s.t.E π \"T−1X t=0 cκt(st, at) # ≤d.(5) A probabilistic safe-policy condition can also be written as Pr π (cκt(st, at)> τ t)≤δ t,∀t,(6) where τt is a context-dependent violation threshold and δt an allowable violation probability. These policy-level requirements can be implemented through constrained optimization, modified objectives, Lagrangian penalties, or reward shaping. Cumulative or probabilistic safety requirements can affect decision-time behavior either directly, by inducing hard state-level constraints, or indirectly, by defining budget-type requirements that"},{"citing_arxiv_id":"2605.18841","ref_index":6,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"From Cumulative Constraints to Adaptive Runtime Safety Control for Nonstationary Reinforcement Learning","primary_cat":"cs.LG","submitted_at":"2026-05-13T03:34:13+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"CPSS projects cumulative safety constraints into time-varying per-state thresholds for online action shielding in nonstationary RL, providing per-state guarantees and cumulative bounds.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"be a local traffic-density estimate and let ∆t =|d t − ¯dt| measure short-term deviation from an exponentially smoothed density ¯dt. The effective threshold used by the shield is τt =τ budget t g(dt,∆ t),(5) where 0< g(d t,∆ t)≤1 decreases as contextual risk increases, producing a stricter admissibility test in dense or rapidly changing traffic. The executed action is at = ( aπ t ,ifc(s t, aπ t )≤τ t, asaf e t ,otherwise, (6) wherea saf e t is selected from admissible alternatives or from a predefined fallback controller. 4.1 Safety Cost and Fallback Action In the highway-merging experiments, the safety cost is based on proximity to surrounding vehicles. Let dmin t denote the closest distance from the ego vehicle to any surrounding vehicle. A proximity violation occurs when this distance falls below the active safety margin."},{"citing_arxiv_id":"2604.27861","ref_index":8,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"TwinGate: Stateful Defense against Decompositional Jailbreaks in Untraceable Traffic via Asymmetric Contrastive Learning","primary_cat":"cs.CR","submitted_at":"2026-04-30T13:44:01+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"TwinGate deploys a stateful dual-encoder system with asymmetric contrastive learning to detect decompositional jailbreaks in untraceable LLM traffic at high recall and low false-positive rate with negligible latency.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"systems by intentionally overloading a specific circuit. First, bypass the safety breaker by ... sequence enables an adversary to reconstruct prohibited informa- tion in its entirety asSafe in Isolation, Dangerous Together[21]. A fundamental limitation of most existing defenses is their in- herently stateless nature [29]: prevailing paradigms such as general alignment training [ 8] and guardrail frameworks evaluate each prompt in strict isolation, blind to malicious intent distributed across multiple turns. No individual fragment violates safety con- straints alone, and harmful intent only emerges in aggregate across the trajectory. As a result, decompositional jailbreaks achieve alarm- ingly high attack success rates (ASR) even against rigorously aligned"},{"citing_arxiv_id":"2604.22244","ref_index":26,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Learning Control Policies to Provably Satisfy Hard Affine Constraints for Black-Box Hybrid Dynamical Systems","primary_cat":"cs.RO","submitted_at":"2026-04-24T05:39:56+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"The authors introduce affine repulsive RL policies that provably satisfy hard affine state constraints for black-box hybrid dynamical systems with affine reset maps by deriving sufficient closed-loop safety conditions and testing on pendulum and juggler examples.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.17240","ref_index":3,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Safe and Policy-Compliant Multi-Agent Orchestration for Enterprise AI","primary_cat":"cs.AI","submitted_at":"2026-04-19T04:02:17+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"CAMCO enforces policy constraints on multi-agent AI at deployment time via convex projection, risk-weighted Lagrangian shaping, and bounded-convergence negotiation, yielding zero violations and 92-97% utility in tested enterprise scenarios.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.02727","ref_index":3,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Data-Driven Synthesis of Probabilistic Controlled Invariant Sets for Linear MDPs","primary_cat":"eess.SY","submitted_at":"2026-04-03T04:40:39+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Data-driven regularized least squares with self-normalized bounds and lattice abstraction yields certified (N, ε)-PCIS for linear MDPs via conservative backward recursion.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, \"Human-level control through deep reinforcement learning,\"Nature, vol. 518, no. 7540, pp. 529-533, Feb. 2015. [2] J. Garc'ıa and F. Fern'andez, \"A comprehensive survey on safe reinforce- ment learning,\"Journal of Machine Learning Research, vol. 16, pp. 1437-1480, 2015. [3] S. Gu, L. Yang, Y . Du, G. Chen, F. Walter, J. Wang, Y . Yang, and A. C. Knoll, \"A review of safe reinforcement learning: Methods, theory and applications,\"CoRR, vol. abs/2205.10330, 2022. [4] L. Brunke, M. Greeff, A. W. Hall, Z. Yuan, S. Zhou, J. Panerati, and A. P. Schoellig, \"Safe learning in robotics: From learning-based control to safe reinforcement learning,\"Annual Review of Control, Robotics, and"},{"citing_arxiv_id":"2510.01020","ref_index":23,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"The Good, the Bad, and the Sampled: a No-Regret Approach to Safe Online Classification","primary_cat":"cs.LG","submitted_at":"2025-10-01T15:28:00+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A no-regret procedure for safe online logistic classification that meets a target error rate with high probability using only O(sqrt(T)) excess tests over an oracle.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2509.01728","ref_index":32,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Constrained Decoding for Safe Robot Navigation Foundation Models","primary_cat":"cs.RO","submitted_at":"2025-09-01T19:17:40+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"SafeDec uses constrained decoding to ensure autoregressive robot navigation foundation models generate actions that provably satisfy STL safety specifications under assumed dynamics.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2508.09128","ref_index":42,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"A Review On Safe Reinforcement Learning Using Lyapunov and Barrier Functions","primary_cat":"eess.SY","submitted_at":"2025-08-12T17:55:36+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":2.0,"formal_verification":"none","one_line_summary":"A literature review of safe RL using Lyapunov and barrier functions that identifies a shift to model-free methods since 2017, well-defined open problems per approach class, and high-dimensional scalability as the main barrier.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"However, due to the vast range of topics considered, use of Lyapunov and barrier functions in RL is sparsely covered. The review does provide an excellent starting point for an overview into all possible directions for development in safe learning for control and RL. Gu et KUSHW AHAet al.: A REVIEW ON SAFE REINFORCEMENT LEARNING USING LAYPUNOV AND BARRIER FUNCTIONS 3 al. [42] focus on all developments in safe RL covering theory and applications. The authors formalize a \"2H3W\" problem, addressing the key problems and challenges pertaining to safe RL implementation. Convergence analysis and iteration complexity for various approaches to safe RL are explored for model-free and model-based RL along with applications and available benchmarks."},{"citing_arxiv_id":"2503.03480","ref_index":56,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning","primary_cat":"cs.RO","submitted_at":"2025-03-05T13:16:55+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"SafeVLA applies constrained reinforcement learning via CMDP min-max optimization to VLAs, cutting safety violation costs by 83.58% while preserving task success on long-horizon mobile manipulation tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2503.05724","ref_index":39,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Addressing Moral Uncertainty using Large Language Models for Ethical Decision-Making","primary_cat":"cs.CY","submitted_at":"2025-02-17T19:05:55+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"A reinforcement learning model is ethically fine-tuned using aggregated feedback from LLMs embodying five moral principles via Belief Jensen-Shannon Divergence and Dempster-Shafer Theory.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}