From judgment to interference: Early stopping LLM harmful outputs via streaming content monitoring

Yang Li, Qiang Sheng, Yehan Yang, Xueyao Zhang, Juan Cao · 2025 · arXiv 2506.09996

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

TwinGate: Stateful Defense against Decompositional Jailbreaks in Untraceable Traffic via Asymmetric Contrastive Learning

cs.CR · 2026-04-30 · unverdicted · novelty 6.0

TwinGate deploys a stateful dual-encoder system with asymmetric contrastive learning to detect decompositional jailbreaks in untraceable LLM traffic at high recall and low false-positive rate with negligible latency.

Beyond Linear Probes: Dynamic Safety Monitoring for Language Models

cs.LG · 2025-09-30 · unverdicted · novelty 6.0

TPCs allow term-by-term progressive polynomial evaluation on LLM activations for flexible safety monitoring that supports both stronger guardrails and low-cost adaptive cascades.

From Static Inference to Dynamic Interaction: A Survey of Streaming Large Language Models

cs.CL · 2026-03-04 · unverdicted · novelty 5.0

The paper supplies a unified definition based on data flow and dynamic interaction plus a systematic taxonomy to organize fragmented work on streaming large language models.

citing papers explorer

Showing 3 of 3 citing papers.

TwinGate: Stateful Defense against Decompositional Jailbreaks in Untraceable Traffic via Asymmetric Contrastive Learning cs.CR · 2026-04-30 · unverdicted · none · ref 16
TwinGate deploys a stateful dual-encoder system with asymmetric contrastive learning to detect decompositional jailbreaks in untraceable LLM traffic at high recall and low false-positive rate with negligible latency.
Beyond Linear Probes: Dynamic Safety Monitoring for Language Models cs.LG · 2025-09-30 · unverdicted · none · ref 39
TPCs allow term-by-term progressive polynomial evaluation on LLM activations for flexible safety monitoring that supports both stronger guardrails and low-cost adaptive cascades.
From Static Inference to Dynamic Interaction: A Survey of Streaming Large Language Models cs.CL · 2026-03-04 · unverdicted · none · ref 7
The paper supplies a unified definition based on data flow and dynamic interaction plus a systematic taxonomy to organize fragmented work on streaming large language models.

From judgment to interference: Early stopping LLM harmful outputs via streaming content monitoring

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer