TwinGate deploys a stateful dual-encoder system with asymmetric contrastive learning to detect decompositional jailbreaks in untraceable LLM traffic at high recall and low false-positive rate with negligible latency.
From judgment to interference: Early stopping LLM harmful outputs via streaming content monitoring
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
TPCs allow term-by-term progressive polynomial evaluation on LLM activations for flexible safety monitoring that supports both stronger guardrails and low-cost adaptive cascades.
The paper supplies a unified definition based on data flow and dynamic interaction plus a systematic taxonomy to organize fragmented work on streaming large language models.
citing papers explorer
-
TwinGate: Stateful Defense against Decompositional Jailbreaks in Untraceable Traffic via Asymmetric Contrastive Learning
TwinGate deploys a stateful dual-encoder system with asymmetric contrastive learning to detect decompositional jailbreaks in untraceable LLM traffic at high recall and low false-positive rate with negligible latency.
-
Beyond Linear Probes: Dynamic Safety Monitoring for Language Models
TPCs allow term-by-term progressive polynomial evaluation on LLM activations for flexible safety monitoring that supports both stronger guardrails and low-cost adaptive cascades.
-
From Static Inference to Dynamic Interaction: A Survey of Streaming Large Language Models
The paper supplies a unified definition based on data flow and dynamic interaction plus a systematic taxonomy to organize fragmented work on streaming large language models.