A behavioral monitoring technique using HTTP, lexical, and timing signals detects guardrail presence with 100% accuracy and distinguishes guardrail blocks from LLM rejections with 98% average F1 on unseen prompts.
Frontiers of Computer Science , year=
5 Pith papers cite this work. Polarity classification is still indexing.
years
2026 5verdicts
UNVERDICTED 5representative citing papers
TP-TopK uses private warm-up to select k coordinates for DP-SGD, with a stationarity bound showing noise scales with k (not d) under a given criterion, and experiments on image datasets showing learned supports retain more gradient energy than random ones.
Federated personalization of foundation models creates hard-to-detect trustworthiness failures due to privacy constraints, and existing benchmarks cannot adequately evaluate them.
LLM-powered conversational voice sleep diaries achieved higher adherence and richer contextual reports than text-based diaries, with a noted trade-off in structured field completeness.
Presents a four-module LLM framework for text-to-SQL on the ALeRCE astro database, evaluated on 110 NL/SQL pairs across 13 models with perfect-match metrics.
citing papers explorer
-
When Do Fewer Coordinates Suffice in DP-SGD?
TP-TopK uses private warm-up to select k coordinates for DP-SGD, with a stationarity bound showing noise scales with k (not d) under a given criterion, and experiments on image datasets showing learned supports retain more gradient energy than random ones.
-
Silent Failures in Federated Personalization of Foundation Models
Federated personalization of foundation models creates hard-to-detect trustworthiness failures due to privacy constraints, and existing benchmarks cannot adequately evaluate them.