Assessing hidden risks of llms: An empirical study on robustness, consistency, and credibility

Wentao Ye, Mingfeng Ou, Tianyi Li, Xuetao Ma, Yifan Yanggong, Sai Wu, Jie Fu, Gang Chen, Junbo Zhao, et al · 2023 · arXiv 2305.10235

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

CASCADE: Detecting Inconsistencies between Code and Documentation with Automatic Test Generation

cs.SE · 2026-04-21 · unverdicted · novelty 7.0

CASCADE finds code-documentation mismatches by running LLM-generated tests from docs and confirming failure only when documentation-derived code succeeds on the same test.

Pop Quiz Attack: Black-box Membership Inference Attacks Against Large Language Models

cs.CR · 2026-05-07 · unverdicted · novelty 6.0

PopQuiz Attack infers LLM training data membership by turning examples into quiz questions and measuring answer accuracy, reaching 0.873 average ROC-AUC across six models and outperforming prior methods by 20.6%.

TrustLLM: Trustworthiness in Large Language Models

cs.CL · 2024-01-10 · unverdicted · novelty 5.0

TrustLLM defines eight trustworthiness principles, creates a six-dimension benchmark, and evaluates 16 LLMs showing proprietary models generally lead but some open-source ones are close while over-calibration can hurt utility.

citing papers explorer

Showing 3 of 3 citing papers.

CASCADE: Detecting Inconsistencies between Code and Documentation with Automatic Test Generation cs.SE · 2026-04-21 · unverdicted · none · ref 47
CASCADE finds code-documentation mismatches by running LLM-generated tests from docs and confirming failure only when documentation-derived code succeeds on the same test.
Pop Quiz Attack: Black-box Membership Inference Attacks Against Large Language Models cs.CR · 2026-05-07 · unverdicted · none · ref 50
PopQuiz Attack infers LLM training data membership by turning examples into quiz questions and measuring answer accuracy, reaching 0.873 average ROC-AUC across six models and outperforming prior methods by 20.6%.
TrustLLM: Trustworthiness in Large Language Models cs.CL · 2024-01-10 · unverdicted · none · ref 266
TrustLLM defines eight trustworthiness principles, creates a six-dimension benchmark, and evaluates 16 LLMs showing proprietary models generally lead but some open-source ones are close while over-calibration can hurt utility.

Assessing hidden risks of llms: An empirical study on robustness, consistency, and credibility

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer