XtrAIn shifts occlusion from input space to parameter space along the training trajectory to produce cleaner feature attributions than standard methods.
Explainable artificial intelligence (xai): From inherent explainability to large language models.arXiv preprint arXiv:2501.09967
5 Pith papers cite this work. Polarity classification is still indexing.
5
Pith papers citing it
years
2026 5representative citing papers
The paper introduces the Agentic Risk Standard (ARS) as a payment settlement framework that delivers predefined compensation for AI agent execution failures, misalignment, or unintended outcomes.
GRAPHIC interprets confusion matrices from linear classifiers on intermediate layers as graphs to visualize and quantify class confusion dynamics in deep learning.
ERPPO adds a DSA-based ambiguity estimator to MAPPO and switches between L1 and L2 entropy regularization to improve exploration and stability in non-stationary multi-dimensional observations.