{"total":57,"items":[{"citing_arxiv_id":"2605.22169","ref_index":38,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Balancing Uncertainty and Diversity of Samples: Leveraging Diversity of Least, High Confidence Samples for Effective Active Learning","primary_cat":"cs.CV","submitted_at":"2026-05-21T08:39:51+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Proposes LCD and three other hybrid uncertainty-diversity sampling methods for active learning that outperform prior approaches by selecting uncertain yet diverse samples.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.21938","ref_index":25,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Optimal Guarantees for Auditing R\\'enyi Differentially Private Machine Learning","primary_cat":"cs.LG","submitted_at":"2026-05-21T03:18:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A hypothesis-testing framework with class-restricted Donsker-Varadhan estimators provides optimal non-asymptotic confidence intervals and minimax lower bounds for black-box auditing of Rényi DP guarantees.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.21060","ref_index":20,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Divide et Calibra: Multiclass Local Calibration via Vector Quantization","primary_cat":"cs.LG","submitted_at":"2026-05-20T11:44:55+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Vector quantization induces a structured partition of the representation space for composing heterogeneous multiclass calibration maps via shared codeword-dependent Dirichlet factors.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.20534","ref_index":42,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Axiomatizing Neural Networks via Pursuit of Subspaces","primary_cat":"cs.LG","submitted_at":"2026-05-19T22:12:58+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Authors introduce the Pursuit of Subspaces (PoS) hypothesis, an axiomatic geometric framework that unifies explanations for representation, computation, and generalization in shallow and deep neural networks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.20005","ref_index":30,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Fine-Tuning Without Forgetting via Loss-Adaptive Learning Rates","primary_cat":"cs.LG","submitted_at":"2026-05-19T15:36:52+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"FINCH is a loss-adaptive learning-rate schedule that reduces forgetting by 93% on average during LLM fine-tuning while matching standard task performance across several benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.18202","ref_index":86,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Concise and Logically Consistent Conformal Sets for Neuro-Symbolic Concept-Based Models","primary_cat":"cs.LG","submitted_at":"2026-05-18T10:43:02+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"COCOCO is a conformal framework for NeSy-CBMs that jointly conformalizes concepts and labels, reconciles them via deduction-abduction revision, and satisfies consistency, coverage, and conciseness while retaining distribution-free guarantees.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.18020","ref_index":21,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Federated Learning by Utility-Constrained Stochastic Aggregation for Improving Rational Participation","primary_cat":"cs.LG","submitted_at":"2026-05-18T08:12:36+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"FedUCA formalizes the server as an optimizer that uses utility-constrained stochastic aggregation to maximize client retention and global performance in heterogeneous federated learning.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.17373","ref_index":25,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"FML-bench: A Controlled Study of AI Research Agent Strategies from the Perspective of Search Dynamics","primary_cat":"cs.LG","submitted_at":"2026-05-17T10:30:38+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"FML-Bench shows a simple greedy hill-climber nearly matches tree search on dense-opportunity tasks while an adaptive agent that broadens search on stagnation outperforms six baselines across 18 tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.16834","ref_index":21,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Learning Relative Representations for Fine-Grained Multimodal Alignment with Limited Data","primary_cat":"cs.CV","submitted_at":"2026-05-16T06:33:38+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A new post-hoc alignment technique uses learnable anchors to capture token-level relative similarities between modalities, outperforming global alignment baselines on zero-shot classification, retrieval, and segmentation with scarce paired examples.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.18868","ref_index":28,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"DarkLLM: Learning Language-Driven Adversarial Attacks with Large Language Models","primary_cat":"cs.CR","submitted_at":"2026-05-15T12:28:16+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"DarkLLM trains an LLM to generate language-driven adversarial perturbations that unify targeted, untargeted, segmentation, and multi-model attacks on foundation models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.15675","ref_index":32,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Interaction-Aware Influence Functions for Group Attribution","primary_cat":"cs.LG","submitted_at":"2026-05-15T06:54:15+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Extends influence functions with a second-order pairwise interaction term that improves group attribution accuracy over simple summation on multiple model-dataset pairs and instruction-tuning selection tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.15154","ref_index":25,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"RoSHAP: A Distributional Framework and Robust Metric for Stable Feature Attribution","primary_cat":"stat.ML","submitted_at":"2026-05-14T17:51:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"RoSHAP is a robust feature-ranking metric that summarizes the distributional properties of SHAP values via bootstrap resampling and asymptotic normality to reward active, strong, and stable features.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.14413","ref_index":14,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"MahaVar: OOD Detection via Class-wise Mahalanobis Distance Variance under Neural Collapse","primary_cat":"cs.LG","submitted_at":"2026-05-14T05:58:19+00:00","verdict":"CONDITIONAL","verdict_confidence":"MODERATE","novelty_score":5.0,"formal_verification":"none","one_line_summary":"MahaVar augments the Mahalanobis OOD score with class-wise distance variance, which is theoretically higher for in-distribution samples under relaxed Neural Collapse geometry.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.13835","ref_index":25,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Unlocking Patch-Level Features for CLIP-Based Class-Incremental Learning","primary_cat":"cs.CV","submitted_at":"2026-05-13T17:56:23+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"SPA unlocks patch-level features in CLIP for class-incremental learning via semantic-guided selection and optimal transport alignment with class descriptions, plus projectors and pseudo-feature replay to reduce forgetting.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.13214","ref_index":21,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Backdoor Channels Hidden in Latent Space: Cryptographic Undetectability in Modern Neural Networks","primary_cat":"cs.CR","submitted_at":"2026-05-13T09:06:25+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Backdoors can be embedded in ResNet and ViT models as statistically indistinguishable latent directions, reducing cryptographic undetectability to an intractable hypothesis test over parameter distributions.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"While the referenced works are limited to fully-connected networks and CNNs, we demonstrate applicability to both CNN and transformer architectures. Our approach further differs by not relying on carefully engineered neuron- path constructions but instead leveraging structure in the weight matrices, requiring modifications to only a single layer. While Lamparth and Reuel [21] also analyse internal representations to identify network components important for a backdooring mechanism, the backdoor originates from data poisoning and lacks formal cryptographic assurances. Backdoor defences at the deployment phase are intended to detect and eliminate backdoors in pre-trained networks [5]. Removal strategies often exploit the fact that backdoor functionality is"},{"citing_arxiv_id":"2605.13148","ref_index":20,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Understanding Generalization through Decision Pattern Shift","primary_cat":"cs.LG","submitted_at":"2026-05-13T08:14:59+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"DPS quantifies deviation of per-sample decision patterns from class averages and shows linear correlation with generalization gaps while unifying degradation scenarios into a continuous trajectory.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.12942","ref_index":39,"ref_count":2,"confidence":0.55,"is_internal_anchor":false,"paper_title":"From Compression to Accountability: Harmless Copyright Protection for Dataset Distillation","primary_cat":"cs.CR","submitted_at":"2026-05-13T03:23:35+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"SubPopMark embeds verifiable subpopulation biases into distilled datasets via CVM and USTM optimization stages, allowing provenance inference through comparison of model output signatures against a reference behavior bank.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.11558","ref_index":36,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"A Composite Activation Function for Learning Stable Binary Representations","primary_cat":"cs.LG","submitted_at":"2026-05-12T05:41:36+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"HTAF is a sigmoid-tanh composite that approximates the Heaviside function to allow stable gradient training of binary activation networks, yielding ICBMs with stable discretization and competitive performance on image tasks.","context_count":1,"top_context_role":"dataset","top_context_polarity":"use_dataset","context_text":"[34] Insung Kong, Juntong Chen, Sophie Langer, and Johannes Schmidt-Hieber. On the expressivity of deep heaviside networks.arXiv preprint arXiv:2505.00110, 2025. [35] Insung Kong and Yongdai Kim. Posterior concentrations of fully-connected bayesian neural networks with general priors on the weights.Journal of Machine Learning Research, 26(94):1- 60, 2025. [36] Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009. [37] Akshay Kulkarni, Ge Yan, Chung-En Sun, Tuomas Oikarinen, and Tsui-Wei Weng. Interpretable generative models through post-hoc concept bottlenecks. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 8162-8171, 2025."},{"citing_arxiv_id":"2605.10756","ref_index":27,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"TINS: Test-time ID-prototype-separated Negative Semantics Learning for OOD Detection","primary_cat":"cs.CV","submitted_at":"2026-05-11T15:54:34+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"TINS improves OOD detection by learning negative semantics at test time with ID-prototype separation, cutting average FPR95 from 14.04% to 6.72% on the Four-OOD benchmark with ImageNet-1K.","context_count":1,"top_context_role":"other","top_context_polarity":"unclear","context_text":"[25] Xue Jiang, Feng Liu, Zhen Fang, Hong Chen, Tongliang Liu, Feng Zheng, and Bo Han. Negative label guided ood detection with pretrained vision-language models. InThe Twelfth International Conference on Learning Representations, 2024. [26] Shu Kong and Deva Ramanan. Opengan: Open-set recognition via open data generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 813-822, 2021. [27] Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009. [28] Ya Le and Xuan Yang. Tiny imagenet visual recognition challenge.CS 231N, 7(7):3, 2015. [29] Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin. A simple unified framework for detecting out-of-distribution samples and adversarial attacks.Advances in neural information"},{"citing_arxiv_id":"2605.09963","ref_index":1,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Learning to Perceive \"Where\": Spatial Pretext Tasks for Robust Self-Supervised Learning","primary_cat":"cs.CV","submitted_at":"2026-05-11T04:15:34+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Spatial Prediction pretext task learns spatial structure in self-supervised learning by regressing relative position and scale between image views, yielding more structured representations and better generalization.","context_count":1,"top_context_role":"dataset","top_context_polarity":"use_dataset","context_text":"The left panel illustrates the problem setting: two cropped and resized views (orange and blue) are sampled from the same image, and the SSL model predicts the relative position and scale of the orange patch with respect to the blue reference. The right panel summarizes the evaluation tasks for benchmarking SSL models. (a) Image recognition includes in-domain classification on C100 [1] and IN-1K [2], out-of-distribution robustness on IN-C [3], IN-R [4], Sketch [5], and Occlusion, and cross-dataset transfer learning on Flowers [6], C100 [1], DTD [7], and Food [8]. (b) Dense prediction tasks include semantic segmentation on PASCAL VOC [9] and depth estimation on NYU [10]. (c) Spatial prediction evaluates the ability of SSL models to estimate relative position and scale between"},{"citing_arxiv_id":"2605.07756","ref_index":19,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"When Losses Align: Gradient-Based Composite Loss Weighting for Efficient Pretraining","primary_cat":"cs.LG","submitted_at":"2026-05-08T13:59:19+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A bilevel method learns composite pretraining loss weights online via gradient alignment with a downstream objective, matching tuned baselines at roughly 30% extra cost over one training run.","context_count":1,"top_context_role":"dataset","top_context_polarity":"use_dataset","context_text":"The mean and standard deviation across 4 seeds are reported. The best method for each dataset-backbone combination is shown in bold. 6 4.2 Computer vision Experiment design.We evaluate our method in the computer vision domain using the solo-learn [18], a standardized framework for self-supervised representation learning. We consider experiments onCIFAR-10,CIFAR-100[ 19], andImageNet-100, a subset of ImageNet [ 20]. We apply our approach toAll4One[ 5], a recent self-supervised method that combines multiple objectives inspired by Barlow Twins [21], BYOL [2], and NNCLR [22]. Model quality is evaluated following the benchmark protocol via online training of a linear classifier on top of detached representations, reporting validation top-1 accuracy."},{"citing_arxiv_id":"2605.07634","ref_index":15,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Robust stochastic first order methods in heavy-tailed noise via medoid mini-batch gradient sampling","primary_cat":"math.OC","submitted_at":"2026-05-08T12:01:25+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"R-SGD-Mini achieves O(1/T) convergence of expected squared gradient norm to a noise-dependent neighborhood in heavy-tailed settings by selecting the medoid gradient from M data chunks.","context_count":1,"top_context_role":"other","top_context_polarity":"unclear","context_text":"Dusan Stamenkovic. Nonlinear gradient mappings and stochastic optimization: A general framework with applications to heavy-tail noise.SIAM Journal on Optimization, 33(2):394-423, 2023. doi: 10.1137/21M145896X. URLhttps://doi.org/10.1137/21M145896X. [14] Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009. [15] Zijian Liu, Jiawei Zhang, and Zhengyuan Zhou. Breaking the lower bound with (little) structure: Acceleration in non-convex stochastic optimization with heavy-tailed noise. InThe Thirty Sixth Annual Conference on Learning Theory, pages 2266-2290. PMLR, 2023. [16] Gábor Lugosi and Shahar Mendelson. Mean estimation and regression under heavy-tailed distributions: A survey."},{"citing_arxiv_id":"2605.07512","ref_index":27,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Hierarchical Dual-Subspace Decoupling for Continual Learning in Vision-Language Models","primary_cat":"cs.CV","submitted_at":"2026-05-08T09:42:05+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"HDSD decouples parameter subspaces in vision-language models via a Feature Modulation Module, General Fusion Module with adaptive thresholds, and Hierarchical Learning Module with SVD scaling to minimize cross-task interference and achieve state-of-the-art class-incremental learning performance.","context_count":1,"top_context_role":"dataset","top_context_polarity":"use_dataset","context_text":"A= diag(s 1σ1, s2σ2, ..., siσi)(13) The parameter matrix is then reconstructed as: Wtest =U AV T (14) Notably, the scaling factors introduced during training are preserved at inference time, resulting in a hierarchical weighting over task-specific components. 4 Experiment 4.1 Datasets and Metrics We conduct experiments on three datasets: CIFAR-100 [27], ImageNet-100, and ImageNet-R [28]. The CIFAR-100 dataset consists of 100 categories, each containing 600 color images with a resolution of 32×32 pixels. In this dataset, 500 images are designated for the training set, while 100 images are reserved for the test set. The ImageNet-100 dataset, a subset of the larger ImageNet-1k dataset [29], comprises 100 categories."},{"citing_arxiv_id":"2605.07119","ref_index":23,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Classification Fields: Arbitrarily Fine Recursive Hierarchical Clustering From Few Examples","primary_cat":"stat.ML","submitted_at":"2026-05-08T01:50:49+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Classification fields are infinite recursive hierarchical cluster structures generated by a local refinement rule, and a ReLU network predictor learned from finite prefixes can approximate the generator and extend it to deeper levels with exponential convergence in the completed cell metric.","context_count":1,"top_context_role":"dataset","top_context_polarity":"use_dataset","context_text":"refinement windows needed for such approximations, yielding a finite neural rollout mechanism for infinite-depth hierarchical completion. Empirically, we evaluate this viewpoint in three settings of increasing mismatch from the theory: matched CFG-generated hierarchies, out-of-family IFS fractal hierarchies, and image-induced recursive clustering hierarchies built from CLIP [34] embeddings of CIFAR datasets [23]. Across these settings, CFP rollouts preserve ordered child slots, unordered geometry, and hierarchy-level path metrics better than simpler baselines. Overall, this paper makes three contributions. First, we introduce classification fields as a metric- geometric model for infinite-depth hierarchical clustering: recursively generated DAGs whose nodes"},{"citing_arxiv_id":"2605.06357","ref_index":30,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Memory Efficient Full-gradient Attacks (MEFA) Framework for Adversarial Defense Evaluations","primary_cat":"cs.LG","submitted_at":"2026-05-07T14:35:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"MEFA enables exact full-gradient white-box attacks on iterative stochastic purification defenses like diffusion and Langevin EBMs by trading recomputation for lower memory, revealing vulnerabilities missed by approximate-gradient methods.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.05871","ref_index":13,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Retain-Neutral Surrogates for Min-Max Unlearning","primary_cat":"cs.LG","submitted_at":"2026-05-07T08:38:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"ROSU derives a closed-form retain-neutral perturbation for min-max unlearning that bounds retain damage via curvature and improves performance when gradients are aligned.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.05769","ref_index":43,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Adaptive Selection of LoRA Components in Privacy-Preserving Federated Learning","primary_cat":"cs.LG","submitted_at":"2026-05-07T07:01:00+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"AS-LoRA adaptively chooses which LoRA factor to update per layer and round using a curvature-aware second-order score, eliminating reconstruction error floors and improving performance in DP federated learning.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.04209","ref_index":27,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Undetectable Backdoors in Model Parameters: Hiding Sparse Secrets in High Dimensions","primary_cat":"cs.CR","submitted_at":"2026-05-05T18:48:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Sparse Backdoor plants a provably undetectable backdoor in neural network weights via structured sparse perturbations and isotropic Gaussian dithering, with detection hardness reduced to Sparse PCA.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.02109","ref_index":17,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Detecting Adversarial Data via Provable Adversarial Noise Amplification","primary_cat":"cs.LG","submitted_at":"2026-05-04T00:08:05+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A provable adversarial noise amplification theorem under sufficient conditions enables a custom-trained detector that identifies adversarial examples at inference time using enhanced layer-wise noise signals.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.18867","ref_index":24,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Hierarchically Robust Zero-shot Vision-language Models","primary_cat":"cs.CV","submitted_at":"2026-04-20T21:42:14+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A hierarchical adversarial fine-tuning method for VLMs aligns image and text embeddings at multiple hierarchy depths with theoretical margin connections to boost robustness to leaf and superclass attacks while using multiple trees for semantic variety.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.14621","ref_index":22,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Differentially Private Conformal Prediction","primary_cat":"stat.ML","submitted_at":"2026-04-16T05:08:02+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"DPCP delivers end-to-end differentially private conformal prediction sets that are tighter than split-based private methods under the same privacy budget while maintaining coverage under regularity conditions.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.14587","ref_index":15,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"CLion: Efficient Cautious Lion Optimizer with Enhanced Generalization","primary_cat":"cs.LG","submitted_at":"2026-04-16T03:32:48+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"CLion achieves O(1/N) generalization error and O(√d / T^{1/4}) convergence for nonconvex stochastic optimization, improving on Lion's O(1/(N τ^T)) bound.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.03154","ref_index":21,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"DSBD: Dual-Aligned Structural Basis Distillation for Graph Domain Adaptation","primary_cat":"cs.LG","submitted_at":"2026-04-03T16:23:59+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"DSBD distills a dual-aligned structural basis to adapt GNNs across graphs with structural distribution shifts, outperforming prior methods on benchmarks.","context_count":1,"top_context_role":"dataset","top_context_polarity":"use_dataset","context_text":"ployed on unseen target graphs (Line 24). The complexity analysis can be found in Appendix. E. 4 Experiments Datasets.To evaluate the effectiveness of DSBD, we conduct ex- periments on both graph and image benchmarks under three rep- resentative types of domain shifts. (1)Structure-based shifts: Structural shifts are simulated on MNIST [23], CIFAR10 [21], PRO- TEINS [9], Mutagenicity [19], NCI1 [45], FRANKENSTEIN [35], and ogbg-molhiv [15] by partitioning graphs into multiple domains ac- cording to node or edge densities [67, 68]. (2)Feature-based shifts: We further evaluate on DD, PROTEINS, BZR, BZR_MD, COX2, and COX2_MD, where source and target domains share identical global structures but exhibit significant shifts in node feature distributions."},{"citing_arxiv_id":"2603.13864","ref_index":63,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Inevitable Encounters: Backdoor Attacks Involving Lossy Compression","primary_cat":"cs.CR","submitted_at":"2026-03-14T09:45:59+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"ROI coding enables backdoor triggers to survive lossy compression by embedding malicious information into binary bitstreams via sample-specific or customized masks for both learned and traditional codecs.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2602.07200","ref_index":60,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"BadSNN: Backdoor Attacks on Spiking Neural Networks via Adversarial Spiking Neuron","primary_cat":"cs.CR","submitted_at":"2026-02-06T21:20:41+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"BadSNN injects backdoors into spiking neural networks by adversarially tuning LIF neuron hyperparameters and optimizing triggers, achieving higher attack success than prior data-poisoning methods while remaining robust to common defenses.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2512.10275","ref_index":39,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Sample-wise Adaptive Weighting for Transfer Consistency in Adversarial Distillation","primary_cat":"cs.CV","submitted_at":"2025-12-11T04:31:04+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SAAD adaptively weights adversarial training samples by their transferability to the teacher, yielding higher AutoAttack robustness than prior distillation methods on CIFAR and Tiny-ImageNet without extra compute.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2511.17634","ref_index":23,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Efficient Score Pre-computation for Diffusion Models via Cross-Matrix Krylov Projection","primary_cat":"cs.CV","submitted_at":"2025-11-19T07:21:49+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Cross-matrix Krylov projection reuses shared subspaces from seed matrices to accelerate score pre-computation in diffusion models, delivering 15.8-43.7% time savings and up to 115x speedup versus DDPM baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2510.21366","ref_index":25,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"BADiff: Bandwidth Adaptive Diffusion Model","primary_cat":"cs.CV","submitted_at":"2025-10-24T11:50:03+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"BADiff introduces joint training of diffusion models with quality conditioning derived from bandwidth to enable adaptive early-stop sampling that preserves appropriate perceptual quality.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2510.01608","ref_index":29,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"NPN: Non-Linear Projections of the Null-Space for Imaging Inverse Problems","primary_cat":"cs.CV","submitted_at":"2025-10-02T02:45:06+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"NPN introduces a neural-network-based regularization that promotes reconstructions lying in a low-dimensional projection of the sensing operator's null-space, with claimed theoretical guarantees and improved empirical performance across compressive sensing, deblurring, super-resolution, CT, and MRI.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2509.25003","ref_index":23,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Score-based Membership Inference on Diffusion Models","primary_cat":"cs.LG","submitted_at":"2025-09-29T16:28:55+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Presents SimA, a score-based single-query membership inference attack for diffusion models and LDMs that uses denoiser output norm to reveal training set proximity and outperforms multi-query baselines on eight datasets.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2508.14255","ref_index":41,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Graph Concept Bottleneck Models","primary_cat":"cs.LG","submitted_at":"2025-08-19T20:23:18+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"GraphCBMs extend concept bottleneck models by building latent concept graphs to model correlations between concepts, yielding better image classification accuracy, more informative structure for interpretability, and stronger intervention results.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2506.13763","ref_index":28,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Diagnosing and Improving Diffusion Models by Estimating the Optimal Loss Value","primary_cat":"cs.LG","submitted_at":"2025-06-16T17:59:54+00:00","verdict":"CONDITIONAL","verdict_confidence":"MODERATE","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Derives closed-form optimal loss for unified diffusion models, provides variance-controlled estimators, and shows improved diagnosis, training schedules, and power-law scaling after subtracting the optimal value.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2506.05398","ref_index":22,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"2ndMatch: Finetuning Pruned Diffusion Models via Second-Order Jacobian Matching","primary_cat":"cs.GR","submitted_at":"2025-06-03T20:04:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"2ndMatch finetunes pruned diffusion models via second-order Jacobian matching inspired by Finite-Time Lyapunov Exponents to reduce the quality gap with dense models on image generation tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2506.02671","ref_index":21,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Test-Time Distillation for Continual Model Adaptation","primary_cat":"cs.CV","submitted_at":"2025-06-03T09:16:51+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"CoDiRe blends VLM and target model predictions via MSP-based weighting and Optimal Transport rectification to enable stable continual test-time adaptation, outperforming CoTTA by 10.55% on ImageNet-C at 48% of the compute cost.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2506.01247","ref_index":46,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Beyond Interpretability: When, Why, and How Sparse Autoencoders Enable Label-Free Visual Steering","primary_cat":"cs.CV","submitted_at":"2025-06-02T01:51:20+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2411.05183","ref_index":38,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Why CNN Features Are not Gaussian: A Statistical Anatomy of Deep Representations","primary_cat":"cs.CV","submitted_at":"2024-11-07T21:04:58+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"CNN feature activations follow long-tailed Weibull-like distributions with increasing tail dependence by depth rather than Gaussian, indicating a Matthew process that concentrates signal in tails.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2410.04941","ref_index":14,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"TOAST: Transformer Optimization using Adaptive and Simple Transformations","primary_cat":"cs.LG","submitted_at":"2024-10-07T11:35:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"TOAST approximates full transformer blocks in pretrained models via lightweight closed-form mappings to cut parameters and FLOPs without retraining or finetuning.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2410.03000","ref_index":19,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Towards Generalized Certified Robustness with Multi-Norm Training","primary_cat":"cs.LG","submitted_at":"2024-10-03T21:20:46+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"CURE is the first multi-norm certified training method that improves union robustness across l_p norms and unseen perturbations on MNIST, CIFAR-10 and TinyImagenet.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2409.01633","ref_index":25,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"SleepNet and DreamNet: Enriching and Reconstructing Representations for Consolidated Visual Classification","primary_cat":"cs.LG","submitted_at":"2024-09-03T06:04:39+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"SleepNet and DreamNet enrich visual features via supervised pre-trained encoders and reconstruct hidden states with encoder-decoder frameworks to outperform prior state-of-the-art classifiers.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2408.11338","ref_index":8,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Automatic Dataset Construction (ADC): Sample Collection, Data Curation, and Beyond","primary_cat":"cs.AI","submitted_at":"2024-08-21T04:45:12+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"The ADC method automates the creation of large image classification datasets using LLMs and search engines, achieving 79% human agreement and reducing label noise on a 1 million image clothing dataset, while also releasing benchmarks for noise and bias issues.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}