{"total":15,"items":[{"citing_arxiv_id":"2606.26712","ref_index":2,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"MLFFM-SegDiff: A Multi-Level Feature Fusion Diffusion Model for Skin Lesion Segmentation","primary_cat":"eess.IV","submitted_at":"2026-06-25T07:45:28+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"MLFFM-SegDiff adds a multi-level feature fusion module and dual-path encoder to a diffusion U-Net, reporting improved Jaccard (0.8546) and Dice (0.9207) scores over baselines on three skin lesion datasets.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.22892","ref_index":30,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"IViT: A Novel Interpretable Visual Transformer for Skin Disease Detection","primary_cat":"eess.IV","submitted_at":"2026-06-22T06:06:02+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"IViT applies quadratic programming to a pre-trained Vision Transformer with a multi-objective loss, achieving 93.80% accuracy on six skin disease datasets (0.21% below baseline) while reducing feature redundancy by 29.5% and producing clinically consistent activations.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.22309","ref_index":20,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"The $\\alpha$-Index: A Penalized Authorship-Integrity Framework for Position-Weighted Scientific Contribution","primary_cat":"cs.DL","submitted_at":"2026-06-21T02:32:03+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"The α-index is a conserved position-weighted authorship framework with a senior-author penalty that decreases credit as the number of middle authors increases.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.13135","ref_index":1,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Cascade Classification of Dermoscopic Images of Skin Neoplasms with Controllable Sensitivity and External Clinical Validation","primary_cat":"cs.CV","submitted_at":"2026-06-11T09:55:57+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Cascade classification improves macro F1 over single-stage for some models by allowing sensitivity control but reveals a large generalization gap on external clinical data.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.11827","ref_index":2,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Jaguar: Fast Private CNN Inference with Power-of-Two Homomorphic Arithmetic","primary_cat":"cs.CR","submitted_at":"2026-06-10T09:04:46+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Jaguar replaces prime-modulus HE with power-of-two arithmetic to enable coefficient-domain convolution and local-shift truncation, reporting 2-3.7x lower latency than Cheetah and Rhombus on ResNet-18/50 and MobileNetV2.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.10735","ref_index":6,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Patient-Level Diagnosis of Acute Myeloid Leukemia via Deep Learning Analysis of Bone Marrow Smear","primary_cat":"cs.CV","submitted_at":"2026-06-09T11:45:19+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"YOLO segmentation plus EfficientNet classification aggregates cell predictions to patient-level CBLC ratios, reporting weighted F1 scores of 0.87-0.91 on three external center cohorts from 89 patients.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.04326","ref_index":23,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Measuring What Matters: Synthetic Benchmarks for Concept Bottleneck Models","primary_cat":"cs.LG","submitted_at":"2026-06-03T01:01:05+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Introduces synthetic benchmarks for concept bottleneck models that control data modality, concept choice, annotation quality, and completeness to evaluate performance in decision support and automation.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.17885","ref_index":6,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Multi-agent AI systems outperform human teams in creativity","primary_cat":"cs.CL","submitted_at":"2026-05-18T05:52:11+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Multi-agent LLM teams outperform human teams in creativity (d=1.50) across tasks by producing more novel ideas, with distinct semantic exploration patterns predicting success for each group.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.18878","ref_index":283,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Prognostic Value of Lung Ultrasound Biomarkers for Readmission Risk in Congestive Heart Failure: A Pilot Data-Driven Analysis","primary_cat":"eess.SP","submitted_at":"2026-05-16T02:49:12+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.14403","ref_index":6,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"DermAgent: A Self-Reflective Agentic System for Dermatological Image Analysis with Multi-Tool Reasoning and Traceable Decision-Making","primary_cat":"cs.CV","submitted_at":"2026-05-14T05:41:11+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"DermAgent orchestrates seven vision-language tools in a Plan-Execute-Reflect loop with dual-modality retrieval from 413k cases and a critic module to outperform GPT-4o by 17.6% in zero-shot dermatological diagnosis accuracy.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.11111","ref_index":1,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"ShardTensor: Domain Parallelism for Scientific Machine Learning","primary_cat":"cs.DC","submitted_at":"2026-05-11T18:20:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"ShardTensor is a domain-parallelism system for SciML that enables flexible scaling of extreme-resolution spatial datasets by removing the constraint of batch size one per device.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Index Terms-HPC for ML; Parallel and distributed learning algorithms; Model, pipeline, and data parallelism I. INTRODUCTION Scientific machine learning applications have become a vehicle for accelerated simulation, scientific discovery, and industrial design. Machine learning has found applications in an incredible breadth of domains: healthcare and medicine [1], [2], industrial design [3]-[5], fluid dynamics [6] and aerodynamics [7], weather and climate forecasting [8], [9], fundamental sciences [10]-[12], and many, many more [13]- [15]. It is not an overstatement to say that machine learning methods are fundamentally changing scientific research, all the way from early development to end user and industrial"},{"citing_arxiv_id":"2604.28053","ref_index":45,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"To Build or Not to Build? Factors that Lead to Non-Development or Abandonment of AI Systems","primary_cat":"cs.CY","submitted_at":"2026-04-30T16:00:52+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A scoping review and empirical analysis produce a six-category taxonomy of factors driving AI non-development and abandonment, showing that practical issues like resource limits and organizational dynamics often outweigh ethical concerns in real decisions.","context_count":1,"top_context_role":"background","top_context_polarity":"support","context_text":"Inappropriate model selection [99, 130] Model training too technically challenging [32, 99] Model training too resource intensive [41, 170] Undefined or inappropriate success/evaluation criteria [19, 99] Model performance not sufficient [3, 130] Difficult to integrate into pipelines [13, 137] Inability to conduct evaluation/pilot [130] Low adoption in pilot deployment [45] Changing incen- tives/priorities [74] Not indicated/scoped well [20, 97, 130] Not aligned with organizational strategy [131] Insufficient leadership sponsorship [96, 106] Clients/customers did not want it [55, 147] Concerns over keeping company assets private [32] Lacked technical expertise to build [39] Lacked technical expertise to maintain [67]"},{"citing_arxiv_id":"2604.19323","ref_index":8,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Concept Inconsistency in Dermoscopic Concept Bottleneck Models: A Rough-Set Analysis of the Derm7pt Dataset","primary_cat":"cs.LG","submitted_at":"2026-04-21T10:45:50+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Rough-set analysis finds 16.4% of 305 concept profiles in Derm7pt inconsistent (306 images), capping hard CBM accuracy at 92.1%; symmetric filtering produces a 705-image consistent benchmark where EfficientNet-B5 reaches 0.90 label accuracy.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"For each inconsistent concept profile ek ∈Π , the classifier h must assign a single label to all images in ek, so it correctly classifies at most nmaj k of them. Summing over all 50 inconsistent concept profiles and using the explicit enumeration of the Derm7pt boundary region, we get the following expression: |{x∈BND C(d):h(x) =d(x)}| ≤ ∑ k∈Π nmaj k =226.(8) Adding both regions: |{x∈U:h(x) =d(x)}| ≤705+226=931 , and dividing by |U|=1011 gives the stated bound (7). Tightness follows from the majority-vote classifier, which assigns the unique label on each consistent profile and the majority label on each inconsistent profile, achieving exactly 931 correct predictions and attaining the bound. Corollary."},{"citing_arxiv_id":"2604.08502","ref_index":3,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Quantifying Explanation Consistency: The C-Score Metric for CAM-Based Explainability in Medical Image Classification","primary_cat":"cs.CV","submitted_at":"2026-04-09T17:47:31+00:00","verdict":"UNVERDICTED","verdict_confidence":"UNKNOWN","novelty_score":7.0,"formal_verification":"none","one_line_summary":"The C-Score quantifies intra-class explanation consistency for CAM methods via confidence-weighted pairwise soft IoU and detects AUC-consistency dissociation as an early warning for model instability on chest X-ray classification.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"explanation quality rather than predictive ranking alone. 1 Introduction 1.1 The Clinical Deployment Gap in Medical AI Deep learning models for medical image analysis have achieved, and in some domains exceeded, expert-level dis- criminative performance [1, 2]. Convolutional neural net- works trained on chest X-ray datasets now achieve AUC values exceeding 0.99 for pneumonia detection [3], and similar performance has been reported across ophthalmol- ogy, dermatology, and radiology [4]. However, a fundamental tension exists between classifica- tion performance and clinical trustworthiness. High AUC certifies that a model correctly ranks pathological cases above normal ones in terms of predicted probability. It does not certify what image features the model is using to"},{"citing_arxiv_id":"2602.00821","ref_index":1,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Zero-Shot Generative De-identification: Inversion-Free Flow for Privacy-Preserving Skin Image Analysis","primary_cat":"cs.CV","submitted_at":"2026-01-31T17:19:23+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Zero-shot inversion-free flow method de-identifies skin images in under 20 seconds while preserving pathological features with IoU stability exceeding 0.67 using segment-by-synthesis and CIELAB decoupling.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}