DermAgent orchestrates seven vision-language tools in a Plan-Execute-Reflect loop with dual-modality retrieval from 413k cases and a critic module to outperform GPT-4o by 17.6% in zero-shot dermatological diagnosis accuracy.
hub
and Ko, Justin and Swetter, Susan M
18 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
years
2026 18roles
background 4representative citing papers
Rough-set analysis finds 16.4% of 305 concept profiles in Derm7pt inconsistent (306 images), capping hard CBM accuracy at 92.1%; symmetric filtering produces a 705-image consistent benchmark where EfficientNet-B5 reaches 0.90 label accuracy.
The C-Score quantifies intra-class explanation consistency for CAM methods via confidence-weighted pairwise soft IoU and detects AUC-consistency dissociation as an early warning for model instability on chest X-ray classification.
The α-index is a conserved position-weighted authorship framework with a senior-author penalty that decreases credit as the number of middle authors increases.
Jaguar replaces prime-modulus HE with power-of-two arithmetic to enable coefficient-domain convolution and local-shift truncation, reporting 2-3.7x lower latency than Cheetah and Rhombus on ResNet-18/50 and MobileNetV2.
Introduces synthetic benchmarks for concept bottleneck models that control data modality, concept choice, annotation quality, and completeness to evaluate performance in decision support and automation.
Multi-agent LLM teams outperform human teams in creativity (d=1.50) across tasks by producing more novel ideas, with distinct semantic exploration patterns predicting success for each group.
Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.
ShardTensor is a domain-parallelism system for SciML that enables flexible scaling of extreme-resolution spatial datasets by removing the constraint of batch size one per device.
A scoping review and empirical analysis produce a six-category taxonomy of factors driving AI non-development and abandonment, showing that practical issues like resource limits and organizational dynamics often outweigh ethical concerns in real decisions.
Zero-shot inversion-free flow method de-identifies skin images in under 20 seconds while preserving pathological features with IoU stability exceeding 0.67 using segment-by-synthesis and CIELAB decoupling.
MLFFM-SegDiff adds a multi-level feature fusion module and dual-path encoder to a diffusion U-Net, reporting improved Jaccard (0.8546) and Dice (0.9207) scores over baselines on three skin lesion datasets.
IViT applies quadratic programming to a pre-trained Vision Transformer with a multi-objective loss, achieving 93.80% accuracy on six skin disease datasets (0.21% below baseline) while reducing feature redundancy by 29.5% and producing clinically consistent activations.
Cascade classification improves macro F1 over single-stage for some models by allowing sensitivity control but reveals a large generalization gap on external clinical data.
YOLO segmentation plus EfficientNet classification aggregates cell predictions to patient-level CBLC ratios, reporting weighted F1 scores of 0.87-0.91 on three external center cohorts from 89 patients.
Describes a methodology and the resulting dataset of 1,026 dermoscopic images with structured metadata and verified diagnostic labels for medical informatics research.
Prospective single-center validation of a cascade deep learning dermoscopy CDSS found no false negatives for five malignant lesions and 88.3% specificity, with quantitative IoU assessment of attention maps.
Benchmark of twelve models finds hybrid CNN-transformer architectures and a SigLIP vision-language model deliver the strongest overall performance on skin cancer detection using the PAD-UFES-20 dataset.
citing papers explorer
-
Concept Inconsistency in Dermoscopic Concept Bottleneck Models: A Rough-Set Analysis of the Derm7pt Dataset
Rough-set analysis finds 16.4% of 305 concept profiles in Derm7pt inconsistent (306 images), capping hard CBM accuracy at 92.1%; symmetric filtering produces a 705-image consistent benchmark where EfficientNet-B5 reaches 0.90 label accuracy.
-
Measuring What Matters: Synthetic Benchmarks for Concept Bottleneck Models
Introduces synthetic benchmarks for concept bottleneck models that control data modality, concept choice, annotation quality, and completeness to evaluate performance in decision support and automation.