Title resolution pending

Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun · 2016

28 Pith papers cite this work. Polarity classification is still indexing.

28 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

method 2 background 1 baseline 1

citation-polarity summary

background 2 baseline 1 use method 1

representative citing papers

Systematic Discovery of Semantic Attacks in Online Map Construction through Conditional Diffusion

cs.CV · 2026-05-14 · unverdicted · novelty 8.0

MIRAGE discovers semantic attacks on online HD map construction via conditional diffusion, enabling boundary removal and injection that degrade AV performance while passing as realistic environmental changes.

Towards Generalized Image Manipulation Localization via Score-based Model

cs.CV · 2026-05-16 · conditional · novelty 7.0

DiffIML applies score-based generative modeling to image manipulation localization, recovering coherent masks iteratively from noise to improve generalization on unseen manipulation types.

SoK: Unlearnability and Unlearning for Model Dememorization

cs.LG · 2026-05-12 · conditional · novelty 7.0

The first integrated taxonomy, empirical study of interplay and shallow dememorization, plus a theoretical guarantee on dememorization depth for certified unlearning.

TENNOR: Trustworthy Execution for Neural Networks through Obliviousness and Retrievals

cs.CR · 2026-05-08 · unverdicted · novelty 7.0

TENNOR enables efficient private training of wide neural networks in TEEs by recasting sparsification as doubly oblivious LSH retrievals and introducing MP-WTA to cut hash table memory by 50x while preserving accuracy.

PACO: Proxy-Task Alignment and Online Calibration for On-the-Fly Category Discovery

cs.CV · 2026-04-13 · unverdicted · novelty 7.0

PACO provides a hierarchical online decision system with proxy-simulated initial thresholds and adaptive updates from mature prototypes to enable consistent category discovery in streaming sequences.

Towards Green Wearable Computing: A Physics-Aware Spiking Neural Network for Energy-Efficient IMU-based Human Activity Recognition

cs.LG · 2026-04-12 · unverdicted · novelty 7.0

PAS-Net is a fully multiplier-free spiking neural network that enforces human joint constraints spatially and uses causal neuromodulation temporally to achieve state-of-the-art accuracy on IMU HAR with up to 98% lower dynamic energy via early-exit.

OVS-DINO: Open-Vocabulary Segmentation via Structure-Aligned SAM-DINO with Language Guidance

cs.CV · 2026-04-09 · unverdicted · novelty 7.0

OVS-DINO structurally aligns DINO with SAM to revitalize attenuated boundary features, achieving SOTA gains of 2.1% average and 6.3% on Cityscapes in weakly-supervised open-vocabulary segmentation.

Lost in the Hype: Revealing and Dissecting the Performance Degradation of Medical Multimodal Large Language Models in Image Classification

cs.CV · 2026-04-09 · unverdicted · novelty 7.0

Medical MLLMs degrade on image classification due to four failure modes in visual representation quality, connector projection fidelity, LLM comprehension, and semantic mapping alignment, quantified by feature probing on 14 models across 3 datasets.

DynLP: Parallel Dynamic Batch Update for Label Propagation in Semi-Supervised Learning

cs.DC · 2026-04-08 · unverdicted · novelty 7.0

DynLP is a parallel dynamic batch update algorithm for label propagation that achieves significant speedups by updating only relevant parts of the graph on GPUs.

Satellite-Free Training for Drone-View Geo-Localization

cs.CV · 2026-04-02 · conditional · novelty 7.0

A satellite-free training framework reconstructs 3D drone scenes via Gaussian splatting, generates geometry-normalized pseudo-orthophotos, and aggregates DINOv3 features with a Fisher vector model trained only on drone data to enable cross-view retrieval.

Mixture of Predefined Experts: Maximizing Data Usage on Vertical Federated Learning

cs.LG · 2026-02-13 · unverdicted · novelty 7.0

Split-MoPE integrates split learning with predefined-expert routing to maximize usable data in vertical federated learning under sample misalignment, delivering state-of-the-art accuracy in one communication round plus built-in robustness and per-sample contribution scores.

OCCAM: Open-set Causal Concept explAnation and Ontology induction for black-box vision Models

cs.AI · 2026-05-18 · unverdicted · novelty 6.0

OCCAM discovers open-set visual concepts, estimates causal contributions via object-level interventions on black-box vision models, and induces a global concept ontology from aggregated dataset evidence.

LBFTI: Layer-Based Facial Template Inversion for Identity-Preserving Fine-Grained Face Reconstruction

cs.CV · 2026-04-20 · unverdicted · novelty 6.0

LBFTI decomposes faces into three layers with dedicated generators and a three-stage training process to invert templates into fine-grained, identity-preserving images, claiming 25.3% better TAR than prior methods.

AnchorRefine: Synergy-Manipulation Based on Trajectory Anchor and Residual Refinement for Vision-Language-Action Models

cs.RO · 2026-04-20 · unverdicted · novelty 6.0

AnchorRefine factorizes VLA action generation into a trajectory anchor for coarse planning and residual refinement for local corrections, improving success rates by up to 7.8% in simulation and 18% on real robots across LIBERO, CALVIN, and physical tasks.

Neural Adjoint Method for Meta-optics: Accelerating Volumetric Inverse Design via Fourier Neural Operators

cs.LG · 2026-04-19 · unverdicted · novelty 6.0

A stage-wise Fourier Neural Operator surrogate predicts per-voxel adjoint gradients to accelerate 3D meta-optics inverse design, replacing expensive FDTD solves with fast inference.

Cross-Modal Generation: From Commodity WiFi to High-Fidelity mmWave and RFID Sensing

cs.LG · 2026-04-17 · unverdicted · novelty 6.0

RF-CMG synthesizes high-quality mmWave and RFID signals from WiFi using a diffusion model with Modality-Guided Embedding for high-frequency details and Low-Frequency Modality Consistency to preserve physical structure.

Rethinking Video Human-Object Interaction: Set Prediction over Time for Unified Detection and Anticipation

cs.CV · 2026-04-12 · unverdicted · novelty 6.0

A pair-centric set-prediction model unifies present HOI detection and multi-horizon anticipation in video by modeling future interactions as residual transitions from current pair states, backed by a temporally corrected benchmark.

User-Aware Conditional Generative Total Correlation Learning for Multi-Modal Recommendation

cs.IR · 2026-04-03 · unverdicted · novelty 6.0

GTC improves multi-modal recommendation by using user-conditional diffusion-based feature filtering and total correlation optimization, achieving up to 28.3% gains in NDCG@5 on benchmarks.

TIQA: Human-Aligned Perceptual Text Quality Assessment in Generated Images

cs.CV · 2026-03-07 · unverdicted · novelty 6.0

TIQA introduces datasets and a model that predict human perceptual quality of rendered text in AI images, achieving PLCC 0.942 on crops and improving selected image text quality by 0.36 MOS.

AEG: A Baremetal Framework for AI Acceleration via Direct Hardware Access in Heterogeneous Accelerators

cs.DC · 2026-02-15 · unverdicted · novelty 6.0

AEG baremetal framework achieves 9.2x higher compute efficiency, 3-7x less data movement, and near-zero latency variance for ResNet-18 on 28 AIE tiles versus Linux Vitis AI on 304 tiles while maintaining 68.78% ImageNet accuracy.

VeriOS: Query-Driven Proactive Human-Agent-GUI Interaction for Trustworthy OS Agents

cs.CL · 2025-09-09 · unverdicted · novelty 6.0

VeriOS-Agent is an OS agent that proactively queries humans in untrustworthy scenarios via a query-driven framework and three-stage training, achieving 19.72% higher step-wise success rate over baselines while preserving normal performance.

GenHAR: Generalizing Cross-domain Human Activity Recognition for Last-mile Delivery

cs.CV · 2026-05-21 · unverdicted · novelty 5.0

GenHAR generalizes cross-domain human activity recognition by 9.97% accuracy and 6.4x lower FLOPs via tokenized sensor data, frequency channel correlations, selective masking, and efficient attention, with deployment detecting 2.15 billion activities.

SAIL: Structure-Aware Interpretable Learning for Anatomy-Aligned Post-hoc Explanations in OCT

cs.CV · 2026-05-04 · unverdicted · novelty 5.0

SAIL integrates anatomical priors at the representation level with semantic features via fusion to produce more anatomically aligned attribution maps in OCT without altering existing explainability techniques.

RACANet: Reliability-Aware Crowd Anchor Network for RGB-T Crowd Counting

cs.CV · 2026-04-27 · unverdicted · novelty 5.0

RACANet proposes a reliability-aware two-stage fusion network with cross-modal pretraining and local anchor modules that outperforms prior RGB-T crowd counting methods on standard benchmarks.

citing papers explorer

Showing 28 of 28 citing papers.

Systematic Discovery of Semantic Attacks in Online Map Construction through Conditional Diffusion cs.CV · 2026-05-14 · unverdicted · none · ref 20
MIRAGE discovers semantic attacks on online HD map construction via conditional diffusion, enabling boundary removal and injection that degrade AV performance while passing as realistic environmental changes.
Towards Generalized Image Manipulation Localization via Score-based Model cs.CV · 2026-05-16 · conditional · none · ref 10
DiffIML applies score-based generative modeling to image manipulation localization, recovering coherent masks iteratively from noise to improve generalization on unseen manipulation types.
SoK: Unlearnability and Unlearning for Model Dememorization cs.LG · 2026-05-12 · conditional · none · ref 72
The first integrated taxonomy, empirical study of interplay and shallow dememorization, plus a theoretical guarantee on dememorization depth for certified unlearning.
TENNOR: Trustworthy Execution for Neural Networks through Obliviousness and Retrievals cs.CR · 2026-05-08 · unverdicted · none · ref 48
TENNOR enables efficient private training of wide neural networks in TEEs by recasting sparsification as doubly oblivious LSH retrievals and introducing MP-WTA to cut hash table memory by 50x while preserving accuracy.
PACO: Proxy-Task Alignment and Online Calibration for On-the-Fly Category Discovery cs.CV · 2026-04-13 · unverdicted · none · ref 18
PACO provides a hierarchical online decision system with proxy-simulated initial thresholds and adaptive updates from mature prototypes to enable consistent category discovery in streaming sequences.
Towards Green Wearable Computing: A Physics-Aware Spiking Neural Network for Energy-Efficient IMU-based Human Activity Recognition cs.LG · 2026-04-12 · unverdicted · none · ref 13
PAS-Net is a fully multiplier-free spiking neural network that enforces human joint constraints spatially and uses causal neuromodulation temporally to achieve state-of-the-art accuracy on IMU HAR with up to 98% lower dynamic energy via early-exit.
OVS-DINO: Open-Vocabulary Segmentation via Structure-Aligned SAM-DINO with Language Guidance cs.CV · 2026-04-09 · unverdicted · none · ref 18
OVS-DINO structurally aligns DINO with SAM to revitalize attenuated boundary features, achieving SOTA gains of 2.1% average and 6.3% on Cityscapes in weakly-supervised open-vocabulary segmentation.
Lost in the Hype: Revealing and Dissecting the Performance Degradation of Medical Multimodal Large Language Models in Image Classification cs.CV · 2026-04-09 · unverdicted · none · ref 13
Medical MLLMs degrade on image classification due to four failure modes in visual representation quality, connector projection fidelity, LLM comprehension, and semantic mapping alignment, quantified by feature probing on 14 models across 3 datasets.
DynLP: Parallel Dynamic Batch Update for Label Propagation in Semi-Supervised Learning cs.DC · 2026-04-08 · unverdicted · none · ref 14
DynLP is a parallel dynamic batch update algorithm for label propagation that achieves significant speedups by updating only relevant parts of the graph on GPUs.
Satellite-Free Training for Drone-View Geo-Localization cs.CV · 2026-04-02 · conditional · none · ref 10
A satellite-free training framework reconstructs 3D drone scenes via Gaussian splatting, generates geometry-normalized pseudo-orthophotos, and aggregates DINOv3 features with a Fisher vector model trained only on drone data to enable cross-view retrieval.
Mixture of Predefined Experts: Maximizing Data Usage on Vertical Federated Learning cs.LG · 2026-02-13 · unverdicted · none · ref 11
Split-MoPE integrates split learning with predefined-expert routing to maximize usable data in vertical federated learning under sample misalignment, delivering state-of-the-art accuracy in one communication round plus built-in robustness and per-sample contribution scores.
OCCAM: Open-set Causal Concept explAnation and Ontology induction for black-box vision Models cs.AI · 2026-05-18 · unverdicted · none · ref 21
OCCAM discovers open-set visual concepts, estimates causal contributions via object-level interventions on black-box vision models, and induces a global concept ontology from aggregated dataset evidence.
LBFTI: Layer-Based Facial Template Inversion for Identity-Preserving Fine-Grained Face Reconstruction cs.CV · 2026-04-20 · unverdicted · none · ref 13
LBFTI decomposes faces into three layers with dedicated generators and a three-stage training process to invert templates into fine-grained, identity-preserving images, claiming 25.3% better TAR than prior methods.
AnchorRefine: Synergy-Manipulation Based on Trajectory Anchor and Residual Refinement for Vision-Language-Action Models cs.RO · 2026-04-20 · unverdicted · none · ref 21
AnchorRefine factorizes VLA action generation into a trajectory anchor for coarse planning and residual refinement for local corrections, improving success rates by up to 7.8% in simulation and 18% on real robots across LIBERO, CALVIN, and physical tasks.
Neural Adjoint Method for Meta-optics: Accelerating Volumetric Inverse Design via Fourier Neural Operators cs.LG · 2026-04-19 · unverdicted · none · ref 11
A stage-wise Fourier Neural Operator surrogate predicts per-voxel adjoint gradients to accelerate 3D meta-optics inverse design, replacing expensive FDTD solves with fast inference.
Cross-Modal Generation: From Commodity WiFi to High-Fidelity mmWave and RFID Sensing cs.LG · 2026-04-17 · unverdicted · none · ref 14
RF-CMG synthesizes high-quality mmWave and RFID signals from WiFi using a diffusion model with Modality-Guided Embedding for high-frequency details and Low-Frequency Modality Consistency to preserve physical structure.
Rethinking Video Human-Object Interaction: Set Prediction over Time for Unified Detection and Anticipation cs.CV · 2026-04-12 · unverdicted · none · ref 11
A pair-centric set-prediction model unifies present HOI detection and multi-horizon anticipation in video by modeling future interactions as residual transitions from current pair states, backed by a temporally corrected benchmark.
User-Aware Conditional Generative Total Correlation Learning for Multi-Modal Recommendation cs.IR · 2026-04-03 · unverdicted · none · ref 8
GTC improves multi-modal recommendation by using user-conditional diffusion-based feature filtering and total correlation optimization, achieving up to 28.3% gains in NDCG@5 on benchmarks.
TIQA: Human-Aligned Perceptual Text Quality Assessment in Generated Images cs.CV · 2026-03-07 · unverdicted · none · ref 25
TIQA introduces datasets and a model that predict human perceptual quality of rendered text in AI images, achieving PLCC 0.942 on crops and improving selected image text quality by 0.36 MOS.
AEG: A Baremetal Framework for AI Acceleration via Direct Hardware Access in Heterogeneous Accelerators cs.DC · 2026-02-15 · unverdicted · none · ref 18
AEG baremetal framework achieves 9.2x higher compute efficiency, 3-7x less data movement, and near-zero latency variance for ResNet-18 on 28 AIE tiles versus Linux Vitis AI on 304 tiles while maintaining 68.78% ImageNet accuracy.
VeriOS: Query-Driven Proactive Human-Agent-GUI Interaction for Trustworthy OS Agents cs.CL · 2025-09-09 · unverdicted · none · ref 17
VeriOS-Agent is an OS agent that proactively queries humans in untrustworthy scenarios via a query-driven framework and three-stage training, achieving 19.72% higher step-wise success rate over baselines while preserving normal performance.
GenHAR: Generalizing Cross-domain Human Activity Recognition for Last-mile Delivery cs.CV · 2026-05-21 · unverdicted · none · ref 21
GenHAR generalizes cross-domain human activity recognition by 9.97% accuracy and 6.4x lower FLOPs via tokenized sensor data, frequency channel correlations, selective masking, and efficient attention, with deployment detecting 2.15 billion activities.
SAIL: Structure-Aware Interpretable Learning for Anatomy-Aligned Post-hoc Explanations in OCT cs.CV · 2026-05-04 · unverdicted · none · ref 34
SAIL integrates anatomical priors at the representation level with semantic features via fusion to produce more anatomically aligned attribution maps in OCT without altering existing explainability techniques.
RACANet: Reliability-Aware Crowd Anchor Network for RGB-T Crowd Counting cs.CV · 2026-04-27 · unverdicted · none · ref 7
RACANet proposes a reliability-aware two-stage fusion network with cross-modal pretraining and local anchor modules that outperforms prior RGB-T crowd counting methods on standard benchmarks.
Observe Less, Understand More: Cost-aware Cross-scale Observation for Remote Sensing Understanding cs.CV · 2026-04-13 · unverdicted · none · ref 13
A unified cost-aware formulation couples fine-grained high-resolution sampling decisions with cross-patch representation prediction to achieve superior performance-cost trade-offs on remote sensing recognition and retrieval tasks using a new 10M-image benchmark.
MATCHA: Efficient Deployment of Deep Neural Networks on Multi-Accelerator Heterogeneous Edge SoCs cs.DC · 2026-04-10 · unverdicted · none · ref 15
MATCHA optimizes DNN deployment on heterogeneous multi-accelerator edge SoCs via constraint programming for memory and scheduling plus pattern matching for parallel execution, cutting latency up to 35% versus the MATCH compiler on MLPerf Tiny.
WRF4CIR: Weight-Regularized Fine-Tuning Network for Composed Image Retrieval cs.CV · 2026-04-07 · unverdicted · none · ref 22
WRF4CIR uses weight-regularized fine-tuning with adversarial perturbations to mitigate overfitting in composed image retrieval and narrows the generalization gap on benchmarks.
SatReg: Regression-based Neural Architecture Search for Lightweight Satellite Image Segmentation cs.CV · 2026-04-11 · unverdicted · none · ref 15
SatReg uses regression surrogates on two width variables from CM-UNet students to select near-optimal lightweight segmentation architectures for edge satellite deployment without exhaustive search.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer