super hub Tool reference

Delving Deep into Rectifiers

Jian Sun, Kaiming He, Shaoqing Ren, Xiangyu Zhang · 2015 · 2015 IEEE International Conference on Computer Vision (ICCV) · DOI 10.1109/iccv.2015.123

Tool reference. 71% of classified Pith citations use this work as a method, library, or software dependency, not as a substantive claim.

23 Pith papers citing it

12.9k external citations · Crossref

Method reference 71% of classified citations

open at publisher browse 23 citing papers more from Jian Sun

hub tools

JSON dossier citing papers JSON publisher DOI

citation-role summary

method 4 background 1 dataset 1 other 1

citation-polarity summary

use method 4 background 1 unclear 1 use dataset 1

authors

Jian Sun Kaiming He Shaoqing Ren Xiangyu Zhang

co-cited works

representative citing papers

DELOS: Detecting Shallow Transits in Kepler Photometry Using a Contrastive-Learning Framework

astro-ph.EP · 2026-05-28 · conditional · novelty 7.0

DELOS applies contrastive learning to phase-folded light curves to detect shallow intermediate-to-long period transits, reporting 15.5% and 11.25% gains in combined precision-recall over BLS and TLS in low-SNR tests plus 3-80x speedups.

Learning Dynamic Stability Landscapes in Synchronization Networks

cs.LG · 2026-05-22 · unverdicted · novelty 7.0

Introduces graph-to-image prediction of per-node dynamic stability landscapes in oscillator networks from topology, releases two 10k-graph datasets, and shows GNN-CNN models achieve good accuracy with cross-size generalization.

Human face perception reflects inverse-generative and naturalistic discriminative objectives

q-bio.NC · 2026-05-12 · unverdicted · novelty 7.0

Human face perception aligns with neural networks trained on inverse-generative and naturalistic discriminative tasks, as these best predict human dissimilarity judgments on controversial and random face pairs.

MASCing: Configurable Mixture-of-Experts Behavior via Activation Steering Masks

cs.CR · 2026-04-30 · unverdicted · novelty 7.0

MASCing uses an LSTM surrogate and optimized steering masks to enable flexible, inference-time control over MoE expert routing for safety objectives, improving jailbreak defense and content generation success rates substantially across multiple models.

Multipolar Magnetic-Field Inference for PSR J0740+6620 with Neural-Network-Accelerated NICER Pulse-Profile Modeling

astro-ph.HE · 2026-06-29 · unverdicted · novelty 6.0

Neural-network surrogate accelerated MCMC infers multipolar magnetic field parameters for PSR J0740+6620 from NICER data, finding broad multimodal posteriors and disfavoring a zero-offset model.

A Stochastic--Geometric Theory of Scaling Laws in Grokking

stat.ML · 2026-06-29 · unverdicted · novelty 6.0

A stochastic-geometric model of solution-space topology under Adam derives explicit scaling laws for grokking transition time as a function of learning rate, batch size, and L2 coefficient.

TACK: A Statistical Evaluation of Degradation Activity on a Novel TArgeting Chimeras Knowledge Dataset

q-bio.QM · 2026-05-19 · unverdicted · novelty 6.0 · 2 refs

A new aggregated PROTAC dataset shows potency is more predictable than maximum degradation by ML, with classical methods outperforming a specialized graph neural network.

Multi-agent AI systems outperform human teams in creativity

cs.CL · 2026-05-18 · unverdicted · novelty 6.0

Multi-agent LLM teams outperform human teams in creativity (d=1.50) across tasks by producing more novel ideas, with distinct semantic exploration patterns predicting success for each group.

Learning Large-Scale Modular Addition with an Auxiliary Modulus

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

An auxiliary modulus during training reduces wrap-around issues and preserves train-test input distributions, enabling better accuracy and sample efficiency for large N and q in modular addition learning.

Amortized Variational Inference for Joint Posterior and Predictive Distributions in Bayesian Uncertainty Quantification

stat.ML · 2026-05-05 · unverdicted · novelty 6.0

An amortized variational framework jointly targets the posterior and posterior-predictive distributions via a KL upper bound and moment regularization, yielding more accurate predictions at lower online cost than two-stage variational inference.

LTBs-KAN: Linear-Time B-splines Kolmogorov-Arnold Networks

cs.LG · 2026-04-23 · unverdicted · novelty 6.0

LTBs-KAN delivers linear-time B-spline evaluation in KANs plus parameter reduction via product-of-sums factorization, with competitive results on MNIST, Fashion-MNIST, and CIFAR-10.

TalkLoRA: Communication-Aware Mixture of Low-Rank Adaptation for Large Language Models

cs.LG · 2026-04-07 · unverdicted · novelty 6.0

TalkLoRA equips MoE-LoRA experts with a communication module that smooths routing dynamics and improves performance on language tasks under similar parameter budgets.

A new initialisation to Control Gradients in Sinusoidal Neural network

cs.LG · 2025-12-06 · unverdicted · novelty 6.0

A closed-form initialization for SIREN networks based on pre-activation fixed points and Jacobian variance sequences improves gradient scaling, training dynamics via NTK, and generalization on reconstruction tasks over the original scheme.

Deep Slice Interpolation for Reducing Through-Plane Anisotropy and Noise in Head CT

eess.IV · 2026-06-08 · unverdicted · novelty 5.0

Deep learning system synthesizes intermediate head CT slices to halve through-plane anisotropy while providing implicit denoising, outperforming baselines on structural metrics.

Quadratic integrate-and-fire neurons exhibit less fragmented loss landscapes and outperform leaky integrate-and-fire neurons in spike-based gradient descent

cs.NE · 2026-06-02 · unverdicted · novelty 5.0

QIF neurons outperform LIF neurons in spike-based gradient descent training of spiking neural networks by avoiding discontinuities that fragment the loss landscape.

Rethinking Federated Unlearning via the Lens of Memorization

cs.LG · 2026-05-23 · unverdicted · novelty 5.0

Introduces Grouped Memorization Evaluation and FedMemPrune to remove unique memorized information in federated unlearning while preserving overlapping knowledge.

Enhancing Event Reconstruction in Hyper-Kamiokande with Machine Learning: A ResNet Implementation

hep-ex · 2026-04-15 · conditional · novelty 5.0

ResNet models classify four particle types and regress vertex, direction, and momentum in Hyper-Kamiokande with resolutions matching likelihood methods but at 30,000-50,000x faster inference on GPU.

Gamma-Ray Burst Light Curve Reconstruction: A Comparative Machine and Deep Learning Analysis

astro-ph.HE · 2024-12-28 · unverdicted · novelty 5.0

MLP and Attention U-Net outperform other models in reconstructing GRB light curves on 521 events, cutting plateau parameter uncertainties by 37-41% versus the Willingale baseline while achieving low MSE.

A Resource-Efficient Hybrid CNN-LSTM network for image-based bean leaf disease classification

cs.CV · 2026-04-15 · unverdicted · novelty 4.0

A lightweight hybrid CNN-LSTM network classifies bean leaf diseases at 94.38% accuracy and 1.86 MB size on the ibean dataset, with reported state-of-the-art F1 scores using EfficientNet-B7+LSTM.

TwinLiteNet+: An Enhanced Multi-Task Segmentation Model for Autonomous Driving

cs.CV · 2024-03-25 · unverdicted · novelty 4.0

TwinLiteNet+ is a hybrid-encoder multi-task segmentation model with new UCB, USB, and PCAA modules that reports 92.9% mIoU on drivable area and 34.2% IoU on lane segmentation on BDD100K while using 11x fewer FLOPs than prior models.

The Mathematics of AI Winters: The mathematical Taxonomy of Paradigm Fragility in AI Winter

cs.LG · 2026-06-10 · unverdicted · novelty 3.0

Established mathematical bottlenecks in representation, optimization, complexity, and high-dimensional learning aligned with the central disappointments of early AI research periods.

A Variational Kolosov--Muskhelishvili Network for Elasticity and Fracture

cs.CE · 2026-05-04

A Deep Ritz Method for High-Dimensional Steady States of the Cahn-Hilliard Equation

math.NA · 2026-04-20

citing papers explorer

Showing 20 of 20 citing papers after filters.

DELOS: Detecting Shallow Transits in Kepler Photometry Using a Contrastive-Learning Framework astro-ph.EP · 2026-05-28 · conditional · none · ref 33
DELOS applies contrastive learning to phase-folded light curves to detect shallow intermediate-to-long period transits, reporting 15.5% and 11.25% gains in combined precision-recall over BLS and TLS in low-SNR tests plus 3-80x speedups.
Learning Dynamic Stability Landscapes in Synchronization Networks cs.LG · 2026-05-22 · unverdicted · none · ref 270
Introduces graph-to-image prediction of per-node dynamic stability landscapes in oscillator networks from topology, releases two 10k-graph datasets, and shows GNN-CNN models achieve good accuracy with cross-size generalization.
Human face perception reflects inverse-generative and naturalistic discriminative objectives q-bio.NC · 2026-05-12 · unverdicted · none · ref 76
Human face perception aligns with neural networks trained on inverse-generative and naturalistic discriminative tasks, as these best predict human dissimilarity judgments on controversial and random face pairs.
MASCing: Configurable Mixture-of-Experts Behavior via Activation Steering Masks cs.CR · 2026-04-30 · unverdicted · none · ref 18
MASCing uses an LSTM surrogate and optimized steering masks to enable flexible, inference-time control over MoE expert routing for safety objectives, improving jailbreak defense and content generation success rates substantially across multiple models.
Multipolar Magnetic-Field Inference for PSR J0740+6620 with Neural-Network-Accelerated NICER Pulse-Profile Modeling astro-ph.HE · 2026-06-29 · unverdicted · none · ref 15
Neural-network surrogate accelerated MCMC infers multipolar magnetic field parameters for PSR J0740+6620 from NICER data, finding broad multimodal posteriors and disfavoring a zero-offset model.
A Stochastic--Geometric Theory of Scaling Laws in Grokking stat.ML · 2026-06-29 · unverdicted · none · ref 30
A stochastic-geometric model of solution-space topology under Adam derives explicit scaling laws for grokking transition time as a function of learning rate, batch size, and L2 coefficient.
TACK: A Statistical Evaluation of Degradation Activity on a Novel TArgeting Chimeras Knowledge Dataset q-bio.QM · 2026-05-19 · unverdicted · none · ref 20 · 2 links
A new aggregated PROTAC dataset shows potency is more predictable than maximum degradation by ML, with classical methods outperforming a specialized graph neural network.
Multi-agent AI systems outperform human teams in creativity cs.CL · 2026-05-18 · unverdicted · none · ref 5
Multi-agent LLM teams outperform human teams in creativity (d=1.50) across tasks by producing more novel ideas, with distinct semantic exploration patterns predicting success for each group.
Learning Large-Scale Modular Addition with an Auxiliary Modulus cs.LG · 2026-05-08 · unverdicted · none · ref 12
An auxiliary modulus during training reduces wrap-around issues and preserves train-test input distributions, enabling better accuracy and sample efficiency for large N and q in modular addition learning.
Amortized Variational Inference for Joint Posterior and Predictive Distributions in Bayesian Uncertainty Quantification stat.ML · 2026-05-05 · unverdicted · none · ref 25
An amortized variational framework jointly targets the posterior and posterior-predictive distributions via a KL upper bound and moment regularization, yielding more accurate predictions at lower online cost than two-stage variational inference.
LTBs-KAN: Linear-Time B-splines Kolmogorov-Arnold Networks cs.LG · 2026-04-23 · unverdicted · none · ref 7
LTBs-KAN delivers linear-time B-spline evaluation in KANs plus parameter reduction via product-of-sums factorization, with competitive results on MNIST, Fashion-MNIST, and CIFAR-10.
TalkLoRA: Communication-Aware Mixture of Low-Rank Adaptation for Large Language Models cs.LG · 2026-04-07 · unverdicted · none · ref 9
TalkLoRA equips MoE-LoRA experts with a communication module that smooths routing dynamics and improves performance on language tasks under similar parameter budgets.
Deep Slice Interpolation for Reducing Through-Plane Anisotropy and Noise in Head CT eess.IV · 2026-06-08 · unverdicted · none · ref 31
Deep learning system synthesizes intermediate head CT slices to halve through-plane anisotropy while providing implicit denoising, outperforming baselines on structural metrics.
Quadratic integrate-and-fire neurons exhibit less fragmented loss landscapes and outperform leaky integrate-and-fire neurons in spike-based gradient descent cs.NE · 2026-06-02 · unverdicted · none · ref 50
QIF neurons outperform LIF neurons in spike-based gradient descent training of spiking neural networks by avoiding discontinuities that fragment the loss landscape.
Rethinking Federated Unlearning via the Lens of Memorization cs.LG · 2026-05-23 · unverdicted · none · ref 19
Introduces Grouped Memorization Evaluation and FedMemPrune to remove unique memorized information in federated unlearning while preserving overlapping knowledge.
Enhancing Event Reconstruction in Hyper-Kamiokande with Machine Learning: A ResNet Implementation hep-ex · 2026-04-15 · conditional · none · ref 26
ResNet models classify four particle types and regress vertex, direction, and momentum in Hyper-Kamiokande with resolutions matching likelihood methods but at 30,000-50,000x faster inference on GPU.
A Resource-Efficient Hybrid CNN-LSTM network for image-based bean leaf disease classification cs.CV · 2026-04-15 · unverdicted · none · ref 35
A lightweight hybrid CNN-LSTM network classifies bean leaf diseases at 94.38% accuracy and 1.86 MB size on the ibean dataset, with reported state-of-the-art F1 scores using EfficientNet-B7+LSTM.
The Mathematics of AI Winters: The mathematical Taxonomy of Paradigm Fragility in AI Winter cs.LG · 2026-06-10 · unverdicted · none · ref 14
Established mathematical bottlenecks in representation, optimization, complexity, and high-dimensional learning aligned with the central disappointments of early AI research periods.
A Variational Kolosov--Muskhelishvili Network for Elasticity and Fracture cs.CE · 2026-05-04 · unreviewed · ref 77
A Deep Ritz Method for High-Dimensional Steady States of the Cahn-Hilliard Equation math.NA · 2026-04-20 · unreviewed · ref 28

Delving Deep into Rectifiers

hub tools

citation-role summary

citation-polarity summary

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer