Simulation study shows cold TLB misses in reverse address translation dominate latency for small collectives in multi-GPU pods, causing up to 1.4x degradation, while larger ones see diminishing returns.
super hub
ImageNet classific ation with deep convolutional neural networks
19 Pith papers cite this work, alongside 31,060 external citations. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
claims ledger
- dataset 0388-18.2018. URL https://www.jneurosci.org/lookup/doi/10. 1523/JNEUROSCI.0388-18.2018. [15] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. ImageNet classification with deep convolutional neural networks.Communications of the ACM, 60(6):84-90, May 2017. ISSN 0001-0782, 1557-7317. doi:10.1145/3065386. URL https://dl.acm.org/doi/10.1145/ 3065386. [16] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: A large-scale hierarchical image database. In2009 IEEE Co
- background Science Studies (2024) 1-21. doi:10.1162/qss_a_00294. [28] S. Hochreiter, J. Schmidhuber, Long Short-Term Memory, Neural Computation 9 (1997) 1735-1780. doi:10.1162/neco.1997.9.8.1735. [29] A. Krizhevsky, I. Sutskever, G. E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, Communications of the ACM 60 (2017) 84-90. doi:10.1145/3065386. [30] Y. LeCun, et al., Gradient-based Learning Applied to Document Recognition, Proceedings of the IEEE 86 (1998) 2278-2324. doi:10.1109/5
- background terization (IISWC). 191-202. doi:10.1109/IISWC.2018.8573483 [61] Bingyao Li, Jieming Yin, Anup Holey, Youtao Zhang, Jun Yang, and Xulong Tang. 2023. Trans-FW: Short Circuiting Page Table Walk in Multi-GPU Systems via Remote Forwarding. In2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA). 456-470. doi:10.1109/HPCA56546.2023.10071054 [62] Shen Li, Yanli Zhao, Rohan Varma, Omkar Salpekar, Pieter Noordhuis, Teng Li, Adam Paszke, Jeff Smith, Brian Vaughan, Pritam Dama
- background (2025). Llms as meta-reviewers' assistants: A case study. Proc. Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. 2025:7763-7803. DOI:10.18653/v1/2025.naacl-long.395 [53] Krizhevsky A., Sutskever I. and Hinton G.E. (2017). Ima- geNet classification with deep convolutional neural networks. Commun. ACM 60:84-90. DOI:10.1145/3065386 [54] Hinton G., Deng L., Yu D., et al. (2012). Deep neural net- works for acoustic modeling in speech recognition: The shared views of four research
authors
co-cited works
representative citing papers
Reachability for neural networks is NP-hard for single-hidden-layer networks with output dimension 1 and weights restricted to {-1,0,1}.
SMR uses multi-channel map-encoded reinforcement learning to achieve roughly 10% better time utilization than greedy baselines for single-dish radio telescope scheduling.
Derives deterministic MMD, KSD, and KL objectives with rotationally invariant kernels on the hypersphere, yielding more stable SSL training and dataset-dependent geometry in learned representations.
Proposes a psychovisual-inspired deep learning method that encodes images in learned frequency sub-bands for interpretable semantic structures and reduced depth dependence.
Equivariant neural networks for 2D Q-tensor prediction in nematic liquid crystals achieve lower errors and better generalization than non-equivariant models while satisfying symmetry constraints.
AI use in science has grown exponentially since 2015 but stays confined to computer science and statistics topics, shows higher retraction rates and citations, and follows distinct global adoption patterns.
AI peer review systems are vulnerable to prompt injections, prestige biases, assertion strength effects, and contextual poisoning, as demonstrated by a new attack taxonomy and causal experiments on real conference submissions.
AIBuildAI uses a manager agent and three LLM sub-agents to fully automate AI model development and achieves a 63.1% medal rate on MLE-Bench, matching experienced human engineers.
A zero-shot visual world model trained on one child's experience achieves broad competence on physical understanding benchmarks while matching developmental behavioral patterns.
Montparnasse Monte Carlo framework solves all Eterna100 V1 puzzles faster than DesiRNA and finds mRNA sequences with more paired bases than LinearDesign's MFE solution.
Strategic insertion of Global Average Pooling layers in VGG-16 reduces trainable parameters by 98%, maintains 66.4% ImageNet Top-1 accuracy, doubles translation robustness, and yields superior Spearman correlations in perceptual IQA tasks.
Convolutional neural networks are shown to perform inverse design of thin-film metamaterial stacks by learning the mapping from structure to ellipsometric and reflectance/transmittance spectra, with efficiency gains over traditional optimization as layer count increases.
A cycle-based reentry architecture is proposed to guarantee self-model emergence, self-preservation, and prompt-injection immunity in AGI via a D-I loop and a new S-measure of integrated information.
Deep learning models on standardized 2D CT projections of pelvis and skull from 141 cadavers reach 95.65% patient-level accuracy for biological sex determination.
Hybrid neuromorphic-ANN models outperform standard deep learning on few-shot benchmarks and under occlusion/impulse noise via astrocytic modulation and spiking dynamics.
Empirical tests show open-source LLM agents underperform the Bandit SAST tool and are not ready to replace it for security scanning.
Transfer learning with fine-tuned AlexNet achieves 98% accuracy classifying smartphone e-waste into 12 classes on a small dataset via hyperparameter tuning and augmentation.
citing papers explorer
-
Analyzing Reverse Address Translation Overheads in Multi-GPU Scale-Up Pods
Simulation study shows cold TLB misses in reverse address translation dominate latency for small collectives in multi-GPU pods, causing up to 1.4x degradation, while larger ones see diminishing returns.
-
Reachability In Simple Neural Networks
Reachability for neural networks is NP-hard for single-hidden-layer networks with output dimension 1 and weights restricted to {-1,0,1}.
-
SMR: Scheduler with Multi-Channel Map-Encoded Reinforcement Learning for Radio Telescopes
SMR uses multi-channel map-encoded reinforcement learning to achieve roughly 10% better time utilization than greedy baselines for single-dish radio telescope scheduling.
-
Expanding SPHERE-JEPA: A Family of Statistical Regularizers for the Hypersphere
Derives deterministic MMD, KSD, and KL objectives with rotationally invariant kernels on the hypersphere, yielding more stable SSL training and dataset-dependent geometry in learned representations.
-
Deep Psychovisual Image Representations
Proposes a psychovisual-inspired deep learning method that encodes images in learned frequency sub-bands for interpretable semantic structures and reduced depth dependence.
-
On the Equivariant Learning of the $Q$-tensor Order Parameter
Equivariant neural networks for 2D Q-tensor prediction in nematic liquid crystals achieve lower errors and better generalization than non-equivariant models while satisfying symmetry constraints.
-
When AI Meets Science: Research Diversity, Interdisciplinarity, Visibility, and Retractions across Disciplines in a Global Surge
AI use in science has grown exponentially since 2015 but stays confined to computer science and statistics topics, shows higher retraction rates and citations, and follows distinct global adoption patterns.
-
When AI reviews science: Can we trust the referee?
AI peer review systems are vulnerable to prompt injections, prestige biases, assertion strength effects, and contextual poisoning, as demonstrated by a new attack taxonomy and causal experiments on real conference submissions.
-
AIBuildAI: An AI Agent for Automatically Building AI Models
AIBuildAI uses a manager agent and three LLM sub-agents to fully automate AI model development and achieves a 63.1% medal rate on MLE-Bench, matching experienced human engineers.
-
Zero-shot World Models Are Developmentally Efficient Learners
A zero-shot visual world model trained on one child's experience achieves broad competence on physical understanding benchmarks while matching developmental behavioral patterns.
-
The Montparnasse Algorithm for RNA Design
Montparnasse Monte Carlo framework solves all Eterna100 V1 puzzles faster than DesiRNA and finds mRNA sequences with more paired bases than LinearDesign's MFE solution.
-
Parameter-Efficient Architectural Modifications for Translation-Invariant CNNs
Strategic insertion of Global Average Pooling layers in VGG-16 reduces trainable parameters by 98%, maintains 66.4% ImageNet Top-1 accuracy, doubles translation robustness, and yields superior Spearman correlations in perceptual IQA tasks.
-
General Inverse Design of Thin-Film Metamaterials With Convolutional Neural Networks
Convolutional neural networks are shown to perform inverse design of thin-film metamaterial stacks by learning the mapping from structure to ellipsometric and reflectance/transmittance spectra, with efficiency gains over traditional optimization as layer count increases.
-
Beyond Feedforward Networks: Reentry Neural Systems as the Fundamental Basis of Subjecthood and Intrinsic Safety of Next-Generation AGI
A cycle-based reentry architecture is proposed to guarantee self-model emergence, self-preservation, and prompt-injection immunity in AGI via a D-I loop and a new S-measure of integrated information.
-
Biological Sex Determination in Cadavers Using Deep Learning Algorithms from Computed Tomography Images of Pelvis and Skull
Deep learning models on standardized 2D CT projections of pelvis and skull from 141 cadavers reach 95.65% patient-level accuracy for biological sex determination.
-
The Neuromorphic Supremacy
Hybrid neuromorphic-ANN models outperform standard deep learning on few-shot benchmarks and under occlusion/impulse noise via astrocytic modulation and spiking dynamics.
-
Can Open-Source LLM Agents Replace Static Application Security Testing Tools? An Empirical Assessment
Empirical tests show open-source LLM agents underperform the Bandit SAST tool and are not ready to replace it for security scanning.
-
Transfer learning-based method for automated ewaste recycling in smart cities
Transfer learning with fine-tuned AlexNet achieves 98% accuracy classifying smartphone e-waste into 12 classes on a small dataset via hyperparameter tuning and augmentation.
- Venus-DeFakerOne: Unified Fake Image Detection & Localization