citation dossier

E., Srivastava, N., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R

stub title-quality gate blocked hub promotion · 11 Pith inbound

G · 2012 · arXiv 1207.0580

11Pith papers citing it

11reference links

cs.LGtop field · 4 papers

ACCEPTtop verdict bucket · 5 papers

This arXiv-backed work is queued for full Pith review when it crosses the high-inbound sweep. That review runs reader · skeptic · desk-editor · referee · rebuttal · circularity · lean confirmation · RS check · pith extraction.

read on arXiv PDF

why this work matters in Pith

Pith has found this work in 11 reviewed papers. Its strongest current cluster is cs.LG (4 papers). The largest review-status bucket among citing papers is ACCEPT (5 papers). For highly cited works, this page shows a dossier first and a bounded explorer second; it never tries to render every citing paper at once.

representative citing papers

Generative Adversarial Networks

stat.ML · 2014-06-10 · accept · novelty 9.0

A generative model is trained to match a data distribution by competing in a minimax game against a discriminator, reaching an equilibrium where the generator recovers the true distribution and the discriminator outputs 1/2 everywhere.

Deep Residual Learning for Image Recognition

cs.CV · 2015-12-10 · accept · novelty 8.0

Residual networks reformulate layers to learn residual functions, enabling effective training of up to 152-layer models that achieve 3.57% error on ImageNet and win ILSVRC 2015.

Conditional Generative Adversarial Nets

cs.LG · 2014-11-06 · accept · novelty 8.0

Conditional GANs generate samples matching a given condition by supplying the condition to both generator and discriminator.

Adam: A Method for Stochastic Optimization

cs.LG · 2014-12-22 · accept · novelty 7.5

A first-order stochastic optimizer that maintains bias-corrected exponential moving averages of the gradient and its square, dividing the former by the square root of the latter to set per-parameter step sizes.

Simultaneous measurements of $N$-subjettiness observables in jets from gluons and light-flavour quarks, and in decays of boosted W bosons and top quarks

hep-ex · 2026-04-28 · unverdicted · novelty 7.0

CMS reports a simultaneous measurement of 25 N-subjettiness observables in 1-, 2-, and 3-prong jets, unfolded to stable particles with particle-level correlations for QCD modeling.

Improved Regularization of Convolutional Neural Networks with Cutout

cs.CV · 2017-08-15 · accept · novelty 7.0

Randomly masking square regions of input images during CNN training yields new state-of-the-art test errors of 2.56% on CIFAR-10, 15.20% on CIFAR-100, and 1.30% on SVHN.

Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

cs.LG · 2013-08-15 · conditional · novelty 7.0

The paper introduces and compares gradient estimators for stochastic binary neurons, notably a decomposition approach and the straight-through estimator, to support sparse conditional computation in deep networks.

Explicit Dropout: Deterministic Regularization for Transformer Architectures

cs.LG · 2026-04-22 · unverdicted · novelty 6.0

Explicit dropout reformulates stochastic dropout as deterministic loss penalties for Transformers, matching or exceeding standard performance with independent control per component.

Language models recognize dropout and Gaussian noise applied to their activations

cs.AI · 2026-04-19 · unverdicted · novelty 6.0

Language models detect, localize, and distinguish dropout from Gaussian noise applied to their activations, often with high accuracy.

Robust Deepfake Detection: Mitigating Spatial Attention Drift via Calibrated Complementary Ensembles

cs.CV · 2026-04-28 · unverdicted · novelty 4.0

A multi-stream ensemble using DINOv2 and CLIP backbones trained with extreme degradations achieves stable deepfake detection and fourth place in the NTIRE 2026 challenge.

Quantum memory and scrambling from the perspective of a classical neural network

quant-ph · 2026-04-28 · unverdicted · novelty 4.0

Time-dependent quantum memory oscillates faster than OTOC, does not equilibrate, and is more sensitive to symmetry breaking, as shown by neural-network predictions on helical spin chains.

citing papers explorer

Showing 11 of 11 citing papers.

Generative Adversarial Networks stat.ML · 2014-06-10 · accept · none · ref 17
A generative model is trained to match a data distribution by competing in a minimax game against a discriminator, reaching an equilibrium where the generator recovers the true distribution and the discriminator outputs 1/2 everywhere.
Deep Residual Learning for Image Recognition cs.CV · 2015-12-10 · accept · none · ref 14
Residual networks reformulate layers to learn residual functions, enabling effective training of up to 152-layer models that achieve 3.57% error on ImageNet and win ILSVRC 2015.
Conditional Generative Adversarial Nets cs.LG · 2014-11-06 · accept · none · ref 9
Conditional GANs generate samples matching a given condition by supplying the condition to both generator and discriminator.
Adam: A Method for Stochastic Optimization cs.LG · 2014-12-22 · accept · none · ref 8
A first-order stochastic optimizer that maintains bias-corrected exponential moving averages of the gradient and its square, dividing the former by the square root of the latter to set per-parameter step sizes.
Simultaneous measurements of $N$-subjettiness observables in jets from gluons and light-flavour quarks, and in decays of boosted W bosons and top quarks hep-ex · 2026-04-28 · unverdicted · none · ref 100
CMS reports a simultaneous measurement of 25 N-subjettiness observables in 1-, 2-, and 3-prong jets, unfolded to stable particles with particle-level correlations for QCD modeling.
Improved Regularization of Convolutional Neural Networks with Cutout cs.CV · 2017-08-15 · accept · none · ref 6
Randomly masking square regions of input images during CNN training yields new state-of-the-art test errors of 2.56% on CIFAR-10, 15.20% on CIFAR-100, and 1.30% on SVHN.
Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation cs.LG · 2013-08-15 · conditional · none · ref 10
The paper introduces and compares gradient estimators for stochastic binary neurons, notably a decomposition approach and the straight-through estimator, to support sparse conditional computation in deep networks.
Explicit Dropout: Deterministic Regularization for Transformer Architectures cs.LG · 2026-04-22 · unverdicted · none · ref 15
Explicit dropout reformulates stochastic dropout as deterministic loss penalties for Transformers, matching or exceeding standard performance with independent control per component.
Language models recognize dropout and Gaussian noise applied to their activations cs.AI · 2026-04-19 · unverdicted · none · ref 6
Language models detect, localize, and distinguish dropout from Gaussian noise applied to their activations, often with high accuracy.
Robust Deepfake Detection: Mitigating Spatial Attention Drift via Calibrated Complementary Ensembles cs.CV · 2026-04-28 · unverdicted · none · ref 14
A multi-stream ensemble using DINOv2 and CLIP backbones trained with extreme degradations achieves stable deepfake detection and fourth place in the NTIRE 2026 challenge.
Quantum memory and scrambling from the perspective of a classical neural network quant-ph · 2026-04-28 · unverdicted · none · ref 103
Time-dependent quantum memory oscillates faster than OTOC, does not equilibrate, and is more sensitive to symmetry breaking, as shown by neural-network predictions on helical spin chains.

E., Srivastava, N., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R

why this work matters in Pith

fields

years

verdicts

representative citing papers

citing papers explorer