Imagenet: A large- scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, Li Fei-Fei · 2009

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

browse 9 citing papers

citation-role summary

dataset 1

citation-polarity summary

use dataset 1

representative citing papers

TCP-SSM: Efficient Vision State Space Models with Token-Conditioned Poles

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

TCP-SSM conditions stable poles on visual tokens to explicitly control memory decay and oscillation in SSMs, cutting computation up to 44% while matching or exceeding accuracy on classification, segmentation, and detection.

LAION-5B: An open large-scale dataset for training next generation image-text models

cs.CV · 2022-10-16 · accept · novelty 7.0

LAION-5B is an openly released dataset of 5.85 billion CLIP-filtered image-text pairs that enables replication of foundational vision-language models.

Elastic Attention Cores for Scalable Vision Transformers

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

VECA learns effective visual representations using core-periphery attention where patches interact exclusively via a resolution-invariant set of learned core embeddings, achieving linear O(N) complexity while maintaining competitive performance.

Birds of a Feather Flock Together: Background-Invariant Representations via Linear Structure in VLMs

cs.CV · 2026-05-11 · unverdicted · novelty 6.0

Exploiting linear structure in VLM embeddings, a synthetic-data pre-training method yields background-invariant representations that exceed 90% worst-group accuracy on Waterbirds even under 100% spurious correlation with no minority examples in training.

What Matters for Diffusion-Friendly Latent Manifold? Prior-Aligned Autoencoders for Latent Diffusion

cs.CV · 2026-05-08 · unverdicted · novelty 6.0

Prior-Aligned AutoEncoders shape latent manifolds with spatial coherence, local continuity, and global semantics to improve latent diffusion, achieving SOTA gFID 1.03 on ImageNet 256x256 with up to 13x faster convergence.

UniISP: A Unified ISP Framework for Both Human and Machine Vision

cs.CV · 2026-05-08 · unverdicted · novelty 6.0

UniISP unifies ISP processing with a Hybrid Attention Module and Feature Adapter to produce images that are both visually pleasing for humans and informative for computer vision models.

SoftSAE: Dynamic Top-K Selection for Adaptive Sparse Autoencoders

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

SoftSAE replaces fixed-K sparsity in autoencoders with a learned, input-dependent number of active features via a soft top-k operator.

TINS: Test-time ID-prototype-separated Negative Semantics Learning for OOD Detection

cs.CV · 2026-05-11 · unverdicted · novelty 5.0

TINS improves OOD detection by learning negative semantics at test time with ID-prototype separation, cutting average FPR95 from 14.04% to 6.72% on the Four-OOD benchmark with ImageNet-1K.

Micro-Defects Expose Macro-Fakes: Detecting AI-Generated Images via Local Distributional Shifts

cs.CV · 2026-05-10 · unverdicted · novelty 5.0

MDMF detects AI-generated images by learning patch-level forensic signatures and quantifying their distributional discrepancies with MMD, yielding larger separation than global methods when micro-defects are present.

citing papers explorer

Showing 9 of 9 citing papers.

TCP-SSM: Efficient Vision State Space Models with Token-Conditioned Poles cs.CV · 2026-05-12 · unverdicted · none · ref 9
TCP-SSM conditions stable poles on visual tokens to explicitly control memory decay and oscillation in SSMs, cutting computation up to 44% while matching or exceeding accuracy on classification, segmentation, and detection.
LAION-5B: An open large-scale dataset for training next generation image-text models cs.CV · 2022-10-16 · accept · none · ref 13
LAION-5B is an openly released dataset of 5.85 billion CLIP-filtered image-text pairs that enables replication of foundational vision-language models.
Elastic Attention Cores for Scalable Vision Transformers cs.CV · 2026-05-12 · unverdicted · none · ref 146
VECA learns effective visual representations using core-periphery attention where patches interact exclusively via a resolution-invariant set of learned core embeddings, achieving linear O(N) complexity while maintaining competitive performance.
Birds of a Feather Flock Together: Background-Invariant Representations via Linear Structure in VLMs cs.CV · 2026-05-11 · unverdicted · none · ref 6
Exploiting linear structure in VLM embeddings, a synthetic-data pre-training method yields background-invariant representations that exceed 90% worst-group accuracy on Waterbirds even under 100% spurious correlation with no minority examples in training.
What Matters for Diffusion-Friendly Latent Manifold? Prior-Aligned Autoencoders for Latent Diffusion cs.CV · 2026-05-08 · unverdicted · none · ref 16
Prior-Aligned AutoEncoders shape latent manifolds with spatial coherence, local continuity, and global semantics to improve latent diffusion, achieving SOTA gFID 1.03 on ImageNet 256x256 with up to 13x faster convergence.
UniISP: A Unified ISP Framework for Both Human and Machine Vision cs.CV · 2026-05-08 · unverdicted · none · ref 9
UniISP unifies ISP processing with a Hybrid Attention Module and Feature Adapter to produce images that are both visually pleasing for humans and informative for computer vision models.
SoftSAE: Dynamic Top-K Selection for Adaptive Sparse Autoencoders cs.LG · 2026-05-07 · unverdicted · none · ref 25
SoftSAE replaces fixed-K sparsity in autoencoders with a learned, input-dependent number of active features via a soft top-k operator.
TINS: Test-time ID-prototype-separated Negative Semantics Learning for OOD Detection cs.CV · 2026-05-11 · unverdicted · none · ref 8
TINS improves OOD detection by learning negative semantics at test time with ID-prototype separation, cutting average FPR95 from 14.04% to 6.72% on the Four-OOD benchmark with ImageNet-1K.
Micro-Defects Expose Macro-Fakes: Detecting AI-Generated Images via Local Distributional Shifts cs.CV · 2026-05-10 · unverdicted · none · ref 9
MDMF detects AI-generated images by learning patch-level forensic signatures and quantifying their distributional discrepancies with MMD, yielding larger separation than global methods when micro-defects are present.

Imagenet: A large- scale hierarchical image database

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer