hub Mixed citations

mixup: Beyond Empirical Risk Minimization

Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz · 2017 · cs.LG · arXiv 1710.09412

Mixed citation behavior. Most common role is background (47%).

86 Pith papers citing it

Background 47% of classified citations

open full Pith review browse 86 citing papers arXiv PDF

abstract

Large deep neural networks are powerful, but exhibit undesirable behaviors such as memorization and sensitivity to adversarial examples. In this work, we propose mixup, a simple learning principle to alleviate these issues. In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels. By doing so, mixup regularizes the neural network to favor simple linear behavior in-between training examples. Our experiments on the ImageNet-2012, CIFAR-10, CIFAR-100, Google commands and UCI datasets show that mixup improves the generalization of state-of-the-art neural network architectures. We also find that mixup reduces the memorization of corrupt labels, increases the robustness to adversarial examples, and stabilizes the training of generative adversarial networks.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 6 method 6 baseline 2 other 1

citation-polarity summary

background 7 use method 6 baseline 2

claims ledger

abstract Large deep neural networks are powerful, but exhibit undesirable behaviors such as memorization and sensitivity to adversarial examples. In this work, we propose mixup, a simple learning principle to alleviate these issues. In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels. By doing so, mixup regularizes the neural network to favor simple linear behavior in-between training examples. Our experiments on the ImageNet-2012, CIFAR-10, CIFAR-100, Google commands and UCI datasets show that mixup improves the generalization of state-of-the-art neur

co-cited works

representative citing papers

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

cs.CV · 2021-03-25 · accept · novelty 8.0

Swin Transformer reaches 87.3% ImageNet accuracy and sets new records on COCO detection and ADE20K segmentation by replacing global self-attention with shifted-window local attention inside a hierarchical pyramid.

TCP-SSM: Efficient Vision State Space Models with Token-Conditioned Poles

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

TCP-SSM conditions stable poles on visual tokens to explicitly control memory decay and oscillation in SSMs, cutting computation up to 44% while matching or exceeding accuracy on classification, segmentation, and detection.

Efficient and provably convergent end-to-end training of deep neural networks with linear constraints

math.OC · 2026-05-12 · unverdicted · novelty 7.0

An efficiently computable HS-Jacobian acts as a conservative mapping for projections onto polyhedral sets, supporting provably convergent Adam-based end-to-end training of linearly constrained deep neural networks.

LLM-guided Semi-Supervised Approaches for Social Media Crisis Data Classification

cs.AI · 2026-05-08 · conditional · novelty 7.0

LG-CoTrain, an LLM-guided co-training method, outperforms classical semi-supervised baselines for crisis tweet classification in low-resource settings with 5-25 labeled examples per class.

LookWhen? Fast Video Recognition by Learning When, Where, and What to Compute

cs.CV · 2026-05-07 · conditional · novelty 7.0

LookWhen factorizes video recognition into learning when, where, and what to compute via uniqueness-based token selection and dual-teacher distillation, achieving better accuracy-FLOPs trade-offs than baselines on multiple datasets.

Domain Generalization through Spatial Relation Induction over Visual Primitives

cs.CV · 2026-05-07 · unverdicted · novelty 7.0

PARSE improves domain generalization accuracy by factoring recognition into visual primitives and their spatial relational compositions learned end-to-end with differentiable predicates.

LEGO: LoRA-Enabled Generator-Oriented Framework for Synthetic Image Detection

cs.CV · 2026-05-06 · unverdicted · novelty 7.0

LEGO uses multiple generator-specific LoRA modules modulated by an MLP and fused with attention to detect synthetic images, achieving better performance than prior methods while using under 10% of the training data.

SignMAE: Segmentation-Driven Self-Supervised Learning for Sign Language Recognition

cs.CV · 2026-05-03 · unverdicted · novelty 7.0

SignMAE uses segmentation-driven masking in a mask-and-reconstruct self-supervised task to learn fine-grained sign representations, achieving state-of-the-art accuracy on WLASL, NMFs-CSL, and Slovo with fewer frames and modalities.

Direct Discrepancy Replay: Distribution-Discrepancy Condensation and Manifold-Consistent Replay for Continual Face Forgery Detection

cs.CV · 2026-04-14 · unverdicted · novelty 7.0

A replay method for continual face forgery detection condenses real-fake distribution discrepancies into compact maps and synthesizes compatible samples from current real faces to reduce forgetting under tight memory budgets without storing historical images.

Is your algorithm unlearning or untraining?

cs.LG · 2026-04-09 · conditional · novelty 7.0

Machine unlearning conflates reversing the influence of specific training examples (untraining) with removing the full underlying distribution or behavior (unlearning).

Unifying Contrastive and Generative Objectives for Visual Understanding and Text-to-Image Generation

cs.CV · 2026-03-03 · unverdicted · novelty 7.0

DREAM introduces Masking Warmup and Semantically Aligned Decoding to let a single encoder handle both contrastive alignment and masked generation, yielding gains over CLIP and FLUID on understanding and generation benchmarks.

ST-BCP: Tightening Coverage Bound for Backward Conformal Prediction via Non-Conformity Score Transformation

stat.ML · 2026-02-02 · conditional · novelty 7.0

ST-BCP tightens the coverage bound in Backward Conformal Prediction by applying a computable data-dependent transformation to nonconformity scores, reducing the average gap from 4.20% to 1.12% on benchmarks while proving superiority over the identity baseline.

Chronos: Learning the Language of Time Series

cs.LG · 2024-03-12 · conditional · novelty 7.0

Chronos pretrains transformer models on tokenized time series to deliver strong zero-shot forecasting across diverse domains.

The DeepFake Detection Challenge (DFDC) Dataset

cs.CV · 2020-06-12 · accept · novelty 7.0

The DFDC dataset is the largest public collection of face-swapped videos and supports detectors that generalize to in-the-wild deepfakes.

Lightweight Vision-Aided Beam Tracking for Cross-Environment mmWave Communications

eess.SP · 2026-07-01 · unverdicted · novelty 6.0

Lightweight CNN with separable convolutions, hierarchical augmentation and power-based label smoothing reaches 84% cross-environment beam prediction accuracy on two real DeepSense 6G scenarios while cutting parameters by 52x and complexity by 79x versus ResNet.

MedDiffuseMix: Preserving Diagnostic Evidence with Saliency-Aware Diffusion Medical Image Data Augmentatio

cs.CV · 2026-06-25 · unverdicted · novelty 6.0

MedDiffuseMix uses classifier saliency maps to restrict diffusion-based mixing to non-diagnostic areas of medical images, yielding accuracy, F1, and AUC gains over standard, Mixup, and diffusion baselines on four public datasets.

Blind Recovery of Latent Domains via Unsupervised Symmetry Discovery

cs.LG · 2026-06-16 · unverdicted · novelty 6.0

Unsupervised symmetry discovery via shallow group-convolutional networks recovers latent domains from linear measurements of random fields by learning symmetry actions under stationarity and locality constraints.

MindAlign: Decoding Inner Speech from fMRI Signals via Multimodal Embedding Alignment under Limited Data

cs.CL · 2026-06-15 · unverdicted · novelty 6.0

MindAlign decodes inner speech from fMRI via subject-specific neural-semantic alignment into a multimodal space followed by prompting of a frozen LM, outperforming baselines and generalizing across subjects.

Demystifying Training-Time Augmentation for Data-Constrained Language Model Pretraining

cs.LG · 2026-06-15 · unverdicted · novelty 6.0

Training-time augmentations in token noise, permutation, and offset categories reduce overfitting and improve minimum validation loss in multi-epoch autoregressive pretraining on fixed corpora.

SNR-ST-Mix: Sample-specific Neighborhood Regression Mixup for Augmented Spatial Transcriptomics Imputation with Deep Neural Network

cs.LG · 2026-06-07 · unverdicted · novelty 6.0

SNR-ST-Mix is a geometry- and expression-aware mixup augmentation that constrains interpolation to k-nearest spatial neighbors and weights by transcriptomic similarity for spatial transcriptomics imputation.

Mitigating Stethoscope-Induced Shortcuts in Respiratory Sound Classification under Federated Domain Generalization with Causality-Inspired Interventions

eess.AS · 2026-05-28 · unverdicted · novelty 6.0

A causality-inspired FedDG framework with device style intervention network, counterfactual text augmentation, and gradient alignment outperforms baselines on leave-one-device-out validation for RSC on ICBHI and SPRSound datasets.

Representation-Conditioned Diffusion Models for Guided Training Data Generation

cs.CV · 2026-05-26 · unverdicted · novelty 6.0

Representation-conditioned diffusion models generate synthetic ImageNet data that trains classifiers to higher top-1 accuracy than class-conditioned generation (+10.76 pp) or real data (+2.0 pp when scaled).

GAMR: Geometric-Aware Manifold Regularization with Virtual Outlier Synthesis for Learning with Noisy Labels

cs.CV · 2026-05-20 · unverdicted · novelty 6.0

GAMR introduces geometric-aware manifold regularization via virtual outlier synthesis to enhance intra-class compactness and inter-class separation, improving robustness to noisy labels beyond passive sample filtering.

HamBR: Active Decision Boundary Restoration Based on Hamiltonian Dynamics for Learning with Noisy Labels

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

HamBR uses Spherical HMC to probe ambiguous regions and synthesize virtual outliers with energy-based repulsion to restore decision boundaries degraded by noisy labels, achieving SOTA on CIFAR and real-world benchmarks.

citing papers explorer

Showing 36 of 86 citing papers.

Personalized Generative Models for Contextual Debiasing cs.CV · 2026-05-25 · unverdicted · none · ref 63 · internal anchor
DecoupleGen personalizes diffusion models to create images with uncommon contexts for debiasing object recognition, yielding consistent gains on scene classification tasks.
Noise-Robust Financial Numerical Entity Attribute Tagging cs.AI · 2026-05-24 · unverdicted · none · ref 23 · internal anchor
NORA applies task-aware weighting and NPK filtering to handle label noise in multi-attribute tagging of financial numerical entities, outperforming baselines on a new 6.6M-instance benchmark.
FDDet: Achieving Data-Efficient Food Defect Detection Under Real-World Scenarios cs.CV · 2026-05-23 · unverdicted · none · ref 18 · internal anchor
FDDet is a semi-supervised object detection framework with BBoxMixUp and CGPC that outperforms standard detectors on the new FDD-48 food defect dataset under data-limited real-world conditions.
Holistic Reliability Propagation: Decoupling Annotation and Prediction for Robust Noisy-Label cs.CV · 2026-05-20 · unverdicted · none · ref 37 · internal anchor
HRP decouples annotation reliability (alpha) and pseudo-label reliability (beta) via bilevel meta-learning and routes them to distinct objectives in reliability-aware Mixup and contrastive learning for improved noisy-label robustness.
Axiomatizing Neural Networks via Pursuit of Subspaces cs.LG · 2026-05-19 · unverdicted · none · ref 78 · internal anchor
Authors introduce the Pursuit of Subspaces (PoS) hypothesis, an axiomatic geometric framework that unifies explanations for representation, computation, and generalization in shallow and deep neural networks.
Graph Transductive Sharpening: Leveraging Unlabeled Predictions in Node Classification cs.LG · 2026-05-18 · unverdicted · none · ref 54 · internal anchor
Transductive Sharpening adds an entropy-minimization term on unlabeled-node predictions to the training objective for graph node classification.
CAST: Channel-Aware Spatial Transfer Learning with Pseudo-Image Radar for Sign Language Recognition cs.CV · 2026-05-09 · unverdicted · none · ref 31 · internal anchor
CAST achieves 80.5% Top-1 accuracy on radar-only sign language recognition by fusing physics-aware CVD and RTM representations through channel-aware spatial attention and asymmetric cross-attention.
Agentic AIs Are the Missing Paradigm for Out-of-Distribution Generalization in Foundation Models cs.LG · 2026-05-07 · unverdicted · none · ref 38 · internal anchor
Agentic AI systems are required to overcome the parameter coverage ceiling that prevents foundation models from handling certain out-of-distribution cases.
HiMix: Hierarchical Artifact-aware Mixup for Generalized Synthetic Image Detection cs.CV · 2026-04-30 · unverdicted · none · ref 52 · internal anchor
HiMix combines mixup augmentation to create transitional real-fake samples with hierarchical global-local artifact feature fusion to achieve better generalization in detecting AI-generated images from unseen generators.
Investigating Bias and Fairness in Appearance-based Gaze Estimation cs.CV · 2026-04-12 · unverdicted · none · ref 79 · internal anchor
First large-scale fairness audit of gaze estimators reveals sizable accuracy disparities by ethnicity and gender, with existing mitigation methods providing only marginal fairness gains.
Beyond Surface Artifacts: Capturing Shared Latent Forgery Knowledge Across Modalities cs.CV · 2026-04-09 · unverdicted · none · ref 64 · internal anchor
Introduces MAF framework and DeepModal-Bench to capture universal cross-modal forgery traces for better generalization in multimodal deepfake detection.
Multi-Aspect Knowledge Distillation for Language Model with Low-rank Factorization cs.CL · 2026-04-03 · unverdicted · none · ref 4 · internal anchor
MaKD distills pre-trained language models by deeply mimicking self-attention and feed-forward modules across aspects using low-rank factorization, matching strong baselines at the same parameter budget and extending to auto-regressive models.
Why Invariance is Not Enough for Biomedical Domain Generalization and How to Fix It eess.IV · 2026-04-02 · unverdicted · none · ref 69 · internal anchor
MaskGen improves domain generalization for biomedical image segmentation by using source intensities plus domain-stable foundation model representations with minimal added complexity.
Ordinal Adaptive Correction: A Data-Centric Approach to Ordinal Image Classification with Noisy Labels cs.CV · 2025-09-02 · unverdicted · none · ref 12 · internal anchor
ORDAC adaptively corrects noisy ordinal labels via dynamic label distribution adjustments, yielding lower error and higher recall on noisy Adience and Diabetic Retinopathy benchmarks.
Two-Stage Framework for Efficient UAV-Based Wildfire Video Analysis with Adaptive Compression and Fire Source Detection cs.CV · 2025-08-22 · unverdicted · none · ref 39 · internal anchor
A two-stage UAV framework prunes redundant wildfire video clips via a policy network with station point mechanism and detects fire sources in real time using an improved YOLOv8 model.
i-WiViG: Interpretable Window Vision GNN cs.CV · 2025-03-11 · unverdicted · none · ref 37 · internal anchor
i-WiViG is an interpretable window vision GNN that constrains nodes to disjoint local windows and applies learnable sparse attention to identify relevant subgraphs, delivering competitive performance on scene classification and regression with natural and remote-sensing images.
Automatic Dataset Construction (ADC): Sample Collection, Data Curation, and Beyond cs.AI · 2024-08-21 · unverdicted · none · ref 49 · internal anchor
The ADC method automates the creation of large image classification datasets using LLMs and search engines, achieving 79% human agreement and reducing label noise on a 1 million image clothing dataset, while also releasing benchmarks for noise and bias issues.
YOLOv4: Optimal Speed and Accuracy of Object Detection cs.CV · 2020-04-23 · unverdicted · none · ref 92 · internal anchor
YOLOv4 achieves 43.5% AP (65.7% AP50) on MS COCO at ~65 FPS on Tesla V100 by integrating WRC, CSP, CmBN, SAT, Mish activation, Mosaic augmentation, DropBlock, and CIoU loss.
Annotation-Free Cardiac Vessel Segmentation via Knowledge Transfer from Retinal Images eess.IV · 2019-07-26 · unverdicted · none · ref 15 · internal anchor
SC-GAN performs annotation-free coronary artery segmentation by transferring shape-consistent knowledge from retinal vessel annotations via a GAN trained on 1092 DSA images.
The Receptive Field as a Regularizer in Deep Convolutional Neural Networks for Acoustic Scene Classification cs.LG · 2019-07-03 · unverdicted · none · ref 19 · internal anchor
Tuning receptive field sizes in ResNet and DenseNet enables them to outperform VGG models on acoustic scene classification across three datasets.
Efficient data augmentation using graph imputation neural networks stat.ML · 2019-06-20 · unverdicted · none · ref 17 · internal anchor
Graph imputation neural networks augment semi-supervised datasets up to 10x by reconstructing heavily damaged samples on a similarity graph, improving over fully-supervised baselines on benchmarks.
PRISM: Prioritized Channel Importance with Semi-supervised Domain Adaptation for Cross-Subject EEG Emotion Recognition cs.LG · 2026-07-01 · unverdicted · none · ref 32 · internal anchor
PRISM combines data-dependent channel weighting via expert ensemble and confidence-filtered pseudo-label domain adaptation to outperform prior methods on cross-subject EEG emotion tasks in DEAP, DREAMER, and SEED.
Improving Combined Detection and Classification of TEM Defects via Mask-Conditioned Latent Diffusion Augmentation cs.CV · 2026-06-01 · unverdicted · none · ref 28 · internal anchor
Mask-conditioned LDM generates synthetic TEM defect image-mask pairs that augment small experimental sets and produce up to 0.02 gain in harmonic-mean F1 for combined detection and classification with Mask R-CNN.
an interpretable vision transformer framework for automated brain tumor classification cs.CV · 2026-04-23 · unverdicted · none · ref 15 · internal anchor
Vision Transformer with CLAHE preprocessing, two-stage fine-tuning, MixUp/CutMix, EMA, TTA, and attention rollout achieves 99.29% accuracy and 99.25% macro F1 on four-class brain tumor MRI classification from 7023 scans.
A Wasserstein GAN-based climate scenario generator for risk management and insurance: the case of soil subsidence cs.LG · 2026-04-22 · unverdicted · none · ref 67 · internal anchor
A conditional Wasserstein GAN generates plausible future SWI drought trajectories for French insurance risk management under climate change.
PR3DICTR: A modular AI framework for medical 3D image-based detection and outcome prediction cs.CV · 2026-04-03 · unverdicted · none · ref 14 · internal anchor
PR3DICTR is a new open-access modular framework for 3D medical image classification and outcome prediction that works with as little as two lines of code.
CLIP the Landscape: Automated Tagging of Crowdsourced Landscape Images cs.CV · 2025-06-13 · unverdicted · none · ref 18 · internal anchor
A lightweight multi-modal CLIP pipeline predicts exact-match geographical tags on a Kaggle subset of the Geograph crowdsourced image archive by fusing image, location, and title embeddings.
Attention based Convolutional Recurrent Neural Network for Environmental Sound Classification cs.SD · 2019-07-04 · unverdicted · none · ref 26 · internal anchor
A CRNN model with frame-level attention achieves state-of-the-art accuracy on ESC-10 and ESC-50 environmental sound classification datasets.
Rethinking Text-to-Image as Semantic-Aware Data Augmentation for Indoor Scene Recognition cs.CV · 2026-06-17 · unverdicted · none · ref 23 · internal anchor
Stable Diffusion augments limited indoor scene datasets for better recognition models, and DIRE detects the generated images with 100% accuracy using lightweight classifiers.
CellNet -- Localizing Cells using Sparse and Noisy Point Annotations cs.CV · 2026-06-10 · unverdicted · none · ref 43 · internal anchor
CellNet applies regression-based deep learning to count cells from sparse point annotations in microscopy images and claims better performance than zero-shot methods in low-data regimes.
Optimizing 2D Input Representations and Sub-phase Fusion Strategies for Differential Diagnosis of Asthma and COPD Using CNN- and GRU-Based Networks eess.AS · 2026-06-09 · unverdicted · none · ref 53 · internal anchor
MFCC matrices with 13 coefficients and adaptive windowing plus direct concatenation outperform log-mel spectrograms and VAR models for asthma-COPD classification, reaching F1 scores of 0.877 (cycle) and 0.855 (subject).
The General Theory of Localization Methods cs.LG · 2026-05-20 · unverdicted · none · ref 147 · 2 links · internal anchor
The localization method is presented as a unifying framework connecting kernel methods, MeanShift, Hopfield networks, LLE, fuzzy inference, denoising autoencoders, and Transformers via local models and the localization trick.
SleepNet and DreamNet: Enriching and Reconstructing Representations for Consolidated Visual Classification cs.LG · 2024-09-03 · unverdicted · none · ref 46 · internal anchor
SleepNet and DreamNet enrich visual features via supervised pre-trained encoders and reconstruct hidden states with encoder-decoder frameworks to outperform prior state-of-the-art classifiers.
HODGEPODGE: Sound event detection based on ensemble of semi-supervised learning methods cs.SD · 2019-07-17 · unverdicted · none · ref 11 · internal anchor
An ensemble of CRNNs trained with consistency regularization and MixUp on mixed labeled/unlabeled data reaches 42.0% event-based F-measure on DCASE 2019 Task 4, beating the 25.8% baseline.
Image-Based Malware Type Classification on MalNet-Image Tiny: Effects of Multi-Scale Fusion, Transfer Learning, Data Augmentation, and Schedule-Free Optimization cs.CR · 2026-04-22 · unverdicted · none · ref 18 · internal anchor
Pretraining plus Mixup/TrivialAugment and a feature pyramid network lift macro-F1 from 0.65 to 0.69 on 43-class malware image classification while cutting training epochs from 96 to 10.
Know Yourself Better: Diverse Object-Related Features Improve Open Set Recognition cs.CV · 2024-04-16 · unreviewed · ref 52 · internal anchor

mixup: Beyond Empirical Risk Minimization

hub tools

citation-role summary

citation-polarity summary

claims ledger

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer