hub

mixup: Beyond Empirical Risk Minimization

Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz · 2017 · cs.LG · arXiv 1710.09412

38 Pith papers cite this work. Polarity classification is still indexing.

38 Pith papers citing it

open full Pith review browse 38 citing papers arXiv PDF

abstract

Large deep neural networks are powerful, but exhibit undesirable behaviors such as memorization and sensitivity to adversarial examples. In this work, we propose mixup, a simple learning principle to alleviate these issues. In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels. By doing so, mixup regularizes the neural network to favor simple linear behavior in-between training examples. Our experiments on the ImageNet-2012, CIFAR-10, CIFAR-100, Google commands and UCI datasets show that mixup improves the generalization of state-of-the-art neural network architectures. We also find that mixup reduces the memorization of corrupt labels, increases the robustness to adversarial examples, and stabilizes the training of generative adversarial networks.

hub tools

JSON dossier citing papers JSON arXiv source

claims ledger

abstract Large deep neural networks are powerful, but exhibit undesirable behaviors such as memorization and sensitivity to adversarial examples. In this work, we propose mixup, a simple learning principle to alleviate these issues. In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels. By doing so, mixup regularizes the neural network to favor simple linear behavior in-between training examples. Our experiments on the ImageNet-2012, CIFAR-10, CIFAR-100, Google commands and UCI datasets show that mixup improves the generalization of state-of-the-art neur

co-cited works

representative citing papers

TCP-SSM: Efficient Vision State Space Models with Token-Conditioned Poles

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

TCP-SSM conditions stable poles on visual tokens to explicitly control memory decay and oscillation in SSMs, cutting computation up to 44% while matching or exceeding accuracy on classification, segmentation, and detection.

Efficient and provably convergent end-to-end training of deep neural networks with linear constraints

math.OC · 2026-05-12 · unverdicted · novelty 7.0

An efficiently computable HS-Jacobian acts as a conservative mapping for projections onto polyhedral sets, supporting provably convergent Adam-based end-to-end training of linearly constrained deep neural networks.

LLM-guided Semi-Supervised Approaches for Social Media Crisis Data Classification

cs.AI · 2026-05-08 · conditional · novelty 7.0

LG-CoTrain, an LLM-guided co-training method, outperforms classical semi-supervised baselines for crisis tweet classification in low-resource settings with 5-25 labeled examples per class.

LookWhen? Fast Video Recognition by Learning When, Where, and What to Compute

cs.CV · 2026-05-07 · conditional · novelty 7.0

LookWhen factorizes video recognition into learning when, where, and what to compute via uniqueness-based token selection and dual-teacher distillation, achieving better accuracy-FLOPs trade-offs than baselines on multiple datasets.

Domain Generalization through Spatial Relation Induction over Visual Primitives

cs.CV · 2026-05-07 · unverdicted · novelty 7.0

PARSE improves domain generalization accuracy by factoring recognition into visual primitives and their spatial relational compositions learned end-to-end with differentiable predicates.

LEGO: LoRA-Enabled Generator-Oriented Framework for Synthetic Image Detection

cs.CV · 2026-05-06 · unverdicted · novelty 7.0

LEGO uses multiple generator-specific LoRA modules modulated by an MLP and fused with attention to detect synthetic images, achieving better performance than prior methods while using under 10% of the training data.

SignMAE: Segmentation-Driven Self-Supervised Learning for Sign Language Recognition

cs.CV · 2026-05-03 · unverdicted · novelty 7.0

SignMAE uses segmentation-driven masking in a mask-and-reconstruct self-supervised task to learn fine-grained sign representations, achieving state-of-the-art accuracy on WLASL, NMFs-CSL, and Slovo with fewer frames and modalities.

Direct Discrepancy Replay: Distribution-Discrepancy Condensation and Manifold-Consistent Replay for Continual Face Forgery Detection

cs.CV · 2026-04-14 · unverdicted · novelty 7.0

A replay method for continual face forgery detection condenses real-fake distribution discrepancies into compact maps and synthesizes compatible samples from current real faces to reduce forgetting under tight memory budgets without storing historical images.

Is your algorithm unlearning or untraining?

cs.LG · 2026-04-09 · conditional · novelty 7.0

Machine unlearning conflates reversing the influence of specific training examples (untraining) with removing the full underlying distribution or behavior (unlearning).

Chronos: Learning the Language of Time Series

cs.LG · 2024-03-12 · conditional · novelty 7.0

Chronos pretrains transformer models on tokenized time series to deliver strong zero-shot forecasting across diverse domains.

The DeepFake Detection Challenge (DFDC) Dataset

cs.CV · 2020-06-12 · accept · novelty 7.0

The DFDC dataset is the largest public collection of face-swapped videos and supports detectors that generalize to in-the-wild deepfakes.

HamBR: Active Decision Boundary Restoration Based on Hamiltonian Dynamics for Learning with Noisy Labels

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

HamBR uses Spherical HMC to probe ambiguous regions and synthesize virtual outliers with energy-based repulsion to restore decision boundaries degraded by noisy labels, achieving SOTA on CIFAR and real-world benchmarks.

LiBaGS: Lightweight Boundary Gap Synthesis for Targeted Synthetic Data Selection

cs.LG · 2026-05-11 · unverdicted · novelty 6.0

LiBaGS scores and selects synthetic data near decision boundaries using proximity, uncertainty, density, and validity, with boundary-gap allocation and marginal stopping to improve training accuracy.

Cross-Sample Relational Fusion: Unifying Domain Generalization and Class-Incremental Learning

cs.CV · 2026-05-09 · unverdicted · novelty 6.0

CORF unifies domain generalization and class-incremental learning via selective sample refinement with spatial maps and confidence weighting plus cascaded relational distillation.

ICDAR 2026 Competition on Writer Identification and Pen Classification from Hand-Drawn Circles

cs.CV · 2026-05-08 · accept · novelty 6.0

A new dataset of hand-drawn circles from 66 writers and 8 pens yields competition results of 64.8% top-1 accuracy for open-set writer identification and 92.7% for pen classification.

Leveraging Image Generators to Address Training Data Scarcity: The Gen4Regen Dataset for Forest Regeneration Mapping

cs.CV · 2026-05-07 · conditional · novelty 6.0

Mixing real UAV imagery with 2101 AI-generated image-mask pairs improves semantic segmentation F1 scores for fine-grained forest species by over 15 percentage points overall and up to 30 points for rare classes.

Cheeger--Hodge Contrastive Learning for Structurally Robust Graph Representation Learning

cs.LG · 2026-04-29 · unverdicted · novelty 6.0

CHCL aligns a Cheeger-Hodge joint signature across graph augmentations to produce embeddings that remain stable under local structural changes.

Beyond Binary Contrast: Modeling Continuous Skeleton Action Spaces with Transitional Anchors

cs.CV · 2026-04-20 · unverdicted · novelty 6.0

TranCLR models continuous skeleton action spaces with transitional anchors and multi-level manifold calibration, yielding smoother and more accurate representations than binary contrastive methods.

PAC-Bayes Bounds for Gibbs Posteriors via Singular Learning Theory

stat.ML · 2026-04-19 · unverdicted · novelty 6.0

PAC-Bayes bounds for Gibbs posteriors are obtained via singular learning theory, producing explicit and tighter posterior-averaged risk bounds that adapt to data structure in overparameterized models.

Human Gaze-based Dual Teacher Guidance Learning for Semi-Supervised Medical Image Segmentation

eess.IV · 2026-04-12 · unverdicted · novelty 6.0

HG-DTGL integrates human gaze as an extra teacher in mean-teacher learning via GazeMix, MGP module and Gaze Loss, reporting superior segmentation across ten organs on multiple modalities.

Feature-Aware Anisotropic Local Differential Privacy for Utility-Preserving Graph Representation Learning in Metal Additive Manufacturing

cs.LG · 2026-04-06 · unverdicted · novelty 6.0

FI-LDP-HGAT applies feature-importance-aware anisotropic local differential privacy to a hierarchical graph attention network, recovering 81.5% utility at epsilon=4 and 0.762 defect recall at epsilon=2 on a DED porosity dataset while outperforming standard LDP and DP-SGD baselines.

OASIC: Occlusion-Agnostic and Severity-Informed Classification

cs.CV · 2026-04-05 · conditional · novelty 6.0

OASIC uses anomaly-based masking and severity estimation to select occlusion-matched models, improving AUC on occluded images by up to 23.7 points.

Can LLMs Learn to Reason Robustly under Noisy Supervision?

cs.LG · 2026-04-05 · conditional · novelty 6.0

Online Label Refinement lets LLMs learn robust reasoning from noisy supervision by correcting labels when majority answers show rising rollout success and stable history, delivering 3-4% gains on math and reasoning benchmarks even at high noise levels.

R\'enyi Attention Entropy for Patch Pruning

cs.CV · 2026-04-04 · unverdicted · novelty 6.0

Rényi entropy of attention maps serves as a tunable criterion for pruning redundant patches in vision transformers, reducing compute with preserved accuracy on image recognition.

citing papers explorer

Showing 38 of 38 citing papers.

TCP-SSM: Efficient Vision State Space Models with Token-Conditioned Poles cs.CV · 2026-05-12 · unverdicted · none · ref 70 · internal anchor
TCP-SSM conditions stable poles on visual tokens to explicitly control memory decay and oscillation in SSMs, cutting computation up to 44% while matching or exceeding accuracy on classification, segmentation, and detection.
Efficient and provably convergent end-to-end training of deep neural networks with linear constraints math.OC · 2026-05-12 · unverdicted · none · ref 76 · internal anchor
An efficiently computable HS-Jacobian acts as a conservative mapping for projections onto polyhedral sets, supporting provably convergent Adam-based end-to-end training of linearly constrained deep neural networks.
LLM-guided Semi-Supervised Approaches for Social Media Crisis Data Classification cs.AI · 2026-05-08 · conditional · none · ref 44 · internal anchor
LG-CoTrain, an LLM-guided co-training method, outperforms classical semi-supervised baselines for crisis tweet classification in low-resource settings with 5-25 labeled examples per class.
LookWhen? Fast Video Recognition by Learning When, Where, and What to Compute cs.CV · 2026-05-07 · conditional · none · ref 61 · internal anchor
LookWhen factorizes video recognition into learning when, where, and what to compute via uniqueness-based token selection and dual-teacher distillation, achieving better accuracy-FLOPs trade-offs than baselines on multiple datasets.
Domain Generalization through Spatial Relation Induction over Visual Primitives cs.CV · 2026-05-07 · unverdicted · none · ref 12 · internal anchor
PARSE improves domain generalization accuracy by factoring recognition into visual primitives and their spatial relational compositions learned end-to-end with differentiable predicates.
LEGO: LoRA-Enabled Generator-Oriented Framework for Synthetic Image Detection cs.CV · 2026-05-06 · unverdicted · none · ref 52 · internal anchor
LEGO uses multiple generator-specific LoRA modules modulated by an MLP and fused with attention to detect synthetic images, achieving better performance than prior methods while using under 10% of the training data.
SignMAE: Segmentation-Driven Self-Supervised Learning for Sign Language Recognition cs.CV · 2026-05-03 · unverdicted · none · ref 26 · internal anchor
SignMAE uses segmentation-driven masking in a mask-and-reconstruct self-supervised task to learn fine-grained sign representations, achieving state-of-the-art accuracy on WLASL, NMFs-CSL, and Slovo with fewer frames and modalities.
Direct Discrepancy Replay: Distribution-Discrepancy Condensation and Manifold-Consistent Replay for Continual Face Forgery Detection cs.CV · 2026-04-14 · unverdicted · none · ref 45 · internal anchor
A replay method for continual face forgery detection condenses real-fake distribution discrepancies into compact maps and synthesizes compatible samples from current real faces to reduce forgetting under tight memory budgets without storing historical images.
Is your algorithm unlearning or untraining? cs.LG · 2026-04-09 · conditional · none · ref 34 · internal anchor
Machine unlearning conflates reversing the influence of specific training examples (untraining) with removing the full underlying distribution or behavior (unlearning).
Chronos: Learning the Language of Time Series cs.LG · 2024-03-12 · conditional · none · ref 99 · internal anchor
Chronos pretrains transformer models on tokenized time series to deliver strong zero-shot forecasting across diverse domains.
The DeepFake Detection Challenge (DFDC) Dataset cs.CV · 2020-06-12 · accept · none · ref 32 · internal anchor
The DFDC dataset is the largest public collection of face-swapped videos and supports detectors that generalize to in-the-wild deepfakes.
HamBR: Active Decision Boundary Restoration Based on Hamiltonian Dynamics for Learning with Noisy Labels cs.CV · 2026-05-12 · unverdicted · none · ref 37 · internal anchor
HamBR uses Spherical HMC to probe ambiguous regions and synthesize virtual outliers with energy-based repulsion to restore decision boundaries degraded by noisy labels, achieving SOTA on CIFAR and real-world benchmarks.
LiBaGS: Lightweight Boundary Gap Synthesis for Targeted Synthetic Data Selection cs.LG · 2026-05-11 · unverdicted · none · ref 56 · internal anchor
LiBaGS scores and selects synthetic data near decision boundaries using proximity, uncertainty, density, and validity, with boundary-gap allocation and marginal stopping to improve training accuracy.
Cross-Sample Relational Fusion: Unifying Domain Generalization and Class-Incremental Learning cs.CV · 2026-05-09 · unverdicted · none · ref 73 · internal anchor
CORF unifies domain generalization and class-incremental learning via selective sample refinement with spatial maps and confidence weighting plus cascaded relational distillation.
ICDAR 2026 Competition on Writer Identification and Pen Classification from Hand-Drawn Circles cs.CV · 2026-05-08 · accept · none · ref 23 · internal anchor
A new dataset of hand-drawn circles from 66 writers and 8 pens yields competition results of 64.8% top-1 accuracy for open-set writer identification and 92.7% for pen classification.
Leveraging Image Generators to Address Training Data Scarcity: The Gen4Regen Dataset for Forest Regeneration Mapping cs.CV · 2026-05-07 · conditional · none · ref 81 · internal anchor
Mixing real UAV imagery with 2101 AI-generated image-mask pairs improves semantic segmentation F1 scores for fine-grained forest species by over 15 percentage points overall and up to 30 points for rare classes.
Cheeger--Hodge Contrastive Learning for Structurally Robust Graph Representation Learning cs.LG · 2026-04-29 · unverdicted · none · ref 78 · internal anchor
CHCL aligns a Cheeger-Hodge joint signature across graph augmentations to produce embeddings that remain stable under local structural changes.
Beyond Binary Contrast: Modeling Continuous Skeleton Action Spaces with Transitional Anchors cs.CV · 2026-04-20 · unverdicted · none · ref 69 · internal anchor
TranCLR models continuous skeleton action spaces with transitional anchors and multi-level manifold calibration, yielding smoother and more accurate representations than binary contrastive methods.
PAC-Bayes Bounds for Gibbs Posteriors via Singular Learning Theory stat.ML · 2026-04-19 · unverdicted · none · ref 53 · internal anchor
PAC-Bayes bounds for Gibbs posteriors are obtained via singular learning theory, producing explicit and tighter posterior-averaged risk bounds that adapt to data structure in overparameterized models.
Human Gaze-based Dual Teacher Guidance Learning for Semi-Supervised Medical Image Segmentation eess.IV · 2026-04-12 · unverdicted · none · ref 6 · internal anchor
HG-DTGL integrates human gaze as an extra teacher in mean-teacher learning via GazeMix, MGP module and Gaze Loss, reporting superior segmentation across ten organs on multiple modalities.
Feature-Aware Anisotropic Local Differential Privacy for Utility-Preserving Graph Representation Learning in Metal Additive Manufacturing cs.LG · 2026-04-06 · unverdicted · none · ref 47 · internal anchor
FI-LDP-HGAT applies feature-importance-aware anisotropic local differential privacy to a hierarchical graph attention network, recovering 81.5% utility at epsilon=4 and 0.762 defect recall at epsilon=2 on a DED porosity dataset while outperforming standard LDP and DP-SGD baselines.
OASIC: Occlusion-Agnostic and Severity-Informed Classification cs.CV · 2026-04-05 · conditional · none · ref 30 · internal anchor
OASIC uses anomaly-based masking and severity estimation to select occlusion-matched models, improving AUC on occluded images by up to 23.7 points.
Can LLMs Learn to Reason Robustly under Noisy Supervision? cs.LG · 2026-04-05 · conditional · none · ref 33 · internal anchor
Online Label Refinement lets LLMs learn robust reasoning from noisy supervision by correcting labels when majority answers show rising rollout success and stable history, delivering 3-4% gains on math and reasoning benchmarks even at high noise levels.
R\'enyi Attention Entropy for Patch Pruning cs.CV · 2026-04-04 · unverdicted · none · ref 32 · internal anchor
Rényi entropy of attention maps serves as a tunable criterion for pruning redundant patches in vision transformers, reducing compute with preserved accuracy on image recognition.
YOLOv12: Attention-Centric Real-Time Object Detectors cs.CV · 2025-02-18 · unverdicted · none · ref 66 · internal anchor
YOLOv12 is a new attention-based real-time object detector that reports higher accuracy than YOLOv10, YOLOv11, and RT-DETR variants at comparable or better speed and efficiency.
Revisiting Feature Prediction for Learning Visual Representations from Video cs.CV · 2024-02-15 · conditional · none · ref 87 · internal anchor
V-JEPA models trained only on feature prediction from 2 million public videos achieve 81.9% on Kinetics-400, 72.2% on Something-Something-v2, and 77.9% on ImageNet-1K using frozen ViT-H/16 backbones.
CAST: Channel-Aware Spatial Transfer Learning with Pseudo-Image Radar for Sign Language Recognition cs.CV · 2026-05-09 · unverdicted · none · ref 31 · internal anchor
CAST achieves 80.5% Top-1 accuracy on radar-only sign language recognition by fusing physics-aware CVD and RTM representations through channel-aware spatial attention and asymmetric cross-attention.
Agentic AIs Are the Missing Paradigm for Out-of-Distribution Generalization in Foundation Models cs.LG · 2026-05-07 · unverdicted · none · ref 38 · internal anchor
Agentic AI systems are required to overcome the parameter coverage ceiling that prevents foundation models from handling certain out-of-distribution cases.
HiMix: Hierarchical Artifact-aware Mixup for Generalized Synthetic Image Detection cs.CV · 2026-04-30 · unverdicted · none · ref 52 · internal anchor
HiMix combines mixup augmentation to create transitional real-fake samples with hierarchical global-local artifact feature fusion to achieve better generalization in detecting AI-generated images from unseen generators.
Investigating Bias and Fairness in Appearance-based Gaze Estimation cs.CV · 2026-04-12 · unverdicted · none · ref 79 · internal anchor
First large-scale fairness audit of gaze estimators reveals sizable accuracy disparities by ethnicity and gender, with existing mitigation methods providing only marginal fairness gains.
Beyond Surface Artifacts: Capturing Shared Latent Forgery Knowledge Across Modalities cs.CV · 2026-04-09 · unverdicted · none · ref 64 · internal anchor
Introduces MAF framework and DeepModal-Bench to capture universal cross-modal forgery traces for better generalization in multimodal deepfake detection.
Multi-Aspect Knowledge Distillation for Language Model with Low-rank Factorization cs.CL · 2026-04-03 · unverdicted · none · ref 4 · internal anchor
MaKD distills pre-trained language models by deeply mimicking self-attention and feed-forward modules across aspects using low-rank factorization, matching strong baselines at the same parameter budget and extending to auto-regressive models.
Why Invariance is Not Enough for Biomedical Domain Generalization and How to Fix It eess.IV · 2026-04-02 · unverdicted · none · ref 69 · internal anchor
MaskGen improves domain generalization for biomedical image segmentation by using source intensities plus domain-stable foundation model representations with minimal added complexity.
YOLOv4: Optimal Speed and Accuracy of Object Detection cs.CV · 2020-04-23 · unverdicted · none · ref 92 · internal anchor
YOLOv4 achieves 43.5% AP (65.7% AP50) on MS COCO at ~65 FPS on Tesla V100 by integrating WRC, CSP, CmBN, SAT, Mish activation, Mosaic augmentation, DropBlock, and CIoU loss.
an interpretable vision transformer framework for automated brain tumor classification cs.CV · 2026-04-23 · unverdicted · none · ref 15 · internal anchor
Vision Transformer with CLAHE preprocessing, two-stage fine-tuning, MixUp/CutMix, EMA, TTA, and attention rollout achieves 99.29% accuracy and 99.25% macro F1 on four-class brain tumor MRI classification from 7023 scans.
A Wasserstein GAN-based climate scenario generator for risk management and insurance: the case of soil subsidence cs.LG · 2026-04-22 · unverdicted · none · ref 67 · internal anchor
A conditional Wasserstein GAN generates plausible future SWI drought trajectories for French insurance risk management under climate change.
PR3DICTR: A modular AI framework for medical 3D image-based detection and outcome prediction cs.CV · 2026-04-03 · unverdicted · none · ref 14 · internal anchor
PR3DICTR is a new open-access modular framework for 3D medical image classification and outcome prediction that works with as little as two lines of code.
Image-Based Malware Type Classification on MalNet-Image Tiny: Effects of Multi-Scale Fusion, Transfer Learning, Data Augmentation, and Schedule-Free Optimization cs.CR · 2026-04-22 · unverdicted · none · ref 18 · internal anchor
Pretraining plus Mixup/TrivialAugment and a feature pyramid network lift macro-F1 from 0.65 to 0.69 on 43-class malware image classification while cutting training epochs from 96 to 10.

mixup: Beyond Empirical Risk Minimization

hub tools

claims ledger

co-cited works

fields

years

verdicts

representative citing papers

citing papers explorer