arxiv: 2604.18759 · v1 · submitted 2026-04-20 · 💻 cs.CL

Recognition: unknown

Model-Agnostic Meta Learning for Class Imbalance Adaptation

Hanshu Rao , Guangzeng Han , Xiaolei Huang

Authors on Pith no claims yet

Pith reviewed 2026-05-10 04:44 UTC · model grok-4.3

classification 💻 cs.CL

keywords class imbalancemeta-learningNLPinstance weightingresamplingbi-level optimizationhard examples

0 comments

The pith

HAMR uses bi-level optimization to dynamically weight hard minority instances and their neighbors in NLP tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Hardness-Aware Meta-Resample, or HAMR, as a method to address class imbalance in natural language processing by prioritizing difficult samples from rare classes. It relies on bi-level optimization to learn instance weights and then applies neighborhood-aware resampling to emphasize those hard examples along with similar ones. This matters for applications such as biomedical text analysis, disaster response classification, and sentiment detection, where minority classes often cause models to fail. If the approach holds, it offers a way to improve performance on imbalanced data without relying on fixed balancing rules tailored to each dataset.

Core claim

HAMR employs bi-level optimizations to dynamically estimate instance-level weights that prioritize genuinely challenging samples and minority classes, while a neighborhood-aware resampling mechanism amplifies training focus on hard examples and their semantically similar neighbors. The framework is tested on six imbalanced datasets across biomedical, disaster response, and sentiment domains, where it yields substantial gains for minority classes and outperforms strong baselines.

What carries the argument

The Hardness-Aware Meta-Resample (HAMR) framework, which uses bi-level optimization to compute dynamic instance weights and pairs it with neighborhood-aware resampling to focus training on difficult minority examples.

If this is right

Minority class performance improves substantially across the tested datasets.
HAMR outperforms strong baselines consistently in biomedical, disaster response, and sentiment tasks.
The bi-level optimization and neighborhood resampling modules contribute synergistically to the observed gains.
The method functions as a flexible, generalizable adaptation for class imbalance in varied NLP settings.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The weighting mechanism could reduce reliance on manual class balancing steps when class difficulty shifts within a single domain.
The same bi-level structure might transfer to non-text classification problems if an analogous definition of hardness is available.
Checking performance on larger transformer models would clarify whether the added optimization steps remain practical at scale.

Load-bearing premise

The bi-level optimization will produce instance weights that generalize beyond the six evaluated datasets without overfitting to their particular difficulty distributions or domain characteristics.

What would settle it

Running HAMR on a seventh imbalanced dataset drawn from a different domain and checking whether minority-class F1 scores remain higher than those achieved by standard oversampling or reweighting baselines.

Figures

Figures reproduced from arXiv: 2604.18759 by Guangzeng Han, Hanshu Rao, Xiaolei Huang.

**Figure 1.** Figure 1: Framework of Hardness-Aware Meta-Resample (HAMR). HAMR employs a bi-level optimization: an inner loop performs an intermediate model update using pre-meta weights, and an outer loop updates the weighting network from meta-validation feedback and applies its post-meta weights for the actual model update. Embedding-based neighborhoods guide resampling toward clusters of hard examples, complementing adaptive … view at source ↗

**Figure 2.** Figure 2: Model performance on majority and minority classes, grouped by quartiles, with Q1 denoting the rarest [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

read the original abstract

Class imbalance is a widespread challenge in NLP tasks, significantly hindering robust performance across diverse domains and applications. We introduce Hardness-Aware Meta-Resample (HAMR), a unified framework that adaptively addresses both class imbalance and data difficulty. HAMR employs bi-level optimizations to dynamically estimate instance-level weights that prioritize genuinely challenging samples and minority classes, while a neighborhood-aware resampling mechanism amplifies training focus on hard examples and their semantically similar neighbors. We validate HAMR on six imbalanced datasets covering multiple tasks and spanning biomedical, disaster response, and sentiment domains. Experimental results show that HAMR achieves substantial improvements for minority classes and consistently outperforms strong baselines. Extensive ablation studies demonstrate that our proposed modules synergistically contribute to performance gains and highlight HAMR as a flexible and generalizable approach for class imbalance adaptation. Code is available at https://github.com/trust-nlp/ImbalanceLearning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

HAMR pairs bi-level meta-optimization for instance weights with neighborhood resampling, but the abstract gives too little on baselines and implementation details to judge the gains.

read the letter

The main point is that this paper puts together bi-level optimization to learn hardness-aware weights and a neighborhood-aware resampling step that boosts hard minority examples plus their similar neighbors. That specific combination is new relative to the prior work they cite, even if each piece has been tried separately before. They test it on six datasets spanning biomedical, disaster, and sentiment tasks and say the parts reinforce each other in ablations while beating strong baselines. Releasing code is useful for anyone who wants to try the framework on their own imbalanced data. Those are the concrete positives. The evidence side is thinner. The abstract claims substantial minority-class gains and consistent outperformance but supplies no numbers on baseline performance, no significance tests, and no discussion of how datasets or splits were chosen. That leaves room for the usual worries about post-hoc selection. The stress-test concern about circularity also lands: if the neighborhood step builds similarity from embeddings or predictions of the model currently being trained, then resampling decisions depend on the same parameters they are meant to influence. The abstract does not say they freeze an external encoder or use fixed embeddings, so this needs explicit checking in the full paper. This work is for applied researchers who need a meta-learning tool for text classification under imbalance. A reader facing real deployment constraints in medicine or crisis data could get practical value from the code and the combined approach. It shows straightforward engagement with the literature and a clear empirical setup. I would send it to peer review rather than desk reject. The core idea is coherent enough to evaluate, and the multi-domain experiments give referees something to work with, even if revisions will be needed for more rigorous reporting and clarification on the resampling mechanics.

Referee Report

2 major / 2 minor

Summary. The paper introduces Hardness-Aware Meta-Resample (HAMR), a model-agnostic framework for class imbalance in NLP that uses bi-level optimization to learn instance-level weights prioritizing hard minority samples, combined with a neighborhood-aware resampling step that amplifies focus on difficult examples and their semantically similar neighbors. It reports consistent outperformance over strong baselines on six imbalanced datasets spanning biomedical, disaster response, and sentiment domains, with ablations showing synergistic contributions from the proposed modules.

Significance. If the empirical results hold under rigorous controls, HAMR provides a flexible, generalizable approach to class imbalance adaptation that integrates meta-learning with adaptive resampling. The public code release supports reproducibility, and the multi-domain evaluation strengthens the case for broader applicability beyond the tested tasks.

major comments (2)

[Method] Method section (bi-level optimization and neighborhood construction): the description does not specify whether neighborhood similarity is computed from fixed external embeddings or from the evolving model parameters/predictions during training. If the latter, the resampling decisions become dependent on the same parameters being optimized, creating a potential circular feedback loop that could inflate gains on minority classes without providing an independent hardness signal.
[Experiments] Experiments section (results and ablations): the central claim of 'substantial improvements' and 'consistent outperformance' is presented without reported quantitative details on baseline performance levels, statistical significance (e.g., p-values or confidence intervals across runs), or controls for post-hoc dataset selection, which are load-bearing for assessing whether the gains generalize or reflect particular dataset characteristics.

minor comments (2)

[Method] Clarify the exact bi-level optimization formulation, including how the inner and outer loops interact with the resampling weights, to improve reproducibility.
[Discussion] Add explicit discussion of limitations, such as computational overhead of the bi-level optimization and potential sensitivity to hyperparameter choices in the neighborhood construction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below with specific responses and commit to revisions that improve clarity and rigor without altering the core contributions.

read point-by-point responses

Referee: [Method] Method section (bi-level optimization and neighborhood construction): the description does not specify whether neighborhood similarity is computed from fixed external embeddings or from the evolving model parameters/predictions during training. If the latter, the resampling decisions become dependent on the same parameters being optimized, creating a potential circular feedback loop that could inflate gains on minority classes without providing an independent hardness signal.

Authors: We appreciate the referee highlighting this potential ambiguity. In the HAMR framework, neighborhood similarity is computed using fixed external embeddings from a pre-trained Sentence-BERT model, which remain unchanged during training and are independent of the bi-level optimization process for instance weights. This design ensures the resampling step receives an external hardness signal rather than relying on evolving model predictions. We will revise the method section to explicitly state this choice, include implementation details on the embedding model, and add a brief discussion of how this separation prevents circular feedback. revision: yes
Referee: [Experiments] Experiments section (results and ablations): the central claim of 'substantial improvements' and 'consistent outperformance' is presented without reported quantitative details on baseline performance levels, statistical significance (e.g., p-values or confidence intervals across runs), or controls for post-hoc dataset selection, which are load-bearing for assessing whether the gains generalize or reflect particular dataset characteristics.

Authors: We agree that stronger statistical reporting and transparency on dataset choices are needed to support the claims. The current manuscript reports performance metrics across six datasets but lacks explicit standard deviations, p-values, and a dedicated discussion of selection criteria. In revision, we will expand the experiments section to include mean results with standard deviations over multiple runs, paired statistical tests with p-values, confidence intervals, and a paragraph explaining that the datasets were selected based on established benchmarks in the class imbalance literature (biomedical, disaster, sentiment) rather than post-hoc filtering. These changes will be added without modifying the existing results. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical framework with held-out evaluation

full rationale

The paper introduces HAMR via bi-level optimization for instance weights and a neighborhood-aware resampling step, then reports performance gains on six held-out test sets across domains. No equations, predictions, or uniqueness claims reduce the reported results to quantities defined solely by the same fitted parameters or by self-citation chains. The derivation chain is self-contained as an algorithmic proposal whose validity is assessed externally via standard train/test splits rather than by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on standard supervised-learning assumptions plus the domain assumption that semantic neighborhoods in embedding space reliably indicate label similarity for resampling.

axioms (2)

domain assumption Bi-level optimization can stably estimate instance weights that reflect both class rarity and example difficulty.
Invoked in the description of the meta-optimization loop.
domain assumption Neighborhoods defined by embedding similarity contain useful additional training signal for minority classes.
Basis for the neighborhood-aware resampling mechanism.

pith-pipeline@v0.9.0 · 5447 in / 1197 out tokens · 30908 ms · 2026-05-10T04:44:51.212585+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

76 extracted references · 26 canonical work pages · 1 internal anchor

[1]

PyTorch: An Imperative Style, High-Performance Deep Learning Library , url =

Paszke, Adam and Gross, Sam and Massa, Francisco and Lerer, Adam and Bradbury, James and Chanan, Gregory and Killeen, Trevor and Lin, Zeming and Gimelshein, Natalia and Antiga, Luca and Desmaison, Alban and Kopf, Andreas and Yang, Edward and DeVito, Zachary and Raison, Martin and Tejani, Alykhan and Chilamkurthy, Sasank and Steiner, Benoit and Fang, Lu an...
[2]

Examining and Adapting Time for Multilingual Classification via Mixture of Temporal Experts

Liu, Weisi and Han, Guangzeng and Huang, Xiaolei. Examining and Adapting Time for Multilingual Classification via Mixture of Temporal Experts. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.naacl-long.313

work page doi:10.18653/v1/2025.naacl-long.313 2025
[3]

Not All Features Deserve Attention: Graph-Guided Dependency Learning for Tabular Data Generation with Language Models

Zhang, Zheyu and Yang, Shuo and Prenkaj, Bardh and Kasneci, Gjergji. Not All Features Deserve Attention: Graph-Guided Dependency Learning for Tabular Data Generation with Language Models. Findings of the Association for Computational Linguistics: EMNLP 2025. 2025. doi:10.18653/v1/2025.findings-emnlp.330

work page doi:10.18653/v1/2025.findings-emnlp.330 2025
[4]

Attributes as Textual Genes: Leveraging LLM s as Genetic Algorithm Simulators for Conditional Synthetic Data Generation

Han, Guangzeng and Liu, Weisi and Huang, Xiaolei. Attributes as Textual Genes: Leveraging LLM s as Genetic Algorithm Simulators for Conditional Synthetic Data Generation. Findings of the Association for Computational Linguistics: EMNLP 2025. 2025. doi:10.18653/v1/2025.findings-emnlp.1055

work page doi:10.18653/v1/2025.findings-emnlp.1055 2025
[5]

Preprint , year=

Multiconir: Towards multi-condition information retrieval , author=. Preprint , year=
[6]

Introduction to the Bio-entity Recognition Task at JNLPBA

Collier, Nigel and Kim, Jin-Dong. Introduction to the Bio-entity Recognition Task at JNLPBA. Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications ( NLPBA / B io NLP ). 2004

2004
[7]

and Camacho-Collados, Jose

Ushio, Asahi and Neves, Leonardo and Silva, Vitor and Barbieri, Francesco. and Camacho-Collados, Jose. N amed E ntity R ecognition in T witter: A D ataset and A nalysis on S hort- T erm T emporal S hifts. The 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural L...

2022
[8]

T - NER : An All-Round Python Library for Transformer-based Named Entity Recognition

Ushio, Asahi and Camacho-Collados, Jose. T - NER : An All-Round Python Library for Transformer-based Named Entity Recognition. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. 2021. doi:10.18653/v1/2021.eacl-demos.7

work page doi:10.18653/v1/2021.eacl-demos.7 2021
[9]

HumAID: Human-Annotated Disaster Incidents Data from Twitter with Deep Learning Benchmarks , volume=

Alam, Firoj and Qazi, Umair and Imran, Muhammad and Ofli, Ferda , year=. HumAID: Human-Annotated Disaster Incidents Data from Twitter with Deep Learning Benchmarks , volume=. doi:10.1609/icwsm.v15i1.18116 , booktitle=

work page doi:10.1609/icwsm.v15i1.18116
[10]

and Ng, Andrew and Potts, Christopher

Socher, Richard and Perelygin, Alex and Wu, Jean and Chuang, Jason and Manning, Christopher D. and Ng, Andrew and Potts, Christopher. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 2013

2013
[11]

ASGARD: A Portable Architecture for Multilingual Dialogue Systems , author=. Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=
[12]

Forty-second International Conference on Machine Learning , year=

Learning Imbalanced Data with Beneficial Label Noise , author=. Forty-second International Conference on Machine Learning , year=
[13]

Dice Loss for Data-imbalanced NLP Tasks

Li, Xiaoya and Sun, Xiaofei and Meng, Yuxian and Liang, Junjun and Wu, Fei and Li, Jiwei. Dice Loss for Data-imbalanced NLP Tasks. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020. doi:10.18653/v1/2020.acl-main.45

work page doi:10.18653/v1/2020.acl-main.45 2020
[14]

2024 , eprint=

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference , author=. 2024 , eprint=

2024
[15]

2021 , eprint=

DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing , author=. 2021 , eprint=

2021
[16]

2023 , eprint=

C-Pack: Packaged Resources To Advance General Chinese Embedding , author=. 2023 , eprint=

2023
[17]

Proceedings of the 34th International Conference on Machine Learning - Volume 70 , pages =

Finn, Chelsea and Abbeel, Pieter and Levine, Sergey , title =. Proceedings of the 34th International Conference on Machine Learning - Volume 70 , pages =. 2017 , publisher =

2017
[18]

Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval , pages =

Moreo, Alejandro and Esuli, Andrea and Sebastiani, Fabrizio , title =. Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval , pages =. 2016 , isbn =. doi:10.1145/2911451.2914722 , abstract =

work page doi:10.1145/2911451.2914722 2016
[19]

and Bowyer, Kevin W

Chawla, Nitesh V. and Bowyer, Kevin W. and Hall, Lawrence O. and Kegelmeyer, W. Philip , title =. J. Artif. Int. Res. , month = jun, pages =. 2002 , issue_date =

2002
[20]

Focal Loss for Dense Object Detection , year=

Lin, Tsung-Yi and Goyal, Priya and Girshick, Ross and He, Kaiming and Dollár, Piotr , journal=. Focal Loss for Dense Object Detection , year=
[21]

Advances in neural information processing systems , volume=

Learning imbalanced datasets with label-distribution-aware margin loss , author=. Advances in neural information processing systems , volume=
[22]

International Conference on Machine Learning , year=

Learning to Reweight Examples for Robust Deep Learning , author=. International Conference on Machine Learning , year=
[23]

Proceedings of the 33rd International Conference on Neural Information Processing Systems , articleno =

Shu, Jun and Xie, Qi and Yi, Lixuan and Zhao, Qian and Zhou, Sanping and Xu, Zongben and Meng, Deyu , title =. Proceedings of the 33rd International Conference on Neural Information Processing Systems , articleno =. 2019 , publisher =

2019
[24]

ArXiv , year=

The Faiss library , author=. ArXiv , year=
[25]

Introduction to machine learning: K-nearest neighbors , volume =

Zhang, Zhongheng , year =. Introduction to machine learning: K-nearest neighbors , volume =. Annals of Translational Medicine , doi =
[26]

IEEE Transactions on Neural Networks and Learning Systems , volume=

Class-imbalanced-aware distantly supervised named entity recognition , author=. IEEE Transactions on Neural Networks and Learning Systems , volume=. 2023 , publisher=

2023
[27]

Proceedings of the fifth international conference on Knowledge capture , pages=

Reducing class imbalance during active learning for named entity annotation , author=. Proceedings of the fifth international conference on Knowledge capture , pages=
[28]

IEEE Access , volume=

Named entity recognition utilized to enhance text classification while preserving privacy , author=. IEEE Access , volume=. 2023 , publisher=

2023
[29]

2015 2nd International conference on advanced informatics: concepts, theory and applications (ICAICTA) , pages=

Improving emotion classification in imbalanced YouTube dataset using SMOTE algorithm , author=. 2015 2nd International conference on advanced informatics: concepts, theory and applications (ICAICTA) , pages=. 2015 , organization=

2015
[30]

IEEE Access , volume=

Impact of SMOTE on imbalanced text features for toxic comments classification using RVVC model , author=. IEEE Access , volume=. 2021 , publisher=

2021
[31]

Class Balancing for Efficient Active Learning in Imbalanced Datasets

Fairstein, Yaron and Kalinsky, Oren and Karnin, Zohar and Kushilevitz, Guy and Libov, Alexander and Tolmach, Sofia. Class Balancing for Efficient Active Learning in Imbalanced Datasets. Proceedings of the 18th Linguistic Annotation Workshop (LAW-XVIII). 2024

2024
[32]

A Survey of Methods for Addressing Class Imbalance in Deep-Learning Based Natural Language Processing

Henning, Sophie and Beluch, William and Fraser, Alexander and Friedrich, Annemarie. A Survey of Methods for Addressing Class Imbalance in Deep-Learning Based Natural Language Processing. Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. 2023. doi:10.18653/v1/2023.eacl-main.38

work page doi:10.18653/v1/2023.eacl-main.38 2023
[33]

A nchor AL : Computationally Efficient Active Learning for Large and Imbalanced Datasets

Lesci, Pietro and Vlachos, Andreas. A nchor AL : Computationally Efficient Active Learning for Large and Imbalanced Datasets. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2024. doi:10.18653/v1/2024.naacl-long.467

work page doi:10.18653/v1/2024.naacl-long.467 2024
[34]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Defying imbalanced forgetting in class incremental learning , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
[35]

Proceedings of the AAAI Conference on Artificial Intelligence , author=

Imbalanced Label Distribution Learning , volume=. Proceedings of the AAAI Conference on Artificial Intelligence , author=. 2023 , month=. doi:10.1609/aaai.v37i9.26341 , abstractNote=

work page doi:10.1609/aaai.v37i9.26341 2023
[36]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

GBRIP: Granular Ball Representation for Imbalanced Partial Label Learning , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
[37]

Song, Hwanjun and Kim, Minseok and Lee, Jae-Gil , title =. Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence , articleno =. 2024 , isbn =. doi:10.1609/aaai.v38i19.30157 , abstract =

work page doi:10.1609/aaai.v38i19.30157 2024
[38]

Transformers: State-of-the-Art Natural Language Processing

Wolf, Thomas and Debut, Lysandre and Sanh, Victor and Chaumond, Julien and Delangue, Clement and Moi, Anthony and Cistac, Pierric and Rault, Tim and Louf, Remi and Funtowicz, Morgan and Davison, Joe and Shleifer, Sam and von Platen, Patrick and Ma, Clara and Jernite, Yacine and Plu, Julien and Xu, Canwen and Le Scao, Teven and Gugger, Sylvain and Drame, M...

work page doi:10.18653/v1/2020.emnlp-demos.6 2020
[39]

Advances in neural information processing systems , volume=

Pytorch: An imperative style, high-performance deep learning library , author=. Advances in neural information processing systems , volume=
[40]

Journal of big data , volume=

Survey on deep learning with class imbalance , author=. Journal of big data , volume=. 2019 , publisher=

2019
[41]

arXiv preprint arXiv:2210.03797 , year=

Named entity recognition in Twitter: A dataset and analysis on short-term temporal shifts , author=. arXiv preprint arXiv:2210.03797 , year=

work page arXiv
[42]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , pages=

Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks , author=. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , pages=
[43]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

Long-Tailed Question Answering in an Open World , author=. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=
[44]

IEEE Access , year=

Majority or minority: Data imbalance learning method for named entity recognition , author=. IEEE Access , year=
[45]

Charles J Wallace, Connie J Nelson, Robert Paul Liber- man, Robert A Aitchison, David Lukoff, John P El- der, and Chris Ferris

EDA: Easy data augmentation techniques for boosting performance on text classification tasks , author=. arXiv preprint arXiv:1901.11196 , year=

work page arXiv 1901
[46]

Understanding back-translation at scale

Understanding back-translation at scale , author=. arXiv preprint arXiv:1808.09381 , year=

work page arXiv
[47]

Transactions of the Association for Computational Linguistics , volume=

An empirical survey of data augmentation for limited data learning in nlp , author=. Transactions of the Association for Computational Linguistics , volume=. 2023 , publisher=

2023
[48]

2020 International Conference on Computer Communication and Network Security (CCNS) , pages=

A survey of text data augmentation , author=. 2020 International Conference on Computer Communication and Network Security (CCNS) , pages=. 2020 , organization=

2020
[49]

Co-mixup: Saliency guided joint mixup with supermodular diversity.arXiv preprint arXiv:2102.03065, 2021

Co-mixup: Saliency guided joint mixup with supermodular diversity , author=. arXiv preprint arXiv:2102.03065 , year=

work page arXiv
[50]

International joint conference on artificial intelligence , volume=

The foundations of cost-sensitive learning , author=. International joint conference on artificial intelligence , volume=. 2001 , organization=

2001
[51]

CoRR , volume =

Deqing Wang and Hui Zhang and Wenjun Wu and Mengxiang Lin , title =. CoRR , volume =. 2010 , url =. 1012.2609 , timestamp =

work page arXiv 2010
[52]

Bowyer and Nitesh V

Kevin W. Bowyer and Nitesh V. Chawla and Lawrence O. Hall and W. Philip Kegelmeyer , title =. CoRR , volume =. 2011 , url =. 1106.1813 , timestamp =

work page arXiv 2011
[53]

Journal of the American Medical Informatics Association , volume =

van den Goorbergh, Ruben and van Smeden, Maarten and Timmerman, Dirk and Van Calster, Ben , title =. Journal of the American Medical Informatics Association , volume =. 2022 , month =. doi:10.1093/jamia/ocac093 , url =

work page doi:10.1093/jamia/ocac093 2022
[54]

Meta-learning for dynamic tuning of active learning on stream classification , journal =

Vinicius Eiji Martins and Alberto Cano and Sylvio. Meta-learning for dynamic tuning of active learning on stream classification , journal =. 2023 , issn =. doi:https://doi.org/10.1016/j.patcog.2023.109359 , url =

work page doi:10.1016/j.patcog.2023.109359 2023
[55]

arXiv e-prints , keywords =

Carole H. Sudre and Wenqi Li and Tom Vercauteren and S. Generalised Dice overlap as a deep learning loss function for highly unbalanced segmentations , journal =. 2017 , url =. 1707.03237 , timestamp =

work page arXiv 2017
[56]

Journal of Information Science , volume=

A novel focal-loss and class-weight-aware convolutional neural network for the classification of in-text citations , author=. Journal of Information Science , volume=. 2023 , publisher=

2023
[57]

doi: 10.18653/v1/W18-5446

Wang, Alex and Singh, Amanpreet and Michael, Julian and Hill, Felix and Levy, Omer and Bowman, Samuel. GLUE : A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. Proceedings of the 2018 EMNLP Workshop B lackbox NLP : Analyzing and Interpreting Neural Networks for NLP. 2018. doi:10.18653/v1/W18-5446

work page doi:10.18653/v1/w18-5446 2018
[58]

Dealing with Data Imbalance in Text Classification , journal =

Cristian Padurariu and Mihaela Elena Breaban , keywords =. Dealing with Data Imbalance in Text Classification , journal =. 2019 , issn =. doi:https://doi.org/10.1016/j.procs.2019.09.229 , url =

work page doi:10.1016/j.procs.2019.09.229 2019
[59]

and Shenoy, Pradeep , booktitle=

Jain, Nishant and Suggala, Arun S. and Shenoy, Pradeep , booktitle=. Improving Generalization via Meta-Learning on Hard Samples , year=
[60]

2024 , eprint=

Learning from Noisy Labels via Self-Taught On-the-Fly Meta Loss Rescaling , author=. 2024 , eprint=

2024
[61]

Benchmarking Long-tail Generalization with Likelihood Splits

Godbole, Ameya and Jia, Robin. Benchmarking Long-tail Generalization with Likelihood Splits. Findings of the Association for Computational Linguistics: EACL 2023. 2023. doi:10.18653/v1/2023.findings-eacl.71

work page doi:10.18653/v1/2023.findings-eacl.71 2023
[62]

2019 , eprint=

Class-Balanced Loss Based on Effective Number of Samples , author=. 2019 , eprint=

2019
[63]

2020 , eprint=

Decoupling Representation and Classifier for Long-Tailed Recognition , author=. 2020 , eprint=

2020
[64]

, journal=

He, Haibo and Garcia, Edwardo A. , journal=. Learning from Imbalanced Data , year=
[65]

Class-Balanced Loss Based on Effective Number of Samples , year=

Cui, Yin and Jia, Menglin and Lin, Tsung-Yi and Song, Yang and Belongie, Serge , booktitle=. Class-Balanced Loss Based on Effective Number of Samples , year=
[66]

Wu, Yinjun and Stein, Adam and Gardner, Jacob and Naik, Mayur , title =. Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence , articleno =. 2023 , isbn =. doi:10.1609/aaai.v3...

work page doi:10.1609/aaai.v37i5.25756 2023
[67]

2025 , url=

Qwen3 technical report , author=. 2025 , url=

2025
[68]

2024 , institution=

The llama 3 herd of models , author=. 2024 , institution=

2024
[69]

ArXiv , year=

Qwen3 Technical Report , author=. ArXiv , year=
[70]

LoRA: Low-Rank Adaptation of Large Language Models

Edward J. Hu and Yelong Shen and Phillip Wallis and Zeyuan Allen. LoRA: Low-Rank Adaptation of Large Language Models , journal =. 2021 , url =. 2106.09685 , timestamp =

work page internal anchor Pith review Pith/arXiv arXiv 2021
[71]

Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias , url =

Yu, Yue and Zhuang, Yuchen and Zhang, Jieyu and Meng, Yu and Ratner, Alexander J and Krishna, Ranjay and Shen, Jiaming and Zhang, Chao , booktitle =. Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias , url =
[72]

THE FAISS LIBRARY , year=

Douze, Matthijs and Guzhva, Alexandr and Deng, Chengqi and Johnson, Jeff and Szilvasy, Gergely and Mazaré, Pierre-Emmanuel and Lomeli, Maria and Hosseini, Lucas and Jégou, Hervé , journal=. THE FAISS LIBRARY , year=
[73]

Transactions on Machine Learning Research , issn=

Exponential Moving Average of Weights in Deep Learning: Dynamics and Benefits , author=. Transactions on Machine Learning Research , issn=. 2024 , url=

2024
[74]

European Conference on Computer Vision , pages=

Robust Nearest Neighbors for Source-Free Domain Adaptation Under Class Distribution Shift , author=. European Conference on Computer Vision , pages=. 2024 , organization=

2024
[75]

2025 , eprint=

Cultivating Multidisciplinary Research and Education on GPU Infrastructure for Mid-South Institutions at the University of Memphis: Practice and Challenge , author=. 2025 , eprint=

2025
[76]

Bahri, Dara and Jiang, Heinrich and Gupta, Maya , booktitle =. Deep k-. 2020 , editor =

2020