pith. machine review for the scientific record. sign in

arxiv: 2605.12340 · v1 · submitted 2026-05-12 · 📊 stat.ML · cs.LG

Recognition: no theorem link

Online Learning-to-Defer with Varying Experts

Axel Carlier, Dang Hoang Duy, Lai Xing Ng, Maxime Meyer, Wei Tsang Ooi, Yannis Montreuil

Pith reviewed 2026-05-13 03:58 UTC · model grok-4.3

classification 📊 stat.ML cs.LG
keywords online learninglearning-to-deferbandit feedbackregret boundsmulticlass classificationvarying expertsH-consistency bounds
0
0 comments X

The pith

An online algorithm for learning to defer routes queries to a changing pool of experts and achieves sublinear regret in multiclass classification with bandit feedback.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the first online Learning-to-Defer method that handles streaming data, bandit feedback, and a dynamically varying set of experts in multiclass classification. It provides regret bounds that grow slower than linear in the time horizon T, specifically O((n + n_e) T^{2/3}) in general and O((n + n_e) sqrt(T)) when a low-noise condition holds. A sympathetic reader would care because real deployments involve continuous data arrival and experts that come and go, unlike static batch settings. The work shows this extension is possible while maintaining theoretical performance guarantees through new consistency bounds.

Core claim

The paper establishes an online L2D algorithm for multiclass classification with bandit feedback that accommodates a dynamically varying pool of experts. This algorithm attains regret guarantees of O((n + n_e) T^{2/3}) in the general case and O((n + n_e) sqrt(T)) under a low-noise condition, where n is the number of classes, n_e the number of distinct experts seen, and T the horizon. The analysis relies on novel H-consistency bounds in the online setting paired with first-order online convex optimization methods. Experiments confirm the method works on both synthetic and real-world data.

What carries the argument

The online L2D algorithm that uses novel H-consistency bounds combined with first-order methods for online convex optimization to route each query to either the model or one of the available experts.

If this is right

  • Standard batch Learning-to-Defer methods can be extended to handle streaming data and changing expert availability.
  • The regret scales linearly with the number of classes and distinct experts observed.
  • Improved sqrt(T) regret holds when the low-noise condition is satisfied.
  • Empirical performance on synthetic and real datasets supports the theoretical guarantees.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach may support applications like real-time decision systems where expert availability fluctuates over time.
  • Similar techniques could apply to other online decision problems involving deferral or routing.
  • Relaxing the bandit feedback assumption might lead to even tighter bounds in future work.

Load-bearing premise

The regret analysis depends on novel H-consistency bounds holding in the online framework and on the low-noise condition being satisfied; if either fails, the stated regret rates may not be achieved.

What would settle it

Observe the actual regret growth in a setting where the low-noise condition is violated; if regret grows faster than sqrt(T) while other conditions hold, the improved bound would be falsified.

Figures

Figures reproduced from arXiv: 2605.12340 by Axel Carlier, Dang Hoang Duy, Lai Xing Ng, Maxime Meyer, Wei Tsang Ooi, Yannis Montreuil.

Figure 1
Figure 1. Figure 1: Expert Accuracies by Regions on Reuters4. [PITH_FULL_IMAGE:figures/full_fig_p009_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Results of experiment on Reuters4 for setting 3: drifting availability and drifting expertise. Whiskers denote standard deviations computed over 5 independent runs. dynamically varying experts. Our analysis extends classical H-consistency bounds to the online setting and establishes sublinear regret guarantees of O((n + ne)T 2/3 ) in the general case and O((n+ne) √ T) under a near-realizable condition. The… view at source ↗
Figure 3
Figure 3. Figure 3: Results of the synthetic experiment for setting 1: fixed availability and expertise. Error bars represent standard deviations across five independent runs [PITH_FULL_IMAGE:figures/full_fig_p030_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Results of the synthetic experiment for setting 2: fixed expert availability and expertise. Error bars represent standard deviations across five independent runs. Setting 3: Drifting Availability and Expertise We report results for this setting in [PITH_FULL_IMAGE:figures/full_fig_p031_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Evolution of the Expert Accuracies by Regions [PITH_FULL_IMAGE:figures/full_fig_p032_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Results of the synthetic experiment for setting 3: drifting availability and drifting expertise. Whiskers denote standard deviations computed over 5 independent runs. F.3 Details of Reuters4 Dataset Experiments Setting 1: Fixed Availability and Expertise. The results are summarized in [PITH_FULL_IMAGE:figures/full_fig_p032_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Results of experiment on Reuters4 for setting 1: fixed availability and expertise. Whiskers denote standard deviations computed over 5 independent runs. Setting 2: Drifting Availability and Fixed Expertise. Accuracies of experts are given in [PITH_FULL_IMAGE:figures/full_fig_p033_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Results of experiment on Reuters4 for setting 2: drifting availability and fixed expertise. Whiskers denote standard deviations computed over 5 independent runs. groups, each treated as a human expert gi , i = 1, 2, 3. For an input image x, expert gi predicts by sampling uniformly at random from the labels in group i. We further introduce expert-specific noise regions to model heterogeneous expertise: g1 i… view at source ↗
Figure 9
Figure 9. Figure 9: Expert Accuracies by Regions on Reuters4. [PITH_FULL_IMAGE:figures/full_fig_p035_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Results of experiment on Reuters4 for setting 3: drifting availability and drifting expertise. Whiskers denote standard deviations computed over 5 independent runs [PITH_FULL_IMAGE:figures/full_fig_p035_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Results of experiment on CIFAR10H with image corruption from CIFAR10C. Results are averaged over 5 [PITH_FULL_IMAGE:figures/full_fig_p036_11.png] view at source ↗
read the original abstract

Learning-to-Defer (L2D) methods route each query either to a predictive model or to external experts. While existing work studies this problem in batch settings, real-world deployments require handling streaming data, changing expert availability, and shifting expert distribution. We introduce the first online L2D algorithm for multiclass classification with bandit feedback and a dynamically varying pool of experts. Our method achieves regret guarantees of $O((n+n_e)T^{2/3})$ in general and $O((n+n_e)\sqrt{T})$ under a low-noise condition, where $T$ is the time horizon, $n$ is the number of labels, and $n_e$ is the number of distinct experts observed across rounds. The analysis builds on novel $\mathcal{H}$-consistency bounds for the online framework, combined with first-order methods for online convex optimization. Experiments on synthetic and real-world datasets demonstrate that our approach effectively extends standard Learning-to-Defer to settings with varying expert availability and reliability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the first online Learning-to-Defer (L2D) algorithm for multiclass classification with bandit feedback and a dynamically varying pool of experts. It claims regret bounds of O((n + n_e) T^{2/3}) in general and O((n + n_e) √T) under a low-noise condition, derived via novel H-consistency bounds combined with first-order online convex optimization methods. Experiments on synthetic and real-world datasets are presented to demonstrate effectiveness in settings with changing expert availability.

Significance. If the regret analysis holds, this constitutes a solid contribution by extending batch L2D to realistic online streaming regimes with varying experts, providing the first such guarantees in the bandit multiclass setting. The technical approach of adapting H-consistency bounds to the online framework is a clear strength, as is the improved √T rate under low noise, which parallels standard results in online learning. The empirical validation supports applicability, though the overall impact would benefit from explicit verification of the low-noise regime in experiments.

major comments (2)
  1. [§4 (Regret Analysis)] §4 (Regret Analysis): The novel H-consistency bounds are central to both the general T^{2/3} and the improved √T claims; the manuscript must explicitly derive or state the precise conditions (including any dependence on the varying expert pool and bandit feedback) under which these bounds apply, as the current high-level description leaves open whether they hold without additional assumptions.
  2. [Experiments section] Experiments section: The low-noise condition is required for the O((n + n_e) √T) rate, yet no verification, estimation, or ablation is reported on whether the synthetic or real-world datasets satisfy it; without this, the experiments do not provide evidence that the improved rate is attained or that the general bound is the relevant one.
minor comments (2)
  1. [Introduction] Introduction: The quantity n_e (number of distinct experts observed across rounds) should be defined at first use with a brief remark on how it is determined in the online stream, to avoid ambiguity for readers.
  2. [Notation and preliminaries] Notation and preliminaries: Ensure the bandit feedback model (e.g., loss observation only for the chosen action) is restated consistently when transitioning from the L2D deferral decision to the regret definition.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment and recommendation for minor revision. The comments on the regret analysis and experiments are constructive, and we address each point below with plans for clarification in the revised manuscript.

read point-by-point responses
  1. Referee: [§4 (Regret Analysis)] §4 (Regret Analysis): The novel H-consistency bounds are central to both the general T^{2/3} and the improved √T claims; the manuscript must explicitly derive or state the precise conditions (including any dependence on the varying expert pool and bandit feedback) under which these bounds apply, as the current high-level description leaves open whether they hold without additional assumptions.

    Authors: We agree that the conditions underlying the H-consistency bounds merit a more explicit statement. In the revised manuscript we will expand Section 4 with a dedicated paragraph that derives the bounds from first principles: the surrogate loss is assumed convex and H-consistent with the multiclass 0-1 loss (standard in the batch L2D literature), the online adaptation proceeds via first-order online convex optimization with unbiased loss estimates under bandit feedback, and the dependence on the expert pool appears only through the aggregate quantity n_e (total distinct experts observed). No further assumptions on the arrival process of experts or on the feedback mechanism are required beyond boundedness of the loss and the usual online-convex-optimization step-size conditions. This addition will remove any ambiguity while preserving the stated regret rates. revision: yes

  2. Referee: [Experiments section] Experiments section: The low-noise condition is required for the O((n + n_e) √T) rate, yet no verification, estimation, or ablation is reported on whether the synthetic or real-world datasets satisfy it; without this, the experiments do not provide evidence that the improved rate is attained or that the general bound is the relevant one.

    Authors: We acknowledge that the experiments section currently contains no explicit check of the low-noise condition. In the revision we will add a short subsection (or appendix paragraph) that reports a simple empirical proxy for noise level on each dataset—for example, the average disagreement rate between the model and the best expert, or a Tsybakov-style noise estimate on the synthetic data. This will allow readers to judge whether the improved √T regime is plausibly active in the reported runs or whether the general T^{2/3} bound is the operative guarantee. The main experimental claims (practical superiority over baselines under varying expert availability) remain valid independently of the noise condition. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The provided abstract and description state that the regret bounds are obtained by building novel H-consistency bounds for the online setting and then applying standard first-order online convex optimization methods. No equations, self-definitions, fitted parameters renamed as predictions, or load-bearing self-citations are exhibited that would reduce the claimed O((n+ne)T^{2/3}) or O((n+ne)√T) guarantees to the inputs by construction. The analysis is presented as extending prior L2D work with new bounds plus off-the-shelf OCO tools; absent any quoted reduction (e.g., a bound that is tautological with its own fitting procedure), the derivation remains self-contained and independent of the target result.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Review performed on abstract only; full derivation and any hidden parameters or assumptions cannot be audited.

axioms (2)
  • domain assumption Novel H-consistency bounds exist for the online multiclass deferral setting
    Invoked to obtain the regret guarantees; location implied in the analysis description.
  • domain assumption Low-noise condition holds for the improved O(sqrt(T)) bound
    Explicitly required for the faster rate; no verification details in abstract.

pith-pipeline@v0.9.0 · 5483 in / 1351 out tokens · 63448 ms · 2026-05-13T03:58:21.515926+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

221 extracted references · 221 canonical work pages · 11 internal anchors

  1. [1]

    MIMIC-III, a freely accessible critical care database , volume =

    Johnson, Alistair and Pollard, Tom and Shen, Lu and Lehman, Li-wei and Feng, Mengling and Ghassemi, Mohammad and Moody, Benjamin and Szolovits, Peter and Celi, Leo and Mark, Roger , year =. MIMIC-III, a freely accessible critical care database , volume =. Scientific Data , doi =

  2. [2]

    MIMIC-IV, a freely accessible electronic health record dataset , volume =

    Johnson, Alistair and Bulgarelli, Lucas and Shen, Lu and Gayles, Alvin and Shammout, Ayad and Horng, Steven and Pollard, Tom and Moody, Benjamin and Gow, Brian and Lehman, Li-wei and Celi, Leo and Mark, Roger , year =. MIMIC-IV, a freely accessible electronic health record dataset , volume =. Scientific Data , doi =

  3. [3]

    SQuAD: 100,000+ Questions for Machine Comprehension of Text

    SQuAD: 100,000+ Questions for Machine Comprehension of Text , author=. arXiv preprint arXiv:1606.05250 , year=

  4. [4]

    2016 , volume=

    Hazan, Elad , booktitle=. 2016 , volume=

  5. [5]

    2024 , eprint=

    A Survey on Human-AI Teaming with Large Pre-Trained Models , author=. 2024 , eprint=

  6. [6]

    Foundations of machine learning

    Mehryar Mohri and Afshin Rostamizadeh and Ameet Talwalkar. Foundations of machine learning. 2012

  7. [7]

    Advances in Neural Information Processing Systems , volume=

    H -Consistency Bounds: Characterization and Extensions , author=. Advances in Neural Information Processing Systems , volume=

  8. [9]

    Combining Human Predictions with Model Probabilities via Confusion Matrices and Calibration , url =

    Kerrigan, Gavin and Smyth, Padhraic and Steyvers, Mark , booktitle =. Combining Human Predictions with Model Probabilities via Confusion Matrices and Calibration , url =

  9. [10]

    International Conference on Artificial Intelligence and Statistics , pages=

    Mitigating Underfitting in Learning to Defer with Consistent Losses , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2024 , organization=

  10. [11]

    Proceedings of The 27th International Conference on Artificial Intelligence and Statistics , pages =

    Learning to Defer to a Population: A Meta-Learning Approach , author =. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics , pages =. 2024 , editor =

  11. [12]

    Uncertainty in Artificial Intelligence , pages=

    Counterfactual inference of second opinions , author=. Uncertainty in Artificial Intelligence , pages=. 2022 , organization=

  12. [14]

    Proceedings of The 27th International Conference on Artificial Intelligence and Statistics , pages =

    Theoretically Grounded Loss Functions and Algorithms for Score-Based Multi-Class Abstention , author =. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics , pages =. 2024 , editor =

  13. [15]

    The Pascal Visual Object Classes (VOC) challenge , volume =

    Everingham, Mark and Van Gool, Luc and Williams, Christopher and Winn, John and Zisserman, Andrew , year =. The Pascal Visual Object Classes (VOC) challenge , volume =. International Journal of Computer Vision , doi =

  14. [16]

    2019 , eprint=

    The Algorithmic Automation Problem: Prediction, Triage, and Human Effort , author=. 2019 , eprint=

  15. [17]

    2009 , url=

    Learning Multiple Layers of Features from Tiny Images , author=. 2009 , url=

  16. [18]

    Sparse spatial autoregressions , journal =

    R. Sparse spatial autoregressions , journal =. 1997 , issn =. doi:https://doi.org/10.1016/S0167-7152(96)00140-X , url =

  17. [19]

    Fisher and the making of maximum likelihood 1912-1922

    Ohn Aldrich, R A. Fisher and the making of maximum likelihood 1912-1922. Statistical Science

  18. [20]

    2017 , eprint=

    MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , author=. 2017 , eprint=

  19. [21]

    2016 , eprint=

    Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , author=. 2016 , eprint=

  20. [22]

    ISAIM , year=

    Principled Approaches for Learning to Defer with Multiple Experts , author=. ISAIM , year=

  21. [23]

    2021 , eprint=

    Calibration and Consistency of Adversarial Surrogate Losses , author=. 2021 , eprint=

  22. [24]

    Adam: A Method for Stochastic Optimization

    Adam: A method for stochastic optimization , author=. arXiv preprint arXiv:1412.6980 , year=

  23. [25]

    2022 , eprint=

    Towards Consistency in Adversarial Classification , author=. 2022 , eprint=

  24. [26]

    2021 , eprint=

    Calibrated Surrogate Losses for Adversarially Robust Classification , author=. 2021 , eprint=

  25. [27]

    Evasion Attacks against Machine Learning at Test Time , ISBN=

    Biggio, Battista and Corona, Igino and Maiorca, Davide and Nelson, Blaine and Šrndić, Nedim and Laskov, Pavel and Giacinto, Giorgio and Roli, Fabio , year=. Evasion Attacks against Machine Learning at Test Time , ISBN=. doi:10.1007/978-3-642-40994-3_25 , booktitle=

  26. [28]

    Advances in neural information processing systems , volume=

    Realizable H -Consistent and Bayes-Consistent Loss Functions for Learning to Defer , author=. Advances in neural information processing systems , volume=

  27. [29]

    Proceedings of The 26th International Conference on Artificial Intelligence and Statistics , pages =

    Theoretically Grounded Loss Functions and Algorithms for Adversarial Robustness , author =. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics , pages =. 2023 , editor =

  28. [30]

    1998 , institution=

    Multi-class support vector machines , author=. 1998 , institution=

  29. [31]

    Ghosh, Aritra and Kumar, Himanshu and Sastry, P. S. , title =. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence , pages =. 2017 , publisher =

  30. [32]

    Proceedings of the 37th International Conference on Neural Information Processing Systems , articleno =

    Cao, Yuzhou and Mozannar, Hussein and Feng, Lei and Wei, Hongxin and An, Bo , title =. Proceedings of the 37th International Conference on Neural Information Processing Systems , articleno =. 2024 , publisher =

  31. [33]

    Science and Engineering Ethics , volume=

    In AI we trust: ethics, artificial intelligence, and reliability , author=. Science and Engineering Ethics , volume=. 2020 , publisher=

  32. [34]

    Personal and ubiquitous computing , volume=

    The chatbot usability scale: the design and pilot of a usability scale for interaction with AI-based conversational agents , author=. Personal and ubiquitous computing , volume=. 2022 , publisher=

  33. [35]

    Towards Human-

    Joshua Strong and Qianhui Men and Alison Noble , booktitle=. Towards Human-. 2024 , url=

  34. [36]

    2018 , eprint=

    Know What You Don't Know: Unanswerable Questions for SQuAD , author=. 2018 , eprint=

  35. [37]

    The 28th International Conference on Artificial Intelligence and Statistics , year=

    A Causal Framework for Evaluating Deferring Systems , author=. The 28th International Conference on Artificial Intelligence and Statistics , year=

  36. [38]

    Post-hoc estimators for learning to defer to an expert , url =

    Narasimhan, Harikrishna and Jitkrittum, Wittawat and Menon, Aditya K and Rawat, Ankit and Kumar, Sanjiv , booktitle =. Post-hoc estimators for learning to defer to an expert , url =

  37. [39]

    Proceedings of Thirty Sixth Conference on Learning Theory , pages =

    On Classification-Calibration of Gamma-Phi Losses , author =. Proceedings of Thirty Sixth Conference on Learning Theory , pages =. 2023 , editor =

  38. [40]

    Learning with Rejection

    Cortes, Corinna and DeSalvo, Giulia and Mohri, Mehryar. Learning with Rejection. Algorithmic Learning Theory. 2016

  39. [41]

    Proceedings of the 41st International Conference on Machine Learning , articleno =

    Mao, Anqi and Mohri, Mehryar and Zhong, Yutao , title =. Proceedings of the 41st International Conference on Machine Learning , articleno =. 2024 , publisher =

  40. [42]

    Forty-second International Conference on Machine Learning , year=

    A Two-Stage Learning-to-Defer Approach for Multi-Task Learning , author=. Forty-second International Conference on Machine Learning , year=

  41. [43]

    Ieee Access , volume=

    Threat of adversarial attacks on deep learning in computer vision: A survey , author=. Ieee Access , volume=. 2018 , publisher=

  42. [44]

    Explaining and Harnessing Adversarial Examples

    Explaining and harnessing adversarial examples , author=. arXiv preprint arXiv:1412.6572 , year=

  43. [45]

    CAAI Transactions on Intelligence Technology , volume=

    A survey on adversarial attacks and defences , author=. CAAI Transactions on Intelligence Technology , volume=. 2021 , publisher=

  44. [46]

    2024 , eprint=

    Calibrated Language Models Must Hallucinate , author=. 2024 , eprint=

  45. [47]

    Transactions of the Association for Computational Linguistics , volume =

    Jiang, Zhengbao and Araki, Jun and Ding, Haibo and Neubig, Graham , title = ". Transactions of the Association for Computational Linguistics , volume =. 2021 , month =. doi:10.1162/tacl_a_00407 , url =

  46. [48]

    M. I. Jordan and T. M. Mitchell , title =. Science , volume =. 2015 , doi =. https://www.science.org/doi/pdf/10.1126/science.aaa8415 , abstract =

  47. [49]

    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , author=. arXiv preprint arXiv:1810.04805 , year=

  48. [50]

    International Conference on Artificial Intelligence and Statistics , year=

    Learning to Defer to Multiple Experts: Consistent Surrogate Losses, Confidence Calibration, and Conformal Ensembles , author=. International Conference on Artificial Intelligence and Statistics , year=

  49. [51]

    Proceedings of the 37th International Conference on Machine Learning , articleno =

    Mozannar, Hussein and Sontag, David , title =. Proceedings of the 37th International Conference on Machine Learning , articleno =. 2020 , publisher =

  50. [52]

    Controlling the Specificity of Clarification Question Generation

    Cao, Yang Trista and Rao, Sudha and Daum \'e III, Hal. Controlling the Specificity of Clarification Question Generation. Proceedings of the 2019 Workshop on Widening NLP. 2019

  51. [53]

    Rao, Sudha and Daumé, III , year =

  52. [54]

    2023 , eprint=

    A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions , author=. 2023 , eprint=

  53. [55]

    RoBERTa: A Robustly Optimized BERT Pretraining Approach

    RoBERTa: A Robustly Optimized BERT Pretraining Approach , author=. arXiv preprint arXiv:1907.11692 , year=

  54. [56]

    TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension

    TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , author=. arXiv preprint arXiv:1705.03551 , year=

  55. [57]

    International Conference on Algorithmic Learning Theory , pages=

    Predictor-rejector multi-class abstention: Theoretical analysis and algorithms , author=. International Conference on Algorithmic Learning Theory , pages=. 2024 , organization=

  56. [58]

    Selective Classification for Deep Neural Networks , url =

    Geifman, Yonatan and El-Yaniv, Ran , booktitle =. Selective Classification for Deep Neural Networks , url =

  57. [59]

    The Journal of Machine Learning Research , author=

    Classification with a Reject Option using a Hinge Loss , volume=. The Journal of Machine Learning Research , author=. 2008 , month=jun, pages=

  58. [61]

    LUKE : Deep Contextualized Entity Representations with Entity-aware Self-attention

    Yamada, Ikuya and Asai, Akari and Shindo, Hiroyuki and Takeda, Hideaki and Matsumoto, Yuji. LUKE : Deep Contextualized Entity Representations with Entity-aware Self-attention. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. doi:10.18653/v1/2020.emnlp-main.523

  59. [62]

    Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015) , pages=

    HITSZ-ICRC: Exploiting classification approach for answer selection in community question answering , author=. Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015) , pages=

  60. [63]

    2020 , eprint=

    MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices , author=. 2020 , eprint=

  61. [64]

    , journal=

    Lloyd, S. , journal=. Least squares quantization in PCM , year=

  62. [65]

    2014 , eprint=

    Intriguing properties of neural networks , author=. 2014 , eprint=

  63. [66]

    Artificial Intelligence Review , volume=

    Expert finding in community question answering: a review , author=. Artificial Intelligence Review , volume=. 2020 , publisher=

  64. [67]

    ArXiv , year=

    Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples , author=. ArXiv , year=

  65. [68]

    Incorporating uncertainty in learning to defer algorithms for safe computer-aided diagnosis , volume =

    Liu, Jessie and Gallego, Blanca and Barbieri, Sebastiano , year =. Incorporating uncertainty in learning to defer algorithms for safe computer-aided diagnosis , volume =. Scientific Reports , doi =

  66. [69]

    2021 , eprint=

    Improving Learning-to-Defer Algorithms Through Fine-Tuning , author=. 2021 , eprint=

  67. [70]

    ArXiv , year=

    Towards Deep Learning Models Resistant to Adversarial Attacks , author=. ArXiv , year=

  68. [71]

    International conference on Machine learning , pages=

    Cross-entropy loss functions: Theoretical analysis and applications , author=. International conference on Machine learning , pages=. 2023 , organization=

  69. [72]

    Constructive Approximation , year=

    How to Compare Different Loss Functions and Their Risks , author=. Constructive Approximation , year=

  70. [73]

    Transactions of the Association for Computational Linguistics , volume=

    Natural Questions: a Benchmark for Question Answering Research , author=. Transactions of the Association for Computational Linguistics , volume=

  71. [74]

    Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

    Deep residual learning for image recognition , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

  72. [75]

    Advances in neural information processing systems , volume=

    Generalized cross entropy loss for training deep neural networks with noisy labels , author=. Advances in neural information processing systems , volume=

  73. [76]

    HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering

    HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , author=. arXiv preprint arXiv:1809.09600 , year=

  74. [77]

    arXiv preprint arXiv:1606.03126 , year=

    Key-value memory networks for directly reading documents , author=. arXiv preprint arXiv:1606.03126 , year=

  75. [78]

    arXiv preprint arXiv:1611.01603 , year=

    Bidirectional attention flow for machine comprehension , author=. arXiv preprint arXiv:1611.01603 , year=

  76. [79]

    arXiv preprint arXiv:1704.00051 , year=

    Reading Wikipedia to Answer Open-Domain Questions , author=. arXiv preprint arXiv:1704.00051 , year=

  77. [80]

    arXiv preprint arXiv:1707.07328 , year=

    Adversarial Examples for Evaluating Reading Comprehension Systems , author=. arXiv preprint arXiv:1707.07328 , year=

  78. [81]

    arXiv preprint arXiv:1908.07125 , year=

    Universal Adversarial Triggers for Attacking and Analyzing NLP , author=. arXiv preprint arXiv:1908.07125 , year=

  79. [82]

    Convexity, Classification, and Risk Bounds , volume =

    Bartlett, Peter and Jordan, Michael and McAuliffe, Jon , year =. Convexity, Classification, and Risk Bounds , volume =. Journal of the American Statistical Association , doi =

  80. [83]

    2019 , eprint=

    The Woman Worked as a Babysitter: On Biases in Language Generation , author=. 2019 , eprint=

Showing first 80 references.