Recognition: unknown
Evidential Transformation Network: Turning Pretrained Models into Evidential Models for Post-hoc Uncertainty Estimation
Pith reviewed 2026-05-10 16:50 UTC · model grok-4.3
The pith
A lightweight post-hoc module turns any pretrained model into an evidential model by learning a sample-dependent affine transform on its logits.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The Evidential Transformation Network converts a pretrained classifier into an evidential model by learning a sample-dependent affine transformation of the logits and interpreting the transformed outputs directly as the parameters of a Dirichlet distribution, thereby enabling reliable uncertainty estimation for both in-distribution and out-of-distribution inputs without access to internal model states or full retraining.
What carries the argument
The sample-dependent affine transformation applied to logits, which produces the concentration parameters of a Dirichlet distribution for uncertainty quantification.
If this is right
- Any pretrained image or language classifier can receive evidential uncertainty estimates without modifying its weights or architecture.
- Accuracy on the original task is preserved because the base model remains frozen during ETN training.
- Computational cost stays low because only a lightweight module is added at inference time.
- The same procedure applies across vision and language benchmarks under both in-distribution and out-of-distribution conditions.
Where Pith is reading between the lines
- The success of a simple affine map on logits implies that much of the information needed for evidential uncertainty can be recovered from output logits without internal activations.
- Practitioners could retrofit existing deployed models with ETN to support safer rejection or deferral decisions in high-stakes settings.
- The approach might extend to other output distributions beyond Dirichlet if analogous lightweight transformations prove effective.
Load-bearing premise
A learned sample-dependent affine transformation of the logits alone is sufficient to yield Dirichlet parameters that accurately quantify uncertainty for both in-distribution and out-of-distribution cases.
What would settle it
If ETN applied to a held-out pretrained model on a new out-of-distribution benchmark produces uncertainty scores whose correlation with actual errors is no better than temperature scaling or other logit-only baselines.
Figures
read the original abstract
Pretrained models have become standard in both vision and language, yet they typically do not provide reliable measures of confidence. Existing uncertainty estimation methods, such as deep ensembles and MC dropout, are often too computationally expensive to deploy in practice. Evidential Deep Learning (EDL) offers a more efficient alternative, but it requires models to be trained to output evidential quantities from the start, which is rarely true for pretrained networks. To enable EDL-style uncertainty estimation in pretrained models, we propose the Evidential Transformation Network (ETN), a lightweight post-hoc module that converts a pretrained predictor into an evidential model. ETN operates in logit space: it learns a sample-dependent affine transformation of the logits and interprets the transformed outputs as parameters of a Dirichlet distribution for uncertainty estimation. We evaluate ETN on image classification and large language model question-answering benchmarks under both in-distribution and out-of-distribution settings. ETN consistently improves uncertainty estimation over post-hoc baselines while preserving accuracy and adding only minimal computational overhead.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the Evidential Transformation Network (ETN), a lightweight post-hoc module that applies a learned sample-dependent affine transformation to the logits of a pretrained model and interprets the transformed values as concentration parameters of a Dirichlet distribution. This enables evidential-style uncertainty estimation for both in-distribution and out-of-distribution inputs on image classification and LLM question-answering benchmarks without retraining the base model or accessing internal activations. The central claim is that ETN improves uncertainty metrics over post-hoc baselines while preserving accuracy and incurring only minimal overhead.
Significance. If the empirical gains hold under rigorous validation, ETN would provide a practical, low-cost route to reliable uncertainty quantification for deployed pretrained models in vision and language, filling a gap between expensive methods like ensembles and the limitations of standard post-hoc calibration. The post-hoc, logit-only design is a strength for compatibility with existing networks.
major comments (2)
- [Method (ETN definition and Dirichlet interpretation)] The load-bearing assumption that a per-sample affine transform in logit space alone can produce trustworthy Dirichlet parameters for OOD uncertainty (without internal states or retraining) is not adequately supported. When OOD and ID logit distributions overlap—a common regime—the transform has no additional signal to differentiate evidence, yet the paper treats the resulting Dirichlet as reliable for both regimes. An ablation or analysis demonstrating robustness in overlapping-logit cases is required.
- [Abstract and Evaluation] No quantitative results, specific metrics (e.g., AUROC, ECE), training details for the ETN parameters, loss functions, or error bars appear in the abstract or high-level description, making it impossible to verify whether the data support the claim of consistent improvement. The full evaluation section must supply these with statistical significance tests.
minor comments (2)
- [Method] Clarify the exact parameterization of the sample-dependent affine transform (e.g., whether scale and shift are class-specific or shared, and how they are optimized).
- [Experiments] Add a direct comparison table against recent logit-based post-hoc methods (e.g., temperature scaling variants or Dirichlet calibration) to strengthen the baseline claims.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive report. We address each major comment below and describe the revisions we will implement to improve the manuscript.
read point-by-point responses
-
Referee: [Method (ETN definition and Dirichlet interpretation)] The load-bearing assumption that a per-sample affine transform in logit space alone can produce trustworthy Dirichlet parameters for OOD uncertainty (without internal states or retraining) is not adequately supported. When OOD and ID logit distributions overlap—a common regime—the transform has no additional signal to differentiate evidence, yet the paper treats the resulting Dirichlet as reliable for both regimes. An ablation or analysis demonstrating robustness in overlapping-logit cases is required.
Authors: We appreciate the referee's emphasis on this core assumption. The ETN predicts sample-specific affine parameters via a lightweight network operating on the input (consistent with its post-hoc but input-aware design), which in principle allows it to modulate evidence assignment even when raw logit vectors exhibit overlap. Our empirical results on standard ID/OOD benchmarks demonstrate improved uncertainty metrics, indicating that sufficient differentiating signal is captured in practice. Nevertheless, we agree that dedicated analysis of the overlapping-logit regime is needed. In the revision we will add an ablation that (i) quantifies logit overlap between ID and OOD samples, (ii) visualizes the corresponding Dirichlet parameters and uncertainty estimates, and (iii) reports performance relative to baselines under high-overlap conditions. This analysis will be placed in the experimental section. revision: yes
-
Referee: [Abstract and Evaluation] No quantitative results, specific metrics (e.g., AUROC, ECE), training details for the ETN parameters, loss functions, or error bars appear in the abstract or high-level description, making it impossible to verify whether the data support the claim of consistent improvement. The full evaluation section must supply these with statistical significance tests.
Authors: We agree that greater quantitative transparency is warranted. While abstracts are conventionally concise, we will revise the abstract to explicitly state the key improvements (e.g., AUROC gains for OOD detection and ECE reductions). In the evaluation section we will add: (i) explicit training details for ETN (optimizer, learning rate, number of epochs, and the evidential loss formulation), (ii) error bars computed over multiple random seeds, and (iii) statistical significance tests (paired t-tests or Wilcoxon signed-rank tests) comparing ETN against each baseline. These additions will directly support the claim of consistent improvement. revision: yes
Circularity Check
No circularity: ETN is an independent post-hoc module trained and evaluated on external benchmarks
full rationale
The paper proposes ETN as a lightweight, separately trained module that applies a learned sample-dependent affine transform to the logits of a frozen pretrained model and treats the outputs as Dirichlet concentration parameters. This construction is defined explicitly as an add-on component with its own parameters optimized on held-out data; no equation reduces the claimed uncertainty estimates to the pretrained model's outputs by definition, and no central premise is justified solely by self-citation. All reported improvements are measured against standard external ID/OOD benchmarks rather than internal consistency checks, so the derivation chain remains self-contained.
Axiom & Free-Parameter Ledger
free parameters (1)
- parameters of the sample-dependent affine transformation
axioms (1)
- domain assumption Transformed logits can be directly interpreted as parameters of a Dirichlet distribution for uncertainty estimation
invented entities (1)
-
Evidential Transformation Network (ETN)
no independent evidence
Forward citations
Cited by 1 Pith paper
-
Rethinking Vacuity for OOD Detection in Evidential Deep Learning
Vacuity-based OOD detection in evidential deep learning is highly sensitive to class cardinality differences between ID and OOD, which can artificially inflate AUROC and AUPR without any change in model predictions.
Reference graph
Works this paper leans on
-
[1]
Explaining neural scaling laws.Proceedings of the National Academy of Sciences, 121(27), 2024
Yasaman Bahri, Ethan Dyer, Jared Kaplan, Jaehoon Lee, and Utkarsh Sharma. Explaining neural scaling laws.Proceedings of the National Academy of Sciences, 121(27), 2024. 7
2024
-
[2]
Beyond next token probabilities: Learnable, fast detection of hallucinations and data contamination on llm output distribu- tions, 2025
Guy Bar-Shalom, Fabrizio Frasca, Derek Lim, Yoav Gelberg, Yftah Ziser, Ran El-Yaniv, Gal Chechik, and Haggai Maron. Beyond next token probabilities: Learnable, fast detection of hallucinations and data contamination on llm output distribu- tions, 2025. 1
2025
-
[3]
On second-order scoring rules for epistemic uncertainty quantifi- cation, 2023
Viktor Bengs, Eyke H¨ullermeier, and Willem Waegeman. On second-order scoring rules for epistemic uncertainty quantifi- cation, 2023. 1
2023
-
[4]
Posterior network: Uncertainty estimation without ood samples via density-based pseudo-counts.Advances in neural information processing systems, 33:1356–1367, 2020
Bertrand Charpentier, Daniel Z ¨ugner, and Stephan G ¨unne- mann. Posterior network: Uncertainty estimation without ood samples via density-based pseudo-counts.Advances in neural information processing systems, 33:1356–1367, 2020. 3
2020
-
[5]
R-edl: Relaxing nonessential settings of evidential deep learning
Mengyuan Chen, Junyu Gao, and Changsheng Xu. R-edl: Relaxing nonessential settings of evidential deep learning. In The Twelfth International Conference on Learning Represen- tations, 2024. 2, 3, 4, 6
2024
-
[6]
Wenhu Chen, Yilin Shen, Hongxia Jin, and William Wang. A variational dirichlet framework for out-of-distribution detec- tion.arXiv preprint arXiv:1811.07308, 2018. 3
-
[7]
Laplace redux-effortless bayesian deep learning.Advances in neural information processing systems, 34:20089–20103, 2021
Erik Daxberger, Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Matthias Bauer, and Philipp Hennig. Laplace redux-effortless bayesian deep learning.Advances in neural information processing systems, 34:20089–20103, 2021. 1, 2, 6, 4
2021
-
[8]
Uncertainty estimation by fisher information-based evidential deep learning, 2023
Danruo Deng, Guangyong Chen, Yang Yu, Furui Liu, and Pheng-Ann Heng. Uncertainty estimation by fisher information-based evidential deep learning, 2023. 3, 5, 6
2023
-
[9]
Imagenet: A large-scale hierarchical image database
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009. 6
2009
-
[10]
Evans and J.S
M.J. Evans and J.S. Rosenthal.Probability and Statistics: The Science of Uncertainty. W. H. Freeman, 2004. 2
2004
-
[11]
Dropout as a bayesian approximation: Representing model uncertainty in deep learn- ing
Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learn- ing. InProceedings of The 33rd International Conference on Machine Learning, pages 1050–1059, New York, New York, USA, 2016. PMLR. 1, 6
2016
-
[12]
A survey of uncertainty in deep neural networks.Artifi- cial Intelligence Review, 56(Suppl 1):1513–1589, 2023
Jakob Gawlikowski, Cedrique Rovile Njieutcheu Tassi, Mohsin Ali, Jongseok Lee, Matthias Humt, Jianxiang Feng, Anna Kruspe, Rudolph Triebel, Peter Jung, Ribana Roscher, et al. A survey of uncertainty in deep neural networks.Artifi- cial Intelligence Review, 56(Suppl 1):1513–1589, 2023. 1, 2, 5
2023
-
[13]
Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Ab- hinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, et al. The llama 3 herd of models.arXiv preprint arXiv:2407.21783,
work page internal anchor Pith review Pith/arXiv arXiv
-
[14]
On calibration of modern neural networks
Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q Weinberger. On calibration of modern neural networks. InInternational conference on machine learning, pages 1321–1330. PMLR,
-
[15]
Deep residual learning for image recognition, 2015
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition, 2015. 4
2015
-
[16]
A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks
Dan Hendrycks and Kevin Gimpel. A baseline for detect- ing misclassified and out-of-distribution examples in neural networks.arXiv preprint arXiv:1610.02136, 2016. 3
work page internal anchor Pith review arXiv 2016
-
[17]
The many faces of robust- ness: A critical analysis of out-of-distribution generalization
Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kada- vath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, et al. The many faces of robust- ness: A critical analysis of out-of-distribution generalization. InProceedings of the IEEE/CVF international conference on computer vision, pages 8340–8349, 2021. 6
2021
-
[18]
Measuring massive multitask language understanding.Proceedings of the International Conference on Learning Representations (ICLR), 2021
Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Man- tas Mazeika, Dawn Song, and Jacob Steinhardt. Measuring massive multitask language understanding.Proceedings of the International Conference on Learning Representations (ICLR), 2021. 6
2021
-
[19]
Natural adversarial examples
Dan Hendrycks, Kevin Zhao, Steven Basart, Jacob Steinhardt, and Dawn Song. Natural adversarial examples. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15262–15271, 2021. 6
2021
-
[20]
Mostofa Ali Patwary, Yang Yang, and Yanqi Zhou
Joel Hestness, Sharan Narang, Newsha Ardalani, Gregory Diamos, Heewoo Jun, Hassan Kianinejad, Md. Mostofa Ali Patwary, Yang Yang, and Yanqi Zhou. Deep learning scaling is predictable, empirically, 2017. 7
2017
-
[21]
Logits are all we need to adapt closed models, 2025
Gaurush Hiranandani, Haolun Wu, Subhojyoti Mukherjee, and Sanmi Koyejo. Logits are all we need to adapt closed models, 2025. 1
2025
-
[22]
Being bayesian about categorical probability
Taejong Joo, Uijung Chung, and Min-Gwan Seo. Being bayesian about categorical probability. InInternational con- ference on machine learning, pages 4950–4961. PMLR, 2020. 3
2020
-
[23]
Springer Publishing Company, Incorpo- rated, 1st edition, 2016
Audun Jøsang.Subjective Logic: A Formalism for Reasoning Under Uncertainty. Springer Publishing Company, Incorpo- rated, 1st edition, 2016. 2
2016
-
[24]
Sample-dependent adaptive temperature scaling for improved calibration
Tom Joy, Francesco Pinto, Ser-Nam Lim, Philip HS Torr, and Puneet K Dokania. Sample-dependent adaptive temperature scaling for improved calibration. InProceedings of the AAAI Conference on Artificial Intelligence, pages 14919–14926,
-
[25]
Is epistemic uncertainty faithfully represented by evidential deep learning methods? InInterna- tional Conference on Machine Learning, pages 22624–22642
Mira Juergens, Nis Meinert, Viktor Bengs, Eyke H¨ullermeier, and Willem Waegeman. Is epistemic uncertainty faithfully represented by evidential deep learning methods? InInterna- tional Conference on Machine Learning, pages 22624–22642. PMLR, 2024. 1, 3
2024
-
[26]
Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Rad- ford, Jeffrey Wu, and Dario Amodei
Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Rad- ford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models, 2020. 1
2020
-
[27]
Learning multiple layers of features from tiny images.University of Toronto, 2012
Alex Krizhevsky. Learning multiple layers of features from tiny images.University of Toronto, 2012. 6
2012
-
[28]
Beyond temperature scaling: Obtaining well-calibrated multi-class probabilities with dirichlet calibration.Advances in neural information processing systems, 32, 2019
Meelis Kull, Miquel Perello Nieto, Markus K¨angsepp, Telmo Silva Filho, Hao Song, and Peter Flach. Beyond temperature scaling: Obtaining well-calibrated multi-class probabilities with dirichlet calibration.Advances in neural information processing systems, 32, 2019. 3
2019
-
[29]
RACE: Large-scale ReAding comprehension dataset from examinations
Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, and Eduard Hovy. RACE: Large-scale ReAding comprehension dataset from examinations. InProceedings of the 2017 Confer- ence on Empirical Methods in Natural Language Processing, pages 785–794, Copenhagen, Denmark, 2017. Association for Computational Linguistics. 6
2017
-
[30]
Simple and scalable predictive uncertainty estima- tion using deep ensembles
Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estima- tion using deep ensembles. InAdvances in Neural Information Processing Systems. Curran Associates, Inc., 2017. 1, 6
2017
-
[31]
A simple unified framework for detecting out-of-distribution samples and adversarial attacks.Advances in neural informa- tion processing systems, 31, 2018
Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin. A simple unified framework for detecting out-of-distribution samples and adversarial attacks.Advances in neural informa- tion processing systems, 31, 2018. 4
2018
-
[32]
Calibrating LLMs with Information-Theoretic Evidential Deep Learning, February 2025
Yawei Li, David R¨ugamer, Bernd Bischl, and Mina Rezaei. Calibrating llms with information-theoretic evidential deep learning.arXiv preprint arXiv:2502.06351, 2025. 2, 3, 6, 4
-
[33]
Enhancing the reliabil- ity of out-of-distribution image detection in neural networks
Shiyu Liang, Yixuan Li, and R Srikant. Enhancing the reliabil- ity of out-of-distribution image detection in neural networks. InInternational Conference on Learning Representations,
-
[34]
Simple and principled uncertainty estimation with deterministic deep learning via distance awareness.Advances in neural informa- tion processing systems, 33:7498–7512, 2020
Jeremiah Liu, Zi Lin, Shreyas Padhy, Dustin Tran, Tania Bedrax Weiss, and Balaji Lakshminarayanan. Simple and principled uncertainty estimation with deterministic deep learning via distance awareness.Advances in neural informa- tion processing systems, 33:7498–7512, 2020. 5
2020
-
[35]
Large-Margin Softmax Loss for Convolutional Neural Networks
Weiyang Liu, Yandong Wen, Zhiding Yu, and Meng Yang. Large-margin softmax loss for convolutional neural networks. arXiv preprint arXiv:1612.02295, 2016. 5
work page Pith review arXiv 2016
-
[36]
Predictive uncertainty esti- mation via prior networks.Advances in neural information processing systems, 31, 2018
Andrey Malinin and Mark Gales. Predictive uncertainty esti- mation via prior networks.Advances in neural information processing systems, 31, 2018. 2, 3
2018
-
[37]
Reverse kl-divergence train- ing of prior networks: Improved uncertainty and adversarial robustness
Andrey Malinin and Mark Gales. Reverse kl-divergence train- ing of prior networks: Improved uncertainty and adversarial robustness. InAdvances in Neural Information Processing Systems. Curran Associates, Inc., 2019. 3, 4
2019
-
[38]
Can a suit of armor conduct electricity? a new dataset for open book question answering
Todor Mihaylov, Peter Clark, Tushar Khot, and Ashish Sab- harwal. Can a suit of armor conduct electricity? a new dataset for open book question answering. InEMNLP, 2018. 6
2018
-
[39]
Revisiting the calibration of modern neural networks
Matthias Minderer, Josip Djolonga, Rob Romijnders, Frances Hubis, Xiaohua Zhai, Neil Houlsby, Dustin Tran, and Mario Lucic. Revisiting the calibration of modern neural networks. Advances in neural information processing systems, 34:15682– 15694, 2021. 3
2021
-
[40]
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y . Ng. Reading digits in natural images with unsupervised feature learning. InNIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011,
2011
-
[41]
Predicting good probabilities with supervised learning
Alexandru Niculescu-Mizil and Rich Caruana. Predicting good probabilities with supervised learning. InProceedings of the 22nd international conference on Machine learning, pages 625–632, 2005. 2, 4
2005
-
[42]
Learn to accumulate evidence from all training samples: theory and practice
Deep Pandey and Qi Yu. Learn to accumulate evidence from all training samples: theory and practice. InProceedings of the 40th International Conference on Machine Learning, pages 26963–26989, 2023. 5
2023
-
[43]
Probabilistic outputs for support vector ma- chines and comparisons to regularized likelihood methods
John Platt et al. Probabilistic outputs for support vector ma- chines and comparisons to regularized likelihood methods. Advances in large margin classifiers, 10(3):61–74, 1999. 2, 4
1999
-
[44]
Eviden- tial deep learning to quantify classification uncertainty, 2018
Murat Sensoy, Lance Kaplan, and Melih Kandemir. Eviden- tial deep learning to quantify classification uncertainty, 2018. 1, 2, 3
2018
-
[45]
Post-hoc uncertainty learning using a dirichlet meta-model, 2022
Maohao Shen, Yuheng Bu, Prasanna Sattigeri, Soumya Ghosh, Subhro Das, and Gregory Wornell. Post-hoc uncertainty learning using a dirichlet meta-model, 2022. 2, 6, 4
2022
-
[46]
Thermome- ter: towards universal calibration for large language models
Maohao Shen, Subhro Das, Kristjan Greenewald, Prasanna Sattigeri, Gregory Wornell, and Soumya Ghosh. Thermome- ter: towards universal calibration for large language models. InProceedings of the 41st International Conference on Ma- chine Learning. JMLR.org, 2024. 4
2024
-
[47]
Are uncertainty quantification capabilities of evidential deep learn- ing a mirage?Advances in Neural Information Processing Systems, 37:107830–107864, 2024
Maohao Shen, Jongha Jon Ryu, Soumya Ghosh, Yuheng Bu, Prasanna Sattigeri, Subhro Das, and Gregory Wornell. Are uncertainty quantification capabilities of evidential deep learn- ing a mirage?Advances in Neural Information Processing Systems, 37:107830–107864, 2024. 3, 6, 1, 4
2024
-
[48]
Very deep convolutional net- works for large-scale image recognition
K Simonyan and A Zisserman. Very deep convolutional net- works for large-scale image recognition. In3rd International Conference on Learning Representations (ICLR 2015). Com- putational and Biological Learning Society, 2015. 3
2015
-
[49]
Least squares sup- port vector machine classifiers.Neural processing letters, 9 (3):293–300, 1999
Johan AK Suykens and Joos Vandewalle. Least squares sup- port vector machine classifiers.Neural processing letters, 9 (3):293–300, 1999. 5
1999
-
[50]
Uncertainty estimation using a single deep determinis- tic neural network
Joost Van Amersfoort, Lewis Smith, Yee Whye Teh, and Yarin Gal. Uncertainty estimation using a single deep determinis- tic neural network. InInternational conference on machine learning, pages 9690–9700. PMLR, 2020. 5
2020
-
[51]
Learning robust global representations by penalizing local pre- dictive power
Haohan Wang, Songwei Ge, Zachary Lipton, and Eric P Xing. Learning robust global representations by penalizing local pre- dictive power. InAdvances in Neural Information Processing Systems, pages 10506–10518, 2019. 6
2019
-
[52]
The generalised product moment distribution in samples from a normal multivariate population.Biometrika, 20(1/2):32–52, 1928
John Wishart. The generalised product moment distribution in samples from a normal multivariate population.Biometrika, 20(1/2):32–52, 1928. 3
1928
-
[53]
Bayesian low-rank adaptation for large language models
Adam X Yang, Maxime Robeyns, Xi Wang, and Laurence Aitchison. Bayesian low-rank adaptation for large language models. InThe Twelfth International Conference on Learning Representations. 6, 3, 4
-
[54]
Taeseong Yoon and Heeyoung Kim. Uncertainty estimation by density aware evidential deep learning.arXiv preprint arXiv:2409.08754, 2024. 2, 4, 5, 6 Evidential Transformation Network: Turning Pretrained Models into Evidential Models for Post-hoc Uncertainty Estimation Supplementary Material
-
[55]
First, the benefits of ETN are largely empirical rather than theoretical
Limitations While ETN improves the uncertainty estimation performance of pretrained models without harming accuracy and with only minimal additional computational cost, it also has sev- eral limitations. First, the benefits of ETN are largely empirical rather than theoretical. Recent works have raised concerns about EDL from a theoretical standpoint, argu...
-
[56]
Proofs and Derivations In this section, we analyze the behavior of logits produced by models trained with cross-entropy and EDL losses. We first define the softmax per-sample(x, y) cross-entropy loss as: LCE(z, y) =−log ezy PC j=1 ezj = log 1 + X j̸=y e zj −zy Then we define the inter-class margin of an sample as: γ(z, y) =z y −max j̸=y zj Given these def...
-
[57]
Specifically, we explain (1) how the variational distribution over A is constructed, and (2) how the prior termb is handled
Modeling Transformation Parameteriza- tions In this section, we describe how the transformation parameter A is modeled when defined as a scalar, vector, or matrix. Specifically, we explain (1) how the variational distribution over A is constructed, and (2) how the prior termb is handled. For clarity, we denote the scalar case by a, the vector case bya, an...
-
[58]
Training Details The hyperparameters used for training ETN are summa- rized in Table 3
Experimental Setting 11.1. Training Details The hyperparameters used for training ETN are summa- rized in Table 3. For LLM experiments, we employ cosine learning-rate scheduling with warm-up steps. All experi- ments are performed using three different random seeds, and we report the mean along with 95% confidence intervals. For post-hoc uncertainty estima...
2012
-
[59]
for all experimental settings, and train the additional parameters using the reverse KL formulation ofL EDL
-
[60]
OOD-Detection Baselines In this section, we compare ETN against ODIN [33] and the Mahalanobis distance method (MD) [31]
Additional Experiments 12.1. OOD-Detection Baselines In this section, we compare ETN against ODIN [33] and the Mahalanobis distance method (MD) [31]. Although neither ODIN nor MD are strictly uncertainty estimation methods, we include them as they both work in post-hoc manner, and Method CIFAR10→CIFAR10-OOD ImageNet→ImageNet-OOD OBQA→MMLU RACE→MMLU MD45.4...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.