When Bits Break Recourse: Counterfactual-Faithful Quantization
Pith reviewed 2026-05-20 14:39 UTC · model grok-4.3
The pith
Counterfactual-Faithful Quantization keeps recourse valid and low-cost after bit reduction while matching full-precision accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Quantization perturbs model outputs enough to invalidate many recourse actions that worked on the full-precision model. Counterfactual-Faithful Quantization (CFQ) solves this by jointly learning quantization parameters and bit allocations so that the quantized model still classifies the recourse points from the teacher model correctly, under a fixed total bit budget. This is supported by a sufficient condition derived from margin analysis that guarantees stability when perturbations are bounded.
What carries the argument
Counterfactual-Faithful Quantization (CFQ), which enforces the target outcome at teacher recourse points during quantizer training and mixed-precision allocation.
If this is right
- Accuracy can stay comparable to full-precision models while validity drop and counterfactual recourse gap improve across bit budgets.
- Recourse stability holds for the tested tabular datasets when the global bit constraint is enforced during training.
- Mixed-precision allocation guided by counterfactual fidelity outperforms uniform accuracy-focused quantization on stability metrics.
Where Pith is reading between the lines
- Deployers of quantized models in lending or criminal justice may need to adopt counterfactual-aware training to keep explanations actionable after compression.
- The same bounded-perturbation logic could apply to other compression methods such as pruning if analogous margin conditions can be derived.
- Future low-bit training pipelines might treat recourse metrics as first-class objectives alongside accuracy.
Load-bearing premise
The margin analysis assumes quantization perturbations remain bounded so that recourse transfers from the full-precision teacher to the quantized student.
What would settle it
Measure whether recourse actions valid on the full-precision model remain valid on the quantized version once the observed quantization error exceeds the margin bound used in the proof.
Figures
read the original abstract
Quantization can preserve predictive accuracy under low-bit deployment while silently breaking algorithmic recourse: an actionable change that flips a decision before quantization may fail after quantization, or become substantially more costly. We formalize counterfactual sensitivity under quantization through validity, cost, and direction stability, and introduce two metrics: Validity Drop (VD) and Counterfactual Recourse Gap (CRG) that reveal recourse failures invisible to accuracy. We propose Counterfactual-Faithful Quantization (CFQ), which trains quantizer parameters and mixed-precision bit allocation to preserve counterfactual behavior by enforcing the target outcome at teacher recourse points under a global bit budget. A margin-based analysis gives a sufficient condition for recourse transfer under bounded quantization perturbations. Experiments on Adult, German Credit, and COMPAS show that accuracy-matched baselines can significantly degrade recourse stability, while CFQ maintains accuracy and substantially improves VD and CRG across bit budgets.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper formalizes how quantization can degrade algorithmic recourse (validity, cost, and direction stability of counterfactuals), introduces Validity Drop (VD) and Counterfactual Recourse Gap (CRG) metrics, proposes Counterfactual-Faithful Quantization (CFQ) that jointly optimizes quantizer parameters and mixed-precision bit allocation to enforce target outcomes at teacher recourse points under a global bit budget, supplies a margin-based sufficient condition for recourse transfer under bounded perturbations, and reports experiments on Adult, German Credit, and COMPAS showing that accuracy-matched baselines degrade recourse stability while CFQ preserves accuracy and improves VD/CRG across bit budgets.
Significance. If the empirical gains hold under the stated conditions, the work identifies a practically relevant failure mode for quantized models in recourse-sensitive applications and supplies a targeted mitigation that does not sacrifice predictive accuracy. The introduction of VD and CRG provides concrete, falsifiable ways to measure the phenomenon beyond accuracy, and the margin analysis offers a starting point for theoretical guarantees. The experiments across three standard datasets strengthen the case that the issue is not isolated.
major comments (1)
- [Margin-based analysis (abstract and §4)] Margin-based analysis (abstract and §4): the sufficient condition for validity/cost/direction stability requires quantization perturbations to remain strictly smaller than the decision margin at each teacher recourse point. For the 4-bit regime tested, worst-case ||q(x)-x|| can exceed typical margins on Adult/German Credit/COMPAS near boundaries or in low-bit regions; when this occurs the formal transfer guarantee does not apply, so the reported VD/CRG improvements rest on the empirical objective rather than the stated analysis. The manuscript should either verify the bound holds on the evaluated points or qualify the analysis as applying only above a minimum bit-width.
minor comments (3)
- [Results] Results section: report dataset-specific numerical values for VD and CRG (with standard deviations over runs) rather than qualitative statements of 'substantial improvement'; include the exact bit-allocation schedules chosen by CFQ versus baselines.
- [Method] Notation: define the teacher recourse point generation procedure and the precise form of the CFQ loss (including how the global bit budget is enforced) before the margin analysis; the current abstract-level description leaves the optimization target ambiguous.
- [Experiments] Figure clarity: ensure recourse stability plots distinguish between validity drop and cost/direction components; add a table summarizing per-dataset accuracy, VD, and CRG at each bit-width for direct comparison.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and for highlighting the distinction between the sufficient condition in our margin analysis and the empirical performance of CFQ. We address the major comment below and have prepared revisions to qualify the analysis appropriately while preserving the core contributions.
read point-by-point responses
-
Referee: Margin-based analysis (abstract and §4): the sufficient condition for validity/cost/direction stability requires quantization perturbations to remain strictly smaller than the decision margin at each teacher recourse point. For the 4-bit regime tested, worst-case ||q(x)-x|| can exceed typical margins on Adult/German Credit/COMPAS near boundaries or in low-bit regions; when this occurs the formal transfer guarantee does not apply, so the reported VD/CRG improvements rest on the empirical objective rather than the stated analysis. The manuscript should either verify the bound holds on the evaluated points or qualify the analysis as applying only above a minimum bit-width.
Authors: We agree that the margin-based result is a sufficient (not necessary) condition and that, for 4-bit quantization, the worst-case perturbation norm can exceed the decision margin at some recourse points near boundaries. In such cases the formal transfer guarantee does not apply, and the reported gains in VD and CRG are attributable to the joint optimization objective that directly enforces the teacher recourse outcome under the global bit budget. We will revise the abstract and §4 to explicitly qualify the analysis as holding only when the quantization perturbation bound is strictly smaller than the margin at each evaluated recourse point. We will also add a short discussion (with illustrative margin-versus-error estimates on the three datasets) clarifying the bit-width regimes in which the sufficient condition is expected to be satisfied. This change does not alter the empirical claims or the practical utility of CFQ. revision: yes
Circularity Check
No circularity: optimization targets external teacher recourse points and metrics are independently defined
full rationale
The paper defines CFQ as training quantizer parameters and bit allocation to enforce the target outcome specifically at recourse points obtained from a separate full-precision teacher model, under a global bit budget. Validity Drop (VD) and Counterfactual Recourse Gap (CRG) are defined directly from the stability of validity, cost, and direction between teacher and student models. The margin-based analysis supplies only a sufficient condition under an explicit bounded-perturbation assumption and is not used to derive the training objective or the reported empirical gains. No step reduces a claimed result to a self-fit, self-citation chain, or renaming of the input; the central experimental comparison against accuracy-matched baselines therefore remains independent of the method's own outputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Quantization perturbations remain bounded
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
A margin-based analysis gives a sufficient condition for recourse transfer under bounded quantization perturbations.
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leanJ_uniquely_calibrated_via_higher_derivative unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Proposition 5.1 (Margin robustness at the recourse point)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Arthur Asuncion and David J. Newman. Uci machine learning repository. https://archive. ics.uci.edu/, 2007
work page 2007
-
[2]
Post-training 4-bit quantization of convolution networks for rapid-deployment, 2019
Ron Banner, Yury Nahshan, Elad Hoffer, and Daniel Soudry. Post-training 4-bit quantization of convolution networks for rapid-deployment, 2019. URL https://arxiv.org/abs/1810. 05723
work page 2019
-
[3]
Census income (adult) data set
Barry Becker and Ronny Kohavi. Census income (adult) data set. https://archive.ics. uci.edu/dataset/2/adult, 1996. URL https://archive.ics.uci.edu/dataset/2/ adult
work page 1996
-
[4]
Model multiplicity: Opportunities, concerns, and solutions
Emily Black, Manish Raghavan, and Solon Barocas. Model multiplicity: Opportunities, concerns, and solutions. InProceedings of the 2022 ACM Conference on Fairness, Ac- countability, and Transparency (FAccT), 2022. doi: 10.1145/3531146.3533149. URL https://dl.acm.org/doi/10.1145/3531146.3533149
-
[5]
PACT: Parameterized clipping activation for quantized neural networks
Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang, Vijayalakshmi Srinivasan, and Kailash Gopalakrishnan. PACT: Parameterized clipping activation for quantized neural networks. InInternational Conference on Learning Representations (ICLR), 2018. URL https://openreview.net/forum?id=By5ugjyCb
work page 2018
-
[6]
Zhen Dong, Zhewei Yao, Danish Arfeen, Amir Gholami, Michael W. Mahoney, and Kurt Keutzer. Hawq: Hessian aware quantization of neural networks with mixed-precision. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019. URL https://openaccess.thecvf.com/content_ICCV_2019/html/Dong_HAWQ_Hessian_ Aware_Quantization_of_Neural...
work page 2019
-
[7]
Zhen Dong, Zhewei Yao, Daiyaan Arfeen, Amir Gholami, Michael W. Mahoney, and Kurt Keutzer. HAWQ-V2: Hessian aware trace-weighted quantization of neural net- works. InAdvances in Neural Information Processing Systems (NeurIPS), 2020. doi: 10.48550/arXiv.1911.03852. URL https://proceedings.neurips.cc/paper_files/ paper/2020/hash/d77c703536718b95308130ff2e5c...
-
[8]
The accuracy, fairness, and limits of predicting recidivism
Julia Dressel and Hany Farid. The accuracy, fairness, and limits of predicting recidivism. Science Advances, 4(1), 2018. doi: 10.1126/sciadv.aao5580. URL https://www.science. org/doi/10.1126/sciadv.aao5580. 10
-
[9]
Steven K. Esser, Jeffrey L. McKinstry, Deepika Bablani, Rathinakumar Appuswamy, and Dharmendra S. Modha. Learned step size quantization, 2020. URL https://arxiv.org/ abs/1902.08153
-
[10]
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Elias Frantar and Dan Alistarh. Gptq: Accurate post-training quantization for generative pre-trained transformers, 2022. URLhttps://arxiv.org/abs/2210.17323
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[11]
Mahoney and Kurt Keutzer , year=
Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W. Mahoney, and Kurt Keutzer. A survey of quantization methods for efficient neural network inference, 2021. URL https: //arxiv.org/abs/2103.13630
-
[12]
Song Han, Huizi Mao, and William J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding, 2016. URL https://arxiv. org/abs/1510.00149
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[13]
Deep Residual Learning for Image Recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition, 2015. URLhttps://arxiv.org/abs/1512.03385
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[14]
Distilling the knowledge in a neural network,
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network,
-
[15]
URLhttps://arxiv.org/abs/1503.02531
work page internal anchor Pith review Pith/arXiv arXiv
-
[16]
Statlog (german credit data) data set
Hans Hofmann. Statlog (german credit data) data set. https://archive.ics.uci.edu/ dataset/144/statlog+german+credit+data, 1994
work page 1994
-
[17]
Quantization and training of neural networks for efficient integer-arithmetic-only inference
Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. Quantization and training of neural networks for efficient integer-arithmetic-only inference. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018. doi: 10.1109/CVPR.2018. 00286. URL https:...
-
[18]
Shalmali Joshi, Oluwasanmi Koyejo, Warut Vijitbenjaronk, Been Kim, and Joydeep Ghosh. Towards realistic individual recourse and actionable explanations in black-box decision making systems, 2019. URLhttps://arxiv.org/abs/1907.09615
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[19]
doi:10.1145/3527848 , keywords =
Amir-Hossein Karimi, Gilles Barthe, Bernhard Schölkopf, and Isabel Valera. A survey of algorithmic recourse: Contrastive explanations and consequential recommendations.ACM Comput. Surv., 55(5), 2022. ISSN 0360-0300. URLhttps://doi.org/10.1145/3527848
-
[20]
Auto-Encoding Variational Bayes
Diederik P Kingma and Max Welling. Auto-encoding variational bayes, 2022. URL https: //arxiv.org/abs/1312.6114
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[21]
Quantizing deep convolutional networks for efficient inference: A whitepaper
Raghuraman Krishnamoorthi. Quantizing deep convolutional networks for efficient inference: A whitepaper, 2018. URLhttps://arxiv.org/abs/1806.08342
work page internal anchor Pith review Pith/arXiv arXiv 2018
- [22]
-
[23]
Counterfactual explanations and model multiplicity: a relational verification view
Francesco Leofante, Elena Botoeva, and Vineet Rajani. Counterfactual explanations and model multiplicity: a relational verification view. InProceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning (KR), 2023. URL https://proceedings.kr.org/2023/78/kr2023-0078-leofante-et-al.pdf
work page 2023
-
[24]
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin, Jiaming Tang, Haotian Tang, Shang Yang, Wei-Ming Chen, Wei-Chen Wang, Guangxuan Xiao, Xingyu Dang, Chuang Gan, and Song Han. Awq: Activation-aware weight quantization for llm compression and acceleration, 2024. URLhttps://arxiv.org/abs/2306.00978
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[25]
Deep Learning Face Attributes in the Wild
Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild, 2015. URLhttps://arxiv.org/abs/1411.7766
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[26]
Preserving causal constraints in counterfactual explanations for machine learning classifiers, 2020
Divyat Mahajan, Chenhao Tan, and Amit Sharma. Preserving causal constraints in counterfactual explanations for machine learning classifiers, 2020. URL https://arxiv.org/abs/1912. 03277. 11
work page 2020
-
[27]
Marx, Flavio du Pin Calmon, and Berk Ustun
Charles T. Marx, Flavio du Pin Calmon, and Berk Ustun. Predictive multiplicity in classification. InProceedings of the 37th International Conference on Machine Learning (ICML), volume 119 ofProceedings of Machine Learning Research, 2020. URL https://proceedings.mlr. press/v119/marx20a.html
work page 2020
-
[28]
Ramaravind K. Mothilal, Amit Sharma, and Chenhao Tan. Explaining machine learning classifiers through diverse counterfactual explanations. InProceedings of the 2020 Conference on Fairness, Accountability, and Transparency, page 607–617. ACM, January 2020. doi: 10.1145/3351095.3372850. URLhttp://dx.doi.org/10.1145/3351095.3372850
-
[29]
Up or down? adaptive rounding for post-training quantization
Markus Nagel, Rana Ali Amjad, Mart van Baalen, Christos Louizos, and Tijmen Blankevoort. Up or down? adaptive rounding for post-training quantization. InProceedings of the 37th International Conference on Machine Learning (ICML), volume 119 ofProceedings of Machine Learning Research, pages 7197–7206, 2020. URL https://proceedings.mlr.press/ v119/nagel20a.html
work page 2020
-
[30]
Distributionally robust recourse action
Duy Nguyen, Ngoc Bui, and Viet Anh Nguyen. Distributionally robust recourse action. InInternational Conference on Learning Representations (ICLR), 2023. URL https: //openreview.net/pdf?id=E3ip6qBLF7
work page 2023
-
[31]
Learning model-agnostic coun- terfactual explanations for tabular data
Martin Pawelczyk, Klaus Broelemann, and Gjergji Kasneci. Learning model-agnostic coun- terfactual explanations for tabular data. InProceedings of The Web Conference 2020, page 3126–3132. ACM, April 2020. doi: 10.1145/3366423.3380087. URL http://dx.doi.org/ 10.1145/3366423.3380087
-
[32]
On counterfactual explanations under predictive multiplicity
Martin Pawelczyk, Klaus Broelemann, and Gjergji Kasneci. On counterfactual explanations under predictive multiplicity. InProceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), volume 124 ofProceedings of Machine Learning Research, pages 809–818,
-
[33]
URLhttps://proceedings.mlr.press/v124/pawelczyk20a.html
-
[34]
FACE: Feasible and actionable counterfactual explanations
Rafael Poyiadzi, Kacper Sokol, Raul Santos-Rodriguez, Tijl De Bie, and Peter Flach. FACE: Feasible and actionable counterfactual explanations. InProceedings of the AAAI/ACM Con- ference on AI, Ethics, and Society (AIES), 2020. doi: 10.1145/3375627.3375850. URL https://dl.acm.org/doi/10.1145/3375627.3375850
-
[35]
ProPublica. Propublica compas analysis. https://github.com/propublica/ compas-analysis, 2016
work page 2016
-
[36]
Algorithmic recourse in the wild: Understanding the impact of data and model shifts, 2020
Kaivalya Rawal, Ece Kamar, and Himabindu Lakkaraju. Algorithmic recourse in the wild: Understanding the impact of data and model shifts, 2020. URL https://arxiv.org/abs/ 2012.11788
-
[37]
Interpreting the latent space of gans for semantic face editing, 2020
Yujun Shen, Jinjin Gu, Xiaoou Tang, and Bolei Zhou. Interpreting the latent space of gans for semantic face editing, 2020. URLhttps://arxiv.org/abs/1907.10786
-
[38]
Towards robust and reliable algorithmic recourse
Sohini Upadhyay, Shalmali Joshi, and Himabindu Lakkaraju. Towards robust and reliable algorithmic recourse. InAdvances in Neural Information Processing Systems (NeurIPS), 2021. doi: 10.48550/arXiv.2102.13620. URLhttps://arxiv.org/abs/2102.13620
-
[39]
Actionable recourse in linear classification
Berk Ustun, Alexander Spangher, and Yang Liu. Actionable recourse in linear classification. InProceedings of the Conference on Fairness, Accountability, and Transparency, page 10–19. ACM, January 2019. doi: 10.1145/3287560.3287566. URL http://dx.doi.org/10.1145/ 3287560.3287566
-
[40]
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR
Sandra Wachter, Brent Mittelstadt, and Chris Russell. Counterfactual explanations without opening the black box: Automated decisions and the gdpr, 2018. URL https://arxiv.org/ abs/1711.00399
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[41]
HAQ: Hardware-Aware Automated Quantization with Mixed Precision
Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, and Song Han. HAQ: Hardware-aware automated quantization with mixed precision. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. doi: 10.48550/arXiv.1811.08886. URL https://openaccess.thecvf.com/content_CVPR_2019/papers/Wang_HAQ_ Hardware-Aware_Automated_Quantizatio...
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1811.08886 2019
-
[42]
recourse robust to model shift
Han Xiao, Kashif Rasul, and Roland V ollgraf. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, 2017. URL https://arxiv.org/abs/1708. 07747. 13 When Bits Break Recourse: Counterfactual-Faithful Quantization Appendix Table of Contents A Recourse Optimization and Constraints. . . . . . . . . . . . . . . . . . . . . . . . . ....
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.