Model Compression vs. Adversarial Robustness: An Empirical Study on Language Models for Code
Pith reviewed 2026-05-18 23:57 UTC · model grok-4.3
The pith
Compressing language models for code preserves task performance but sharply reduces resistance to adversarial attacks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that model compression techniques such as pruning, quantization, and knowledge distillation applied to language models for code produce versions that maintain comparable performance to uncompressed models on standard tasks, yet exhibit significantly reduced robustness when exposed to classical adversarial attacks. This trade-off holds across the tested models, tasks, attacks, and metrics, indicating that size reduction comes at the expense of adversarial resilience in code-related applications.
What carries the argument
Empirical evaluation comparing uncompressed and compressed variants (via pruning, quantization, and knowledge distillation) of code language models under four adversarial attacks using six performance metrics on three software analytics tasks.
If this is right
- Deploying compressed code models in security-sensitive applications requires extra robustness safeguards beyond standard compression.
- Compression choices must be assessed jointly on efficiency and adversarial performance rather than efficiency alone.
- New compression methods should target preservation of robustness alongside size reduction.
- Task-specific robustness testing becomes necessary when moving from uncompressed to compressed code models.
Where Pith is reading between the lines
- The observed trade-off could apply to language models outside the code domain if similar compression and attack protocols are used.
- Post-compression fine-tuning or ensemble defenses might mitigate the robustness loss without sacrificing efficiency gains.
- Different attack strengths or adaptive attacks could reveal whether the robustness drop is attack-specific or fundamental.
Load-bearing premise
The four classical adversarial attacks and six metrics are representative enough to establish a general robustness trade-off across compression strategies and code tasks.
What would settle it
An experiment showing that at least one compression strategy preserves or improves robustness scores under the same four attacks and six metrics on the same models and tasks would falsify the reported trade-off.
read the original abstract
Transformer-based language models for code have shown remarkable performance in various software analytics tasks, but their adoption is hindered by high computational costs, slow inference speeds, and substantial environmental impact. Model compression techniques such as pruning, quantization, and knowledge distillation have gained traction in addressing these challenges. However, the impact of these strategies on the robustness of compressed language models for code in adversarial scenarios remains poorly understood. Understanding how these compressed models behave under adversarial attacks is essential for their safe and effective deployment in real-world applications. To bridge this knowledge gap, we conduct a comprehensive evaluation of how common compression strategies affect the adversarial robustness of compressed models. We assess the robustness of compressed versions of three widely used language models for code across three software analytics tasks, using six evaluation metrics and four commonly used classical adversarial attacks. Our findings indicate that compressed models generally maintain comparable performance to their uncompressed counterparts. However, when subjected to adversarial attacks, compressed models exhibit significantly reduced robustness. These results reveal a trade-off between model size reduction and adversarial robustness, underscoring the need for careful consideration when deploying compressed models in security-critical software applications. Our study highlights the need for further research into compression strategies that strike a balance between computational efficiency and adversarial robustness, which is essential for deploying reliable language models for code in real-world software applications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents an empirical study evaluating the effects of model compression techniques (pruning, quantization, and knowledge distillation) on the adversarial robustness of transformer-based language models for code. Using three widely used code models across three software analytics tasks, six evaluation metrics, and four classical adversarial attacks, the authors report that compressed models maintain comparable task performance to their uncompressed counterparts but exhibit significantly reduced robustness under adversarial attacks, revealing a trade-off between compression and robustness that has implications for security-critical deployments.
Significance. If the central empirical findings hold after addressing validity concerns with the attacks, this work is significant for software engineering and AI security research. It fills a gap in understanding robustness implications of compression for code models and provides a broad multi-model, multi-task evaluation that can guide practitioners. The study correctly identifies the need for balanced compression strategies, though its impact depends on demonstrating that the observed robustness drop reflects genuine model vulnerabilities rather than artifacts of the attack methods.
major comments (2)
- [Abstract and Section on Adversarial Attacks] The abstract and methods description of the four classical adversarial attacks provide no indication of adaptations, post-attack filtering, or checks to ensure generated examples preserve code syntax and semantics (e.g., compiler acceptance or functional equivalence). This is load-bearing for the central claim of 'significantly reduced robustness' because standard gradient-based or substitution attacks from NLP frequently yield syntactically invalid or semantically altered code; without explicit validation steps, the robustness gap could be an artifact of attacking malformed inputs rather than a compression-induced vulnerability.
- [Evaluation and Results] The evaluation lacks detail on statistical testing (e.g., significance tests for the 'significantly reduced' robustness claim) or explicit baseline comparisons beyond uncompressed models. This weakens assessment of the trade-off finding across compression strategies and tasks, as noted in the low-confidence soundness assessment.
minor comments (2)
- [Evaluation Metrics] Clarify the exact implementation details of the six metrics and how they align with standard practices in code model evaluation.
- [Experimental Setup] Ensure all model variants, compression hyperparameters, and attack parameters are fully specified in a reproducibility section or appendix.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback. We address each major comment below, providing clarifications and committing to revisions that strengthen the manuscript without altering its core empirical findings.
read point-by-point responses
-
Referee: [Abstract and Section on Adversarial Attacks] The abstract and methods description of the four classical adversarial attacks provide no indication of adaptations, post-attack filtering, or checks to ensure generated examples preserve code syntax and semantics (e.g., compiler acceptance or functional equivalence). This is load-bearing for the central claim of 'significantly reduced robustness' because standard gradient-based or substitution attacks from NLP frequently yield syntactically invalid or semantically altered code; without explicit validation steps, the robustness gap could be an artifact of attacking malformed inputs rather than a compression-induced vulnerability.
Authors: We agree that the current description lacks sufficient detail on validation procedures, which is important for substantiating the robustness claims. Our experiments did apply code-specific adaptations of the four attacks (e.g., syntax-preserving substitutions and variable renaming that respect AST structure) along with post-attack filtering to retain only examples that compile successfully and pass functional equivalence checks via provided test suites. However, these steps were not explicitly documented in the methods. We will add a dedicated subsection to the methods describing the adaptations, filtering criteria, compiler acceptance rates, and the fraction of generated examples retained after validation. This revision will make clear that the reported robustness reductions are not artifacts of malformed inputs. revision: yes
-
Referee: [Evaluation and Results] The evaluation lacks detail on statistical testing (e.g., significance tests for the 'significantly reduced' robustness claim) or explicit baseline comparisons beyond uncompressed models. This weakens assessment of the trade-off finding across compression strategies and tasks, as noted in the low-confidence soundness assessment.
Authors: We concur that adding statistical tests and clearer baseline framing will improve the rigor of the trade-off analysis. We will incorporate paired statistical tests (e.g., Wilcoxon signed-rank or t-tests with multiple random seeds) on the robustness metrics to support the 'significantly reduced' statements. We will also expand the presentation of baseline comparisons by more explicitly tabulating results against the uncompressed models and, space permitting, across compression intensities. These changes directly address the soundness concerns while preserving the multi-model, multi-task scope of the study. revision: yes
Circularity Check
No significant circularity in this empirical measurement study
full rationale
The paper conducts a direct empirical evaluation by applying standard compression techniques (pruning, quantization, distillation) to three code language models, then measuring performance and robustness on three software analytics tasks using six metrics and four classical adversarial attacks. No equations, parameter fitting, derivations, or self-citation chains appear in the provided text. All claims rest on observed measurement differences rather than any reduction of outputs to inputs by construction. The study is therefore self-contained against external benchmarks, with the central robustness trade-off claim arising from experimental results rather than definitional or fitted equivalence.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We assess the robustness of compressed versions of three widely used language models for code across three software analytics tasks, using six evaluation metrics and four commonly used classical adversarial attacks.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Our findings indicate that compressed models generally maintain comparable performance to their uncompressed counterparts. However, when subjected to adversarial attacks, compressed models exhibit significantly reduced robustness.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Advances in neural information processing systems 30 (2017)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
work page 2017
-
[2]
Roy, C.K., Cordy, J.R.: A survey on software clone detection research. Queen’s School of computing TR 541(115), 64–68 (2007) 8Replication-packages 9https://app.grammarly.com/ 10https://chat.openai.com/ 27
work page 2007
-
[3]
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
Feng, Z., Guo, D., Tang, D., Duan, N., Feng, X., Gong, M., Shou, L., Qin, B., Liu, T., Jiang, D., et al.: Codebert: A pre-trained model for programming and natural languages. arXiv:2002.08155 (2020)
work page internal anchor Pith review Pith/arXiv arXiv 2002
-
[4]
GraphCodeBERT: Pre-training Code Representations with Data Flow
Guo, D., Ren, S., Lu, S., Feng, Z., Tang, D., Liu, S., Zhou, L., Duan, N., Svy- atkovskiy, A., Fu, S., et al.: Graphcodebert: Pre-training code representations with data flow. arXiv:2009.08366 (2020)
work page internal anchor Pith review Pith/arXiv arXiv 2009
-
[5]
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Lu, S., Guo, D., Ren, S., Huang, J., Svyatkovskiy, A., Blanco, A., Clement, C., Drain, D., Jiang, D., Tang, D., et al.: Codexglue: A machine learning benchmark dataset for code understanding and generation. arXiv:2102.04664 (2021)
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[6]
Ahmad, W.U., Chakraborty, S., Ray, B., Chang, K.-W.: Unified pre-training for program understanding and generation. arXiv:2103.06333 (2021)
-
[7]
In: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering
Shi, J., Yang, Z., Xu, B., Kang, H.J., Lo, D.: Compressing pre-trained mod- els of code into 3 mb. In: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering. ASE ’22. Association for Com- puting Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3551349. 3556964 . https://doi.org/10.1145/3551349.3556964
-
[8]
Shi, J., Yang, Z., Kang, H.J., Xu, B., He, J., Lo, D.: Greening large lan- guage models of code. In: Proceedings of the 46th International Conference on Software Engineering: Software Engineering in Society. ICSE-SEIS’24, pp. 142–153. Association for Computing Machinery, New York, NY, USA (2024). https://doi.org/10.1145/3639475.3640097 . https://doi-org.l...
-
[9]
ACM Transactions on Software Engineering and Methodology (2024)
Shi, J., Yang, Z., Lo, D.: Efficient and green large language models for soft- ware engineering: Vision and the road ahead. ACM Transactions on Software Engineering and Methodology (2024)
work page 2024
-
[10]
Communications of the ACM 63(12), 54–63 (2020)
Schwartz, R., Dodge, J., Smith, N.A., Etzioni, O.: Green ai. Communications of the ACM 63(12), 54–63 (2020)
work page 2020
-
[11]
arXiv preprint arXiv:2412.13737 (2024)
d’Aloisio, G., Traini, L., Sarro, F., Di Marco, A.: On the compression of language models for code: An empirical study on codebert. arXiv preprint arXiv:2412.13737 (2024)
-
[12]
arXiv preprint arXiv:2407.04147 (2024)
Saad, M., L´ opez, J.A.H., Chen, B., Varr´ o, D., Sharma, T.: Alpine: An adaptive language-agnostic pruning method for language models for code. arXiv preprint arXiv:2407.04147 (2024)
-
[13]
Proceedings of the ACM on Software Engineering 2(FSE), 3057–3080 (2025) 28
Chen, Y., Ye, Y., Li, Z., Ma, Y., Gao, C.: Smaller but better: Self-paced knowledge distillation for lightweight yet effective lcms. Proceedings of the ACM on Software Engineering 2(FSE), 3057–3080 (2025) 28
work page 2025
-
[14]
In: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pp
Hellendoorn, V.J., Proksch, S., Gall, H.C., Bacchelli, A.: When code comple- tion fails: A case study on real-world completions. In: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pp. 960–970 (2019). IEEE
work page 2019
-
[15]
Advances in neural information processing systems 33, 20378–20389 (2020)
Sanh, V., Wolf, T., Rush, A.: Movement pruning: Adaptive sparsity by fine- tuning. Advances in neural information processing systems 33, 20378–20389 (2020)
work page 2020
-
[16]
Zafrir, O., Boudoukh, G., Izsak, P., Wasserblat, M.: Q8bert: Quantized 8bit bert. In: 2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing-NeurIPS Edition (EMC2-NIPS), pp. 36–39 (2019). IEEE
work page 2019
-
[17]
Distilling the Knowledge in a Neural Network
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[18]
Wei, X., Gonugondla, S.K., Wang, S., Ahmad, W., Ray, B., Qian, H., Li, X., Kumar, V., Wang, Z., Tian, Y., et al.: Towards greener yet powerful code genera- tion via quantization: An empirical study. In: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 224–236 (2023)
work page 2023
-
[19]
Empirical Software Engineering 21, 159–182 (2016)
Guo, Y., Sp´ ınola, R.O., Seaman, C.: Exploring the costs of technical debt management–a case study. Empirical Software Engineering 21, 159–182 (2016)
work page 2016
-
[20]
McGraw-Hill Book Co, N.Y.: Cast worldwide application software quality study: summary of key findings. Cast report Charette RN (1989) Software engineering, risk analysis and management Intertext publications (2012)
work page 1989
-
[21]
Journal of Systems and Software 158, 110407 (2019)
Mondal, M., Roy, B., Roy, C.K., Schneider, K.A.: An empirical study on bug propagation through code cloning. Journal of Systems and Software 158, 110407 (2019)
work page 2019
-
[22]
In: Proceedings of the 44th ICSE, pp
Yang, Z., Shi, J., He, J., Lo, D.: Natural attack for pre-trained models of code. In: Proceedings of the 44th ICSE, pp. 1482–1493 (2022)
work page 2022
-
[23]
Du, X., Wen, M., Wei, Z., Wang, S., Jin, H.: An extensive study on adversarial attack against pre-trained models of code. In: 31st FSE, pp. 489–501 (2023)
work page 2023
-
[24]
In: 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp
Tian, Z., Chen, J., Jin, Z.: Code difference guided adversarial example generation for deep code models. In: 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 850–862 (2023). IEEE
work page 2023
-
[25]
ACM Transactions on Software Engineering and Methodology (TOSEM) 31(3), 1–40 (2022) 29
Zhang, H., Fu, Z., Li, G., Ma, L., Zhao, Z., Yang, H., Sun, Y., Liu, Y., Jin, Z.: Towards robustness of deep program processing models—detection, estimation, and enhancement. ACM Transactions on Software Engineering and Methodology (TOSEM) 31(3), 1–40 (2022) 29
work page 2022
-
[26]
arXiv preprint arXiv:2109.03228 (2021)
Xu, C., Zhou, W., Ge, T., Xu, K., McAuley, J., Wei, F.: Beyond preserved accuracy: Evaluating loyalty and robustness of bert compression. arXiv preprint arXiv:2109.03228 (2021)
-
[27]
Ye, S., Xu, K., Liu, S., Cheng, H., Lambrechts, J.-H., Zhang, H., Zhou, A., Ma, K., Wang, Y., Lin, X.: Adversarial robustness vs. model compression, or both? In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 111–120 (2019)
work page 2019
-
[28]
In: Proceedings of the AAAI Conference on AI, vol
Zhang, H., Li, Z., Li, G., Ma, L., Liu, Y., Jin, Z.: Generating adversarial examples for holding robustness of source code processing models. In: Proceedings of the AAAI Conference on AI, vol. 34, pp. 1169–1176 (2020)
work page 2020
-
[29]
In: 31st ACM SIGSOFT ISSTA, pp
Zeng, Z., Tan, H., Zhang, H., Li, J., Zhang, Y., Zhang, L.: An extensive study on pre-trained models for program understanding and generation. In: 31st ACM SIGSOFT ISSTA, pp. 39–51 (2022)
work page 2022
-
[30]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Devlin, J.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[31]
In: Proceedings of the AAAI Conference on Artificial Intelligence, vol
Xu, C., McAuley, J.: A survey on model compression and acceleration for pre- trained language models. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 10566–10575 (2023)
work page 2023
-
[32]
Transactions of the Association for Computational Linguistics 12, 1556–1577 (2024)
Zhu, X., Li, J., Liu, Y., Ma, C., Wang, W.: A survey on model compression for large language models. Transactions of the Association for Computational Linguistics 12, 1556–1577 (2024)
work page 2024
-
[33]
Casta˜ no, J., Mart´ ınez-Fern´ andez, S., Franch, X., Bogner, J.: Exploring the car- bon footprint of hugging face’s ml models: A repository mining study. In: 2023 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 1–12 (2023). IEEE
work page 2023
-
[34]
Hort, M., Grishina, A., Moonen, L.: An exploratory literature study on sharing and energy use of language models for source code. In: 2023 ACM/IEEE Interna- tional Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 1–12 (2023). IEEE
work page 2023
-
[35]
arXiv preprint arXiv:2402.09748 (2024)
Wang, W., Chen, W., Luo, Y., Long, Y., Lin, Z., Zhang, L., Lin, B., Cai, D., He, X.: Model compression and efficient inference for large language models: A survey. arXiv preprint arXiv:2402.09748 (2024)
-
[36]
A Survey on Knowledge Distillation of Large Language Models
Xu, X., Li, M., Tao, C., Shen, T., Cheng, R., Li, J., Xu, C., Tao, D., Zhou, T.: A survey on knowledge distillation of large language models. arXiv preprint arXiv:2402.13116 (2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[37]
arXiv preprint arXiv:2401.08092 (2024)
Xu, M., Yin, W., Cai, D., Yi, R., Xu, D., Wang, Q., Wu, B., Zhao, Y., Yang, 30 C., Wang, S., et al.: A survey of resource-efficient llm and multimodal foundation models. arXiv preprint arXiv:2401.08092 (2024)
-
[38]
Intriguing properties of neural networks
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. arXiv:1312.6199 (2013)
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[39]
In: 2014 IEEE International Conference on Software Maintenance and Evolution, pp
Svajlenko, J., Islam, J.F., Keivanloo, I., Roy, C.K., Mia, M.M.: Towards a big data curated benchmark of inter-project code clones. In: 2014 IEEE International Conference on Software Maintenance and Evolution, pp. 476–480 (2014). IEEE
work page 2014
-
[40]
Wang, W., Li, G., Ma, B., Xia, X., Jin, Z.: Detecting code clones with graph neural network and flow-augmented abstract syntax tree. In: 2020 IEEE 27th Interna- tional Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 261–271 (2020). IEEE
work page 2020
-
[41]
In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, pp
Hough, K., Welearegai, G., Hammer, C., Bell, J.: Revealing injection vulner- abilities by leveraging existing tests. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, pp. 284–296 (2020)
work page 2020
-
[42]
ACM Transactions on Software Engineering and Methodology 32(1), 1–45 (2023)
Sayar, I., Bartel, A., Bodden, E., Le Traon, Y.: An in-depth study of java deseri- alization remote-code execution exploits and vulnerabilities. ACM Transactions on Software Engineering and Methodology 32(1), 1–45 (2023)
work page 2023
-
[43]
CodeSearchNet Challenge: Evaluating the State of Semantic Code Search
Husain, H., Wu, H.-H., Gazit, T., Allamanis, M., Brockschmidt, M.: Codesearch- net challenge: Evaluating the state of semantic code search. arXiv:1909.09436 (2019)
work page internal anchor Pith review Pith/arXiv arXiv 1909
-
[45]
In: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, pp
Ahmed, T., Pai, K.S., Devanbu, P., Barr, E.: Automatic semantic augmentation of language model prompts (for code summarization). In: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, pp. 1–13 (2024)
work page 2024
-
[46]
Advances in neural information processing systems 2 (1989)
LeCun, Y., Denker, J., Solla, S.: Optimal brain damage. Advances in neural information processing systems 2 (1989)
work page 1989
-
[47]
IEEE transactions on information theory 44(6), 2325–2383 (1998)
Gray, R.M., Neuhoff, D.L.: Quantization. IEEE transactions on information theory 44(6), 2325–2383 (1998)
work page 1998
-
[48]
In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp
Sainath, T.N., Kingsbury, B., Sindhwani, V., Arisoy, E., Ramabhadran, B.: Low- rank matrix factorization for deep neural network training with high-dimensional output targets. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6655–6659 (2013). IEEE 31
work page 2013
-
[49]
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019)
work page internal anchor Pith review Pith/arXiv arXiv 1909
-
[50]
The annals of mathematical statistics 22(1), 79–86 (1951)
Kullback, S., Leibler, R.A.: On information and sufficiency. The annals of mathematical statistics 22(1), 79–86 (1951)
work page 1951
-
[51]
arXiv preprint arXiv:2002.08307 (2020)
Gordon, M.A., Duh, K., Andrews, N.: Compressing bert: Studying the effects of weight pruning on transfer learning. arXiv preprint arXiv:2002.08307 (2020)
-
[52]
arXiv preprint arXiv:2505.19433 (2025)
Dong, P., Tang, Z., Liu, X., Li, L., Chu, X., Li, B.: Can compressed llms truly act? an empirical evaluation of agentic capabilities in llm compression. arXiv preprint arXiv:2505.19433 (2025)
-
[53]
Ilyas, A., Engstrom, L., Athalye, A., Lin, J.: Black-box adversarial attacks with limited queries and information. In: ICML, pp. 2137–2146 (2018). PMLR
work page 2018
-
[54]
HotFlip: White-Box Adversarial Examples for Text Classification
Ebrahimi, J., Rao, A., Lowd, D., Dou, D.: Hotflip: White-box adversarial examples for text classification. arXiv arXiv:1712.06751 (2017)
work page internal anchor Pith review Pith/arXiv arXiv 2017
- [55]
-
[56]
Journal of the american statistical association 32(200), 675–701 (1937)
Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the american statistical association 32(200), 675–701 (1937)
work page 1937
-
[57]
Wiley encyclopedia of clinical trials, 1–3 (2007)
Woolson, R.F.: Wilcoxon signed-rank test. Wiley encyclopedia of clinical trials, 1–3 (2007)
work page 2007
-
[58]
arXiv preprint arXiv:2110.08419 (2021)
Du, M., Mukherjee, S., Cheng, Y., Shokouhi, M., Hu, X., Awadallah, A.H.: Robustness challenges in model distillation and pruning for natural language understanding. arXiv preprint arXiv:2110.08419 (2021)
-
[59]
In: 2024 IEEE 10th Interna- tional Conference on Edge Computing and Scalable Cloud (EdgeCom), pp
Gourtani, S.K., Meratnia, N.: Improving robustness of compressed models with weight sharing through knowledge distillation. In: 2024 IEEE 10th Interna- tional Conference on Edge Computing and Scalable Cloud (EdgeCom), pp. 13–21 (2024). IEEE
work page 2024
-
[60]
Zhu, J., Wang, L., Han, X.: Safety and performance, why not both? bi-objective optimized model compression toward ai software deployment. In: Proceed- ings of the 37th IEEE/ACM International Conference on Automated Software Engineering, pp. 1–13 (2022)
work page 2022
-
[61]
In: Proceedings of the AAAI Conference on Artificial Intelligence, vol
Goldblum, M., Fowl, L., Feizi, S., Goldstein, T.: Adversarially robust distillation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 32 3996–4003 (2020)
work page 2020
-
[62]
ACM Computing Surveys 57(5), 1–39 (2025)
Xu, M., Cai, D., Yin, W., Wang, S., Jin, X., Liu, X.: Resource-efficient algorithms and systems of foundation models: A survey. ACM Computing Surveys 57(5), 1–39 (2025)
work page 2025
-
[63]
arXiv preprint arXiv:2111.05193 (2021)
Xu, J., Zhou, W., Fu, Z., Zhou, H., Li, L.: A survey on green deep learning. arXiv preprint arXiv:2111.05193 (2021)
-
[64]
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Sanh, V.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv:1910.01108 (2019)
work page internal anchor Pith review Pith/arXiv arXiv 1910
-
[65]
arXiv preprint arXiv:1908.09355 (2019)
Sun, S., Cheng, Y., Gan, Z., Liu, J.: Patient knowledge distillation for bert model compression. arXiv preprint arXiv:1908.09355 (2019)
-
[66]
arXiv preprint arXiv:1909.10351 (2019)
Jiao, X., Yin, Y., Shang, L., Jiang, X., Chen, X., Li, L., Wang, F., Liu, Q.: Tinybert: Distilling bert for natural language understanding. arXiv preprint arXiv:1909.10351 (2019)
-
[67]
Buciluˇ a, C., Caruana, R., Niculescu-Mizil, A.: Model compression. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 535–541 (2006)
work page 2006
-
[68]
arXiv preprint arXiv:2305.12870 (2023)
Jiang, Y., Chan, C., Chen, M., Wang, W.: Lion: Adversarial distillation of proprietary large language models. arXiv preprint arXiv:2305.12870 (2023)
-
[69]
StructADMM: A Systematic, High-Efficiency Framework of Structured Weight Pruning for DNNs
Zhang, T., Ye, S., Zhang, K., Ma, X., Liu, N., Zhang, L., Tang, J., Ma, K., Lin, X., Fardad, M., et al.: Structadmm: A systematic, high-efficiency framework of structured weight pruning for dnns. arXiv preprint arXiv:1807.11091 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[70]
Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
Tang, R., Lu, Y., Liu, L., Mou, L., Vechtomova, O., Lin, J.: Distilling task- specific knowledge from bert into simple neural networks. arXiv preprint arXiv:1903.12136 (2019)
work page internal anchor Pith review Pith/arXiv arXiv 1903
-
[71]
arXiv preprint arXiv:2002.02925 (2020)
Xu, C., Zhou, W., Ge, T., Wei, F., Zhou, M.: Bert-of-theseus: Compressing bert by progressive module replacing. arXiv preprint arXiv:2002.02925 (2020)
-
[72]
arXiv preprint arXiv:1909.11556 (2019)
Fan, A., Grave, E., Joulin, A.: Reducing transformer depth on demand with structured dropout. arXiv preprint arXiv:1909.11556 (2019)
-
[73]
Michel, P., Levy, O., Neubig, G.: Are sixteen heads really better than one? Advances in neural information processing systems 32 (2019)
work page 2019
-
[74]
In: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, pp
Sun, Z., Du, X., Song, F., Wang, S., Li, L.: When neural code completion models size up the situation: Attaining cheaper and faster completion through dynamic model inference. In: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, pp. 1–12 (2024) 33
work page 2024
-
[75]
Zhang, Z., Zhang, H., Shen, B., Gu, X.: Diet code is healthy: Simplifying programs for pre-trained models of code. In: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 1073–1084 (2022)
work page 2022
-
[76]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp
Dong, J., Koniusz, P., Chen, J., Wang, Z.J., Ong, Y.-S.: Robust distillation via untargeted and targeted intermediate adversarial samples. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 28432– 28442 (2024)
work page 2024
-
[77]
IEEE Transactions on Information Forensics and Security (2023)
Bai, T., Zhao, J., Wen, B.: Guided adversarial contrastive distillation for robust students. IEEE Transactions on Information Forensics and Security (2023)
work page 2023
-
[78]
Advances in Neural Information Processing Systems 36, 10796–10813 (2023) 34
Kuang, H., Liu, H., Wu, Y., Satoh, S., Ji, R.: Improving adversarial robustness via information bottleneck distillation. Advances in Neural Information Processing Systems 36, 10796–10813 (2023) 34
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.