Program-as-Weights: A Programming Paradigm for Fuzzy Functions
Pith reviewed 2026-07-03 16:15 UTC · model grok-4.3
The pith
Natural-language specs for fuzzy tasks compile into small neural adapters that let a 0.6B model match a 32B model's accuracy at 1/50th the memory cost.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
PAW reframes the foundation model from a per-input problem solver into a tool builder: invoked once per function definition, it produces a small reusable artifact whose subsequent calls per function application are cheap and offline. A 4B compiler trained on FuzzyBench emits parameter-efficient adapters for a frozen 0.6B Qwen3 interpreter; programs executed this way match the performance of direct prompting of Qwen3-32B while using roughly one fiftieth of the inference memory and running at 30 tokens per second on a MacBook M3.
What carries the argument
Program-as-Weights (PAW), the mechanism in which a compiler emits parameter-efficient adapters that are loaded into a fixed lightweight interpreter to execute the compiled fuzzy function.
If this is right
- Common fuzzy tasks can be defined once in natural language and then executed repeatedly with low memory and no API dependency.
- The large model is invoked only at function-definition time rather than at every input.
- Fuzzy functions become reproducible local artifacts rather than black-box API responses.
- A single 10M-example dataset suffices to train a compiler that produces usable adapters across multiple task types.
Where Pith is reading between the lines
- The same compiler-plus-interpreter split could be applied to other base models or adapter families if the training distribution is broadened.
- Multiple PAW artifacts could be composed at runtime to build higher-level fuzzy pipelines without retraining the compiler.
- Production systems that currently route every fuzzy decision through an API might replace those routes with one-time compilation plus local execution.
Load-bearing premise
A compiler trained on the FuzzyBench dataset can reliably produce adapters that generalize to arbitrary natural-language specifications of fuzzy functions.
What would settle it
Run the PAW compiler on a new fuzzy-function specification absent from FuzzyBench, execute the resulting adapter on the 0.6B interpreter, and check whether accuracy drops substantially below that of direct prompting on the 32B model for the same inputs.
Figures
read the original abstract
Many everyday programming tasks resist clean rule-based implementation, such as alerting on important log lines, repairing malformed JSON, or ranking search results by intent, and are increasingly outsourced to large language model APIs at the cost of locality, reproducibility, and price. We propose fuzzy-function programming: compiling such a function from a natural-language specification into a compact, locally-executable neural artifact. We instantiate this paradigm with Program-as-Weights (PAW), in which a 4B compiler trained on FuzzyBench, a 10M-example dataset we release, emits parameter-efficient adapters for a frozen, lightweight interpreter. A 0.6B Qwen3 interpreter executing PAW programs matches the performance of direct prompting of Qwen3-32B, while using roughly one fiftieth of the inference memory and running at 30 tokens/s on a MacBook M3. PAW reframes the foundation model from a per-input problem solver into a tool builder: invoked once per function definition, it produces a small reusable artifact whose subsequent calls per function application are cheap and offline.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Program-as-Weights (PAW), a paradigm for 'fuzzy functions' (e.g., log alerting, JSON repair, intent ranking) that resist clean rule-based implementation. A 4B compiler trained on the released FuzzyBench dataset (10M examples) emits parameter-efficient adapters for a frozen 0.6B Qwen3 interpreter. The central claim is that this 0.6B setup matches the performance of direct prompting on Qwen3-32B while using ~1/50th the inference memory and running at 30 tokens/s on a MacBook M3, reframing foundation models as one-time tool builders that produce small reusable offline artifacts.
Significance. If the performance equivalence and generalization hold, the work could enable efficient local execution of tasks currently routed to large APIs, improving locality, reproducibility, and cost. The release of FuzzyBench is a concrete positive contribution. The paradigm shift from per-input solving to artifact generation is conceptually interesting for the ML systems community.
major comments (2)
- [Abstract] Abstract: the headline claim of performance equivalence between the 0.6B Qwen3+PAW interpreter and direct Qwen3-32B prompting is stated without any evaluation details, dataset construction rules, baseline comparisons, or metrics, so the central claim cannot be assessed from the provided text.
- [Evaluation (implied)] The manuscript supplies no experiments or ablations testing whether the 4B compiler produces effective adapters for natural-language fuzzy-function specifications that lie outside the FuzzyBench training distribution; this generalization is load-bearing for the claim that PAW yields reusable artifacts for arbitrary specifications.
Simulated Author's Rebuttal
We thank the referee for the constructive review and for highlighting the potential impact of the work and the value of the FuzzyBench release. We respond to each major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: the headline claim of performance equivalence between the 0.6B Qwen3+PAW interpreter and direct Qwen3-32B prompting is stated without any evaluation details, dataset construction rules, baseline comparisons, or metrics, so the central claim cannot be assessed from the provided text.
Authors: We agree that the abstract, constrained by length, does not include evaluation specifics. The manuscript body details the 10M-example FuzzyBench dataset, the metrics used, and the direct comparison to Qwen3-32B prompting. We will revise the abstract to incorporate a concise reference to the key metrics and dataset scale so the central claim is more readily assessable. revision: yes
-
Referee: [Evaluation (implied)] The manuscript supplies no experiments or ablations testing whether the 4B compiler produces effective adapters for natural-language fuzzy-function specifications that lie outside the FuzzyBench training distribution; this generalization is load-bearing for the claim that PAW yields reusable artifacts for arbitrary specifications.
Authors: The reported results use the held-out test split of FuzzyBench. We acknowledge that explicit evaluation on specifications outside this distribution would strengthen the generalization claim. We will add targeted experiments and ablations on novel out-of-distribution natural-language specifications in the revised manuscript. revision: yes
Circularity Check
No circularity; empirical result on new dataset
full rationale
The paper introduces FuzzyBench (10M examples) and trains a 4B compiler to emit adapters for a frozen 0.6B interpreter. The headline performance equivalence (matching Qwen3-32B direct prompting) is presented as an empirical outcome of this training and evaluation procedure, not as a quantity derived by definition, by renaming a fitted parameter, or by a self-citation chain. No equations or steps reduce the claimed result to its own inputs; the derivation chain is self-contained against the external benchmark of direct prompting.
Axiom & Free-Parameter Ledger
free parameters (2)
- Compiler parameter count (4B)
- Interpreter parameter count (0.6B)
axioms (1)
- domain assumption Adapter modules can capture the behavior of fuzzy functions when emitted by a trained compiler
invented entities (2)
-
Fuzzy function
no independent evidence
-
PAW compiler
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Langley , title =
P. Langley , title =. Proceedings of the 17th International Conference on Machine Learning (ICML 2000) , address =. 2000 , pages =
2000
-
[2]
T. M. Mitchell. The Need for Biases in Learning Generalizations. 1980
1980
-
[3]
M. J. Kearns , title =
-
[4]
Machine Learning: An Artificial Intelligence Approach, Vol. I. 1983
1983
-
[5]
R. O. Duda and P. E. Hart and D. G. Stork. Pattern Classification. 2000
2000
-
[6]
Suppressed for Anonymity , author=
-
[7]
Newell and P
A. Newell and P. S. Rosenbloom. Mechanisms of Skill Acquisition and the Law of Practice. Cognitive Skills and Their Acquisition. 1981
1981
-
[8]
A. L. Samuel. Some Studies in Machine Learning Using the Game of Checkers. IBM Journal of Research and Development. 1959
1959
-
[9]
FANT o M : A Benchmark for Stress-testing Machine Theory of Mind in Interactions
Kim, Hyunwoo and Sclar, Melanie and Zhou, Xuhui and Bras, Ronan and Kim, Gunhee and Choi, Yejin and Sap, Maarten. FANT o M : A Benchmark for Stress-testing Machine Theory of Mind in Interactions. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023. doi:10.18653/v1/2023.emnlp-main.890
-
[10]
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Li, Xiang Lisa and Liang, Percy. Prefix-Tuning: Optimizing Continuous Prompts for Generation. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021. doi:10.18653/v1/2021.acl-long.353
-
[11]
Proceedings of the 38th International Conference on Machine Learning , pages =
Learning Transferable Visual Models From Natural Language Supervision , author =. Proceedings of the 38th International Conference on Machine Learning , pages =. 2021 , editor =
2021
-
[12]
2021 , eprint=
Evaluating Large Language Models Trained on Code , author=. 2021 , eprint=
2021
-
[13]
Yujia Li and David Choi and Junyoung Chung and Nate Kushman and Julian Schrittwieser and Rémi Leblond and Tom Eccles and James Keeling and Felix Gimeno and Agustin Dal Lago and Thomas Hubert and Peter Choy and Cyprien de Masson d’Autume and Igor Babuschkin and Xinyun Chen and Po-Sen Huang and Johannes Welbl and Sven Gowal and Alexey Cherepanov and James M...
-
[14]
Locating and Editing Factual Associations in
Kevin Meng and David Bau and Alex J Andonian and Yonatan Belinkov , booktitle=. Locating and Editing Factual Associations in. 2022 , url=
2022
-
[15]
Edward J Hu and yelong shen and Phillip Wallis and Zeyuan Allen-Zhu and Yuanzhi Li and Shean Wang and Lu Wang and Weizhu Chen , booktitle=. Lo. 2022 , url=
2022
-
[16]
Text-to-Lo
Rujikorn Charakorn and Edoardo Cetin and Yujin Tang and Robert Tjarko Lange , booktitle=. Text-to-Lo. 2025 , url=
2025
-
[17]
The Thirteenth International Conference on Learning Representations , year=
Generative Adapter: Contextualizing Language Models in Parameters with A Single Forward Pass , author=. The Thirteenth International Conference on Learning Representations , year=
-
[18]
International Conference on Learning Representations , year=
Continual learning with hypernetworks , author=. International Conference on Learning Representations , year=
-
[19]
Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
Karimi Mahabadi, Rabeeh and Ruder, Sebastian and Dehghani, Mostafa and Henderson, James. Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Pap...
-
[20]
2020 , eprint=
Language Models are Few-Shot Learners , author=. 2020 , eprint=
2020
-
[21]
2014 , eprint=
Neural Turing Machines , author=. 2014 , eprint=
2014
-
[22]
2016 , eprint=
Neural Programmer-Interpreters , author=. 2016 , eprint=
2016
-
[23]
2021 , eprint=
Thinking Like Transformers , author=. 2021 , eprint=
2021
-
[24]
Frédéric Gruau and Jean-Yves Ratajszczak and Gilles Wiber , abstract =. A neural compiler , journal =. 1995 , issn =. doi:https://doi.org/10.1016/0304-3975(94)00200-3 , url =
-
[25]
2025 , eprint=
Small Language Models are the Future of Agentic AI , author=. 2025 , eprint=
2025
-
[26]
AI Commun
Rubio Manzano, Clemente , title =. AI Commun. , month = oct, pages =. 2012 , issue_date =
2012
-
[27]
The ALCHEmist: Automated Labeling 500x CHEaper than LLM Data Annotators , url =
Huang, Tzu-Heng and Cao, Catherine and Bhargava, Vaishnavi and Sala, Frederic , booktitle =. The ALCHEmist: Automated Labeling 500x CHEaper than LLM Data Annotators , url =. doi:10.52202/079017-2003 , editor =
-
[28]
, title =
Deng, Yuntian and Kanervisto, Anssi and Ling, Jeffrey and Rush, Alexander M. , title =. Proceedings of the 34th International Conference on Machine Learning - Volume 70 , pages =. 2017 , publisher =
2017
-
[29]
and Hajishirzi, Hannaneh and Girshick, Ross and Farhadi, Ali and Kembhavi, Aniruddha , title =
Deitke, Matt and Clark, Christopher and Lee, Sangho and Tripathi, Rohun and Yang, Yue and Park, Jae Sung and Salehi, Mohammadreza and Muennighoff, Niklas and Lo, Kyle and Soldaini, Luca and Lu, Jiasen and Anderson, Taira and Bransom, Erin and Ehsani, Kiana and Ngo, Huong and Chen, YenSung and Patel, Ajay and Yatskar, Mark and Callison-Burch, Chris and Hea...
2025
-
[30]
2025 , eprint=
MCA-Bench: A Multimodal Benchmark for Evaluating CAPTCHA Robustness Against VLM-based Attacks , author=. 2025 , eprint=
2025
-
[31]
2025 , eprint=
Qwen3 Technical Report , author=. 2025 , eprint=
2025
-
[32]
2025 , eprint=
AndesVL Technical Report: An Efficient Mobile-side Multimodal Large Language Model , author=. 2025 , eprint=
2025
-
[33]
2025 , eprint=
Qwen2.5 Technical Report , author=. 2025 , eprint=
2025
-
[34]
2025 , eprint=
Teach2Eval: An Indirect Evaluation Method for LLM by Judging How It Teaches , author=. 2025 , eprint=
2025
-
[35]
The Eleventh International Conference on Learning Representations , year=
Large Language Models are Human-Level Prompt Engineers , author=. The Eleventh International Conference on Learning Representations , year=
-
[36]
Guo, Daya and Yang, Dejian and Zhang, Haowei and Song, Junxiao and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Zhang, Ruoyu and Ma, Shirong and Bi, Xiao and Zhang, Xiaokang and Yu, Xingkai and Wu, Yu and Wu, Z. F. and Gou, Zhibin and Shao, Zhihong and Li, Zhuoshu and Gao, Ziyi and Liu, Aixin and Xue, Bing and Wang, Bingxuan and Wu, Bochao and Feng, Bei ...
-
[37]
2024 , eprint=
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models , author=. 2024 , eprint=
2024
-
[38]
Williams, Ronald J. , title =. Mach. Learn. , month = may, pages =. 1992 , issue_date =. doi:10.1007/BF00992696 , abstract =
-
[39]
Policy Gradient Methods for Reinforcement Learning with Function Approximation , url =
Sutton, Richard S and McAllester, David and Singh, Satinder and Mansour, Yishay , booktitle =. Policy Gradient Methods for Reinforcement Learning with Function Approximation , url =
-
[40]
2025 , eprint=
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild , author=. 2025 , eprint=
2025
-
[41]
2021 , url=
Jieyu Zhang and Yue Yu and Yinghao Li and Yujing Wang and Yaming Yang and Mao Yang and Alexander Ratner , booktitle=. 2021 , url=
2021
-
[42]
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation , author=. arXiv preprint arXiv:2502.14846 , year=
-
[43]
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
Molmo and PixMo: Open weights and open data for state-of-the-art vision-language models , author=. arXiv preprint arXiv:2409.17146 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[44]
2025 , eprint=
Olmo 3 , author=. 2025 , eprint=
2025
-
[45]
Python 3.14.2 , howpublished =
-
[46]
Proceedings of the 34th International Conference on Machine Learning , pages =
Image-to-Markup Generation with Coarse-to-Fine Attention , author =. Proceedings of the 34th International Conference on Machine Learning , pages =. 2017 , editor =
2017
-
[47]
The Eleventh International Conference on Learning Representations , year=
Markup-to-Image Diffusion Models with Scheduled Sampling , author=. The Eleventh International Conference on Learning Representations , year=
-
[48]
2024 , journal =
HybridFlow: A Flexible and Efficient RLHF Framework , author =. 2024 , journal =
2024
-
[49]
2025 , eprint=
HyperSteer: Activation Steering at Scale with Hypernetworks , author=. 2025 , eprint=
2025
-
[50]
The Eleventh International Conference on Learning Representations (ICLR) , year =
Binding Language Models in Symbolic Languages , author =. The Eleventh International Conference on Learning Representations (ICLR) , year =
-
[51]
Learning to Generate Task-Specific Adapters from Task Description
Ye, Qinyuan and Ren, Xiang. Learning to Generate Task-Specific Adapters from Task Description. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 2021. doi:10.18653/v1/2021.acl-short.82
-
[52]
HINT : Hypernetwork Instruction Tuning for Efficient Zero- and Few-Shot Generalisation
Ivison, Hamish and Bhagia, Akshita and Wang, Yizhong and Hajishirzi, Hannaneh and Peters, Matthew. HINT : Hypernetwork Instruction Tuning for Efficient Zero- and Few-Shot Generalisation. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023. doi:10.18653/v1/2023.acl-long.631
-
[53]
2023 , volume =
Phang, Jason and Mao, Yi and He, Pengcheng and Chen, Weizhu , booktitle =. 2023 , volume =
2023
-
[54]
Advances in Neural Information Processing Systems , year =
Learning to Compress Prompts with Gist Tokens , author =. Advances in Neural Information Processing Systems , year =
-
[55]
2024 , url =
Li, Yichuan and Ma, Xiyao and Lu, Sixing and Lee, Kyumin and Liu, Xiaohu and Guo, Chenlei , booktitle =. 2024 , url =
2024
-
[56]
2019 , publisher =
Language Models are Unsupervised Multitask Learners , author =. 2019 , publisher =
2019
-
[57]
gpt-oss-120b & gpt-oss-20b Model Card
OpenAI , year =. 2508.10925 , archivePrefix =
work page internal anchor Pith review Pith/arXiv arXiv
-
[58]
2023 , url =
Gerganov, Georgi and. 2023 , url =
2023
-
[59]
2025 , eprint=
Qwen3-VL Technical Report , author=. 2025 , eprint=
2025
-
[60]
Singh, Amanpreet and Natarajan, Vivek and Shah, Meet and Jiang, Yu and Chen, Xinlei and Batra, Dhruv and Parikh, Devi and Rohrbach, Marcus , booktitle =. Towards
-
[61]
and Le, Quoc V
Ha, David and Dai, Andrew M. and Le, Quoc V. , booktitle =. 2017 , url =
2017
-
[62]
Parameter-Efficient Transfer Learning for
Houlsby, Neil and Giurgiu, Andrei and Jastrzebski, Stanislaw and Morrone, Bruna and de Laroussilhe, Quentin and Gesmundo, Andrea and Attariyan, Mona and Gelly, Sylvain , booktitle =. Parameter-Efficient Transfer Learning for. 2019 , url =
2019
-
[63]
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP) , year =
The Power of Scale for Parameter-Efficient Prompt Tuning , author =. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP) , year =
2021
-
[64]
2022 , url =
Liu, Xiao and Ji, Kaixuan and Fu, Yicheng and Tam, Weng Lam and Du, Zhengxiao and Yang, Zhilin and Tang, Jie , booktitle =. 2022 , url =
2022
-
[65]
Advances in Neural Information Processing Systems , year =
Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning , author =. Advances in Neural Information Processing Systems , year =
-
[66]
Advances in Neural Information Processing Systems , year =
Compacter: Efficient Low-Rank Hypercomplex Adapter Layers , author =. Advances in Neural Information Processing Systems , year =
-
[67]
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL) , year =
Pfeiffer, Jonas and Kamath, Aishwarya and R. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL) , year =
-
[68]
2024 , url =
Liu, Shih-Yang and Wang, Chien-Yi and Yin, Hongxu and Molchanov, Pavlo and Wang, Yu-Chiang Frank and Cheng, Kwang-Ting and Chen, Min-Hung , booktitle =. 2024 , url =
2024
-
[69]
2024 , url =
Huang, Chengsong and Liu, Qian and Lin, Bill Yuchen and Pang, Tianyu and Du, Chao and Lin, Min , booktitle =. 2024 , url =
2024
-
[70]
2023 , url =
Zhang, Qingru and Chen, Minshuo and Bukharin, Alexander and Karampatziakis, Nikos and He, Pengcheng and Cheng, Yu and Chen, Weizhu and Zhao, Tuo , booktitle =. 2023 , url =
2023
-
[71]
2023 , url =
Dettmers, Tim and Pagnoni, Artidoro and Holtzman, Ari and Zettlemoyer, Luke , booktitle =. 2023 , url =
2023
-
[72]
2023 , url =
Frantar, Elias and Ashkboos, Saleh and Hoefler, Torsten and Alistarh, Dan , booktitle =. 2023 , url =
2023
-
[73]
2024 , url =
Lin, Ji and Tang, Jiaming and Tang, Haotian and Yang, Shang and Chen, Wei-Ming and Wang, Wei-Chen and Xiao, Guangxuan and Dang, Xingyu and Gan, Chuang and Han, Song , booktitle =. 2024 , url =
2024
-
[74]
Charakorn, Rujikorn and Cetin, Edoardo and Uesaka, Shinnosuke and Lange, Robert Tjarko , year =. 2602.15902 , archivePrefix =
-
[75]
2026 , eprint =
Latent Context Compilation: Distilling Long Context into Compact Portable Memory , author =. 2026 , eprint =
2026
-
[76]
2026 , eprint =
Trojan, Bartosz and G. 2026 , eprint =
2026
-
[77]
SHINE: A Scalable In-Context Hypernetwork for Mapping Context to LoRA in a Single Pass
Liu, Yewei and Wang, Xiyuan and Mao, Yansheng and Gelberg, Yoav and Maron, Haggai and Zhang, Muhan , year =. 2602.06358 , archivePrefix =
work page internal anchor Pith review Pith/arXiv arXiv
-
[78]
The Tenth International Conference on Learning Representations (ICLR) , year =
Finetuned Language Models are Zero-Shot Learners , author =. The Tenth International Conference on Learning Representations (ICLR) , year =
-
[79]
The Tenth International Conference on Learning Representations (ICLR) , year =
Multitask Prompted Training Enables Zero-Shot Task Generalization , author =. The Tenth International Conference on Learning Representations (ICLR) , year =
-
[80]
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL) , pages =
Cross-Task Generalization via Natural Language Crowdsourcing Instructions , author =. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL) , pages =. 2022 , url =
2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.