CodeCytos: AI-assisted spatial molecular imaging analysis via code-augmented agent action space
Pith reviewed 2026-06-28 19:09 UTC · model grok-4.3
The pith
CodeCytos lets AI agents write and run custom code to explore spatial features in molecular tissue images.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CodeCytos is a coding-based reasoning agent framework that enables dynamic, programmable interaction with spatial molecular imaging data to improve automation and customization of cellular analysis. It supports exploration of custom spatial cellular features by letting the agent generate and execute code in response to minimal user prompts, and it demonstrates outperformance over baseline approaches on expert-curated datasets from frontal cortex, non-small-cell lung cancer, pancreas, and tonsil tissues when using LLM backbones with coding capabilities.
What carries the argument
The code-augmented agent action space, which lets the agent generate executable code to query and compute custom spatial features on the imaging data rather than selecting from a fixed menu of operations.
If this is right
- Bioscientists can request custom spatial analyses using only simple natural-language questions without task-specific instructions.
- Performance on custom feature tasks improves substantially when the agent receives a small number of domain-agnostic coding examples.
- The same framework can be applied across different tissue types without retraining or expert-crafted demonstrations for each new study.
- Custom biomarker exploration becomes more scalable because the agent adapts to new questions by writing fresh code rather than depending on pre-implemented functions.
Where Pith is reading between the lines
- The same code-generation approach could be tested on other imaging modalities such as multiplexed immunofluorescence or spatial transcriptomics data.
- Integration with interactive environments might allow iterative refinement where a user corrects an initial code output and the agent continues from there.
- If the generated code can be automatically verified against ground-truth statistics on public datasets, the framework could support higher-stakes biomarker pipelines.
Load-bearing premise
Large language models can reliably produce correct and useful analysis code for spatial cellular tasks from only minimal prompts and a few examples drawn from unrelated domains.
What would settle it
Run the agent on a held-out spatial analysis task with a minimal prompt, then compare the accuracy and correctness of its generated code output against expert-written reference code for the same task.
Figures
read the original abstract
Conventional tissue image analysis software provides foundational capabilities for cellular analysis, including segmentation, basic morphological feature extraction, and spatial organization analysis. However, these tools often require manual intervention and are not well integrated with code-driven automation, limiting efficiency and scalability for complex spatial tissue studies. In addition, they offer limited flexibility for custom analyses, as they typically support only a fixed set of pre-implemented spatial cellular features. To address these limitations, we propose CodeCytos, a coding-based reasoning agent framework that enables dynamic, programmable interaction with spatial molecular imaging data to improve automation and customization. CodeCytos is designed to streamline the exploration of custom spatial cellular features and adapt to diverse research needs. We demonstrate its utility through case studies on four expert-curated datasets from distinct tissue types: frontal cortex, non-small-cell lung cancer, pancreas, and tonsil. We evaluate CodeCytos under a realistic minimal prompt setting, where bioscientists pose simple questions without task-specific instructions or contextual information about spatial cellular analysis, and benchmark multiple LLM backbones with strong coding capabilities. We further show that incorporating tailored, domain-agnostic few-shot in-context coding-reasoning examples (randomly sampled demonstrations outside the spatial analysis domain) can substantially improve performance without requiring costly, expert-crafted in-domain demonstrations. Overall, CodeCytos outperforms baseline approaches, highlighting the potential of code-action agents to assist with custom feature exploration in spatial molecular imaging and to accelerate biomarker discovery.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces CodeCytos, a coding-based reasoning agent framework that augments LLMs with a code-generation action space to enable dynamic, programmable analysis of spatial molecular imaging data. It targets limitations in conventional tools by supporting custom spatial cellular feature exploration without fixed pre-implemented feature sets. The approach is evaluated via case studies on four expert-curated datasets (frontal cortex, non-small-cell lung cancer, pancreas, tonsil) under a minimal-prompt setting using domain-agnostic few-shot in-context examples, with the central claim that CodeCytos outperforms baseline approaches and can accelerate biomarker discovery.
Significance. If the reported outperformance is substantiated with quantitative results, the framework offers a practical route to greater flexibility and automation in spatial tissue analysis, reducing reliance on manual intervention and enabling researchers to define custom analyses via natural language prompts. The use of domain-agnostic few-shot examples without task-specific or in-domain expert demonstrations is a pragmatic strength that could lower barriers for bioscientists.
minor comments (3)
- [Abstract] Abstract: The claim that CodeCytos 'outperforms baseline approaches' is stated without naming the baselines, providing any performance metrics, or describing the evaluation protocol; the results section should explicitly define these to allow readers to assess the comparison.
- [Abstract] Abstract: The four datasets are described only as 'expert-curated' from distinct tissue types; including accession numbers, imaging modalities, or cell-type annotations would improve reproducibility and context for the case studies.
- [Abstract] Abstract: The phrase 'domain-agnostic few-shot in-context coding-reasoning examples (randomly sampled demonstrations outside the spatial analysis domain)' is introduced without an example or citation to the prompting strategy; a brief illustration or reference would clarify the method.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of CodeCytos, the recognition of its practical strengths in using domain-agnostic few-shot examples, and the recommendation for minor revision. No specific major comments were raised in the report.
Circularity Check
No significant circularity
full rationale
The paper is an empirical demonstration of an applied agent framework evaluated via case studies on four datasets, with performance reported against baselines under a minimal-prompt setting. No derivations, equations, fitted parameters, or self-citations appear in the provided text that reduce any claimed result to a quantity defined by the authors' own prior choices or inputs. The central claims rest on observed outperformance in the described experiments rather than any self-referential construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Large language models with strong coding capabilities can generate accurate code for custom spatial cellular feature extraction when given minimal prompts and out-of-domain few-shot examples.
invented entities (1)
-
CodeCytos
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Tumour host-location (thl) lab: Halo
Cancer Research UK Oxford Centre. Tumour host-location (thl) lab: Halo. https://www.cancer.ox.ac.uk/support/THL/ HALO (n.d.). Accessed 2026-01-25. 3.Visiopharm. Visiopharm. https://visiopharm.com/ (n.d.). Accessed 2026-01-25
2026
-
[2]
Chiu, C.-L., Clack, N.et al.Napari: a python multi-dimensional image viewer platform for the research community. Microsc. Microanal.28, 1576–1577 (2022)
2022
-
[3]
T., Hiner, M
Schindelin, J., Rueden, C. T., Hiner, M. C. & Eliceiri, K. W. The imagej ecosystem: An open platform for biomedical image analysis.Mol. reproduction development82, 518–529 (2015). 6.Palla, G.et al.Squidpy: a scalable framework for spatial omics analysis.Nat. methods19, 171–178 (2022)
2015
-
[4]
E.et al.Cellprofiler: image analysis software for identifying and quantifying cell phenotypes.Genome biology7, R100 (2006)
Carpenter, A. E.et al.Cellprofiler: image analysis software for identifying and quantifying cell phenotypes.Genome biology7, R100 (2006)
2006
-
[5]
InForty-first International Conference on Machine Learning(2024)
Wang, X.et al.Executable code actions elicit better llm agents. InForty-first International Conference on Machine Learning(2024). 9.Zhou, J.et al.An ai agent for fully automated multi-omic analyses.Adv. Sci.11, 2407094 (2024). 10.Wang, H.et al.Spatialagent: An autonomous ai agent for spatial biology.bioRxiv2025–04 (2025)
2024
-
[6]
& W ANG, B
Fallahpour, A., Ma, J., Munim, A., Lyu, H. & W ANG, B. Medrax: Medical reasoning agent for chest x-ray. InForty-second International Conference on Machine Learning. 30/32
-
[7]
In28th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2025), 680–690 (Springer, 2026)
Lyu, X.et al.Wsi-agents: A collaborative multi-agent system for multi-modal whole slide image analysis. In28th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2025), 680–690 (Springer, 2026)
2025
-
[8]
InThe eleventh international conference on learning representations(2022)
Yao, S.et al.React: Synergizing reasoning and acting in language models. InThe eleventh international conference on learning representations(2022)
2022
-
[9]
InProceedings of the 2018 conference on empirical methods in natural language processing, 2369–2380 (2018)
Yang, Z.et al.Hotpotqa: A dataset for diverse, explainable multi-hop question answering. InProceedings of the 2018 conference on empirical methods in natural language processing, 2369–2380 (2018)
2018
-
[10]
FEVER: a large-scale dataset for Fact Extraction and VERification
Thorne, J., Vlachos, A., Christodoulopoulos, C. & Mittal, A. Fever: a large-scale dataset for fact extraction and verification. arXiv preprint arXiv:1803.05355(2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[11]
Shridhar, M.et al.Alfworld: Aligning text and embodied environments for interactive learning.arXiv preprint arXiv:2010.03768(2020)
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[12]
Liang, J.et al.Code as policies: Language model programs for embodied control.arXiv preprint arXiv:2209.07753 (2022)
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[13]
Agent-R1: A Unified and Modular Framework for Agentic Reinforcement Learning
NVIDIA. Nemo gym: An open source library for scaling reinforcement learning environments for llm. https://github.com/ NVIDIA-NeMo/Gym (2025). GitHub repository. 19.Cheng, M.et al.Agent-r1: Training powerful llm agents with end-to-end reinforcement learning (2025). 2511.14460
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[14]
https://pretty-radio-b75.notion
Luo, M.et al.Deepswe: Training a state-of-the-art coding agent from scratch by scaling rl. https://pretty-radio-b75.notion. site/DeepSWE-Training-a-Fully-Open-sourced-State-of-the-Art-Coding-Agent-by-Scaling-RL-22281902c1468193aabbe9a8c59bbe33 (2025). Notion Blog
2025
-
[15]
neural information processing systems36, 53728–53741 (2023)
Rafailov, R.et al.Direct preference optimization: Your language model is secretly a reward model.Adv. neural information processing systems36, 53728–53741 (2023)
2023
-
[16]
Guo, D.et al.Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.arXiv preprint arXiv:2501.12948(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[17]
What learning algorithm is in-context learning? Investigations with linear models
Akyürek, E., Schuurmans, D., Andreas, J., Ma, T. & Zhou, D. What learning algorithm is in-context learning? investigations with linear models.arXiv preprint arXiv:2211.15661(2022)
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[18]
InInternational Conference on Machine Learning, 35151–35174 (PMLR, 2023)
V on Oswald, J.et al.Transformers learn in-context by gradient descent. InInternational Conference on Machine Learning, 35151–35174 (PMLR, 2023)
2023
-
[19]
Bertsch, A.et al.In-context learning with long-context models: An in-depth exploration. InProceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 12119–12149 (2025). 26.Agarwal, R.et al.Many-shot in-context learning.Adv. Neural Inf. Pro...
work page internal anchor Pith review Pith/arXiv arXiv 2025
- [20]
-
[21]
Rastogi, A.et al.Devstral: Fine-tuning language models for coding agent applications.arXiv preprint arXiv:2509.25193 (2025). 33.Team, K.et al.Kimi linear: An expressive, efficient attention architecture.arXiv preprint arXiv:2510.26692(2025)
-
[22]
Zeng, A.et al.Glm-4.5: Agentic, reasoning, and coding (arc) foundation models.arXiv preprint arXiv:2508.06471(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[23]
neural information processing systems35, 24824–24837 (2022)
Wei, J.et al.Chain-of-thought prompting elicits reasoning in large language models.Adv. neural information processing systems35, 24824–24837 (2022)
2022
-
[24]
Vaidya, A. J.et al.Nova: An agentic framework for automated histopathology analysis and discovery.arXiv preprint arXiv:2511.11324(2025)
-
[25]
& Pachitariu, M
Stringer, C., Wang, T., Michaelos, M. & Pachitariu, M. Cellpose: a generalist algorithm for cellular segmentation.Nat. methods18, 100–106 (2021). 31/32 38.Pachitariu, M. & Stringer, C. Cellpose 2.0: how to train your own model.Nat. methods19, 1634–1641 (2022)
2021
-
[26]
& Stringer, C
Pachitariu, M., Rariden, M. & Stringer, C. Cellpose-sam: superhuman generalization for cellular segmentation.bioRxiv 2025–04 (2025)
2025
-
[27]
Goldsborough, T.et al.Instanseg: an embedding-based instance segmentation algorithm optimized for accurate, efficient and portable cell segmentation.arXiv preprint arXiv:2408.15954(2024). 41.Stevens, M.et al.Stardist image segmentation improves circulating tumor cell detection.Cancers14, 2916 (2022). 42.Archit, A.et al.Segment anything for microscopy.Nat....
-
[28]
A.et al.Deep learning automates the quantitative analysis of individual cells in live-cell imaging experiments
Van Valen, D. A.et al.Deep learning automates the quantitative analysis of individual cells in live-cell imaging experiments. PLoS computational biology12, e1005177 (2016)
2016
-
[29]
& Nadeem, S
Ghahremani, P., Marino, J., Dodds, R. & Nadeem, S. Deepliif: An online platform for quantification of clinical pathology slides. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 21399–21405 (2022)
2022
-
[30]
Image Analysis94, 103143 (2024)
Hörst, F.et al.Cellvit: Vision transformers for precise cell segmentation and classification.Med. Image Analysis94, 103143 (2024). 46.Satija, R., Farrell, J. A., Gennert, D., Schier, A. F. & Regev, A. Spatial reconstruction of single-cell gene expression data. Nat. biotechnology33, 495–502 (2015)
2024
-
[31]
Dries, R.et al.Giotto: a toolbox for integrative analysis and visualization of spatial expression data.Genome biology22, 78 (2021)
2021
-
[32]
methods18, 1352–1362 (2021)
Biancalani, T.et al.Deep learning and alignment of spatially resolved single-cell transcriptomes with tangram.Nat. methods18, 1352–1362 (2021)
2021
-
[33]
biotechnology40, 661–671 (2022)
Kleshchevnikov, V .et al.Cell2location maps fine-grained cell types in spatial transcriptomics.Nat. biotechnology40, 661–671 (2022). 50.Kwon, W.et al.Efficient memory management for large language model serving with pagedattention. InProceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles(2023)
2022
-
[34]
neural information processing systems37, 62557–62583 (2024)
Zheng, L.et al.Sglang: Efficient execution of structured language model programs.Adv. neural information processing systems37, 62557–62583 (2024). 32/32
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.