Recognition: unknown
AROMA: Augmented Reasoning Over a Multimodal Architecture for Virtual Cell Genetic Perturbation Modeling
Pith reviewed 2026-05-09 23:04 UTC · model grok-4.3
The pith
AROMA combines text, graphs, and sequences with staged training to predict genetic perturbation effects on cells more accurately and interpretably.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
AROMA integrates textual evidence, graph-topology information, and protein sequence features to model perturbation-target dependencies, and is trained with a two-stage optimization strategy to yield predictions that are both accurate and interpretable. The work also supplies two knowledge graphs and the PerturbReason dataset of more than 498k samples as reusable resources, with experiments confirming better performance than existing methods across cell lines plus robustness in zero-shot evaluation on unseen cells and in knowledge-sparse long-tail scenarios.
What carries the argument
The AROMA multimodal architecture that augments reasoning by fusing textual evidence, graph topology, and protein sequences, then applies two-stage optimization to align outputs with regulatory relationships.
If this is right
- Virtual cell simulations become more usable for studying how genetic changes drive molecular outcomes.
- Predictions gain interpretability so researchers can trace outputs back to specific evidence sources.
- The supplied knowledge graphs and PerturbReason dataset become shared tools for other virtual cell work.
- Performance in zero-shot and long-tail settings indicates the method can handle realistic gaps in biological data.
Where Pith is reading between the lines
- Researchers could use the model to prioritize which gene edits to test first in actual lab work.
- The same fusion of text and graph signals might extend to predicting effects in other biological systems such as metabolic pathways.
- If the interpretability holds outside the authors' tests, it opens a route to hybrid human-AI validation loops where biologists review the model's reasoning steps directly.
Load-bearing premise
The knowledge graphs and PerturbReason dataset must supply signals that truly match real biological regulatory connections, and the two-stage training must improve genuine understanding instead of just matching the particular data splits used.
What would settle it
Run AROMA on an independent collection of genetic perturbation experiments in a fresh cell line never seen in training, then check whether its accuracy stays higher than baselines and whether its reasoning steps line up with established experimental biology.
Figures
read the original abstract
Virtual cell modeling predicts molecular state changes under genetic perturbations in silico, which is essential for biological mechanism studies. However, existing approaches suffer from unconstrained reasoning, uninterpretable predictions, and retrieval signals that are weakly aligned with regulatory topology. To address these limitations, we propose AROMA, an Augmented Reasoning Over a Multimodal Architecture for virtual cell genetic perturbation modeling. AROMA integrates textual evidence, graph-topology information, and protein sequence features to model perturbation-target dependencies, and is trained with a two-stage optimization strategy to yield predictions that are both accurate and interpretable. We also construct two knowledge graphs and a perturbation reasoning dataset, PerturbReason, containing more than 498k samples, as reusable resources for the virtual cell domain. Experiments show that AROMA outperforms existing methods across multiple cell lines, and remains robust under zero-shot evaluation on an unseen cell line, as well as in knowledge-sparse, long-tail scenarios. Overall, AROMA demonstrates that combining knowledge-driven multimodal modeling with evidence retrieval provides a promising pathway toward more reliable and interpretable virtual cell perturbation prediction. Model weights are available at https://huggingface.co/blazerye/AROMA. Code is available at https://github.com/blazerye/AROMA.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces AROMA, a multimodal architecture for virtual cell genetic perturbation modeling that fuses textual evidence, graph-topology signals, and protein sequence features. It employs a two-stage optimization procedure and contributes two new knowledge graphs plus the PerturbReason dataset (498k samples). The central empirical claim is that AROMA outperforms prior methods on multiple cell lines, remains robust in zero-shot transfer to an unseen cell line, and handles knowledge-sparse long-tail cases while producing interpretable predictions.
Significance. If the performance and robustness claims are substantiated by independent quantitative evidence, the work would supply reusable resources (KGs and dataset) and demonstrate a concrete route toward more reliable, knowledge-aligned virtual-cell models. The public release of model weights and code is a clear strength that facilitates follow-up.
major comments (2)
- Abstract: the statements that AROMA 'outperforms existing methods across multiple cell lines' and 'remains robust under zero-shot evaluation on an unseen cell line' are presented without any numerical metrics, baseline names, error bars, or ablation results. Because these empirical claims are load-bearing for the paper's contribution, their absence prevents assessment of whether the two-stage strategy actually delivers the asserted gains.
- Dataset and knowledge-graph construction (implicit in §3 and Experiments): both the PerturbReason dataset and the two author-constructed KGs are built specifically for this study. The central claim of genuine alignment with regulatory topology therefore requires explicit checks (e.g., leakage analysis, topology-independent hold-outs, or external validation) that are not supplied in the provided text; without them the reported advantages risk being artifacts of the authors' own data-construction choices rather than improved perturbation modeling.
minor comments (1)
- The abstract mentions availability of model weights and code but does not indicate whether the released repository contains the exact data-construction scripts and hyper-parameter settings used for the reported experiments.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. We address each major comment point by point below. Where the comments identify gaps in the current manuscript, we have prepared revisions to incorporate the requested evidence and clarifications.
read point-by-point responses
-
Referee: Abstract: the statements that AROMA 'outperforms existing methods across multiple cell lines' and 'remains robust under zero-shot evaluation on an unseen cell line' are presented without any numerical metrics, baseline names, error bars, or ablation results. Because these empirical claims are load-bearing for the paper's contribution, their absence prevents assessment of whether the two-stage strategy actually delivers the asserted gains.
Authors: We agree that the abstract would be strengthened by quantitative support for these central claims. In the revised manuscript we have updated the abstract to include specific performance metrics (e.g., average AUC improvements and robustness scores across cell lines), the names of the primary baselines, references to error bars from the main experiments, and a brief note on the ablation results that isolate the contribution of the two-stage optimization. These additions are drawn directly from the results already reported in Sections 4 and 5 and fit within the abstract length constraints. revision: yes
-
Referee: Dataset and knowledge-graph construction (implicit in §3 and Experiments): both the PerturbReason dataset and the two author-constructed KGs are built specifically for this study. The central claim of genuine alignment with regulatory topology therefore requires explicit checks (e.g., leakage analysis, topology-independent hold-outs, or external validation) that are not supplied in the provided text; without them the reported advantages risk being artifacts of the authors' own data-construction choices rather than improved perturbation modeling.
Authors: We acknowledge that the current text does not provide the explicit validation checks requested. In the revised manuscript we have added a dedicated subsection (now §3.4) that reports: (i) a leakage analysis confirming no shared perturbation targets or regulatory edges between training and test partitions, (ii) performance under topology-independent hold-out splits that remove entire regulatory subgraphs, and (iii) external validation of the constructed KGs against independent sources (STRING, Reactome, and curated perturbation databases). These checks support that the observed gains arise from the multimodal architecture and two-stage training rather than from data-construction artifacts. revision: yes
Circularity Check
No significant circularity in derivation or evaluation chain
full rationale
The paper constructs two knowledge graphs and the PerturbReason dataset (498k samples) explicitly as reusable resources for the virtual cell domain, then trains AROMA via multimodal integration (textual evidence, graph topology, protein sequences) and two-stage optimization. Outperformance claims, zero-shot robustness on an unseen cell line, and long-tail handling are presented as experimental results on held-out splits of these resources. No equation or step reduces a prediction to the construction inputs by definition, no fitted parameter is relabeled as an independent prediction, and no load-bearing self-citation or uniqueness theorem is invoked. Standard practice for introducing a new benchmark and method does not constitute circularity when the architecture choices and optimization strategy retain independent content, and public code/weights enable external checks.
Axiom & Free-Parameter Ledger
free parameters (1)
- Two-stage optimization hyperparameters
axioms (1)
- domain assumption The textual evidence, graph-topology information, and protein sequence features are complementary and accurately aligned with biological regulatory topology.
invented entities (2)
-
PerturbReason dataset
no independent evidence
-
Two knowledge graphs
no independent evidence
Reference graph
Works this paper leans on
-
[1]
arXiv e-prints , pages=
The llama 3 herd of models , author=. arXiv e-prints , pages=
-
[5]
Nature Biotechnology , volume=
Predicting transcriptional outcomes of novel multigene perturbations with GEARS , author=. Nature Biotechnology , volume=. 2024 , publisher=
2024
-
[6]
bioRxiv , pages=
GenePT: a simple but effective foundation model for genes and cells built from ChatGPT , author=. bioRxiv , pages=
-
[7]
Nature methods , volume=
scGPT: toward building a foundation model for single-cell multi-omics using generative AI , author=. Nature methods , volume=. 2024 , publisher=
2024
-
[9]
Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V
Annotation-guided protein design with multi-level domain alignment , author=. Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 1 , pages=
-
[11]
bioRxiv , pages=
rbio1-training scientific reasoning LLMs with biological world models as soft verifiers , author=. bioRxiv , pages=. 2025 , publisher=
2025
-
[13]
Nature methods , volume=
Large-scale foundation model on single-cell transcriptomics , author=. Nature methods , volume=. 2024 , publisher=
2024
-
[15]
Science , volume=
Evolutionary-scale prediction of atomic-level protein structure with a language model , author=. Science , volume=. 2023 , publisher=
2023
-
[16]
, author=
Lora: Low-rank adaptation of large language models. , author=. ICLR , volume=
-
[17]
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Deepseekmath: Pushing the limits of mathematical reasoning in open language models, 2024 , author=. URL https://arxiv. org/abs/2402.03300 , volume=
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[18]
Cell , volume=
A whole-cell computational model predicts phenotype from genotype , author=. Cell , volume=. 2012 , publisher=
2012
-
[19]
Cell , volume=
How to build the virtual cell with artificial intelligence: Priorities and opportunities , author=. Cell , volume=. 2024 , publisher=
2024
-
[20]
ACM computing surveys , volume=
Survey of hallucination in natural language generation , author=. ACM computing surveys , volume=. 2023 , publisher=
2023
-
[21]
Molecular systems biology , volume=
Predicting cellular responses to complex perturbations in high-throughput screens , author=. Molecular systems biology , volume=
-
[22]
Briefings in Bioinformatics , volume=
Drugassist: A large language model for molecule optimization , author=. Briefings in Bioinformatics , volume=. 2025 , publisher=
2025
-
[23]
bioRxiv , pages=
CellFlow enables generative single-cell phenotype modeling with flow matching , author=. bioRxiv , pages=. 2025 , publisher=
2025
-
[24]
arXiv preprint arXiv:2508.02276 , year=
Cellforge: Agentic design of virtual cell models , author=. arXiv preprint arXiv:2508.02276 , year=
-
[26]
arXiv preprint arXiv:2411.04863 , year=
OneProt: Towards multi-modal protein foundation models , author=. arXiv preprint arXiv:2411.04863 , year=
-
[27]
The Thirteenth International Conference on Learning Representations , year=
Atomas: Hierarchical adaptive alignment on molecule-text for unified molecule understanding and generation , author=. The Thirteenth International Conference on Learning Representations , year=
-
[28]
BioRxiv , pages=
Procyon: A multimodal foundation model for protein phenotypes , author=. BioRxiv , pages=. 2024 , publisher=
2024
-
[29]
bioRxiv , pages=
CAPTAIN: A multimodal foundation model pretrained on co-assayed single-cell RNA and protein , author=. bioRxiv , pages=. 2025 , publisher=
2025
-
[30]
IEEE transactions on pattern analysis and machine intelligence , volume=
Prottrans: Toward understanding the language of life through self-supervised learning , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2021 , publisher=
2021
-
[31]
Nature Methods , volume=
Nucleotide Transformer: building and evaluating robust foundation models for human genomics , author=. Nature Methods , volume=. 2025 , publisher=
2025
-
[32]
BioRxiv , pages=
Genome modeling and design across all domains of life with Evo 2 , author=. BioRxiv , pages=. 2025 , publisher=
2025
-
[33]
Nucleic acids research , volume=
STRING: a database of predicted functional associations between proteins , author=. Nucleic acids research , volume=. 2003 , publisher=
2003
-
[34]
Nature genetics , volume=
Gene ontology: tool for the unification of biology , author=. Nature genetics , volume=. 2000 , publisher=
2000
-
[35]
Nucleic acids research , volume=
Reactome: a database of reactions, pathways and biological processes , author=. Nucleic acids research , volume=. 2010 , publisher=
2010
-
[36]
Proceedings of the IEEE/CVF international conference on computer vision , pages=
Crossvit: Cross-attention multi-scale vision transformer for image classification , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=
-
[37]
Nucleic acids research , volume=
CORUM: the comprehensive resource of mammalian protein complexes—2019 , author=. Nucleic acids research , volume=. 2019 , publisher=
2019
-
[38]
International journal of cancer , volume=
K562—a human erythroleukemic cell line , author=. International journal of cancer , volume=. 1979 , publisher=
1979
-
[39]
International journal of molecular sciences , volume=
The curious case of the HepG2 cell line: 40 years of expertise , author=. International journal of molecular sciences , volume=. 2021 , publisher=
2021
-
[40]
BMC genomics , volume=
A genome-wide survey of mutations in the Jurkat cell line , author=. BMC genomics , volume=. 2018 , publisher=
2018
-
[41]
FEBS open bio , volume=
Evidence for reciliation of RPE1 cells in late G1 phase, and ciliary localisation of cyclin B1 , author=. FEBS open bio , volume=. 2013 , publisher=
2013
-
[42]
Catalogue of artificial intelligence tools , pages=
Breadth-first search , author=. Catalogue of artificial intelligence tools , pages=. 1984 , publisher=
1984
-
[43]
ChatGPT o4-mini , author=
-
[44]
Building Generative AI Agents: Using LangGraph, AutoGen, and CrewAI , pages=
OpenAI GPTs and the Assistants API , author=. Building Generative AI Agents: Using LangGraph, AutoGen, and CrewAI , pages=. 2025 , publisher=
2025
-
[45]
2025 , month =
GPT-5 System Card , howpublished =. 2025 , month =
2025
-
[46]
2025 , month =
OpenAI o3 and o4-mini System Card , howpublished =. 2025 , month =
2025
-
[47]
2025 , month =
Gemini 2.5 Pro: Generative AI on Vertex AI , howpublished =. 2025 , month =
2025
-
[49]
BioRxiv , pages=
Predicting cellular responses to perturbation across diverse contexts with State , author=. BioRxiv , pages=. 2025 , publisher=
2025
-
[50]
arXiv preprint arXiv:2412.02565 , year=
SJTU: Spatial judgments in multimodal models towards unified segmentation through coordinate detection , author=. arXiv preprint arXiv:2412.02565 , year=
-
[51]
Medical reference services quarterly , volume=
PubMed 2.0 , author=. Medical reference services quarterly , volume=. 2020 , publisher=
2020
-
[53]
Discover Oncology , volume=
Acetyl-CoA synthetase 2 (ACSS2): a review with a focus on metabolism and tumor development , author=. Discover Oncology , volume=. 2022 , publisher=
2022
-
[54]
Bioinformatics , volume=
Cross-dependent graph neural networks for molecular property prediction , author=. Bioinformatics , volume=. 2022 , publisher=
2022
-
[55]
Cell , volume=
Acetate dependence of tumors , author=. Cell , volume=. 2014 , publisher=
2014
-
[56]
Frontiers in Physiology , volume=
Acetate revisited: A key biomolecule at the nexus of metabolism, epigenetics and oncogenesis—Part 1: Acetyl-CoA, acetogenesis and acyl-CoA short-chain synthetases , author=. Frontiers in Physiology , volume=. 2020 , publisher=
2020
-
[57]
Biochemistry, electron transport chain , author=
-
[58]
Nature communications , volume=
Acetate functions as an epigenetic metabolite to promote lipid synthesis under hypoxia , author=. Nature communications , volume=. 2016 , publisher=
2016
-
[59]
Molecular Metabolism , volume=
Acetate drives ovarian cancer quiescence via ACSS2-mediated acetyl-CoA production , author=. Molecular Metabolism , volume=. 2024 , publisher=
2024
-
[60]
Current medicinal chemistry , volume=
Mitochondrial respiratory complex I: structure, function and implication in human diseases , author=. Current medicinal chemistry , volume=. 2009 , publisher=
2009
-
[61]
American Journal of Physiology-Regulatory, Integrative and Comparative Physiology , volume=
Free radical biology and medicine: it's a gas, man! , author=. American Journal of Physiology-Regulatory, Integrative and Comparative Physiology , volume=. 2006 , publisher=
2006
-
[62]
Journal of Experimental Medicine , volume=
ACLY and ACSS2 link nutrient-dependent chromatin accessibility to CD8 T cell effector responses , author=. Journal of Experimental Medicine , volume=. 2024 , publisher=
2024
-
[63]
Briefings in bioinformatics , volume=
Multimodal deep learning for biomedical data fusion: a review , author=. Briefings in bioinformatics , volume=. 2022 , publisher=
2022
-
[64]
Nature medicine , volume=
Multimodal biomedical AI , author=. Nature medicine , volume=. 2022 , publisher=
2022
-
[65]
Genetics , volume=
A review of multimodal deep learning methods for genomic-enabled prediction in plant breeding , author=. Genetics , volume=. 2024 , publisher=
2024
-
[66]
Eukaryotic cell , volume=
ND3 and ND4L subunits of mitochondrial complex I, both nucleus encoded in Chlamydomonas reinhardtii, are required for activity and assembly of the enzyme , author=. Eukaryotic cell , volume=. 2006 , publisher=
2006
-
[67]
Abhinav K Adduri, Dhruv Gautam, Beatrice Bevilacqua, Alishba Imran, Rohan Shah, Mohsen Naghipourfar, Noam Teyssier, Rajesh Ilango, Sanjay Nagaraj, Mingze Dong, and 1 others. 2025. Predicting cellular responses to perturbation across diverse contexts with state. BioRxiv, pages 2025--06
2025
-
[68]
Maria Ahmad, Adam Wolberg, and Chadi I Kahwaji. 2018. Biochemistry, electron transport chain
2018
-
[69]
Leif C Andersson, Kenneth Nilsson, and Carl G Gahmberg. 1979. K562—a human erythroleukemic cell line. International journal of cancer, 23(2):143--147
1979
-
[70]
Viktoriia A Arzumanian, Olga I Kiseleva, and Ekaterina V Poverennaya. 2021. The curious case of the hepg2 cell line: 40 years of expertise. International journal of molecular sciences, 22(23):13135
2021
-
[71]
Michael Ashburner, Catherine A Ball, Judith A Blake, David Botstein, Heather Butler, J Michael Cherry, Allan P Davis, Kara Dolinski, Selina S Dwight, Janan T Eppig, and 1 others. 2000. Gene ontology: tool for the unification of biology. Nature genetics, 25(1):25--29
2000
-
[72]
Alan Bundy and Lincoln Wallen. 1984. Breadth-first search. In Catalogue of artificial intelligence tools, pages 13--13. Springer
1984
-
[73]
Charlotte Bunne, Yusuf Roohani, Yanay Rosen, Ankit Gupta, Xikun Zhang, Marcel Roed, Theo Alexandrov, Mohammed AlQuraishi, Patricia Brennan, Daniel B Burkhardt, and 1 others. 2024. How to build the virtual cell with artificial intelligence: Priorities and opportunities. Cell, 187(25):7045--7063
2024
-
[74]
Pierre Cardol, Marie Lapaille, Pierre Minet, Fabrice Franck, Ren \'e F Matagne, and Claire Remacle. 2006. Nd3 and nd4l subunits of mitochondrial complex i, both nucleus encoded in chlamydomonas reinhardtii, are required for activity and assembly of the enzyme. Eukaryotic cell, 5(9):1460--1467
2006
-
[75]
Yiqun Chen and James Zou. 2024. Genept: a simple but effective foundation model for genes and cells built from chatgpt. bioRxiv, pages 2023--10
2024
-
[76]
Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, and 1 others. 2025. Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities. arXiv preprint arXiv:2507.06261
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[77]
Sarah A Comerford, Zhiguang Huang, Xinlin Du, Yun Wang, Ling Cai, Agnes K Witkiewicz, Holly Walters, Mohammed N Tantawy, Allie Fu, H Charles Manning, and 1 others. 2014. Acetate dependence of tumors. Cell, 159(7):1591--1602
2014
-
[78]
David Croft, Gavin O’kelly, Guanming Wu, Robin Haw, Marc Gillespie, Lisa Matthews, Michael Caudy, Phani Garapati, Gopal Gopinath, Bijay Jassal, and 1 others. 2010. Reactome: a database of reactions, pathways and biological processes. Nucleic acids research, 39(suppl\_1):D691--D697
2010
-
[79]
Haotian Cui, Chloe Wang, Hassaan Maan, Kuan Pang, Fengning Luo, Nan Duan, and Bo Wang. 2024. scgpt: toward building a foundation model for single-cell multi-omics using generative ai. Nature methods, 21(8):1470--1480
2024
-
[80]
Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, and 1 others. 2024. The llama 3 herd of models. arXiv e-prints, pages arXiv--2407
2024
-
[81]
Ahmed Elnaggar, Michael Heinzinger, Christian Dallago, Ghalia Rehawi, Yu Wang, Llion Jones, Tom Gibbs, Tamas Feher, Christoph Angerer, Martin Steinegger, and 1 others. 2021. Prottrans: Toward understanding the language of life through self-supervised learning. IEEE transactions on pattern analysis and machine intelligence, 44(10):7112--7127
2021
-
[82]
Adibvafa Fallahpour, Andrew Magnuson, Purav Gupta, Shihao Ma, Jack Naimer, Arnav Shah, Haonan Duan, Omar Ibrahim, Hani Goodarzi, Chris J Maddison, and 1 others. 2025. Bioreason: Incentivizing multimodal biological reasoning within a dna-llm model. arXiv preprint arXiv:2505.23579
-
[83]
Xue Gao, Shu-Hai Lin, Feng Ren, Jin-Tao Li, Jia-Jia Chen, Chuan-Bo Yao, Hong-Bin Yang, Shu-Xia Jiang, Guo-Quan Yan, Di Wang, and 1 others. 2016. Acetate functions as an epigenetic metabolite to promote lipid synthesis under hypoxia. Nature communications, 7(1):11960
2016
-
[84]
Louis Gioia, Azeem Siddique, Steven R Head, Daniel R Salomon, and Andrew I Su. 2018. A genome-wide survey of mutations in the jurkat cell line. BMC genomics, 19(1):334
2018
-
[85]
Madalina Giurgiu, Julian Reinhard, Barbara Brauner, Irmtraud Dunger-Kaltenbach, Gisela Fobo, Goar Frishman, Corinna Montrone, and Andreas Ruepp. 2019. Corum: the comprehensive resource of mammalian protein complexes—2019. Nucleic acids research, 47(D1):D559--D563
2019
-
[86]
Google Cloud . 2025. Gemini 2.5 pro: Generative ai on vertex ai. https://docs.cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-pro?hl=zh-cn. Accessed: 2025-12-21
2025
-
[87]
Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, and 1 others. 2025. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv preprint arXiv:2501.12948
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[88]
Minsheng Hao, Jing Gong, Xin Zeng, Chiming Liu, Yucheng Guo, Xingyi Cheng, Taifeng Wang, Jianzhu Ma, Xuegong Zhang, and Le Song. 2024. Large-scale foundation model on single-cell transcriptomics. Nature methods, 21(8):1481--1491
2024
-
[89]
Aaron Hurst, Adam Lerer, Adam P Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, Akila Welihinda, Alan Hayes, Alec Radford, and 1 others. 2024. Gpt-4o system card. arXiv preprint arXiv:2410.21276
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[90]
Ana-Maria Istrate, Fausto Milletari, Fabrizio Castrotorres, Jakub M Tomczak, Michaela Torkar, Donghui Li, and Theofanis Karaletsos. 2025. rbio1-training scientific reasoning llms with biological world models as soft verifiers. bioRxiv, pages 2025--08
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.