GenoMAS: A Multi-Agent Framework for Scientific Discovery via Code-Driven Gene Expression Analysis
Pith reviewed 2026-05-22 00:33 UTC · model grok-4.3
The pith
A team of six LLM agents uses typed messaging and guided action units to automate gene expression analysis from raw transcriptomic files.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By coordinating six LLM-based agents through typed message-passing on a shared canvas, the system lets programming agents convert high-level guidelines into Action Units and then choose at each step to advance, revise, bypass, or backtrack, preserving overall coherence while adapting to the particular demands of large semi-structured transcriptomic datasets; the result is higher benchmark performance than prior automation methods together with gene-phenotype links that match literature reports after latent confounders are taken into account.
What carries the argument
The guided-planning framework that decomposes tasks into Action Units and supplies explicit decision points for agents to advance, revise, bypass, or backtrack.
If this is right
- The agents surface gene-phenotype associations that align with published findings while adjusting for latent confounders.
- The method processes multiple large semi-structured files without the breakdowns typical of fixed workflows.
- Benchmark gains of roughly ten points in preprocessing correlation and sixteen points in gene identification F1 follow directly from the collaborative structure.
- Logical coherence is maintained across steps even when individual agents act with some autonomy.
Where Pith is reading between the lines
- The same agent-collaboration pattern could be tested on other high-dimensional biological datasets such as proteomics or single-cell RNA profiles.
- Adding explicit checks for biological plausibility at backtrack points might further lower the chance of downstream errors.
- If the planning decisions generalize, the approach could shorten the time between raw data arrival and publishable biological insight in many labs.
Load-bearing premise
The LLM agents will generate correct analysis code and keep the multi-step process logically consistent without introducing errors that produce invalid biological conclusions.
What would settle it
Execute the code produced by the agents on a publicly available gene-expression dataset whose correct preprocessing steps and gene-phenotype associations have already been established by independent expert analysis, then check whether the outputs match those established results within expected tolerance.
Figures
read the original abstract
Gene expression analysis holds the key to many biomedical discoveries, yet extracting insights from raw transcriptomic data remains formidable due to the complexity of multiple large, semi-structured files and the need for extensive domain expertise. Current automation approaches are often limited by either inflexible workflows that break down in edge cases or by fully autonomous agents that lack the necessary precision for rigorous scientific inquiry. GenoMAS charts a different course by presenting a team of LLM-based scientists that integrates the reliability of structured workflows with the adaptability of autonomous agents. GenoMAS orchestrates six specialized LLM agents through typed message-passing protocols, each contributing complementary strengths to a shared analytic canvas. At the heart of GenoMAS lies a guided-planning framework: programming agents unfold high-level task guidelines into Action Units and, at each juncture, elect to advance, revise, bypass, or backtrack, thereby maintaining logical coherence while bending gracefully to the idiosyncrasies of genomic data. On the GenoTEX benchmark, GenoMAS reaches a Composite Similarity Correlation of 89.13% for data preprocessing and an F$_1$ of 60.48% for gene identification, surpassing the best prior art by 10.61% and 16.85% respectively. Beyond metrics, GenoMAS surfaces biologically plausible gene-phenotype associations corroborated by the literature, all while adjusting for latent confounders. Code is available at https://github.com/Liu-Hy/GenoMAS.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces GenoMAS, a multi-agent framework with six specialized LLM agents that collaborate via typed message-passing protocols to perform gene expression analysis on raw transcriptomic data. Central to the system is a guided-planning framework in which programming agents decompose high-level guidelines into Action Units and dynamically choose to advance, revise, bypass, or backtrack. On the GenoTEX benchmark the framework reports 89.13% Composite Similarity Correlation for data preprocessing and 60.48% F1 for gene identification, exceeding the best prior art by 10.61% and 16.85% respectively, while surfacing biologically plausible gene-phenotype associations after latent-confounder adjustment. Code is released at the cited GitHub repository.
Significance. If the performance claims hold under rigorous controls, the work offers a practical middle path between rigid pipelines and fully autonomous agents for scientific code generation in genomics. The explicit code release constitutes a clear strength for reproducibility and community scrutiny.
major comments (2)
- [§4] §4 (Experimental evaluation): the reported 89.13% CSC and 60.48% F1 scores are presented without any description of baseline re-implementations, hyper-parameter settings for the compared methods, statistical significance tests, or error bars; these omissions render the claimed margins (10.61% and 16.85%) impossible to assess for robustness.
- [§3.2] §3.2 (Guided-planning framework): the Action Unit mechanism and the four-way decision rule (advance/revise/bypass/backtrack) are described only at a high level; without pseudocode, formal invariants, or concrete traces showing how hallucinations are detected and corrected, it is unclear whether the framework actually guarantees logical coherence across multi-step genomic analyses.
minor comments (2)
- [Abstract] The abstract states that associations are 'corroborated by the literature' yet provides no citation list or overlap statistics; a supplementary table mapping discovered genes to supporting PubMed IDs would strengthen the biological-plausibility claim.
- [§3.1] Notation for the typed message-passing protocol is introduced without an explicit schema or example message; adding a small table of message types and their fields would improve clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below, acknowledging where additional information is warranted, and indicate the revisions planned for the next version of the manuscript.
read point-by-point responses
-
Referee: [§4] §4 (Experimental evaluation): the reported 89.13% CSC and 60.48% F1 scores are presented without any description of baseline re-implementations, hyper-parameter settings for the compared methods, statistical significance tests, or error bars; these omissions render the claimed margins (10.61% and 16.85%) impossible to assess for robustness.
Authors: We agree that the current presentation of results in Section 4 lacks sufficient implementation details to allow independent assessment of robustness. In the revised manuscript we will add: (i) explicit descriptions of how each baseline was re-implemented (including any adaptations required for the GenoTEX benchmark), (ii) the hyper-parameter values and search ranges used for all compared methods, (iii) results of statistical significance tests (e.g., paired t-tests or Wilcoxon signed-rank tests with reported p-values), and (iv) error bars or standard deviations obtained from multiple independent runs. These additions will make the reported margins (10.61% and 16.85%) directly evaluable. revision: yes
-
Referee: [§3.2] §3.2 (Guided-planning framework): the Action Unit mechanism and the four-way decision rule (advance/revise/bypass/backtrack) are described only at a high level; without pseudocode, formal invariants, or concrete traces showing how hallucinations are detected and corrected, it is unclear whether the framework actually guarantees logical coherence across multi-step genomic analyses.
Authors: Section 3.2 currently emphasizes the conceptual design to keep the exposition accessible. We acknowledge that this leaves the operational details underspecified. In the revision we will insert: (i) pseudocode for Action Unit decomposition and the four-way decision procedure, (ii) the key invariants the framework is designed to maintain (e.g., type consistency of messages and non-regression of data-preprocessing state), and (iii) two or three concrete execution traces drawn from our GenoTEX runs that illustrate how the revise or backtrack actions detect and mitigate hallucinations or logical inconsistencies. These additions will clarify how coherence is preserved without claiming formal guarantees beyond the empirical behavior. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The manuscript describes a multi-agent LLM framework for gene expression analysis and reports empirical performance on the external GenoTEX benchmark (89.13% CSC for preprocessing, 60.48% F1 for gene identification). No equations, derivations, or first-principles claims appear that could reduce to self-definition, fitted inputs renamed as predictions, or load-bearing self-citations. The guided-planning and typed message-passing components are presented as design choices whose validity is assessed via external benchmark comparison and released code, not by internal construction. This is the most common honest finding for applied systems papers whose central results are benchmark gains rather than closed-form derivations.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLM agents can be specialized via prompting and coordinated through typed messages to perform rigorous scientific data analysis without introducing critical errors
invented entities (1)
-
Guided-planning framework with Action Units
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
GenoMAS orchestrates six specialized LLM agents through typed message-passing protocols... guided-planning framework: programming agents unfold high-level task guidelines into Action Units and, at each juncture, elect to advance, revise, bypass, or backtrack
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
On the GenoTEX benchmark, GenoMAS reaches a Composite Similarity Correlation of 89.13% for data preprocessing and an F1 of 60.48% for gene identification
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 2 Pith papers
-
The Moltbook Files: A Harmless Slopocalypse or Humanity's Last Experiment
An AI-agent social platform generated mostly neutral content whose use in fine-tuning reduced model truthfulness comparably to human Reddit data, suggesting limited unique harm but flagging tail risks like secret leaks.
-
Heterogeneous Scientific Foundation Model Collaboration
Eywa enables language-based agentic AI systems to collaborate with specialized scientific foundation models for improved performance on structured data tasks.
Reference graph
Works this paper leans on
- [1]
-
[2]
T. Aittokallio. Dealing with missing values in large-scale studies: microarray data imputation and beyond. Briefings in bioinformatics, 11(2):253–264, 2010
work page 2010
-
[3]
C. Angermueller, T. P ¨arnamaa, L. Parts, and O. Stegle. Deep learning for computational biology. Molecular systems biology, 12(7):878, 2016
work page 2016
-
[4]
Claude code: Agentic coding tool, 2024
Anthropic. Claude code: Agentic coding tool, 2024. URL https://www.anthropic.com/ claude/code. Command line tool for agentic coding
work page 2024
-
[5]
Introducing claude 4: Our most intelligent model, 2024
Anthropic. Introducing claude 4: Our most intelligent model, 2024. URL https://www. anthropic.com/claude. Accessed: 2025-01-22
work page 2024
-
[6]
Cursor: The ai code editor, 2024
Anysphere. Cursor: The ai code editor, 2024. URL https://cursor.com. AI-powered code editor
work page 2024
- [7]
- [8]
-
[9]
J. L. Ballard, Z. Wang, W. Li, L. Shen, and Q. Long. Deep learning-based approaches for multi-omics data integration and analysis. BioData Mining , 17(1):38, 2024. doi: 10.1186/ s13040-024-00391-z
work page 2024
- [10]
-
[11]
A. M. Bran, S. Cox, O. Schilter, C. Baldassari, A. D. White, and P . Schwaller. Chemcrow: Aug- menting large-language models with chemistry tools. arXiv preprint arXiv: 2304.05376, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[12]
G. R. Brown, V . Hem, K. S. Katz, M. Ovetsky, C. Wallin, O. Ermolaeva, I. Tolstoy, T. Tatusova, K. D. Pruitt, and D. R. Maglott. Gene: a gene-centered information resource at NCBI. Nucleic Acids Research, 43(D1):D36–D42, 2015. doi: 10.1093/nar/gku1055. URL https://doi.org/ 10.1093/nar/gku1055
-
[13]
O. Bruning, W. Rodenburg, P . F. Wackers, C. Van Oostrom, M. J. Jonker, R. J. Dekker, H. Rauwerda, W. A. Ensink, A. De Vries, and T. M. Breit. Confounding factors in the transcriptome analysis of an in-vivo exposure experiment. PLoS One, 11(1):e0145252, 2016
work page 2016
-
[14]
S. A. Byron, K. R. Van Keuren-Jensen, D. M. Engelthaler, J. D. Carpten, and D. W. Craig. Translat- ing rna sequencing into clinical diagnostics: opportunities and challenges. Nature Reviews Genet- ics, 17(5):257–271, 2016. doi: 10.1038/nrg.2016.10
-
[15]
Why Do Multi-Agent LLM Systems Fail?
M. Cemri, M. Z. Pan, S. Yang, L. A. Agrawal, B. Chopra, R. Tiwari, K. Keutzer, A. Parameswaran, D. Klein, K. Ramchandran, M. Zaharia, J. E. Gonzalez, and I. Stoica. Why do multi-agent llm systems fail? arXiv preprint arXiv:2503.13657, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[16]
V . C ¸ etin and O. YILDIZ. A comprehensive review on data preprocessing techniques in data anal- ysis. Pamukkale ¨Universitesi M¨ uhendislik Bilimleri Dergisi, 28(2):299–312, 2022
work page 2022
-
[17]
I. S. Chan and G. S. Ginsburg. Personalized medicine: progress and promise. Annual review of genomics and human genetics, 12:217–244, 2011
work page 2011
- [18]
-
[19]
X. Chen, M. Lin, N. Sch ¨arli, and D. Zhou. Teaching large language models to self-debug. arXiv preprint arXiv: 2304.05128, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
- [20]
- [21]
-
[22]
E. Clough and T. Barrett. The gene expression omnibus database. Methods in Molecular Biology , 1418:93–110, 2016. doi: 10.1007/978-1-4939-3578-9 5
-
[23]
A. Conesa, P . Madrigal, S. Tarazona, D. Gomez-Cabrero, A. Cervera, A. McPherson, M. W. Szcze´sniak, D. J. Gaffney, L. L. Elo, X. Zhang, and A. Mortazavi. A survey of best practices for rna-seq data analysis. Genome Biology, 17:13, 2016. doi: 10.1186/s13059-016-0881-8
-
[24]
J. P . Cook, A. Mahajan, and A. P . Morris. Guidance for the utility of linear models in meta-analysis of genetic association studies of binary phenotypes.European Journal of Human Genetics, 25(2):240– 245, 2017
work page 2017
-
[25]
T. Dai, S. Vijayakrishnan, F. T. Szczypi ´nski, J.-F. Ayme, E. Simaei, T. Fellowes, R. Clowes, L. Ko- topanov, C. E. Shields, Z. Zhou, J. W. Ward, and A. I. Cooper. Autonomous mobile robots for exploratory synthetic chemistry. Nature, pages 1–8, Nov. 2024. ISSN 1476-4687. doi: 10.1038/s41586-024-08173-7
- [26]
- [27]
-
[28]
DeepSeek-AI, D. Guo, D. Yang, H. Zhang, J. Song, R. Zhang, R. Xu, Q. Zhu, S. Ma, P . Wang, X. Bi, X. Zhang, X. Yu, Y. Wu, Z. Wu, Z. Gou, Z. Shao, Z. Li, Z. Gao, A. Liu, B. Xue, B. Wang, B. Wu, B. Feng, C. Lu, C. Zhao, C. Deng, C. Zhang, C. Ruan, D. Dai, D. Chen, D. Ji, E. Li, F. Lin, F. Dai, F. Luo, G. Hao, G. Chen, G. Li, H. Zhang, H. Bao, H. Xu, H. Wang...
work page internal anchor Pith review Pith/arXiv arXiv 2025
- [29]
-
[30]
Y. Du, S. Li, A. Torralba, J. B. Tenenbaum, and I. Mordatch. Improving factuality and reasoning in language models through multiagent debate. arXiv preprint arXiv: 2305.14325, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
- [31]
-
[32]
A. Esp ´ın-P´erez, C. Portier, M. Chadeau-Hyam, K. van Veldhoven, J. C. Kleinjans, and T. M. de Kok. Comparison of statistical methods and the use of quality control samples for batch effect correction in human transcriptome data. PloS one, 13(8):e0202947, 2018
work page 2018
-
[33]
G. Feng, B. Zhang, Y. Gu, H. Ye, D. He, and L. Wang. Towards revealing the mystery behind chain of thought: A theoretical perspective. NEURIPS, 2023
work page 2023
-
[34]
J. A. Gagnon-Bartsch and T. P . Speed. Using control genes to correct for unwanted variation in microarray data. Biostatistics, 13(3):539–552, 2012. doi: 10.1093/biostatistics/kxr034
-
[35]
D. Ghosh and A. M. Chinnaiyan. Classification and selection of biomarkers in genomic data using lasso. Journal of Biomedicine and Biotechnology, 2005(2):147, 2005
work page 2005
-
[36]
G. S. Ginsburg and K. A. Phillips. Precision medicine: from science to value. Health Affairs, 37(5): 694–701, 2018. doi: 10.1377/hlthaff.2017.1624
-
[37]
Z. Gou, Z. Shao, Y. Gong, Y. Shen, Y. Yang, N. Duan, and W. Chen. Critic: Large language models can self-correct with tool-interactive critiquing. arXiv preprint arXiv:2305.11738, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
- [38]
-
[39]
M. A. Hamburg and F. S. Collins. The path to personalized medicine. New England Journal of Medicine, 363(4):301–304, 2010
work page 2010
-
[40]
J. A. Hanley and B. J. McNeil. The meaning and use of the area under a receiver operating char- acteristic (roc) curve. Radiology, 143(1):29–36, 1982
work page 1982
-
[41]
S. Hao, Y. Gu, H. Ma, J. J. Hong, Z. Wang, D. Wang, and Z. Hu. Reasoning with language model is planning with world model. Conference on Empirical Methods in Natural Language Processing, 2023. doi: 10.48550/arXiv.2305.14992
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2305.14992 2023
-
[42]
C. R. Henderson. Estimation of genetic parameters. Biometrics, 6(2):186–190, 1950. doi: 10.2307/ 3001414
work page 1950
-
[43]
L. Hong and S. E. Page. Groups of diverse problem solvers can outperform groups of high-ability problem solvers. Proceedings of the National Academy of Sciences, 101(46):16385–16389, 2004
work page 2004
-
[44]
S. Hong, M. Zhuge, J. Chen, X. Zheng, Y. Cheng, C. Zhang, J. Wang, Z. Wang, S. K. S. Yau, Z. Lin, L. Zhou, C. Ran, L. Xiao, C. Wu, and J. Schmidhuber. Metagpt: Meta programming for a multi- agent collaborative framework. arXiv preprint arXiv: 2308.00352, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[45]
S. Hong, Y. Lin, B. Liu, B. Liu, B. Wu, C. Zhang, C. Wei, D. Li, J. Chen, J. Zhang, J. Wang, L. Zhang, L. Zhang, M. Yang, M. Zhuge, T. Guo, T. Zhou, W. Tao, X. Tang, X. Lu, X. Zheng, X. Liang, Y. Fei, Y. Cheng, Z. Gou, Z. Xu, and C. Wu. Data interpreter: An llm agent for data science.arXiv preprint arXiv:2402.18679, 2024
-
[46]
S. Hu, C. Lu, and J. Clune. Automated design of agentic systems. arXiv preprint arXiv:2408.08435, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[47]
Large Language Models Cannot Self-Correct Reasoning Yet
J. Huang, X. Chen, S. Mishra, H. S. Zheng, A. W. Yu, X. Song, and D. Zhou. Large language models cannot self-correct reasoning yet. arXiv preprint arXiv:2310.01798, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
- [48]
- [49]
- [50]
- [51]
-
[52]
W. E. Johnson, C. Li, and A. Rabinovic. Adjusting batch effects in microarray expression data using empirical bayes methods. Biostatistics, 8(1):118–127, 2007. doi: 10.1093/biostatistics/kxj037
-
[53]
I. M. Johnstone. On the distribution of the largest eigenvalue in principal components analysis. The Annals of statistics, 29(2):295–327, 2001
work page 2001
-
[54]
H. B. Kang, N. Soliman, M. Latzke, J. C. Chang, and J. Bragg. Comlittee: Literature discovery with personal elected author committees. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pages 1–20, 2023
work page 2023
-
[55]
M. M. R. Khondoker. Statistical methods for pre-processing microarray gene expression data . PhD thesis, University of Edinburgh, 2006
work page 2006
-
[56]
B. Kyalwazi, C. Yau, M. J. Campbell, T. F. Yoshimatsu, A. J. Chien, A. M. Wallace, A. Forero-Torres, L. Pusztai, E. D. Ellis, K. S. Albain, et al. Race, gene expression signatures, and clinical outcomes of patients with high-risk early breast cancer. JAMA Network Open, 6(12):e2349646–e2349646, 2023
work page 2023
- [57]
-
[58]
J. T. Leek, R. B. Scharpf, H. C. Bravo, D. Simcha, B. Langmead, W. E. Johnson, D. Geman, K. Bag- gerly, and R. A. Irizarry. Tackling the widespread and critical impact of batch effects in high- throughput data. Nature Reviews Genetics, 11(10):733–739, 2010
work page 2010
-
[59]
B. Li and C. N. Dewey. Rsem: accurate transcript quantification from rna-seq data with or without 16 a reference genome. BMC Bioinformatics, 12:323, 2011. doi: 10.1186/1471-2105-12-323
- [60]
- [61]
- [62]
-
[63]
A. W.-C. Liew, N.-F. Law, and H. Yan. Missing value imputation for gene expression data: compu- tational techniques to recover missing data from available information. Briefings in bioinformatics, 12(5):498–513, 2011
work page 2011
-
[64]
C. Lippert, J. Listgarten, Y. Liu, C. M. Kadie, R. I. Davidson, and D. Heckerman. Fast linear mixed models for genome-wide association studies. Nature methods, 8(10):833–835, 2011
work page 2011
-
[65]
B. Liu, Y. Jiang, X. Zhang, Q. Liu, S. Zhang, J. Biswas, and P . Stone. Llm+p: Empowering large language models with optimal planning proficiency. arXiv preprint arXiv: 2304.11477, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[66]
B. Liu, X. Li, J. Zhang, J. Wang, T. He, S. Hong, H. Liu, S. Zhang, K. Song, K. Zhu, Y. Cheng, S. Wang, X. Wang, Y. Luo, H. Jin, P . Zhang, O. Liu, J. Chen, H. Zhang, Z. Yu, H. Shi, B. Li, D. Wu, F. Teng, X. Jia, J. Xu, J. Xiang, Y. Lin, T. Liu, T. Liu, Y. Su, H. Sun, G. Berseth, J. Nie, I. Foster, L. Ward, Q. Wu, Y. Gu, M. Zhuge, X. Tang, H. Wang, J. You...
work page 2025
- [67]
- [68]
-
[69]
Z. Liu, Y. Zhang, P . Li, Y. Liu, and D. Yang. Dynamic LLM-agent network: An LLM-agent collab- oration framework with agent team optimization. arXiv preprint arXiv:2310.02170, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[70]
Z. Liu, Y. Huang, S. Raman, A. Anandamurthy, V . Makeeva, V . Subbotin, D. Grushevskaya, K. Raman, E. Kalabusheva, J. Bagaitkar, T. Cui, B. Ren, M. Shvedova, J. Attie, C. Weng, P . Dolzhenko, M. J. Martinez, and K. Zhang. Transcriptomics and epigenetic data integration learning module on google cloud. Briefings in Bioinformatics , 25(Supplement 1):bbae352...
-
[71]
M. I. Love, W. Huber, and S. Anders. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology, 15(12):550, 2014
work page 2014
-
[72]
C. Lu, C. Lu, R. T. Lange, J. Foerster, J. Clune, and D. Ha. The ai scientist: Towards fully automated open-ended scientific discovery. arXiv preprint arXiv:2408.06292, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[73]
H. Ma, T. Hu, Z. Pu, L. Boyin, X. Ai, Y. Liang, and M. Chen. Coevolving with the other you: Fine- tuning llm with sequential cooperative multi-agent reinforcement learning. Advances in Neural Information Processing Systems, 37:15497–15525, 2024
work page 2024
- [74]
-
[75]
Self-Refine: Iterative Refinement with Self-Feedback
A. Madaan, N. Tandon, P . Gupta, S. Hallinan, L. Gao, S. Wiegreffe, U. Alon, N. Dziri, S. Prab- humoye, Y. Yang, et al. Self-refine: Iterative refinement with self-feedback. arXiv preprint arXiv:2303.17651, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
- [76]
-
[77]
J. D. Martin-Rufino, A. Caulier, L. E. Torres, A. Babu, S. Li, S. H. Jung, D. B. Keskin, X. Wang, S. Saori, P . Giuliana, M. Gu, A. A. Thompson, V . G. Sankaran, and E. S. Lander. Transcription factor networks disproportionately enrich for heritability of blood cell phenotypes. Science, 388 (6666):52–59, 2025. doi: 10.1126/science.ads7951
- [78]
-
[79]
Novita ai: Deploy ai models effortlessly with our simple api
Novita AI. Novita ai: Deploy ai models effortlessly with our simple api. https://novitaai. com, 2025. Accessed: 2025-02-17
work page 2025
- [80]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.