pith. sign in

arxiv: 2606.27413 · v1 · pith:WJJUTM37new · submitted 2026-06-25 · 🧬 q-bio.GN · cs.AI

GRAFT: Biological Graph and Hypergraph Benchmarks for Linked Gene Expression and Phenotypic Trait Prediction in Arabidopsis thaliana

Pith reviewed 2026-06-29 01:26 UTC · model grok-4.3

classification 🧬 q-bio.GN cs.AI
keywords GRAFT datasetArabidopsis thalianagene expressionphenotypic traitsgenome-to-phenomegraph learninghypergraphphenotype prediction
0
0 comments X

The pith

The GRAFT dataset is the first to link gene expression profiles with phenotypic trait measurements from the same Arabidopsis thaliana plants.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the GRAFT dataset to address the genome-to-phenome challenge by connecting gene expression data with trait measurements in identical specimens of Arabidopsis thaliana. Current datasets fail to link these data types or cover diverse traits, limiting correlations. GRAFT supports phenotype prediction and graph-based learning on gene-trait associations using conventional and hypergraph baselines. This enables research into how genes control traits using structured, multimodal data from multiple sources.

Core claim

The GRAFT dataset provides multimodal gene information and heterogeneous trait or phenotype data for the same Arabidopsis thaliana specimens, serving as the first such resource to support tasks like phenotype prediction and interpretable graph learning on gene-trait associations.

What carries the argument

The GRAFT dataset, a curated multi-modal collection that pairs gene expression profiles with phenotypic trait measurements from the same individual plants.

If this is right

  • Researchers can now perform phenotype prediction using linked gene and trait data.
  • Graph and hypergraph learning methods can be applied to uncover gene-trait relationships with higher-order interactions.
  • Benchmarks validate associations using biologically-informed baselines.
  • The dataset fosters broader research on genotype-phenotype mapping across multiple data sources.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar paired datasets could be created for other model organisms to expand G2P research.
  • Hypergraph approaches might identify complex multi-gene interactions affecting traits that pairwise graphs miss.
  • Integration with other omics data could further enhance prediction accuracy.

Load-bearing premise

The curation process correctly pairs gene expression profiles with phenotypic measurements from the exact same individual plants without mismatches, batch effects, or integration errors.

What would settle it

Demonstrating that the paired data contains mismatches where gene expression and traits originate from different plants or shows significant batch effects that invalidate the associations.

Figures

Figures reproduced from arXiv: 2606.27413 by Alexander Bucksch, Aranyak Goswami, Fiona L. Goggin, Jiamei Li, Khoa Luu, Manuel Serna-Aguilera, Suxing Liu, Vanshika Jindal.

Figure 1
Figure 1. Figure 1: The problem that non-heterogeneous phenomics data poses to downstream tasks. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The gene-level features are provided by GRAFT. Genes have identifiers and text descrip [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: A visualization of translating the biological functions of the genes of the thale cress to [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
read the original abstract

Understanding which genes control which traits in an organism remains one of the central challenges in biology. Despite significant advances in data collection technology, our ability to map genes to traits is still limited. This genome-to-phenome (G2P) challenge spans several problem domains, including plant breeding, and requires methods capable of reasoning over high-dimensional, heterogeneous, and biologically structured data. Current datasets and data repositories, however, are not well-equipped for this task. Current studies do not link gene expression and trait data, and most focus on very specific traits, limiting the breadth of possible correlations. To address this gap, we present the novel Gene-Graph Regression for Arabidopsis Functional Traits (GRAFT) dataset, a curated multi-modal dataset linking gene expression profiles with phenotypic trait measurements in Arabidopsis thaliana, a model organism in plant biology. GRAFT supports tasks such as phenotype prediction and interpretable graph learning. In addition, we benchmark conventional regression and explanatory baselines, including a biologically-informed hypergraph baseline, to validate gene-trait associations. To the best of our knowledge, this is the first dataset to provide multimodal gene information and heterogeneous trait or phenotype data for the same Arabidopsis thaliana specimens. With GRAFT, we aim to foster research to accurately understand the relationship between genotypes and phenotypes using gene information, higher-order gene pairings, and trait data from multiple sources.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces the GRAFT dataset, a curated multi-modal collection that links gene expression profiles with phenotypic trait measurements from the same Arabidopsis thaliana specimens. It supports phenotype prediction and interpretable graph learning tasks and reports benchmarks using conventional regression plus a biologically-informed hypergraph baseline, claiming to be the first such dataset providing multimodal gene information and heterogeneous traits for identical specimens.

Significance. If the specimen-level pairing can be verified as accurate, the dataset would address a genuine gap in genome-to-phenome resources by enabling joint analysis of high-dimensional gene data and heterogeneous traits; the decision to release baselines alongside the data is a constructive element that aids immediate usability.

major comments (2)
  1. [Abstract and Dataset Construction] Abstract and Dataset Construction section: the claim that the data come from 'the same Arabidopsis thaliana specimens' is load-bearing for the novelty assertion yet is unsupported by any description of source repositories, metadata alignment procedure, sample-overlap checks, batch-effect mitigation, or one-to-one correspondence validation; without these the reported gene-trait associations risk capturing spurious correlations.
  2. [Benchmarks] Benchmarks section: no quantitative validation of pairing fidelity is supplied, prediction results lack error bars or statistical tests, and the hypergraph baseline implementation and evaluation protocol are not detailed, rendering the utility demonstration difficult to assess or reproduce.
minor comments (1)
  1. [Methods] Clarify in the methods whether trait data are continuous or categorical and how missing values are handled across modalities.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight important areas for improving clarity and reproducibility. We address each major comment below and will incorporate the suggested revisions.

read point-by-point responses
  1. Referee: [Abstract and Dataset Construction] Abstract and Dataset Construction section: the claim that the data come from 'the same Arabidopsis thaliana specimens' is load-bearing for the novelty assertion yet is unsupported by any description of source repositories, metadata alignment procedure, sample-overlap checks, batch-effect mitigation, or one-to-one correspondence validation; without these the reported gene-trait associations risk capturing spurious correlations.

    Authors: We agree that additional documentation is required to substantiate the specimen-level pairing. In the revised manuscript we will expand the Dataset Construction section with explicit details on the source repositories (GEO accessions for expression data and the phenotypic trait databases), the metadata alignment procedure, sample-overlap verification steps, batch-effect correction methods applied, and any quantitative checks performed to confirm one-to-one correspondence. These additions will directly address the concern about potential spurious correlations. revision: yes

  2. Referee: [Benchmarks] Benchmarks section: no quantitative validation of pairing fidelity is supplied, prediction results lack error bars or statistical tests, and the hypergraph baseline implementation and evaluation protocol are not detailed, rendering the utility demonstration difficult to assess or reproduce.

    Authors: We acknowledge these omissions limit reproducibility. The revised version will include quantitative pairing-fidelity metrics (e.g., overlap statistics or correlation checks), error bars together with appropriate statistical tests (t-tests or Wilcoxon tests with p-values) on all reported prediction results, and a detailed description of the hypergraph baseline (including feature construction, hyperedge definition, training protocol, and evaluation metrics). We will also release the corresponding code and configuration files. revision: yes

Circularity Check

0 steps flagged

Dataset curation and benchmarking paper shows no circularity in derivation chain

full rationale

The paper presents a new multimodal dataset (GRAFT) linking gene expression profiles with phenotypic traits for Arabidopsis thaliana specimens, plus benchmark results on regression and graph baselines. No equations, fitted parameters, or derivations are described whose outputs reduce by construction to the authors' own inputs or self-citations. The central claim is the existence and utility of the curated pairing itself; no 'prediction' or 'first-principles result' is asserted that could be tautological. Self-citations, if any, are not load-bearing for any mathematical claim. This is the expected non-finding for a data-release paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper contributes a curated dataset rather than new theoretical entities or derivations; it rests on standard practices of gene expression profiling and trait measurement in plant biology.

axioms (1)
  • domain assumption Arabidopsis thaliana serves as a representative model organism whose gene-trait relationships generalize to broader plant biology questions.
    Invoked implicitly when positioning the dataset as relevant to the genome-to-phenome challenge.

pith-pipeline@v0.9.1-grok · 5813 in / 1118 out tokens · 44749 ms · 2026-06-29T01:26:56.931975+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

88 extracted references · 1 canonical work pages

  1. [1]

    https://www.arabidopsis.org/

    The arabidopsis information resource. https://www.arabidopsis.org/. Accessed: 2025- 05-12

  2. [2]

    Accessed: 2025-05-12

    Arapheno.https://arapheno.1001genomes.org/. Accessed: 2025-05-12

  3. [3]

    https://www.ncbi.nlm.nih.gov/geo/

    Gene expression omnibus. https://www.ncbi.nlm.nih.gov/geo/. Accessed: 2025-05-12

  4. [4]

    Accessed: 2025-05-12

    Photosynq.https://photosynq.org/. Accessed: 2025-05-12

  5. [5]

    https://www.plant-phenotyping.org/datasets-home

    Plant phenotyping datasets. https://www.plant-phenotyping.org/datasets-home. Ac- cessed: 2025-05-12

  6. [6]

    Accessed: 2025-05-12

    Sequence read archive.https://www.ncbi.nlm.nih.gov/sra/. Accessed: 2025-05-12

  7. [7]

    https://www.ebi.ac.uk/ena/browser/home

    Sequence read archive. https://www.ebi.ac.uk/ena/browser/home. Accessed: 2025-07- 9

  8. [8]

    https://doi.org/10.25919/5c36957c0af41

    Synthetic arabidopsis dataset. https://doi.org/10.25919/5c36957c0af41. Accessed: 2025-05-12

  9. [9]

    https://plantvision.unl.edu/ unl-plant-phenotyping-datasets/

    Unl plant phenotyping datasets. https://plantvision.unl.edu/ unl-plant-phenotyping-datasets/. Accessed: 2025-05-12

  10. [10]

    Ali Mahmoud Ali and Mazin Abed Mohammed. A comprehensive review of artificial intelli- gence approaches in omics data processing: Evaluating progress and challenges.International Journal of Mathematics, Statistics, and Computer Science, 2:114–167, Dec. 2023

  11. [11]

    Streaming min-max hypergraph partitioning

    Dan Alistarh, Jennifer Iglesias, and Milan V ojnovic. Streaming min-max hypergraph partitioning. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc., 2015

  12. [12]

    Graphframex: Towards systematic evaluation of explainability methods for graph neural networks, 2024

    Kenza Amara, Rex Ying, Zitao Zhang, Zhihao Han, Yinan Shan, Ulrik Brandes, Sebastian Schemm, and Ce Zhang. Graphframex: Towards systematic evaluation of explainability methods for graph neural networks, 2024

  13. [13]

    Deep canonical correlation anal- ysis

    Galen Andrew, Raman Arora, Jeff Bilmes, and Karen Livescu. Deep canonical correlation anal- ysis. In Sanjoy Dasgupta and David McAllester, editors,Proceedings of the 30th International Conference on Machine Learning, volume 28 ofProceedings of Machine Learning Research, pages 1247–1255, Atlanta, Georgia, USA, 17–19 Jun 2013. PMLR

  14. [14]

    Marioni, and Oliver Stegle

    Ricard Argelaguet, Damien Arnol, Danila Bredikhin, Yonatan Deloro, Britta Velten, John C. Marioni, and Oliver Stegle. MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data.Genome Biology, 21(1):111, December 2020

  15. [15]

    Song Bai, Feihu Zhang, and Philip H. S. Torr. Hypergraph convolution and hypergraph attention, 2020

  16. [16]

    Machine learning assists prediction of genes responsible for plant specialized metabolite biosynthesis by integrating multi-omics data.BMC Genomics, 25, 2024

    Wenhui Bai, Cheng Li, Wei Li, Hai Wang, Xiaohong Han, Peipei Wang, and Li Wang. Machine learning assists prediction of genes responsible for plant specialized metabolite biosynthesis by integrating multi-omics data.BMC Genomics, 25, 2024

  17. [17]

    Elsevier, 1989

    Claude Berge.Graphs and hypergraphs. Elsevier, 1989. 10

  18. [18]

    Spectra, euclidean representations and clusterings of hypergraphs.Discrete Mathematics, 117(1):19–39, 1993

    Marianna Bolla. Spectra, euclidean representations and clusterings of hypergraphs.Discrete Mathematics, 117(1):19–39, 1993

  19. [19]

    Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, and Pierre Vandergheynst

    Michael M. Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, and Pierre Vandergheynst. Geometric deep learning: Going beyond euclidean data.IEEE Signal Processing Magazine, 34(4):18–42, July 2017

  20. [20]

    Transcriptomic and metabolomic data integration.Briefings in Bioinformatics, 17(5):891–901, 10 2015

    Rachel Cavill, Danyel Jennen, Jos Kleinjans, and Jacob Jan Briedé. Transcriptomic and metabolomic data integration.Briefings in Bioinformatics, 17(5):891–901, 10 2015

  21. [21]

    An integrated multi-omics and artificial intelligence framework for advance plant phenotyping in horticulture.Biology, 12(10), 2023

    Danuta Cembrowska-Lech, Adrianna Krzemi ´nska, Tymoteusz Miller, Anna Nowakowska, Cezary Adamski, Martyna Radaczy´nska, Grzegorz Mikiciuk, and Małgorzata Mikiciuk. An integrated multi-omics and artificial intelligence framework for advance plant phenotyping in horticulture.Biology, 12(10), 2023

  22. [22]

    Hytrel: Hypergraph-enhanced tabular data representation learning, 2023

    Pei Chen, Soumajyoti Sarkar, Leonard Lausen, Balasubramaniam Srinivasan, Sheng Zha, Ruihong Huang, and George Karypis. Hytrel: Hypergraph-enhanced tabular data representation learning, 2023

  23. [23]

    Llaga: Large language and graph assistant, 2024

    Runjin Chen, Tong Zhao, Ajay Jaiswal, Neil Shah, and Zhangyang Wang. Llaga: Large language and graph assistant, 2024

  24. [24]

    Kim, Justin Halim, Jennifer Arp, Hung-Jui S

    Chia-Yi Cheng, Ying Li, Kranthi Varala, Jessica Bubert, Ji Huang, Grace J. Kim, Justin Halim, Jennifer Arp, Hung-Jui S. Shih, Grace Levinson, Seo Hyun Park, Ha Young Cho, Stephen P. Moose, and Gloria M. Coruzzi. Evolutionarily informed machine learning enhances the power of predictive gene-to-phenotype relationships.Nature Communications, 12(4567), 2021

  25. [25]

    Hyeong Kyu Choi, Seunghun Lee, Jaewon Chu, and Hyunwoo J. Kim. Nutrea: Neural tree search for context-guided multi-hop kgqa, 2023

  26. [26]

    Llm-guided multi-view hypergraph learning for human-centric explainable recommendation, 2024

    Zhixuan Chu, Yan Wang, Qing Cui, Longfei Li, Wenqing Chen, Zhan Qin, and Kui Ren. Llm-guided multi-view hypergraph learning for human-centric explainable recommendation, 2024

  27. [27]

    Burks, Utkarsh Singh, Ron Mittler, and Rajeev K

    Tallon Coxe, David J. Burks, Utkarsh Singh, Ron Mittler, and Rajeev K. Azad. Benchmarking rna-seq aligners at base-level and junction base-level resolution using the arabidopsis thaliana genome.Plants, 13(5), 2024

  28. [28]

    Convolutional neural networks on graphs with fast localized spectral filtering, 2017

    Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. Convolutional neural networks on graphs with fast localized spectral filtering, 2017

  29. [29]

    Demidchik, A.Y

    V .V . Demidchik, A.Y . Shashko, U.Y . Bandarenka, G.N. Smolikova, D.A. Przhevalskaya, M.A. Charnysh, G.A. Pozhvanov, A.V . Barkosvkyi, I.I. Smolich, A.I. Sokolik, M. Yu, and S.S. Medvedev. Plant phenomics: Fundamental bases, software and hardware platforms, and machine learning.Russian Journal of Plant Physiology, 67:397–412, 2020

  30. [30]

    Charting plant gene functions in the multi-omics and single-cell era.Trends in Plant Science, 28:283–296, 2023

    Thomas Depuydt, Bert De Rybel, and Klaas Vandepoele. Charting plant gene functions in the multi-omics and single-cell era.Trends in Plant Science, 28:283–296, 2023

  31. [31]

    Gene expression omnibus: Ncbi gene expression and hybridization array data repository, 2002

    Ron Edgar, Michael Domrachev, and Alex E Lash. Gene expression omnibus: Ncbi gene expression and hybridization array data repository, 2002

  32. [32]

    Fan and L

    Q. Fan and L. Shuai. Adaptive hyper-graph aggregation for modality-agnostic federated learning. In2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12312–12321, 2024

  33. [33]

    Beyond graphs: Can large language models comprehend hypergraphs?, 2024

    Yifan Feng, Chengwu Yang, Xingliang Hou, Shaoyi Du, Shihui Ying, Zongze Wu, and Yue Gao. Beyond graphs: Can large language models comprehend hypergraphs?, 2024

  34. [34]

    Hypergraph neural networks, 2019

    Yifan Feng, Haoxuan You, Zizhao Zhang, Rongrong Ji, and Yue Gao. Hypergraph neural networks, 2019

  35. [35]

    Flood, Willem Kruijer, Sabine K

    Padraic J. Flood, Willem Kruijer, Sabine K. Schnabel, Rob van der Schoor, Henk Jalink, Jan F. H. Snel, Jeremy Harbinson, and Mark G.M. Aarts. Discovery and delivery strategies for engineered live biotherapeutic products.Plant Methods, 12(14), 2016. 11

  36. [36]

    Artificial intelligence in omics.Genomics, Proteomics & Bioinformatics, 20(5):811–813, 01 2023

    Feng Gao, Kun Huang, and Yi Xing. Artificial intelligence in omics.Genomics, Proteomics & Bioinformatics, 20(5):811–813, 01 2023

  37. [37]

    Nazor, Aaron Streets, and Nir Yosef

    Adam Gayoso, Zoë Steier, Romain Lopez, Jeffrey Regier, Kristopher L. Nazor, Aaron Streets, and Nir Yosef. Joint probabilistic modeling of single-cell multi-omic data with totalVI.Nature Methods, 18(3):272–282, March 2021

  38. [38]

    Hamilton, Rex Ying, and Jure Leskovec

    William L. Hamilton, Rex Ying, and Jure Leskovec. Inductive representation learning on large graphs, 2018

  39. [39]

    Vision gnn: An image is worth graph of nodes, 2022

    Kai Han, Yunhe Wang, Jianyuan Guo, Yehui Tang, and Enhua Wu. Vision gnn: An image is worth graph of nodes, 2022

  40. [40]

    Search behavior prediction: A hypergraph perspective, 2022

    Yan Han, Edward W Huang, Wenqing Zheng, Nikhil Rao, Zhangyang Wang, and Karthik Subbian. Search behavior prediction: A hypergraph perspective, 2022

  41. [41]

    Heavey, Deniz Durmusoglu, Nathan Crook, and Aaron C

    Mairead K. Heavey, Deniz Durmusoglu, Nathan Crook, and Aaron C. Anselmo. Discovery and delivery strategies for engineered live biotherapeutic products.Trends in Biotechnology, 40(3):354–369, 2022

  42. [42]

    The arabidopsis information resource (tair): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant, 2001

    E Huala, A W Dickerman, M Garcia-Hernandez, D Weems, L Reiser, F LaFond, D Hanley, D Kiphart, M Zhuang, W Huang, L A Mueller, D Bhattacharyya, D Bhaya, B W Sobral, W Beavis, D W Meinke, C D Town, C Somerville, and S Y Rhee. The arabidopsis information resource (tair): a comprehensive database and web-based information retrieval, analysis, and visualizatio...

  43. [43]

    Unignn: a unified framework for graph and hypergraph neural networks, 2021

    Jing Huang and Jie Yang. Unignn: a unified framework for graph and hypergraph neural networks, 2021

  44. [44]

    Hyperg: Hypergraph-enhanced llms for structured knowledge, 2025

    Sirui Huang, Hanqian Li, Yanggan Gu, Xuming Hu, Qing Li, and Guandong Xu. Hyperg: Hypergraph-enhanced llms for structured knowledge, 2025

  45. [45]

    Learning on weighted hyper- graphs to integrate protein interactions and gene expressions for cancer outcome prediction

    TaeHyun Hwang, Ze Tian, Rui Kuangy, and Jean-Pierre Kocher. Learning on weighted hyper- graphs to integrate protein interactions and gene expressions for cancer outcome prediction. In 2008 Eighth IEEE International Conference on Data Mining, pages 293–302, 2008

  46. [46]

    Learning situation hyper-graphs for video question answering, 2023

    Aisha Urooj Khan, Hilde Kuehne, Bo Wu, Kim Chheu, Walid Bousselham, Chuang Gan, Niels Lobo, and Mubarak Shah. Learning situation hyper-graphs for video question answering, 2023

  47. [47]

    Hypergraph attention networks for multimodal learning

    Eun-Sol Kim, Woo Young Kang, Kyoung-Woon On, Yu-Jung Heo, and Byoung-Tak Zhang. Hypergraph attention networks for multimodal learning. In2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14569–14578, 2020

  48. [48]

    Hypeboy: Generative self-supervised representation learning on hypergraphs, 2024

    Sunwoo Kim, Shinhwan Kang, Fanchen Bu, Soo Yong Lee, Jaemin Yoo, and Kijung Shin. Hypeboy: Generative self-supervised representation learning on hypergraphs, 2024

  49. [49]

    Kipf and Max Welling

    Thomas N. Kipf and Max Welling. Semi-supervised classification with graph convolutional networks, 2017

  50. [50]

    D. V . Klopfenstein, Liangsheng Zhang, Brent S. Pedersen, Fidel Ramírez, Alex War- wick Vesztrocy, Aurélien Naldi, Christopher J. Mungall, Jeffrey M. Yunes, Olga Botvinnik, Mark Weigel, Will Dampier, Christophe Dessimoz, Patrick Flick, and Haibao Tang. GOA- TOOLS: A Python library for Gene Ontology analyses.Scientific Reports, 8(1):10872, July 2018

  51. [51]

    Inhomogeneous hypergraph clustering with applications, 2017

    Pan Li and Olgica Milenkovic. Inhomogeneous hypergraph clustering with applications, 2017

  52. [52]

    Muse-gnn: Learning unified gene representation from multimodal biological graph data, 2023

    Tianyu Liu, Yuge Wang, Rex Ying, and Hongyu Zhao. Muse-gnn: Learning unified gene representation from multimodal biological graph data, 2023

  53. [53]

    Self-supervised dynamic hypergraph recommendation based on hyper-relational knowledge graph, 2023

    Yi Liu, Hongrui Xuan, Bohan Li, Meng Wang, Tong Chen, and Hongzhi Yin. Self-supervised dynamic hypergraph recommendation based on hyper-relational knowledge graph, 2023. 12

  54. [54]

    A unified approach to interpreting model predictions

    Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. In I. Guyon, U. V on Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017

  55. [55]

    Parameterized explainer for graph neural network, 2020

    Dongsheng Luo, Wei Cheng, Dongkuan Xu, Wenchao Yu, Bo Zong, Haifeng Chen, and Xiang Zhang. Parameterized explainer for graph neural network, 2020

  56. [56]

    Hypergraphrag: Retrieval-augmented generation via hypergraph-structured knowledge representation, 2025

    Haoran Luo, Haihong E, Guanting Chen, Yandan Zheng, Xiaobao Wu, Yikai Guo, Qika Lin, Yu Feng, Zemin Kuang, Meina Song, Yifan Zhu, and Luu Anh Tuan. Hypergraphrag: Retrieval-augmented generation via hypergraph-structured knowledge representation, 2025

  57. [57]

    Karunathilake, Thai Thanh Tuan, and Yong Suk Chung

    Sheikh Mansoor, Ekanayaka M.B.M. Karunathilake, Thai Thanh Tuan, and Yong Suk Chung. Genomics, phenomics, and machine learning in transforming plant research: Advancements and challenges.Horticultural Plant Journal, 11(2):486–503, 2025

  58. [58]

    Tsaftaris

    Massimo Minervini, Andreas Fischbach, Hanno Scharr, and Sotirios A. Tsaftaris. Finely-grained annotated datasets for image-based plant phenotyping.Pattern Recognition Letters, 81:80–89, 2016

  59. [59]

    Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe

    Christopher Morris, Martin Ritzert, Matthias Fey, William L. Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe. Weisfeiler and leman go neural: Higher-order graph neural networks, 2021

  60. [60]

    Hyperglm: Hypergraph for video scene graph generation and anticipation, 2025

    Trong-Thuan Nguyen, Pha Nguyen, Jackson Cothren, Alper Yilmaz, and Khoa Luu. Hyperglm: Hypergraph for video scene graph generation and anticipation, 2025

  61. [61]

    Cyclo: Cyclic graph transformer approach to multi-object relationship modeling in aerial videos, 2024

    Trong-Thuan Nguyen, Pha Nguyen, Xin Li, Jackson Cothren, Alper Yilmaz, and Khoa Luu. Cyclo: Cyclic graph transformer approach to multi-object relationship modeling in aerial videos, 2024

  62. [62]

    Hig: Hierarchical interlacement graph approach to scene graph generation in video understanding, 2024

    Trong-Thuan Nguyen, Pha Nguyen, and Khoa Luu. Hig: Hierarchical interlacement graph approach to scene graph generation in video understanding, 2024

  63. [63]

    Let your graph do the talking: Encoding structured data for llms, 2024

    Bryan Perozzi, Bahare Fatemi, Dustin Zelle, Anton Tsitsulin, Mehran Kazemi, Rami Al-Rfou, and Jonathan Halcrow. Let your graph do the talking: Encoding structured data for llms, 2024

  64. [64]

    Pywgcna: a python package for weighted gene co-expression network analysis.Bioinformatics, 39(7):btad415, 07 2023

    Narges Rezaie, Farilie Reese, and Ali Mortazavi. Pywgcna: a python package for weighted gene co-expression network analysis.Bioinformatics, 39(7):btad415, 07 2023

  65. [65]

    why should i trust you?

    Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. "why should i trust you?": Explaining the predictions of any classifier, 2016

  66. [66]

    Rodríguez

    J.A. Rodríguez. On the laplacian spectrum and walk-regular hypergraphs.Linear and Multilin- ear Algebra, 51(3):285–297, 2003

  67. [67]

    Arapheno: a public database for arabidopsis thaliana phenotypes, 2017

    Umit Seren, Dominik Grimm, Joffrey Fitz, Detlef Weigel, Magnus Nordborg, Karsten Borg- wardt, and Arthur Korte. Arapheno: a public database for arabidopsis thaliana phenotypes, 2017

  68. [68]

    Masked label prediction: Unified message passing model for semi-supervised classification, 2021

    Yunsheng Shi, Zhengjie Huang, Shikun Feng, Hui Zhong, Wenjin Wang, and Yu Sun. Masked label prediction: Unified message passing model for semi-supervised classification, 2021

  69. [69]

    Ni, Heung-Yeung Shum, and Jian Guo

    Jiashuo Sun, Chengjin Xu, Lumingyuan Tang, Saizhuo Wang, Chen Lin, Yeyun Gong, Lionel M. Ni, Heung-Yeung Shum, and Jian Guo. Think-on-graph: Deep and responsible reasoning of large language model on knowledge graph, 2024

  70. [70]

    Ubbens and Ian Stavness

    Jordan R. Ubbens and Ian Stavness. Deep plant phenomics: A deep learning platform for complex plant phenotyping tasks.Frontiers in Plant Science, V olume 8 - 2017, 2017

  71. [71]

    Graph attention networks, 2018

    Petar Veliˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. Graph attention networks, 2018

  72. [72]

    Equivariant hyper- graph diffusion neural operators, 2023

    Peihao Wang, Shenghao Yang, Yunyu Liu, Zhangyang Wang, and Pan Li. Equivariant hyper- graph diffusion neural operators, 2023. 13

  73. [73]

    From hypergraph energy functions to hypergraph neural networks, 2023

    Yuxin Wang, Quan Gan, Xipeng Qiu, Xuanjing Huang, and David Wipf. From hypergraph energy functions to hypergraph neural networks, 2023

  74. [74]

    Deep leaf segmentation using synthetic data, 2019

    Daniel Ward, Peyman Moghadam, and Nicolas Hudson. Deep leaf segmentation using synthetic data, 2019

  75. [75]

    Augmentations in hypergraph contrastive learning: Fabricated and generative, 2022

    Tianxin Wei, Yuning You, Tianlong Chen, Yang Shen, Jingrui He, and Zhangyang Wang. Augmentations in hypergraph contrastive learning: Fabricated and generative, 2022

  76. [76]

    Biology in bloom: A primer on the arabidopsis thaliana model system.Genetics, 208(4):1337–1349, April 2018

    Andrew W Woodward and Bonnie Bartel. Biology in bloom: A primer on the arabidopsis thaliana model system.Genetics, 208(4):1337–1349, April 2018

  77. [77]

    Graph information bottleneck, 2020

    Tailin Wu, Hongyu Ren, Pan Li, and Jure Leskovec. Graph information bottleneck, 2020

  78. [78]

    Neural message passing for multi-relational ordered and recursive hyper- graphs

    Naganand Yadati. Neural message passing for multi-relational ordered and recursive hyper- graphs. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors,Advances in Neural Information Processing Systems, volume 33, pages 3275–3289. Curran Associates, Inc., 2020

  79. [79]

    Hypergcn: A new method of training graph convolutional networks on hypergraphs, 2019

    Naganand Yadati, Madhav Nimishakavi, Prateek Yadav, Vikram Nitin, Anand Louis, and Partha Talukdar. Hypergcn: A new method of training graph convolutional networks on hypergraphs, 2019

  80. [80]

    Machine learning bridges omics sciences and plant breeding

    Jun Yan and Xiangfeng Wang. Machine learning bridges omics sciences and plant breeding. Trends in Plant Science, 28:199–210, 2023

Showing first 80 references.