Recognition: unknown
HopRank: Self-Supervised LLM Preference-Tuning on Graphs for Few-Shot Node Classification
Pith reviewed 2026-05-10 05:59 UTC · model grok-4.3
The pith
HopRank lets LLMs classify nodes on text graphs without labels by turning the task into link prediction on homophilic structure.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
HopRank reformulates node classification as a link prediction task and presents a fully self-supervised LLM-tuning framework for text-attributed graphs. It constructs preference data via hierarchical hop-based sampling and employs adaptive preference learning to prioritize informative signals without any class labels. At inference, nodes are classified by predicting their connection preferences to labeled anchors with an adaptive early-exit voting scheme.
What carries the argument
Hierarchical hop-based sampling that generates preference pairs from multi-hop neighborhoods to encode class structure via homophily without labels.
If this is right
- The method achieves accuracy comparable to fully supervised GNNs on three TAG benchmarks.
- It substantially outperforms prior graph-LLM approaches that require labels.
- Classification runs with zero labeled training examples.
- Adaptive early-exit voting reduces the number of model calls during inference.
Where Pith is reading between the lines
- The same preference construction could be applied to other structure-rich tasks such as community detection or link prediction itself.
- Performance on graphs with weak or negative homophily would reveal the boundary of the approach.
- Combining the hop-sampling preference signal with other unsupervised LLM objectives might further reduce any remaining supervision needs.
Load-bearing premise
Edges in text-attributed graphs predominantly connect nodes of the same class according to the homophily principle.
What would settle it
A demonstration that HopRank falls short of supervised GNN accuracy on a text-attributed graph benchmark where connected nodes tend to have different classes.
Figures
read the original abstract
Node classification on text-attributed graphs (TAGs) is a fundamental task with broad applications in citation analysis, social networks, and recommendation systems. Current GNN-based approaches suffer from shallow text encoding and heavy dependence on labeled data, limiting their effectiveness in label-scarce settings. While large language models (LLMs) naturally address the text understanding gap with deep semantic reasoning, existing LLM-for-graph methods either still require abundant labels during training or fail to exploit the rich structural signals freely available in graph topology. Our key observation is that, in many real-world TAGs, edges predominantly connect similar nodes under the homophily principle, meaning graph topology inherently encodes class structure without any labels. Building on this insight, we reformulate node classification as a link prediction task and present HopRank, a fully self-supervised LLM-tuning framework for TAGs. HopRank constructs preference data via hierarchical hop-based sampling and employs adaptive preference learning to prioritize informative training signals without any class labels. At inference, nodes are classified by predicting their connection preferences to labeled anchors, with an adaptive early-exit voting scheme to improve efficiency. Experiments on three TAG benchmarks show that HopRank matches fully-supervised GNNs and substantially outperforms prior graph-LLM methods, despite using zero labeled training data.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents HopRank, a self-supervised LLM preference-tuning framework for few-shot node classification on text-attributed graphs (TAGs). It reformulates node classification as link prediction by constructing preference pairs via hierarchical hop-based sampling from graph topology (treating closer hops as positive under the homophily assumption), performs adaptive preference learning on an LLM without any class labels, and classifies nodes at inference by predicting preferences to a set of labeled anchors combined with early-exit voting. Experiments on three TAG benchmarks claim that HopRank matches the performance of fully-supervised GNNs while substantially outperforming prior graph-LLM methods despite using zero labeled training data.
Significance. If the results hold under scrutiny, this work is significant for demonstrating that graph topology alone can supply self-supervised signals sufficient for LLM preference tuning to achieve competitive node classification in label-scarce TAG settings. The approach bridges GNN structural reasoning with LLM semantic depth without requiring labeled training data, and the adaptive preference and early-exit mechanisms are practical contributions that could reduce annotation costs in applications such as citation networks and social graphs.
major comments (3)
- [§4] §4 (Experiments) and Table 2: The central claim that HopRank matches fully-supervised GNNs with zero training labels rests on the unquantified homophily assumption, yet no homophily ratio, edge-class correlation, or class-homophily metric is reported for the three benchmarks; without these, it is impossible to determine whether the sampled preferences are aligned with true classes or whether the results are benchmark-specific.
- [§3.2] §3.2 (Hop sampling and preference construction): The hierarchical hop-based sampling procedure for generating positive/negative pairs is load-bearing for the self-supervised signal, but the manuscript provides insufficient detail on the exact sampling probabilities, maximum hop distance, and how ties or low-homophily edges are handled; this directly affects whether the preference data reliably encodes class structure.
- [§4.1] §4.1 (Inference procedure): The method uses a set of labeled anchors at inference, making it few-shot rather than zero-label overall, but the number of anchors, their selection strategy, and sensitivity analysis to anchor count are not reported; this undermines direct comparison to standard few-shot GNN baselines that also use limited labels.
minor comments (2)
- [Abstract] The abstract and §1 should explicitly name the three TAG benchmarks (presumably Cora, CiteSeer, PubMed) rather than referring to them generically.
- [Figure 2] Figure 2 (method overview) would benefit from clearer labeling of the adaptive preference loss components and the early-exit condition.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. The comments identify key areas where additional quantification and clarification will improve the manuscript's rigor and reproducibility. We address each major comment point by point below and commit to revisions that directly respond to the concerns raised.
read point-by-point responses
-
Referee: [§4] §4 (Experiments) and Table 2: The central claim that HopRank matches fully-supervised GNNs with zero training labels rests on the unquantified homophily assumption, yet no homophily ratio, edge-class correlation, or class-homophily metric is reported for the three benchmarks; without these, it is impossible to determine whether the sampled preferences are aligned with true classes or whether the results are benchmark-specific.
Authors: We agree that reporting homophily metrics will strengthen the interpretation of our results. In the revised manuscript we will add a dedicated paragraph and supplementary table that computes and reports, for each of the three TAG benchmarks: (1) the overall homophily ratio (fraction of intra-class edges), (2) edge-class correlation, and (3) per-class homophily scores. These quantities will be calculated directly from the ground-truth labels that exist only for evaluation, thereby quantifying how well the hierarchical hop sampling aligns with class structure without altering the zero-label training regime. revision: yes
-
Referee: [§3.2] §3.2 (Hop sampling and preference construction): The hierarchical hop-based sampling procedure for generating positive/negative pairs is load-bearing for the self-supervised signal, but the manuscript provides insufficient detail on the exact sampling probabilities, maximum hop distance, and how ties or low-homophily edges are handled; this directly affects whether the preference data reliably encodes class structure.
Authors: We accept that the current description of the sampling procedure is insufficiently precise. In the revision we will expand §3.2 with an explicit algorithmic description that states: sampling probabilities decay exponentially with hop distance (p(h) ∝ 2^{-h}), the maximum hop distance is capped at 3, and edges connecting nodes whose hop distance exceeds this cap or that fall below a minimum homophily threshold are discarded rather than used as negative samples. We will also include pseudocode and a small illustrative example to make the construction of preference pairs fully reproducible. revision: yes
-
Referee: [§4.1] §4.1 (Inference procedure): The method uses a set of labeled anchors at inference, making it few-shot rather than zero-label overall, but the number of anchors, their selection strategy, and sensitivity analysis to anchor count are not reported; this undermines direct comparison to standard few-shot GNN baselines that also use limited labels.
Authors: The title and abstract already characterize the setting as few-shot node classification; the zero-label claim applies strictly to the training phase. At inference we indeed rely on a small set of labeled anchors. In the revised experiments section we will report the exact anchor count used (10 per class), the selection procedure (stratified random sampling from the labeled pool), and a sensitivity plot showing accuracy as a function of anchor count (1–20 per class). These additions will enable direct, apples-to-apples comparison with few-shot GNN baselines while preserving the central contribution that no labels are required for the preference-tuning stage. revision: yes
Circularity Check
No significant circularity; self-supervised signals generated independently from topology
full rationale
The paper's core derivation reformulates node classification as link prediction by sampling preference pairs from hop distances on the graph. This construction uses only the adjacency structure and the homophily premise as an external modeling choice; the resulting preference data and tuning objective are not defined in terms of the target class labels or any fitted quantity that is later re-used as a 'prediction.' No equations, self-citations, or uniqueness theorems are shown to collapse the claimed performance back onto the inputs by construction. The reported benchmark results therefore constitute an independent empirical test rather than a tautology.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Homophily principle: edges predominantly connect similar nodes, allowing graph topology to encode class structure without labels
Reference graph
Works this paper leans on
-
[1]
R. Chen, T. Zhao, A. Jaisw al, N. Shah, and Z. W ang,Llaga: Large language and graph assistant, in ICML, 2024
2024
-
[2]
Z. Chen, H. Mao, H. Li, W. Jin, H. Wen, X. Wei, S. W ang, D. Yin, W. F an, H. Liu, et al.,Exploring the potential of large language models (llms) in learning on graphs, in SIGKDD, 2024
2024
-
[3]
K. Ding, J. W ang, J. Li, D. Li, and H. Liu,Be more with less: Hypergraph attention networks for inductive text classification, in EMNLP, 2020
2020
-
[4]
Fast Graph Representation Learning with PyTorch Geometric
M. Fey and J. E. Lenssen,Fast graph representa- tion learning with pytorch geometric, arXiv preprint arXiv:1903.02428, (2019)
work page internal anchor Pith review arXiv 1903
-
[5]
C. L. Giles, K. D. Bollacker, and S. La wrence,Citeseer: An automatic citation indexing system, in Proceedings of the third ACM conference on Digital libraries, 1998, pp. 89–98
1998
-
[6]
Hamilton, Z
W. Hamilton, Z. Ying, and J. Leskovec, Inductive representation learning on large graphs, Advances in neural information processing systems, 30 (2017)
2017
-
[7]
X. He, X. Bresson, T. Laurent, A. Perold, Y. LeCun, and B. Hooi,Harnessing explanations: Llm-to-lm interpreter for enhanced text-attributed graph representation learning, in ICLR, 2023
2023
- [8]
-
[9]
Z. Hou, X. Liu, Y. Cen, Y. Dong, H. Yang, C. W ang, and J. Tang,Graphmae: Self- supervised masked graph autoencoders, in Proceed- ings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022, pp. 594–604
2022
-
[10]
E. J. Hu, Y. Shen, P. W allis, Z. Allen- Zhu, Y. Li, S. W ang, L. W ang, W. Chen, et al.,Lora: Low-rank adaptation of large language models., ICLR, 1 (2022), p. 3
2022
-
[11]
Huang and M
K. Huang and M. Zitnik,Graph meta learning via local subgraphs, Advances in neural information processing systems, 33 (2020), pp. 5862–5874
2020
-
[12]
B. Jin, G. Liu, C. Han, M. Jiang, H. Ji, and J. Han,Large language models on graphs: A comprehensive survey, IEEE Transactions on Knowledge and Data Engineering, (2024)
2024
-
[13]
D. P. Kingma,Adam: A method for stochastic op- timization, arXiv preprint arXiv:1412.6980, (2014)
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[14]
Semi-Supervised Classification with Graph Convolutional Networks
T. Kipf,Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv:1609.02907, (2016)
work page internal anchor Pith review arXiv 2016
- [15]
-
[16]
McPherson, L
M. McPherson, L. Smith-Lovin, and J. M. Cook,Birds of a feather: Homophily in social networks, Annual review of sociology, 27 (2001), pp. 415–444
2001
-
[17]
Y. Meng, M. Xia, and D. Chen,SimPO: Simple preference optimization with a reference-free reward, Advances in Neural Information Processing Systems, (2024)
2024
-
[18]
Ouyang, J
L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. W ainwright, P. Mishkin, C. Zhang, S. Agar w al, K. Slama, A. Ray, et al.,Training language models to follow instructions with human feedback, Advances in neural information processing systems, 35 (2022), pp. 27730–27744
2022
-
[19]
Raf ailov, A
R. Raf ailov, A. Sharma, E. Mitchell, C. D. Manning, S. Ermon, and C. Finn,Direct prefer- ence optimization: Your language model is secretly a reward model, Advances in neural information processing systems, 36 (2023), pp. 53728–53741
2023
-
[20]
P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi-Rad,Collective classification in network data, AI magazine, 29 (2008), pp. 93–93
2008
- [21]
-
[22]
J. Tang, Y. Yang, W. Wei, L. Shi, L. Su, S. Cheng, D. Yin, and C. Huang,Graphgpt: Graph instruction tuning for large language models, in Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024, pp. 491–500
2024
-
[23]
Thakoor, C
S. Thakoor, C. Tallec, M. G. Azar, M. Az- abou, E. L. Dyer, R. Munos, P. Veličković, and M. V alko,Large-scale representation learn- ing on graphs via bootstrapping, in International Conference on Learning Representations, 2022
2022
-
[24]
Velickovic, G
P. Velickovic, G. Cucurull, A. Casanov a, A. Romero, P. Lio, Y. Bengio, et al.,Graph attention networks, stat, 1050 (2017), pp. 10–48550
2017
-
[25]
Veličković, W
P. Veličković, W. Fedus, W. L. Hamilton, P. Liò, Y. Bengio, and R. D. Hjelm,Deep graph infomax, in International Conference on Learning Representations, 2019
2019
-
[26]
W ang, J
J. W ang, J. Wu, Y. Hou, Y. Liu, M. Gao, and J. McAuley,Instructgraph: Boosting large language models via graph-centric instruction tuning and preference alignment, in ACL Finding, 2024
2024
-
[27]
W ang and K
Z. W ang and K. Ding,Remol: Llm-guided molecular optimization with reinforcement learning, (2018)
2018
- [28]
-
[29]
MolMem: Memory-Augmented Agentic Reinforcement Learning for Sample-Efficient Molecular Optimization
Z. W ang, Y. Wen, A. Pandy, H. Liu, and K. Ding,Molmem: Memory-augmented agentic reinforcement learning for sample-efficient molecu- lar optimization, arXiv preprint arXiv:2604.12237, (2026)
work page internal anchor Pith review Pith/arXiv arXiv 2026
- [30]
- [31]
-
[32]
F. Wu, A. Souza, T. Zhang, C. Fifty, T. Yu, and K. Weinberger,Simplifying graph convo- lutional networks, in International conference on machine learning, Pmlr, 2019, pp. 6861–6871
2019
- [33]
-
[34]
Xu and K
R. Xu and K. Ding,Gnn-as-judge: Unleashing the power of llms for graph semi-supervised learning with gnn feedback, in Machine Learning on Graphs in the Era of Generative Artificial Intelligence, 2025
2025
-
[35]
Z. Yang, W. Cohen, and R. Salakhudinov, Revisiting semi-supervised learning with graph em- beddings, in International conference on machine learning, PMLR, 2016, pp. 40–48
2016
-
[36]
R. Ye, C. Zhang, R. W ang, S. Xu, and Y. Zhang,Language is all a graph needs, in Find- ings of the association for computational linguistics: EACL 2024, 2024, pp. 1955–1973
2024
-
[37]
J. Zhao, M. Qu, C. Li, H. Yan, Q. Liu, R. Li, X. Xie, and J. Tang,Learning on large-scale text-attributed graphs via variational inference, in International Conference on Learning Representations, 2023
2023
-
[38]
F. Zhou, C. Cao, K. Zhang, G. Trajcevski, T. Zhong, and J. Geng,Meta-gnn: On few-shot node classification in graph meta-learning, in Pro- ceedings of the 28th ACM international conference on information and knowledge management, 2019, pp. 2357–2360
2019
-
[39]
Example: [text]→Answer: [label]
Y. Zhu, Y. Xu, F. Yu, Q. Liu, S. Wu, and L. W ang,Deep graph contrastive representation learning, in ICML Workshop on Graph Representa- tion Learning and Beyond, 2020. A Scope Discussion. HopRank introduces a new paradigm for graph-LLM research: using graph topol- ogy as a self-supervised signal to replace labeled data entirely. As a proof of concept for ...
2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.