Hierarchical Contrastive Learning for Multi-Domain Protein-Ligand Binding
Pith reviewed 2026-05-20 06:39 UTC · model grok-4.3
The pith
Hierarchical contrastive learning decouples geometric pre-training from affinity prediction to handle multi-domain protein flexibility.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
HCLBind is a self-supervised framework that decouples geometric representation learning from affinity regression through general-to-specific pre-training on the Q-BioLiP database; it employs a hierarchical decoy strategy of coordinate perturbation for single-domain proteins and inter-domain rotation for multi-domain complexes, integrated with a domain-gated graph attention network and cross-modal attention, to learn discriminative interface features and robust uncertainty estimates.
What carries the argument
Hierarchical decoy strategy that generates local physicochemical constraints via single-domain coordinate perturbation and global conformational geometry via multi-domain inter-domain rotation.
If this is right
- The model learns local constraints separately from global geometry, reducing the impact of rigid-body assumptions on flexible proteins.
- Domain-gated attention and cross-modal fusion explicitly prioritize interface regions over the rest of the complex.
- LoRA adaptation preserves evolutionary knowledge from foundation models while allowing efficient task-specific training.
- Uncertainty estimates become more reliable because the contrastive pre-training is decoupled from the final regression head.
- Performance gains appear on the PDBBind benchmark for multi-domain cases where prior supervised methods degrade.
Where Pith is reading between the lines
- The same hierarchical decoy pattern could be tested on protein-protein or protein-RNA interfaces where domain motion also governs recognition.
- Representations learned this way may transfer more readily to downstream tasks such as ligand design for flexible multi-domain targets.
- If the decoy strategy truly captures physical grammar, similar contrastive objectives might improve conformational ensemble modeling in structural biology.
Load-bearing premise
The coordinate perturbations and inter-domain rotations create training signals that match real biological binding dynamics instead of non-physical artifacts.
What would settle it
Compare affinity prediction error on a held-out set of multi-domain proteins whose crystal structures show large verified domain rotations against the same model trained without the inter-domain rotation decoy; a large performance gap would support the claim.
Figures
read the original abstract
Predicting protein-ligand binding affinity remains intractable for multi-domain proteins, where inter-domain dynamics govern molecular recognition. Existing geometric deep learning methods typically treat proteins as monolithic static graphs, suffering from rigid-body assumptions and aleatoric noise in flexible regions. To address this, we introduced HCLBind, a self-supervised framework that decouples geometric representation learning from affinity regression. HCLBind leverages a general-to-specific pre-training paradigm on the Q-BioLiP database to learn a robust physical grammar of binding. We propose a novel hierarchical decoy strategy: the model learns local physicochemical constraints through protein coordinate perturbation in single-domain proteins and global conformational geometry through inter-domain rotation in multi-domain complexes. Our hybrid architecture integrates a domain-gated graph attention network and cross-modal attention to explicitly prioritize domain interfaces. Furthermore, we employ LoRA on protein and ligand foundation models, ensuring efficient optimization while preserving evolutionary knowledge. Experiments on PDBBind demonstrate that HCLBind effectively learns discriminative interface features and provides robust uncertainty estimation, overcoming the limitations of standard supervised learning. The code is available at https://github.com/jiankliu/HCLBind.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces HCLBind, a self-supervised hierarchical contrastive learning framework for protein-ligand binding affinity prediction in multi-domain proteins. It decouples geometric representation learning from affinity regression via general-to-specific pre-training on Q-BioLiP, employing a novel hierarchical decoy strategy (coordinate perturbation on single-domain proteins and inter-domain rotation on multi-domain complexes), a domain-gated graph attention network with cross-modal attention, and LoRA adaptation of foundation models. The central claim is that this approach learns discriminative interface features and yields robust uncertainty estimates, as shown by experiments on PDBBind that overcome limitations of standard supervised learning.
Significance. If the central claims are substantiated, the work could meaningfully advance geometric deep learning for flexible multi-domain systems by providing contrastive pre-training signals that capture both local physicochemical constraints and global conformational geometry. The public code release at https://github.com/jiankliu/HCLBind and the use of LoRA to preserve evolutionary knowledge while enabling efficient optimization are concrete strengths that support reproducibility and practical adoption.
major comments (2)
- [Abstract] Abstract: The assertion that 'Experiments on PDBBind demonstrate that HCLBind effectively learns discriminative interface features and provides robust uncertainty estimation' is unsupported by any quantitative metrics, baselines, error bars, ablation studies, or statistical tests. Without these, the central claim that the method overcomes limitations of supervised learning cannot be evaluated.
- [Hierarchical Decoy Strategy] Hierarchical Decoy Strategy section: The claim that coordinate perturbation (single-domain) and inter-domain rotation (multi-domain) supply faithful contrastive signals for learning a 'robust physical grammar of binding' rests on an unverified assumption. These operations can produce non-physical artifacts such as steric clashes or unphysical domain separations; no validation, energy analysis, or comparison to molecular dynamics trajectories is provided to show that the domain-gated GAT and cross-modal attention encode real binding dynamics rather than these artifacts. This assumption is load-bearing for the pre-training paradigm and downstream affinity/uncertainty results on PDBBind.
minor comments (2)
- The equations defining the contrastive loss and uncertainty estimation should be presented explicitly with all variables defined, rather than described only in prose.
- [Experiments] Figure captions and axis labels for any PDBBind performance plots should include error bars, baseline names, and dataset split details for clarity.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback on our manuscript. We address each major comment point by point below, outlining our responses and planned revisions where appropriate.
read point-by-point responses
-
Referee: [Abstract] Abstract: The assertion that 'Experiments on PDBBind demonstrate that HCLBind effectively learns discriminative interface features and provides robust uncertainty estimation' is unsupported by any quantitative metrics, baselines, error bars, ablation studies, or statistical tests. Without these, the central claim that the method overcomes limitations of supervised learning cannot be evaluated.
Authors: We agree that the abstract would benefit from more explicit reference to supporting quantitative evidence. The full manuscript presents these details in the Experiments and Results sections, including baseline comparisons on PDBBind, ablation studies isolating the hierarchical pre-training and attention components, error bars across multiple random seeds, and statistical tests for performance differences. To directly address the concern, we will revise the abstract to briefly highlight key quantitative outcomes (e.g., affinity prediction improvements and uncertainty calibration metrics) while preserving its length and focus. revision: yes
-
Referee: [Hierarchical Decoy Strategy] Hierarchical Decoy Strategy section: The claim that coordinate perturbation (single-domain) and inter-domain rotation (multi-domain) supply faithful contrastive signals for learning a 'robust physical grammar of binding' rests on an unverified assumption. These operations can produce non-physical artifacts such as steric clashes or unphysical domain separations; no validation, energy analysis, or comparison to molecular dynamics trajectories is provided to show that the domain-gated GAT and cross-modal attention encode real binding dynamics rather than these artifacts. This assumption is load-bearing for the pre-training paradigm and downstream affinity/uncertainty results on PDBBind.
Authors: We acknowledge the importance of verifying that the decoy generation produces physically plausible signals. The hierarchical strategy is motivated by established practices in geometric deep learning for proteins, with coordinate perturbation targeting local atomic environments and inter-domain rotation capturing global flexibility relevant to multi-domain binding. In the revised manuscript, we will add a dedicated validation subsection that quantifies potential artifacts (e.g., steric clash scores via standard structural validation tools and interface RMSD preservation) and discusses how the domain-gated graph attention and cross-modal mechanisms help the model prioritize biologically meaningful features. Full molecular dynamics comparisons remain computationally intensive and are noted as a valuable direction for future work, but the added analyses will strengthen the current claims. revision: partial
Circularity Check
No significant circularity in derivation chain
full rationale
The paper describes a self-supervised pre-training framework on the distinct Q-BioLiP database using a proposed hierarchical decoy strategy (coordinate perturbation for single-domain and inter-domain rotation for multi-domain), followed by evaluation on PDBBind. No equations, fitted parameters, or self-citations are presented that reduce the reported performance, uncertainty estimates, or interface feature learning directly to quantities defined by the same inputs or by construction. The central claims rest on the novel architecture (domain-gated GAT and cross-modal attention with LoRA) and the decoy generation process, which are independent methodological contributions rather than reductions to prior fitted values or self-referential definitions. Pre-training and test distributions are explicitly separated, rendering the derivation self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- LoRA rank and scaling factors
axioms (1)
- domain assumption Q-BioLiP contains sufficient structural diversity to learn a general physical grammar of binding that transfers to PDBBind.
Reference graph
Works this paper leans on
-
[1]
Advances in Neural Information Processing Systems33(2020)
Amini, A., Schwarting, W., Soleimany, A., Rus, D.: Deep evidential regression. Advances in Neural Information Processing Systems33(2020)
work page 2020
-
[2]
Nature Machine Intelli- gence5(2), 126–136 (Feb 2023)
Bai, P., Miljković, F., John, B., Lu, H.: Interpretable bilinear attention network with domain adaptation improves drug–target prediction. Nature Machine Intelli- gence5(2), 126–136 (Feb 2023)
work page 2023
-
[3]
https://evolutionaryscale.ai/blog/esm-cambrian (2024)
ESM-Team: ESM Cambrian: Revealing the mysteries of proteins with unsupervised learning. https://evolutionaryscale.ai/blog/esm-cambrian (2024)
work page 2024
-
[4]
Molecular Aspects of Medicine101, 101337 (Feb 2025)
Gianni, S., Brunori, M.: The folding and misfolding of multidomain proteins. Molecular Aspects of Medicine101, 101337 (Feb 2025)
work page 2025
-
[5]
Journal of Computational Chemistry 25(2), 238–250 (Nov 2003)
Gohlke, H., Case, D.A.: Converging free energy estimates: MM-PB(GB)SA studies on the protein–protein complex Ras–Raf. Journal of Computational Chemistry 25(2), 238–250 (Nov 2003)
work page 2003
-
[6]
Nature Commu- nications16(1) (Oct 2025)
Hansen, S.B., Bartual, S.G., Yuan, H., Raimi, O.G., Gorelik, A., Ferenbach, A.T., Lytje, K., Pedersen, J.S., Drace, T., Boesen, T., van Aalten, D.M.F.: Multi-domain O-GlcNAcase structures reveal allosteric regulatory mechanisms. Nature Commu- nications16(1) (Oct 2025)
work page 2025
-
[7]
In: International Conference on Learning Representations (2022)
Hu, E.J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., Chen, W.: LoRA: Low-Rank Adaptation of Large Language Models. In: International Conference on Learning Representations (2022)
work page 2022
-
[8]
Pattern Recognition157, 110887 (Jan 2025)
Hua, Y., Feng, Z., Song, X., Wu, X.J., Kittler, J.: MMDG-DTI: Drug–target inter- action prediction via multimodal feature fusion and domain generalization. Pattern Recognition157, 110887 (Jan 2025)
work page 2025
-
[9]
Bioengineering12(5), 505 (May 2025)
Kaneriya, A., Samudrala, M., Ganesh, H., Moran, J., Dandibhotla, S., Dakshana- murthy, S.: StructureNet: Physics-Informed Hybridized Deep Learning Framework for Protein–Ligand Binding Affinity Prediction. Bioengineering12(5), 505 (May 2025)
work page 2025
-
[10]
Briefings in Bioinformatics26(5) (Aug 2025)
Kumar, R., Romano, J.D., Ritchie, M.D.: CASTER-DTA: equivariant graph neural networks for predicting drug–target affinity. Briefings in Bioinformatics26(5) (Aug 2025)
work page 2025
-
[11]
In: Proceedings of the 31st Inter- national Conference on Neural Information Processing Systems
Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles. In: Proceedings of the 31st Inter- national Conference on Neural Information Processing Systems. vol. 30. Curran Associates, Inc. (2017)
work page 2017
-
[12]
Li, M., Cao, H., Lai, L., Liu, Z.: Disordered linkers in multidomain allosteric pro- teins: Entropic effect to favor the open state or enhanced local concentration to favor the closed state? Protein Science27(9), 1600–1610 (Sep 2018)
work page 2018
-
[13]
Accounts of Chemical Research50(2), 302–309 (Feb 2017)
Liu, Z., Su, M., Han, L., Liu, J., Yang, Q., Li, Y., Wang, R.: Forging the Basis for Developing Protein–Ligand Interaction Scoring Functions. Accounts of Chemical Research50(2), 302–309 (Feb 2017)
work page 2017
-
[14]
(eds.) Advances in Neural Information Processing Systems
Lu, W., Wu, Q., Zhang, J., Rao, J., Li, C., Zheng, S.: TANKBind: Trigonometry- AwareNeuralNetworKsforDrug-ProteinBindingStructurePrediction.In:Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems. vol. 35, pp. 7236–7249. Curran Associates, Inc. (2022) 12 S. Zhang et al
work page 2022
-
[15]
Nature Communications15(1) (Feb 2024)
Lu,W.,Zhang,J.,Huang,W.,Zhang,Z.,Jia,X.,Wang,Z.,Shi,L.,Li,C.,Wolynes, P.G., Zheng, S.: DynamicBind: predicting ligand-specific protein-ligand complex structure with a deep equivariant generative model. Nature Communications15(1) (Feb 2024)
work page 2024
-
[16]
Annual Review of Biochemistry84(1), 551–575 (Jun 2015)
Marsh, J.A., Teichmann, S.A.: Structure, Dynamics, Assembly, and Evolution of Protein Complexes. Annual Review of Biochemistry84(1), 551–575 (Jun 2015)
work page 2015
-
[17]
Bioinformatics 37(8), 1140–1147 (Oct 2020)
Nguyen, T., Le, H., Quinn, T.P., Nguyen, T., Le, T.D., Venkatesh, S.: GraphDTA: predicting drug–target binding affinity with graph neural networks. Bioinformatics 37(8), 1140–1147 (Oct 2020)
work page 2020
-
[18]
Nature Machine Intelligence4(12), 1256–1264 (2022)
Ross, J., Belgodere, B., Chenthamarakshan, V., Padhi, I., Mroueh, Y., Das, P.: Large-scale chemical language representations capture molecular structure and properties. Nature Machine Intelligence4(12), 1256–1264 (2022)
work page 2022
-
[19]
Nature Communications16(1) (May 2025)
Shah, P.M., Zhu, H., Lu, Z., Wang, K., Tang, J., Li, M.: DeepDTAGen: a multitask deep learning framework for drug-target affinity prediction and target-aware drugs generation. Nature Communications16(1) (May 2025)
work page 2025
-
[20]
Proceedings of the National Academy of Sciences120(24) (Jun 2023)
Singh, R., Sledzieski, S., Bryson, B., Cowen, L., Berger, B.: Contrastive learning in protein language space predicts interactions between drugs and protein targets. Proceedings of the National Academy of Sciences120(24) (Jun 2023)
work page 2023
-
[21]
ACS Central Science7(8), 1356–1367 (Jul 2021)
Soleimany, A.P., Amini, A., Goldman, S., Rus, D., Bhatia, S.N., Coley, C.W.: EvidentialDeepLearningforGuidedMolecularPropertyPredictionandDiscovery. ACS Central Science7(8), 1356–1367 (Jul 2021)
work page 2021
-
[22]
Chemical Reviews125(13), 6309–6365 (May 2025)
Sun, Q., Wang, H., Xie, J., Wang, L., Mu, J., Li, J., Ren, Y., Lai, L.: Computer- Aided Drug Discovery for Undruggable Targets. Chemical Reviews125(13), 6309–6365 (May 2025)
work page 2025
-
[23]
Wei, H., Wang, W., Peng, Z., Yang, J.: Q-BioLiP: A Comprehensive Resource forQuaternaryStructure-basedProtein–ligandInteractions.Genomics,Proteomics and Bioinformatics22(1) (Jan 2024)
work page 2024
-
[24]
IEEE Transactions on Artificial Intelligence p
Zhang, S., Liu, J.K.: SeqProFT: Sequence-only Protein Property Prediction with LoRA Finetuning. IEEE Transactions on Artificial Intelligence p. 1–10 (2025)
work page 2025
-
[25]
Zhang, S., Liu, J.K.: Domain-Aware Geometric Multimodal Learning for Multi- Domain Protein-Ligand Affinity Prediction. In: Proceedings of the 14th Interna- tional Conference on Bioinformatics and Computational Biology (ICBCB 2026) (2026), accepted; proceedings forthcoming. arXiv:2601.17102
-
[26]
Journal of Chemical Information and Modeling65(4), 1724–1735 (Feb 2025)
Zhang, Y., Huang, C., Wang, Y., Li, S., Sun, S.: CL-GNN: Contrastive Learn- ing and Graph Neural Network for Protein–Ligand Binding Affinity Prediction. Journal of Chemical Information and Modeling65(4), 1724–1735 (Feb 2025)
work page 2025
-
[27]
Nature Communications16(1) (Jul 2025)
Zhao, Y., Xing, Y., Zhang, Y., Wang, Y., Wan, M., Yi, D., Wu, C., Li, S., Xu, H., Zhang, H., Liu, Z., Zhou, G., Li, M., Wang, X., Chen, Z., Li, R., Wu, L., Zhao, D., Zan, P., He, S., Bo, X.: Evidential deep learning-based drug-target interaction prediction. Nature Communications16(1) (Jul 2025)
work page 2025
-
[28]
Journal of Chemical Information and Modeling65(5), 2304–2313 (Feb 2025)
Zhou, F., Zhang, S., Zhang, H., Liu, J.K.: ProCeSa: Contrast-Enhanced Structure- Aware Network for Thermostability Prediction with Protein Language Models. Journal of Chemical Information and Modeling65(5), 2304–2313 (Feb 2025)
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.