pith. sign in

arxiv: 2605.19902 · v1 · pith:6I5MQFVUnew · submitted 2026-05-19 · 💻 cs.LG · q-bio.QM

Hierarchical Contrastive Learning for Multi-Domain Protein-Ligand Binding

Pith reviewed 2026-05-20 06:39 UTC · model grok-4.3

classification 💻 cs.LG q-bio.QM
keywords hierarchical contrastive learningprotein-ligand bindingmulti-domain proteinsself-supervised learninggraph attention networkbinding affinity predictionuncertainty estimationdecoy strategy
0
0 comments X

The pith

Hierarchical contrastive learning decouples geometric pre-training from affinity prediction to handle multi-domain protein flexibility.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces HCLBind as a self-supervised framework that learns binding representations for multi-domain proteins by separating local and global geometric signals. Standard geometric deep learning treats proteins as rigid static structures, which fails when inter-domain movements control recognition and introduces noise in flexible parts. HCLBind instead pre-trains on a large database using a hierarchical decoy strategy: coordinate perturbations teach single-domain physicochemical rules while inter-domain rotations teach global conformational constraints. A domain-gated graph attention network combined with cross-modal attention then focuses on interface regions, and LoRA keeps foundation-model knowledge intact during fine-tuning. On PDBBind this yields more discriminative interface features and better-calibrated uncertainty estimates than direct supervised regression.

Core claim

HCLBind is a self-supervised framework that decouples geometric representation learning from affinity regression through general-to-specific pre-training on the Q-BioLiP database; it employs a hierarchical decoy strategy of coordinate perturbation for single-domain proteins and inter-domain rotation for multi-domain complexes, integrated with a domain-gated graph attention network and cross-modal attention, to learn discriminative interface features and robust uncertainty estimates.

What carries the argument

Hierarchical decoy strategy that generates local physicochemical constraints via single-domain coordinate perturbation and global conformational geometry via multi-domain inter-domain rotation.

If this is right

  • The model learns local constraints separately from global geometry, reducing the impact of rigid-body assumptions on flexible proteins.
  • Domain-gated attention and cross-modal fusion explicitly prioritize interface regions over the rest of the complex.
  • LoRA adaptation preserves evolutionary knowledge from foundation models while allowing efficient task-specific training.
  • Uncertainty estimates become more reliable because the contrastive pre-training is decoupled from the final regression head.
  • Performance gains appear on the PDBBind benchmark for multi-domain cases where prior supervised methods degrade.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same hierarchical decoy pattern could be tested on protein-protein or protein-RNA interfaces where domain motion also governs recognition.
  • Representations learned this way may transfer more readily to downstream tasks such as ligand design for flexible multi-domain targets.
  • If the decoy strategy truly captures physical grammar, similar contrastive objectives might improve conformational ensemble modeling in structural biology.

Load-bearing premise

The coordinate perturbations and inter-domain rotations create training signals that match real biological binding dynamics instead of non-physical artifacts.

What would settle it

Compare affinity prediction error on a held-out set of multi-domain proteins whose crystal structures show large verified domain rotations against the same model trained without the inter-domain rotation decoy; a large performance gap would support the claim.

Figures

Figures reproduced from arXiv: 2605.19902 by Huifeng Zhang, Jian K. Liu, Rongqi Hong, Shuo Zhang.

Figure 1
Figure 1. Figure 1: The HCLBind Framework. (A) Input Data (B) Pre-training Phase (C) Fine￾tuning Phase We propose HCLBind (Hierarchical Contrastive Learning for Binding), a unified framework designed to capture the complex, adaptive geometry of multi- [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Uncertainty quantification and reliability analysis. Left: Error-retention curves. Middle: Density distribution of uncertainty scores. Right: Epistemic uncertainty com￾parison. Reliable uncertainty estimation is important for high-stakes drug discovery. HCLBind employs evidential deep learning to quantify epistemic uncertainty, enabling the detection of structurally anomalous inputs and ambiguous binding i… view at source ↗
Figure 3
Figure 3. Figure 3: Relative performance gain (%) of HCLBind compared to variants across binding topologies. As mentioned in Supplementary Section S1.3, we stratified the test set into three categories: Single-Domain Binders (SDB), Interface Binders (IB), and Linker Binders (LB) [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
read the original abstract

Predicting protein-ligand binding affinity remains intractable for multi-domain proteins, where inter-domain dynamics govern molecular recognition. Existing geometric deep learning methods typically treat proteins as monolithic static graphs, suffering from rigid-body assumptions and aleatoric noise in flexible regions. To address this, we introduced HCLBind, a self-supervised framework that decouples geometric representation learning from affinity regression. HCLBind leverages a general-to-specific pre-training paradigm on the Q-BioLiP database to learn a robust physical grammar of binding. We propose a novel hierarchical decoy strategy: the model learns local physicochemical constraints through protein coordinate perturbation in single-domain proteins and global conformational geometry through inter-domain rotation in multi-domain complexes. Our hybrid architecture integrates a domain-gated graph attention network and cross-modal attention to explicitly prioritize domain interfaces. Furthermore, we employ LoRA on protein and ligand foundation models, ensuring efficient optimization while preserving evolutionary knowledge. Experiments on PDBBind demonstrate that HCLBind effectively learns discriminative interface features and provides robust uncertainty estimation, overcoming the limitations of standard supervised learning. The code is available at https://github.com/jiankliu/HCLBind.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces HCLBind, a self-supervised hierarchical contrastive learning framework for protein-ligand binding affinity prediction in multi-domain proteins. It decouples geometric representation learning from affinity regression via general-to-specific pre-training on Q-BioLiP, employing a novel hierarchical decoy strategy (coordinate perturbation on single-domain proteins and inter-domain rotation on multi-domain complexes), a domain-gated graph attention network with cross-modal attention, and LoRA adaptation of foundation models. The central claim is that this approach learns discriminative interface features and yields robust uncertainty estimates, as shown by experiments on PDBBind that overcome limitations of standard supervised learning.

Significance. If the central claims are substantiated, the work could meaningfully advance geometric deep learning for flexible multi-domain systems by providing contrastive pre-training signals that capture both local physicochemical constraints and global conformational geometry. The public code release at https://github.com/jiankliu/HCLBind and the use of LoRA to preserve evolutionary knowledge while enabling efficient optimization are concrete strengths that support reproducibility and practical adoption.

major comments (2)
  1. [Abstract] Abstract: The assertion that 'Experiments on PDBBind demonstrate that HCLBind effectively learns discriminative interface features and provides robust uncertainty estimation' is unsupported by any quantitative metrics, baselines, error bars, ablation studies, or statistical tests. Without these, the central claim that the method overcomes limitations of supervised learning cannot be evaluated.
  2. [Hierarchical Decoy Strategy] Hierarchical Decoy Strategy section: The claim that coordinate perturbation (single-domain) and inter-domain rotation (multi-domain) supply faithful contrastive signals for learning a 'robust physical grammar of binding' rests on an unverified assumption. These operations can produce non-physical artifacts such as steric clashes or unphysical domain separations; no validation, energy analysis, or comparison to molecular dynamics trajectories is provided to show that the domain-gated GAT and cross-modal attention encode real binding dynamics rather than these artifacts. This assumption is load-bearing for the pre-training paradigm and downstream affinity/uncertainty results on PDBBind.
minor comments (2)
  1. The equations defining the contrastive loss and uncertainty estimation should be presented explicitly with all variables defined, rather than described only in prose.
  2. [Experiments] Figure captions and axis labels for any PDBBind performance plots should include error bars, baseline names, and dataset split details for clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback on our manuscript. We address each major comment point by point below, outlining our responses and planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The assertion that 'Experiments on PDBBind demonstrate that HCLBind effectively learns discriminative interface features and provides robust uncertainty estimation' is unsupported by any quantitative metrics, baselines, error bars, ablation studies, or statistical tests. Without these, the central claim that the method overcomes limitations of supervised learning cannot be evaluated.

    Authors: We agree that the abstract would benefit from more explicit reference to supporting quantitative evidence. The full manuscript presents these details in the Experiments and Results sections, including baseline comparisons on PDBBind, ablation studies isolating the hierarchical pre-training and attention components, error bars across multiple random seeds, and statistical tests for performance differences. To directly address the concern, we will revise the abstract to briefly highlight key quantitative outcomes (e.g., affinity prediction improvements and uncertainty calibration metrics) while preserving its length and focus. revision: yes

  2. Referee: [Hierarchical Decoy Strategy] Hierarchical Decoy Strategy section: The claim that coordinate perturbation (single-domain) and inter-domain rotation (multi-domain) supply faithful contrastive signals for learning a 'robust physical grammar of binding' rests on an unverified assumption. These operations can produce non-physical artifacts such as steric clashes or unphysical domain separations; no validation, energy analysis, or comparison to molecular dynamics trajectories is provided to show that the domain-gated GAT and cross-modal attention encode real binding dynamics rather than these artifacts. This assumption is load-bearing for the pre-training paradigm and downstream affinity/uncertainty results on PDBBind.

    Authors: We acknowledge the importance of verifying that the decoy generation produces physically plausible signals. The hierarchical strategy is motivated by established practices in geometric deep learning for proteins, with coordinate perturbation targeting local atomic environments and inter-domain rotation capturing global flexibility relevant to multi-domain binding. In the revised manuscript, we will add a dedicated validation subsection that quantifies potential artifacts (e.g., steric clash scores via standard structural validation tools and interface RMSD preservation) and discusses how the domain-gated graph attention and cross-modal mechanisms help the model prioritize biologically meaningful features. Full molecular dynamics comparisons remain computationally intensive and are noted as a valuable direction for future work, but the added analyses will strengthen the current claims. revision: partial

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper describes a self-supervised pre-training framework on the distinct Q-BioLiP database using a proposed hierarchical decoy strategy (coordinate perturbation for single-domain and inter-domain rotation for multi-domain), followed by evaluation on PDBBind. No equations, fitted parameters, or self-citations are presented that reduce the reported performance, uncertainty estimates, or interface feature learning directly to quantities defined by the same inputs or by construction. The central claims rest on the novel architecture (domain-gated GAT and cross-modal attention with LoRA) and the decoy generation process, which are independent methodological contributions rather than reductions to prior fitted values or self-referential definitions. Pre-training and test distributions are explicitly separated, rendering the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that the chosen decoy perturbations generate useful contrastive signals and that the Q-BioLiP database supplies representative binding examples; no new physical entities are postulated.

free parameters (1)
  • LoRA rank and scaling factors
    Hyperparameters controlling the low-rank adaptation of the protein and ligand foundation models.
axioms (1)
  • domain assumption Q-BioLiP contains sufficient structural diversity to learn a general physical grammar of binding that transfers to PDBBind.
    The general-to-specific pre-training paradigm depends on this database being representative of real binding physics.

pith-pipeline@v0.9.0 · 5734 in / 1262 out tokens · 46050 ms · 2026-05-20T06:39:51.301855+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages

  1. [1]

    Advances in Neural Information Processing Systems33(2020)

    Amini, A., Schwarting, W., Soleimany, A., Rus, D.: Deep evidential regression. Advances in Neural Information Processing Systems33(2020)

  2. [2]

    Nature Machine Intelli- gence5(2), 126–136 (Feb 2023)

    Bai, P., Miljković, F., John, B., Lu, H.: Interpretable bilinear attention network with domain adaptation improves drug–target prediction. Nature Machine Intelli- gence5(2), 126–136 (Feb 2023)

  3. [3]

    https://evolutionaryscale.ai/blog/esm-cambrian (2024)

    ESM-Team: ESM Cambrian: Revealing the mysteries of proteins with unsupervised learning. https://evolutionaryscale.ai/blog/esm-cambrian (2024)

  4. [4]

    Molecular Aspects of Medicine101, 101337 (Feb 2025)

    Gianni, S., Brunori, M.: The folding and misfolding of multidomain proteins. Molecular Aspects of Medicine101, 101337 (Feb 2025)

  5. [5]

    Journal of Computational Chemistry 25(2), 238–250 (Nov 2003)

    Gohlke, H., Case, D.A.: Converging free energy estimates: MM-PB(GB)SA studies on the protein–protein complex Ras–Raf. Journal of Computational Chemistry 25(2), 238–250 (Nov 2003)

  6. [6]

    Nature Commu- nications16(1) (Oct 2025)

    Hansen, S.B., Bartual, S.G., Yuan, H., Raimi, O.G., Gorelik, A., Ferenbach, A.T., Lytje, K., Pedersen, J.S., Drace, T., Boesen, T., van Aalten, D.M.F.: Multi-domain O-GlcNAcase structures reveal allosteric regulatory mechanisms. Nature Commu- nications16(1) (Oct 2025)

  7. [7]

    In: International Conference on Learning Representations (2022)

    Hu, E.J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., Chen, W.: LoRA: Low-Rank Adaptation of Large Language Models. In: International Conference on Learning Representations (2022)

  8. [8]

    Pattern Recognition157, 110887 (Jan 2025)

    Hua, Y., Feng, Z., Song, X., Wu, X.J., Kittler, J.: MMDG-DTI: Drug–target inter- action prediction via multimodal feature fusion and domain generalization. Pattern Recognition157, 110887 (Jan 2025)

  9. [9]

    Bioengineering12(5), 505 (May 2025)

    Kaneriya, A., Samudrala, M., Ganesh, H., Moran, J., Dandibhotla, S., Dakshana- murthy, S.: StructureNet: Physics-Informed Hybridized Deep Learning Framework for Protein–Ligand Binding Affinity Prediction. Bioengineering12(5), 505 (May 2025)

  10. [10]

    Briefings in Bioinformatics26(5) (Aug 2025)

    Kumar, R., Romano, J.D., Ritchie, M.D.: CASTER-DTA: equivariant graph neural networks for predicting drug–target affinity. Briefings in Bioinformatics26(5) (Aug 2025)

  11. [11]

    In: Proceedings of the 31st Inter- national Conference on Neural Information Processing Systems

    Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles. In: Proceedings of the 31st Inter- national Conference on Neural Information Processing Systems. vol. 30. Curran Associates, Inc. (2017)

  12. [12]

    Li, M., Cao, H., Lai, L., Liu, Z.: Disordered linkers in multidomain allosteric pro- teins: Entropic effect to favor the open state or enhanced local concentration to favor the closed state? Protein Science27(9), 1600–1610 (Sep 2018)

  13. [13]

    Accounts of Chemical Research50(2), 302–309 (Feb 2017)

    Liu, Z., Su, M., Han, L., Liu, J., Yang, Q., Li, Y., Wang, R.: Forging the Basis for Developing Protein–Ligand Interaction Scoring Functions. Accounts of Chemical Research50(2), 302–309 (Feb 2017)

  14. [14]

    (eds.) Advances in Neural Information Processing Systems

    Lu, W., Wu, Q., Zhang, J., Rao, J., Li, C., Zheng, S.: TANKBind: Trigonometry- AwareNeuralNetworKsforDrug-ProteinBindingStructurePrediction.In:Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems. vol. 35, pp. 7236–7249. Curran Associates, Inc. (2022) 12 S. Zhang et al

  15. [15]

    Nature Communications15(1) (Feb 2024)

    Lu,W.,Zhang,J.,Huang,W.,Zhang,Z.,Jia,X.,Wang,Z.,Shi,L.,Li,C.,Wolynes, P.G., Zheng, S.: DynamicBind: predicting ligand-specific protein-ligand complex structure with a deep equivariant generative model. Nature Communications15(1) (Feb 2024)

  16. [16]

    Annual Review of Biochemistry84(1), 551–575 (Jun 2015)

    Marsh, J.A., Teichmann, S.A.: Structure, Dynamics, Assembly, and Evolution of Protein Complexes. Annual Review of Biochemistry84(1), 551–575 (Jun 2015)

  17. [17]

    Bioinformatics 37(8), 1140–1147 (Oct 2020)

    Nguyen, T., Le, H., Quinn, T.P., Nguyen, T., Le, T.D., Venkatesh, S.: GraphDTA: predicting drug–target binding affinity with graph neural networks. Bioinformatics 37(8), 1140–1147 (Oct 2020)

  18. [18]

    Nature Machine Intelligence4(12), 1256–1264 (2022)

    Ross, J., Belgodere, B., Chenthamarakshan, V., Padhi, I., Mroueh, Y., Das, P.: Large-scale chemical language representations capture molecular structure and properties. Nature Machine Intelligence4(12), 1256–1264 (2022)

  19. [19]

    Nature Communications16(1) (May 2025)

    Shah, P.M., Zhu, H., Lu, Z., Wang, K., Tang, J., Li, M.: DeepDTAGen: a multitask deep learning framework for drug-target affinity prediction and target-aware drugs generation. Nature Communications16(1) (May 2025)

  20. [20]

    Proceedings of the National Academy of Sciences120(24) (Jun 2023)

    Singh, R., Sledzieski, S., Bryson, B., Cowen, L., Berger, B.: Contrastive learning in protein language space predicts interactions between drugs and protein targets. Proceedings of the National Academy of Sciences120(24) (Jun 2023)

  21. [21]

    ACS Central Science7(8), 1356–1367 (Jul 2021)

    Soleimany, A.P., Amini, A., Goldman, S., Rus, D., Bhatia, S.N., Coley, C.W.: EvidentialDeepLearningforGuidedMolecularPropertyPredictionandDiscovery. ACS Central Science7(8), 1356–1367 (Jul 2021)

  22. [22]

    Chemical Reviews125(13), 6309–6365 (May 2025)

    Sun, Q., Wang, H., Xie, J., Wang, L., Mu, J., Li, J., Ren, Y., Lai, L.: Computer- Aided Drug Discovery for Undruggable Targets. Chemical Reviews125(13), 6309–6365 (May 2025)

  23. [23]

    Wei, H., Wang, W., Peng, Z., Yang, J.: Q-BioLiP: A Comprehensive Resource forQuaternaryStructure-basedProtein–ligandInteractions.Genomics,Proteomics and Bioinformatics22(1) (Jan 2024)

  24. [24]

    IEEE Transactions on Artificial Intelligence p

    Zhang, S., Liu, J.K.: SeqProFT: Sequence-only Protein Property Prediction with LoRA Finetuning. IEEE Transactions on Artificial Intelligence p. 1–10 (2025)

  25. [25]

    In: Proceedings of the 14th Interna- tional Conference on Bioinformatics and Computational Biology (ICBCB 2026) (2026), accepted; proceedings forthcoming

    Zhang, S., Liu, J.K.: Domain-Aware Geometric Multimodal Learning for Multi- Domain Protein-Ligand Affinity Prediction. In: Proceedings of the 14th Interna- tional Conference on Bioinformatics and Computational Biology (ICBCB 2026) (2026), accepted; proceedings forthcoming. arXiv:2601.17102

  26. [26]

    Journal of Chemical Information and Modeling65(4), 1724–1735 (Feb 2025)

    Zhang, Y., Huang, C., Wang, Y., Li, S., Sun, S.: CL-GNN: Contrastive Learn- ing and Graph Neural Network for Protein–Ligand Binding Affinity Prediction. Journal of Chemical Information and Modeling65(4), 1724–1735 (Feb 2025)

  27. [27]

    Nature Communications16(1) (Jul 2025)

    Zhao, Y., Xing, Y., Zhang, Y., Wang, Y., Wan, M., Yi, D., Wu, C., Li, S., Xu, H., Zhang, H., Liu, Z., Zhou, G., Li, M., Wang, X., Chen, Z., Li, R., Wu, L., Zhao, D., Zan, P., He, S., Bo, X.: Evidential deep learning-based drug-target interaction prediction. Nature Communications16(1) (Jul 2025)

  28. [28]

    Journal of Chemical Information and Modeling65(5), 2304–2313 (Feb 2025)

    Zhou, F., Zhang, S., Zhang, H., Liu, J.K.: ProCeSa: Contrast-Enhanced Structure- Aware Network for Thermostability Prediction with Protein Language Models. Journal of Chemical Information and Modeling65(5), 2304–2313 (Feb 2025)