WideDTA: prediction of drug-target binding affinity

Hakime \"Ozt\"urk , Elif Ozkirimli , Arzucan \"Ozg\"ur

Authors on Pith no claims yet

classification 🧬 q-bio.QM cs.LGstat.ML

keywords affinitybindinginformationproteinsequencepredictionwidedtadeep

read the original abstract

Motivation: Prediction of the interaction affinity between proteins and compounds is a major challenge in the drug discovery process. WideDTA is a deep-learning based prediction model that employs chemical and biological textual sequence information to predict binding affinity. Results: WideDTA uses four text-based information sources, namely the protein sequence, ligand SMILES, protein domains and motifs, and maximum common substructure words to predict binding affinity. WideDTA outperformed one of the state of the art deep learning methods for drug-target binding affinity prediction, DeepDTA on the KIBA dataset with a statistical significance. This indicates that the word-based sequence representation adapted by WideDTA is a promising alternative to the character-based sequence representation approach in deep learning models for binding affinity prediction, such as the one used in DeepDTA. In addition, the results showed that, given the protein sequence and ligand SMILES, the inclusion of protein domain and motif information as well as ligand maximum common substructure words do not provide additional useful information for the deep learning model. Interestingly, however, using only domain and motif information to represent proteins achieved similar performance to using the full protein sequence, suggesting that important binding relevant information is contained within the protein motifs and domains.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

SPADE: Faster Drug Discovery by Learning from Sparse Data
cs.LG 2026-05 unverdicted novelty 5.0

SPADE selects ligands more efficiently than deep learning or Bayesian optimization, needing fewer tests on average to identify high-quality drug candidates for novel proteins.
HBGSA: Hydrogen Bond Graph with Self-Attention for Drug-Target Binding Affinity Prediction
cs.LG 2026-04 unverdicted novelty 4.0

HBGSA uses a hydrogen bond graph with self-attention in a GNN plus Pearson correlation loss to outperform baselines on binding affinity prediction for drug discovery.