pith. machine review for the scientific record. sign in

arxiv: 1902.04166 · v1 · submitted 2019-02-04 · 🧬 q-bio.QM · cs.LG· stat.ML

Recognition: unknown

WideDTA: prediction of drug-target binding affinity

Authors on Pith no claims yet
classification 🧬 q-bio.QM cs.LGstat.ML
keywords affinitybindinginformationproteinsequencepredictionwidedtadeep
0
0 comments X
read the original abstract

Motivation: Prediction of the interaction affinity between proteins and compounds is a major challenge in the drug discovery process. WideDTA is a deep-learning based prediction model that employs chemical and biological textual sequence information to predict binding affinity. Results: WideDTA uses four text-based information sources, namely the protein sequence, ligand SMILES, protein domains and motifs, and maximum common substructure words to predict binding affinity. WideDTA outperformed one of the state of the art deep learning methods for drug-target binding affinity prediction, DeepDTA on the KIBA dataset with a statistical significance. This indicates that the word-based sequence representation adapted by WideDTA is a promising alternative to the character-based sequence representation approach in deep learning models for binding affinity prediction, such as the one used in DeepDTA. In addition, the results showed that, given the protein sequence and ligand SMILES, the inclusion of protein domain and motif information as well as ligand maximum common substructure words do not provide additional useful information for the deep learning model. Interestingly, however, using only domain and motif information to represent proteins achieved similar performance to using the full protein sequence, suggesting that important binding relevant information is contained within the protein motifs and domains.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. SPADE: Faster Drug Discovery by Learning from Sparse Data

    cs.LG 2026-05 unverdicted novelty 5.0

    SPADE selects ligands more efficiently than deep learning or Bayesian optimization, needing fewer tests on average to identify high-quality drug candidates for novel proteins.

  2. HBGSA: Hydrogen Bond Graph with Self-Attention for Drug-Target Binding Affinity Prediction

    cs.LG 2026-04 unverdicted novelty 4.0

    HBGSA uses a hydrogen bond graph with self-attention in a GNN plus Pearson correlation loss to outperform baselines on binding affinity prediction for drug discovery.