How to Hallucinate Functional Proteins

Hector Garcia Martin; Zak Costello

arxiv: 1903.00458 · v1 · pith:YKG2GZMKnew · submitted 2019-03-01 · 🧬 q-bio.QM

How to Hallucinate Functional Proteins

Zak Costello , Hector Garcia Martin This is my paper

classification 🧬 q-bio.QM

keywords proteinsequencesdesignbioseqvaehallucinatemodelspacevalid

0 comments

read the original abstract

Here we present a novel approach to protein design and phenotypic inference using a generative model for protein sequences. BioSeqVAE, a variational autoencoder variant, can hallucinate syntactically valid protein sequences that are likely to fold and function. BioSeqVAE is trained on the entire known protein sequence space and learns to generate valid examples of protein sequences in an unsupervised manner. The model is validated by showing that its latent feature space is useful and that it accurately reconstructs sequences. Its usefulness is demonstrated with a selection of relevant downstream design tasks. This work is intended to serve as a computational first step towards a general purpose structure free protein design tool.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

DeepRHP: A Hybrid Variational Autoencoder for Designing Random Heteropolymers as Protein Mimics
cs.LG 2026-06 unverdicted novelty 5.0

DeepRHP is a semi-supervised hybrid VAE that learns RHP sequences and chemical features to propose monomer compositions stabilizing membrane proteins, validated against published results.