Contrastive fine-tuning of protein language models on Pfam, structural, interaction, and mutational datasets produces embeddings that improve kNN performance on 15-16 of 23 downstream tasks including remote homology detection and structural retrieval.
MMseqs 2 enables sensitive protein sequence searching for the analysis of massive data sets
4 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 4representative citing papers
STOMP extends direct preference optimization to the multi-objective setting via smooth Tchebysheff scalarization and standardization of observed rewards, achieving highest hypervolume in eight of nine protein engineering evaluations.
A new tree-conditioned edit-flow model for ancestral sequence reconstruction achieves reasonable accuracy on substitution-only evolved sequences and superior localization of changes on natural indel-rich sequences.
Galactica, a science-specialized LLM, reports higher scores than GPT-3, Chinchilla, and PaLM on LaTeX knowledge, mathematical reasoning, and medical QA benchmarks while outperforming general models on BIG-bench.
citing papers explorer
-
ProtSent: Protein Sentence Transformers
Contrastive fine-tuning of protein language models on Pfam, structural, interaction, and mutational datasets produces embeddings that improve kNN performance on 15-16 of 23 downstream tasks including remote homology detection and structural retrieval.
-
Pareto-Optimal Offline Reinforcement Learning via Smooth Tchebysheff Scalarization
STOMP extends direct preference optimization to the multi-objective setting via smooth Tchebysheff scalarization and standardization of observed rewards, achieving highest hypervolume in eight of nine protein engineering evaluations.