pith. machine review for the scientific record. sign in

arxiv: 2603.11703 · v2 · submitted 2026-03-12 · 💻 cs.LG

Recognition: unknown

EvoFlows: Evolutionary Edit-Based Flow-Matching for Protein Engineering

Constance Ferragu, Eli Bixby, Jonathan D. Ziegler, Nicolas Deutschmann, Shayan Aziznejad

Authors on Pith no claims yet
classification 💻 cs.LG
keywords proteinevoflowsmodelssequencetemplatedeletionsengineeringexisting
0
0 comments X
read the original abstract

We introduce EvoFlows, a variable-length protein sequence-to-sequence modeling approach designed for protein engineering. Existing protein language models are poorly suited for optimization tasks: autoregressive models require full sequence generation, masked language and discrete diffusion models rely on pre-specified mutation locations, and no existing methods naturally support insertions and deletions relative to a template sequence. EvoFlows learns mutational trajectories between evolutionarily related protein sequences via edit flows, allowing it to perform a controllable number of mutations (insertions, deletions, and substitutions) on a template sequence, predicting not only _which_ mutation to perform, but also _where_ it should occur. Through extensive _in silico_ evaluation on diverse protein families from UniRef and OAS, we show that EvoFlows generates variants that remain consistent with natural protein families while exploring farther from template sequences than leading baselines.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Tree-Conditioned Edit Flows for Ancestral Sequence Reconstruction

    q-bio.QM 2026-05 unverdicted novelty 6.0

    A new tree-conditioned edit-flow model for ancestral sequence reconstruction achieves reasonable accuracy on substitution-only evolved sequences and superior localization of changes on natural indel-rich sequences.