pith. sign in

arxiv: 2603.03573 · v2 · pith:OFKEG6FJnew · submitted 2026-03-03 · 💻 cs.CE

STRIDE: Post-Training LLMs to Reason and Refine Bio-Sequences via Edit Trajectories

classification 💻 cs.CE
keywords strideeditingrefinementtrajectoriesdiscreteeditsiterativellms
0
0 comments X
read the original abstract

Discrete biological sequence optimization often requires goal-directed, parser-valid edits to an existing protein or molecule. Diffusion models support iterative refinement but do not expose a controllable discrete-edit interface, while autoregressive LLMs can be myopic when planning constrained edits over multiple steps. We introduce STRIDE (Sequence Trajectory Refinement via Iterative Discrete Editing), a post-training framework that trains an LLM to emit executable INSERT/DELETE/REPLACE trajectories for variable-length refinement. STRIDE first learns Levenshtein-aligned shortest-edit demonstrations, then uses supervised fine-tuning and group-based policy optimization to align trajectories with task rewards while preserving coherent editing. On an oracle-based full-action protein stress test, STRIDE raises success over Vanilla SFT from 42% to 89% and novelty among unique improvements from 47% to 97%. On instruction-conditioned molecular editing, the GSPO-aligned variant improves strict success, controllability, and SMILES validity over the SFT-only STRIDE model (code: https://github.com/daiheng-zhang/STRIDE).

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.