Screening of 52,000 bioRxiv preprints finds dual-use-adjacent content routinely present in open titles and abstracts, often exceeding risk thresholds.
Interpretable
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 6roles
background 1polarities
background 1representative citing papers
GoForth is a forward-trained encoder-decoder RNA language model that generates sequences under mixed constraints on fold, sequence, and coding by separating sequence prior, forward folding sampler, and reward oracle.
Unsupervised rewards combining model uncertainty and semantic consistency allow protein language models to self-steer via SRO and BRO algorithms, outperforming DPO and KTO on out-of-distribution prompts while approaching oracle performance.
Post-training stages reshape generalization in biological reasoning models distinctly: CPT aligns with biological language, SFT boosts ID performance but causes OOD to peak early and decline, while RL on strong SFT checkpoints can recover OOD generalization.
TadA-Bench supplies a chronological million-variant wet-lab replay benchmark from 31 TadA directed-evolution rounds that evaluates models on future-round variant ranking given only earlier data.
This review surveys current machine learning methods for RNA secondary structure prediction, identifies a generalization crisis prompting homology-aware benchmarking, and outlines future challenges including pseudoknots, long transcripts, modified nucleotides, and dynamic ensembles.
citing papers explorer
-
The Biosecurity Blind Spot: Systematic Dual-use Detection in Open Science Infrastructure
Screening of 52,000 bioRxiv preprints finds dual-use-adjacent content routinely present in open titles and abstracts, often exceeding risk thresholds.
-
GoForth: Language Models for RNA Design under Structure, Sequence, and Coding Constraints
GoForth is a forward-trained encoder-decoder RNA language model that generates sequences under mixed constraints on fold, sequence, and coding by separating sequence prior, forward folding sampler, and reward oracle.
-
Be Your Own Teacher: Steering Protein Language Models via Unsupervised Reward Optimization
Unsupervised rewards combining model uncertainty and semantic consistency allow protein language models to self-steer via SRO and BRO algorithms, outperforming DPO and KTO on out-of-distribution prompts while approaching oracle performance.
-
How Post-Training Shapes Biological Reasoning Models
Post-training stages reshape generalization in biological reasoning models distinctly: CPT aligns with biological language, SFT boosts ID performance but causes OOD to peak early and decline, while RL on strong SFT checkpoints can recover OOD generalization.
-
TadA-Bench: A Million-Variant Benchmark for Future-Round Discovery Toward Agentic Protein Engineering
TadA-Bench supplies a chronological million-variant wet-lab replay benchmark from 31 TadA directed-evolution rounds that evaluates models on future-round variant ranking given only earlier data.
-
Machine Learning for RNA Secondary Structure Prediction: a review of current methods and challenges
This review surveys current machine learning methods for RNA secondary structure prediction, identifies a generalization crisis prompting homology-aware benchmarking, and outlines future challenges including pseudoknots, long transcripts, modified nucleotides, and dynamic ensembles.