pith. sign in

arxiv: 2602.16643 · v1 · pith:NRVBTI6Anew · submitted 2026-02-18 · 💻 cs.LG · cond-mat.stat-mech

Factorization Machine with Quadratic-Optimization Annealing for RNA Inverse Folding and Evaluation of Binary-Integer Encoding and Nucleotide Assignment

classification 💻 cs.LG cond-mat.stat-mech
keywords encodingfmqafoldinginverseproblemassignmentsbinary-integerintegers
0
0 comments X
read the original abstract

The RNA inverse folding problem aims to identify nucleotide sequences that preferentially adopt a given target secondary structure. While various heuristic and machine learning-based approaches have been proposed, many require a large number of sequence evaluations, which limits their applicability when experimental validation is costly. We propose a method to solve the problem using a factorization machine with quadratic-optimization annealing (FMQA). FMQA is a discrete black-box optimization method reported to obtain high-quality solutions with a limited number of evaluations. Applying FMQA to the problem requires converting nucleotides into binary variables. However, the influence of integer-to-nucleotide assignments and binary-integer encoding on the performance of FMQA has not been thoroughly investigated, even though such choices determine the structure of the surrogate model and the search landscape, and thus can directly affect solution quality. Therefore, this study aims both to establish a novel FMQA framework for RNA inverse folding and to analyze the effects of these assignments and encoding methods. We evaluated all 24 possible assignments of the four nucleotides to the ordered integers (0-3), in combination with four binary-integer encoding methods. Our results demonstrated that one-hot and domain-wall encodings outperform binary and unary encodings in terms of the normalized ensemble defect value. In domain-wall encoding, nucleotides assigned to the boundary integers (0 and 3) appeared with higher frequency. In the RNA inverse folding problem, assigning guanine and cytosine to these boundary integers promoted their enrichment in stem regions, which led to more thermodynamically stable secondary structures than those obtained with one-hot encoding.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Stage-dependent integer-binary encoding in factorization-machine black-box optimization

    cs.LG 2026-06 unverdicted novelty 6.0

    Stage-dependent encoding in FMQA black-box optimization, using one-hot for learning and domain-wall for search, improves residual error on discretized Rastrigin functions under finer discretization and higher dimensio...

  2. Improving FMQA via Initial Training Data Design Considering Marginal Bit Coverage in One-Hot Encoding

    cs.LG 2026-05 unverdicted novelty 6.0

    Ensuring complete marginal bit coverage in initial data for one-hot encoded FMQA improves mean optimization performance on wing-shape benchmarks with 17 and 32 variables.