m6A-FORM: A Foundation Model for Decoding N6-methyladenosine Biology

Shou-Jiang Gao; Sumin Jo; Tinghe Zhang; Yufei Huang

arxiv: 2606.12219 · v1 · pith:VWE4QSDKnew · submitted 2026-06-10 · 🧬 q-bio.GN · q-bio.MN

m6A-FORM: A Foundation Model for Decoding N6-methyladenosine Biology

Tinghe Zhang , Sumin Jo , Shou-Jiang Gao , Yufei Huang This is my paper

Pith reviewed 2026-06-27 07:33 UTC · model grok-4.3

classification 🧬 q-bio.GN q-bio.MN

keywords m6AN6-methyladenosineRNA methylationfoundation modeltransformersite predictionMeRIP-seqepitranscriptomics

0 comments

The pith

m6A-FORM predicts m6A sites with PR-AUC of 0.635 after pretraining on MeRIP-seq peaks, improving over existing methods by at least 0.14.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces m6A-FORM, a transformer-based foundation model for decoding N6-methyladenosine biology in RNA. It pretrains on approximately 22 million sequences derived from MeRIP-seq peaks across 143 human studies to learn methylation-enriched patterns. Fine-tuning on high-confidence single-nucleotide annotations then yields state-of-the-art prediction of m6A sites with PR-AUC 0.635 and ROC-AUC 0.988. The model also adapts to predict binding sites of m6A regulators and identifies thousands of tissue-conserved sites with distinct biological signatures. This approach addresses inefficiencies in prior adenosine-centered predictors by using peak priors for better accuracy and speed.

Core claim

m6A-FORM is a transformer-based foundation model that uses MeRIP-seq peaks as methylation-enriched priors and is pretrained on approximately 22 million peak-derived sequences from 143 human MeRIP-seq studies. After fine-tuning with high-confidence single-nucleotide m6A annotations from m6A-Atlas v2.0 and GLORI, it achieves a PR-AUC of 0.635 and ROC-AUC of 0.988 for m6A site prediction, improving PR-AUC by at least 0.14 over existing methods while enabling substantially faster inference. Task-specific adaptation supports prediction of binding sites for 19 m6A-associated regulators and identification of YTHDF2-bound m6A sites associated with mRNA degradation. Applying the model across 67 datas

What carries the argument

transformer-based foundation model that uses MeRIP-seq peaks as methylation-enriched priors for pretraining on peak-derived sequences

If this is right

The model achieves substantially faster inference for m6A site prediction compared to prior methods.
Task-specific adaptation enables prediction of binding sites for 19 m6A-associated regulators.
The model identifies YTHDF2-bound m6A sites associated with mRNA degradation.
Application across 67 datasets from 24 human tissues yields 19,631 tissue-conserved m6A sites with distinct signatures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The peak-based pretraining approach might extend to prediction tasks for other RNA modifications where peak data exists.
The tissue-conserved sites could provide candidates for experiments testing effects on mRNA decay rates in specific cell types.
Faster inference could allow scanning of larger transcriptomes or integration with other sequencing datasets for combined analyses.

Load-bearing premise

MeRIP-seq peaks serve as reliable methylation-enriched priors for pretraining and the single-nucleotide annotations from m6A-Atlas v2.0 and GLORI constitute accurate ground truth without substantial false positives or selection biases.

What would settle it

An independent validation experiment using an orthogonal technique such as mass spectrometry on held-out tissue samples to check whether the predicted m6A sites match at the reported accuracy levels.

read the original abstract

N6-methyladenosine (m6A) is the most abundant internal modification in eukaryotic mRNA. However, most existing predictors use adenosine-centered formulations that are computationally inefficient and prone to false positives. Here we present m6A-FORM, a transformer-based foundation model for RNA methylation that uses MeRIP-seq peaks as methylation-enriched priors and is pretrained on approximately 22 million peak-derived sequences from 143 human MeRIP-seq studies. After fine-tuning with high-confidence single-nucleotide m6A annotations from m6A-Atlas v2.0 and GLORI, m6A-FORM-sites achieves state-of-the-art m6A site prediction performance, with a PR-AUC of 0.635 and ROC-AUC of 0.988, improving PR-AUC by at least 0.14 over existing methods while enabling substantially faster inference. Task-specific adaptation further supports prediction of binding sites for 19 m6A-associated regulators and identification of YTHDF2-bound m6A sites associated with mRNA degradation. Applying m6A-FORM across 67 datasets from 24 human tissues identifies 19,631 tissue-conserved sites with distinct localization, clustering, methylation, expression, RBP-interaction, and decay-associated signatures.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

m6A-FORM scales up transformer pretraining on MeRIP-seq peaks for m6A tasks but the SOTA numbers rest on unshown splits and label quality.

read the letter

The main takeaway is that this paper introduces a transformer pretrained on roughly 22 million peak-derived sequences from 143 human MeRIP-seq studies, then fine-tuned on single-nucleotide labels from m6A-Atlas v2.0 and GLORI. It reports PR-AUC 0.635 and ROC-AUC 0.988 for site prediction, plus extensions to 19 regulator binding tasks and tissue-conserved sites across 67 datasets.

What is actually new is the volume of MeRIP-seq pretraining treated as methylation-enriched priors, combined with the multi-task fine-tuning. Earlier genomic transformers exist, but this specific scale and RNA-modification focus is a step beyond most prior m6A predictors. The speed claim for inference is also concrete and potentially useful for people scanning large transcriptomes.

The work is straightforward in its framing: older adenosine-centered methods are slow and noisy, so a foundation-model route makes sense on paper. The tissue and regulator applications show they tried to move past single-task prediction.

The soft spots sit in the validation details that are missing from the abstract. No description appears of train-test splits, baseline re-implementations, error bars, or explicit checks for sequence overlap between pretraining peaks and fine-tuning labels. If leakage exists, the 0.14 PR-AUC lift becomes hard to interpret. The stress-test concern about systematic biases or false positives in the m6A-Atlas and GLORI annotations is also unresolved here; those databases are treated as high-confidence ground truth without visible controls for tissue or sequence-context artifacts.

This paper is for computational RNA biologists who need updated m6A site or regulator tools. A reader already working in the area could extract practical value from the pretrained weights if the methods section holds up. It deserves peer review so the data-handling and label-quality questions can be examined directly rather than left to the abstract.

Referee Report

3 major / 2 minor

Summary. The paper presents m6A-FORM, a transformer-based foundation model pretrained on ~22 million sequences derived from MeRIP-seq peaks across 143 human studies. After fine-tuning on single-nucleotide m6A annotations from m6A-Atlas v2.0 and GLORI, the m6A-FORM-sites variant reports state-of-the-art performance (PR-AUC 0.635, ROC-AUC 0.988) for m6A site prediction, with claimed improvement of at least 0.14 in PR-AUC over prior methods, faster inference, and downstream applications to 19 regulator binding sites and 19,631 tissue-conserved sites identified across 67 datasets from 24 tissues.

Significance. If the performance metrics are shown to be robust, the work offers a potentially useful large-scale pretrained model for m6A biology that could improve site prediction efficiency and enable tissue-level analyses. The scale of the pretraining corpus (~22M sequences) is a clear strength relative to prior adenosine-centered predictors.

major comments (3)

[Results] Results (m6A site prediction experiments): The headline PR-AUC of 0.635 and 0.14 improvement over baselines are reported without any description of train-test split methodology, baseline re-implementations, statistical error bars, or explicit controls for data leakage between the MeRIP-seq peak pretraining corpus and the fine-tuning labels from m6A-Atlas v2.0/GLORI; this directly undermines verification of the central SOTA claim.
[Methods] Methods (fine-tuning data curation): The model treats single-nucleotide annotations from m6A-Atlas v2.0 and GLORI as high-confidence ground truth, yet no analysis or external validation is provided for potential false-positive rates, tissue-selection biases, or sequence-context artifacts common in aggregated MeRIP/GLORI compilations; if present, these would systematically inflate both absolute metrics and the reported improvement.
[Results] Results (tissue-conserved sites analysis): The identification of 19,631 tissue-conserved sites and their downstream signatures (localization, RBP interaction, decay) inherits the same label-quality dependency as the site-prediction task; without independent orthogonal validation (e.g., mass-spec or orthogonal sequencing), the biological conclusions rest on the same unverified ground-truth assumption.

minor comments (2)

The abstract states performance numbers but the main text should include a dedicated table comparing all baselines with exact PR-AUC/ROC-AUC values, inference times, and parameter counts for reproducibility.
Notation for the foundation model variants (m6A-FORM vs. m6A-FORM-sites) is introduced without an explicit definition table or diagram showing the pretraining vs. fine-tuning stages.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their detailed and constructive comments. We address each major comment point-by-point below, committing to revisions that add missing methodological details and explicit discussions of limitations.

read point-by-point responses

Referee: [Results] Results (m6A site prediction experiments): The headline PR-AUC of 0.635 and 0.14 improvement over baselines are reported without any description of train-test split methodology, baseline re-implementations, statistical error bars, or explicit controls for data leakage between the MeRIP-seq peak pretraining corpus and the fine-tuning labels from m6A-Atlas v2.0/GLORI; this directly undermines verification of the central SOTA claim.

Authors: We agree that these details are essential for verifying the central claims. In the revised manuscript we will add a dedicated subsection describing the train-test split protocol (including sequence-identity filtering to prevent leakage between the ~22M pretraining sequences and the m6A-Atlas/GLORI fine-tuning labels), the exact re-implementation steps for each baseline, and statistical error bars obtained from multiple random seeds or cross-validation folds. revision: yes
Referee: [Methods] Methods (fine-tuning data curation): The model treats single-nucleotide annotations from m6A-Atlas v2.0 and GLORI as high-confidence ground truth, yet no analysis or external validation is provided for potential false-positive rates, tissue-selection biases, or sequence-context artifacts common in aggregated MeRIP/GLORI compilations; if present, these would systematically inflate both absolute metrics and the reported improvement.

Authors: We acknowledge that the original submission did not include an explicit analysis of label quality. In revision we will insert a new paragraph in Methods that discusses known limitations of aggregated MeRIP-seq and GLORI compilations, cites supporting literature on their false-positive characteristics, and notes potential tissue biases. A full orthogonal experimental validation lies outside the scope of this computational study. revision: partial
Referee: [Results] Results (tissue-conserved sites analysis): The identification of 19,631 tissue-conserved sites and their downstream signatures (localization, RBP interaction, decay) inherits the same label-quality dependency as the site-prediction task; without independent orthogonal validation (e.g., mass-spec or orthogonal sequencing), the biological conclusions rest on the same unverified ground-truth assumption.

Authors: We agree that the conserved-site conclusions rest on the same label assumptions. The revised manuscript will explicitly state this dependency, add a limitations paragraph, and frame the reported signatures as computational observations that motivate future orthogonal experiments. The internal consistency of the signatures (e.g., expected RBP and decay associations) provides supporting context but does not replace independent validation. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical ML pipeline with external labels and no derivation steps

full rationale

The paper presents a transformer foundation model pretrained on MeRIP-seq peak sequences and fine-tuned on single-nucleotide annotations from the external m6A-Atlas v2.0 and GLORI resources. Reported metrics (PR-AUC 0.635, ROC-AUC 0.988) are standard supervised evaluation outcomes on held-out data rather than any claimed first-principles derivation. No equations, self-definitional loops, fitted-input-as-prediction steps, or load-bearing self-citations appear in the described pipeline. The central claims rest on empirical performance against independent annotations and do not reduce to the model's own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Abstract-only; the central performance claim rests on the unstated assumption that the cited databases and MeRIP-seq peaks are high-quality ground truth. No explicit free parameters or invented entities are named.

axioms (2)

domain assumption MeRIP-seq peaks provide reliable methylation-enriched priors for pretraining
Stated in abstract as the basis for pretraining on 22 million sequences.
domain assumption m6A-Atlas v2.0 and GLORI supply accurate single-nucleotide ground truth
Used for fine-tuning and claimed to enable state-of-the-art performance.

pith-pipeline@v0.9.1-grok · 5769 in / 1698 out tokens · 29068 ms · 2026-06-27T07:33:38.600531+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

57 extracted references

[1]

Clustered

rely on highly similar experimental principles, we treated them as a single technology when counting supporting evi dence. Using these criteria, we constructed a high -confidence dataset containing 131,320 base-resolution m6A sites. Dataset preparation for m6A sites identification We collected 528,452 MeRIP -seq peaks from five human cell lines with the l...
[2]

Nature, 2014

Wang, X., et al., N6-methyladenosine-dependent regulation of messenger RNA stability. Nature, 2014. 505(7481): p. 117-120

2014
[3]

Nature, 2012

Dominissini, D., et al., Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature, 2012. 485(7397): p. 201-206

2012
[4]

Cell Genom, 2024

Fan, R., et al., A combined deep learning framework for mammalian m6A site prediction. Cell Genom, 2024. 4(12): p. 100697

2024
[5]

Briefings in Functional Genomics, 2025

Huang, X., et al., m6A RNA modification pathway: orchestrating fibrotic mechanisms across multiple organs. Briefings in Functional Genomics, 2025. 24

2025
[6]

Nature, 2017

Barbieri, I., et al., Promoter-bound METTL3 maintains myeloid leukaemia by m6A- dependent translation control. Nature, 2017. 552(7683): p. 126-131

2017
[7]

Trends in Molecular Medicine, 2023

Liu, Y ., et al., N6-methyladenosine-mediated gene regulation and therapeutic implications. Trends in Molecular Medicine, 2023. 29(6): p. 454-467

2023
[8]

Bioinformatics, 2023

Zhang, Y ., et al., Interpretable prediction models for widespread m6A RNA modification across cell lines and tissues. Bioinformatics, 2023. 39(12)

2023
[9]

Bioinformatics, 2024

Ni, P ., et al., RNA m6A detection using raw current sig nals and basecalling errors from Nanopore direct RNA sequencing reads. Bioinformatics, 2024. 40(6)

2024
[10]

Nature Methods, 2015

Linder, B., et al., Single-nucleotide-resolution mapping of m6A and m6Am throughout the transcriptome. Nature Methods, 2015. 12(8): p. 767-772

2015
[11]

Nature Biotechnology, 2023

Liu, C., et al., Absolute quantification of single -base m6A methylation in the mammalian transcriptome using GLORI. Nature Biotechnology, 2023. 41(3): p. 355-366

2023
[12]

Nucleic Acids Research, 2016

Zhou, Y ., et al., SRAMP: prediction of mammalian N6 -methyladenosine (m6A) sites based on sequence-derived features. Nucleic Acids Research, 2016. 44(10): p. e91-e91

2016
[13]

Nucleic Acids Research,

Chen, K., et al., WHISTLE: a high-accuracy map of the human N6 -methyladenosine (m6A) epitranscriptome predicted using a machine learning approach. Nucleic Acids Research,
[14]

RNA Biol, 2021

Li, J., et al., HSm6AP: a high-precision predictor for the Homo sapiens N6-methyladenosine (m;6 A) based on multiple weights and feature stitching. RNA Biol, 2021. 18(11): p. 1882- 1892

2021
[15]

Cell Genomics, 2024

Fan, R., et al., A combined deep learning framework for mammalian m6A site prediction. Cell Genomics, 2024. 4(12)

2024
[16]

Nature Communications,

Song, Z., et al., Attention-based multi-label neural networks for integrated prediction and interpretation of twelve widely occurring RNA modifications. Nature Communications,
[17]

BMC Bioinformatics, 2024

Tu, G., et al., m6A-TCPred: a web server to predict tissue-conserved human m6A sites using machine learning approach. BMC Bioinformatics, 2024. 25(1): p. 127

2024
[18]

Nucleic Acids Research, 2021

Xiong, Y ., et al., Modeling multi-species RNA modification through multi -task curriculum learning. Nucleic Acids Research, 2021. 49(7): p. 3719-3734

2021
[19]

Signal Transduct Target Ther, 2021

Jiang, X., et al., The role of m6A modification in the biological functions and diseases. Signal Transduct Target Ther, 2021. 6(1): p. 74

2021
[20]

BioRxiv, 2021: p

Chen, Y ., et al., A systematic benchmark of Nanopore long read RNA sequencing for transcript level analysis in human cell lines. BioRxiv, 2021: p. 2021.04. 21.440736

2021
[21]

Bioinformatics, 2021

Ji, Y ., et al., DNABERT: pre -trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics, 2021. 37(15): p. 2112- 2120

2021
[22]

bioRxiv, 2026

Jo, S., et al., Systematic identification of tissue -conserved m(6)A sites reveals a stable epitranscriptomic regulatory layer controlling essential genes. bioRxiv, 2026

2026
[23]

Nucleic Acids Research, 2021

Körtel, N., et al., Deep and accurate detection of m6A RNA modifications using miCLIP2 and m6Aboost machine learning. Nucleic Acids Research, 2021. 49(16): p. e92-e92

2021
[24]

Nucleic Acids Research, 2023

Liang, Z., et al., m6A-Atlas v2.0: updated resources for unraveling the N6-methyladenosine (m6A) epitranscriptome among multiple species. Nucleic Acids Research, 2023. 52(D1): p. D194-D202

2023
[25]

Tegowski, and K.D

Flamand, M.N., M. Tegowski, and K.D. Meyer, The Proteins of mRNA Modification: Writers, Readers, and Erasers. Annu Rev Biochem, 2023. 92: p. 145-173

2023
[26]

Nucleic Acids Research, 2021

Zhao, W., et al., POSTAR3: an updated platform for exploring post -transcriptional regulation coordinated by RNA-binding proteins. Nucleic Acids Research, 2021. 50(D1): p. D287-D294

2021
[27]

Wang, X. and Y . Wang. Sentence-level resampling for named entity recognition . in Proceedings of the 2022 Conference of the North American Chapter of the Association for computational linguistics: human language technologies. 2022

2022
[28]

BMC Genomics, 2018

Pan, X., et al., Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks. BMC Genomics, 2018. 19(1): p. 511

2018
[29]

GigaScience, 2021

Uhl, M., et al., RNAProt: an efficient and feature -rich RNA binding protein binding site predictor. GigaScience, 2021. 10(8)

2021
[30]

Int J Gen Med, 2025

Long, X., et al., RNA Binding Motif Protein 15 (RBM15): Structure, Function and Its Research Progress in Tumors. Int J Gen Med, 2025. 18: p. 3635-3649

2025
[31]

Molecular Cell, 2016

Xiao, W., et al., Nuclear m<sup>6</sup>A Reader YTHDC1 Regulates mRNA Splicing. Molecular Cell, 2016. 61(4): p. 507-519

2016
[32]

Zaccara, S. and S.R. Jaffrey, A Unified Model for the Function of YTHDF Proteins in Regulating m6A-Modified mRNA. Cell, 2020. 181(7): p. 1582-1595.e18

2020
[33]

Cell Reports,

Boo, S.H., et al., UPF1 promotes rapid degradation of m6A-containing RNAs. Cell Reports,
[34]

Nature Cell Biology, 2018

Huang, H., et al., Recognition of RNA N6 -methyladenosine by IGF2BP proteins enhances mRNA stability and translation. Nature Cell Biology, 2018. 20(3): p. 285-295

2018
[35]

Molecular Cancer, 2024

Ying, Y ., et al., Co-transcriptional R-loops-mediated epigenetic regulation drives growth retardation and docetaxel chemosensitivity enhancement in advanced prostate cancer. Molecular Cancer, 2024. 23(1): p. 79

2024
[36]

Nat Cell Biol, 2018

Huang, H., et al., Recognition of RNA N(6)-methyladenosine by IGF2BP proteins enhances mRNA stability and translation. Nat Cell Biol, 2018. 20(3): p. 285-295

2018
[37]

Cell Death Discovery, 2022

Yan, H., et al., Roles and mechanisms of the m6A reader YTHDC1 in biological processes and diseases. Cell Death Discovery, 2022. 8(1): p. 237

2022
[38]

Journal of Translational Medicine, 2022

Wang, X., et al., SRSF9 promotes colorectal cancer progression via stabilizing DSN1 mRNA in an m6A-related manner. Journal of Translational Medicine, 2022. 20(1): p. 198

2022
[39]

Cancer Biology & Therapy, 2024

Wang, J., et al., A positive feedback loop of SRSF9/USP22/ZEB1 promotes the progression of ovarian cancer. Cancer Biology & Therapy, 2024. 25(1): p. 2427415

2024
[40]

eLife, 2016

Ge, Z., et al., Polypyrimidine tract binding protein 1 protects mRNAs from recognition by the nonsense-mediated mRNA decay pathway. eLife, 2016. 5: p. e11155

2016
[41]

Mol Cancer Res, 2020

Zhang, K., et al., AGO2 Mediates MYC mRNA Sta bility in Hepatocellular Carcinoma. Mol Cancer Res, 2020. 18(4): p. 612-622

2020
[42]

Nucleic Acids Research, 2020

Zhang, H., et al., Dynamic landscape and evolution of m6A methylation in human. Nucleic Acids Research, 2020. 48(11): p. 6251-6264

2020
[43]

Molecular Cell, 2020

Liu, J.e., et al., Landscape and Regulation of m6A and m6Am Methylome across Human and Mouse Tissues. Molecular Cell, 2020. 77(2): p. 426-440.e6

2020
[44]

Human Molecular Genetics, 2018

Zhang, F., et al., Fragile X mental retardation protein modulates the stability of its m6A- marked messenger RNA targets. Human Molecular Genetics, 2018. 27(22): p. 3936-3950

2018
[45]

Bioinformatics, 2018

Chen, S., et al., fastp: an ultra -fast all-in-one FASTQ preprocessor. Bioinformatics, 2018. 34(17): p. i884-i890

2018
[46]

Martin, M., Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. journal, 2011. 17(1): p. 10-12

2011
[47]

Nature Biotechnology, 2019

Kim, D., et al., Graph-based genome alignment and genotyping with HISAT2 and HISAT - genotype. Nature Biotechnology, 2019. 37(8): p. 907-915

2019
[48]

Genomics, Proteomics & Bioinformatics, 2026

Zhou, J., et al., Comprehensive Epitranscriptome Analysis from MeR IP-seq Data with exomePeak2. Genomics, Proteomics & Bioinformatics, 2026

2026
[49]

Briefings in Bioinformatics, 2024

Zhang, T.-H., et al., Understanding YTHDF2-mediated mRNA degradation by m6A-BERT- Deg. Briefings in Bioinformatics, 2024. 25(3): p. bbae170

2024
[50]

Cell Genomics, 2024

Fan, R., et al., A combined deep learning framework for mammalian m6A site prediction. Cell Genomics, 2024. 4(12): p. 100697

2024
[51]

Bioinformatics, 2024

Genovese, G., et al., BCFtools/liftover: an accurate and comprehensive tool to convert genetic variants across genome assemblies. Bioinformatics, 2024. 40(2)

2024
[52]

Nature, 2015

Zhou, J., et al., Dynamic m6A mRNA methylation directs translational control of heat shock response. Nature, 2015. 526(7574): p. 591-594

2015
[53]

Better Modeling of Incomplete Annotations for Named Entity Recognition

Jie, Z., et al. Better Modeling of Incomplete Annotations for Named Entity Recognition
[54]

Minneapolis, Minnesota: Association for Computational Linguistics
[55]

Did the Model Understand the Question? 2018

Mudrakarta, P .K., et al. Did the Model Understand the Question? 2018. Melbourne, Australia: Association for Computational Linguistics

2018
[56]

International Journal of Cancer, 2023

Nakken, S., et al., Comprehensive interrogation of gene lists f rom genome-scale cancer screens with oncoEnrichR. International Journal of Cancer, 2023. 153(10): p. 1819-1828

2023
[57]

PLOS Computational Biology, 2013

Lawrence, M., et al., Software for Computing and Annotating Genomic Ranges. PLOS Computational Biology, 2013. 9(8): p. e1003118. Fig. 1 | Overview of the m6A-FORM framework. a, Pipeline for constructing the high -confidence single-base m6A dataset. A total of 224 human MeRIP-seq datasets were processed through data preparation and peak calling, yielding 2...

2013

[1] [1]

Clustered

rely on highly similar experimental principles, we treated them as a single technology when counting supporting evi dence. Using these criteria, we constructed a high -confidence dataset containing 131,320 base-resolution m6A sites. Dataset preparation for m6A sites identification We collected 528,452 MeRIP -seq peaks from five human cell lines with the l...

[2] [2]

Nature, 2014

Wang, X., et al., N6-methyladenosine-dependent regulation of messenger RNA stability. Nature, 2014. 505(7481): p. 117-120

2014

[3] [3]

Nature, 2012

Dominissini, D., et al., Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature, 2012. 485(7397): p. 201-206

2012

[4] [4]

Cell Genom, 2024

Fan, R., et al., A combined deep learning framework for mammalian m6A site prediction. Cell Genom, 2024. 4(12): p. 100697

2024

[5] [5]

Briefings in Functional Genomics, 2025

Huang, X., et al., m6A RNA modification pathway: orchestrating fibrotic mechanisms across multiple organs. Briefings in Functional Genomics, 2025. 24

2025

[6] [6]

Nature, 2017

Barbieri, I., et al., Promoter-bound METTL3 maintains myeloid leukaemia by m6A- dependent translation control. Nature, 2017. 552(7683): p. 126-131

2017

[7] [7]

Trends in Molecular Medicine, 2023

Liu, Y ., et al., N6-methyladenosine-mediated gene regulation and therapeutic implications. Trends in Molecular Medicine, 2023. 29(6): p. 454-467

2023

[8] [8]

Bioinformatics, 2023

Zhang, Y ., et al., Interpretable prediction models for widespread m6A RNA modification across cell lines and tissues. Bioinformatics, 2023. 39(12)

2023

[9] [9]

Bioinformatics, 2024

Ni, P ., et al., RNA m6A detection using raw current sig nals and basecalling errors from Nanopore direct RNA sequencing reads. Bioinformatics, 2024. 40(6)

2024

[10] [10]

Nature Methods, 2015

Linder, B., et al., Single-nucleotide-resolution mapping of m6A and m6Am throughout the transcriptome. Nature Methods, 2015. 12(8): p. 767-772

2015

[11] [11]

Nature Biotechnology, 2023

Liu, C., et al., Absolute quantification of single -base m6A methylation in the mammalian transcriptome using GLORI. Nature Biotechnology, 2023. 41(3): p. 355-366

2023

[12] [12]

Nucleic Acids Research, 2016

Zhou, Y ., et al., SRAMP: prediction of mammalian N6 -methyladenosine (m6A) sites based on sequence-derived features. Nucleic Acids Research, 2016. 44(10): p. e91-e91

2016

[13] [13]

Nucleic Acids Research,

Chen, K., et al., WHISTLE: a high-accuracy map of the human N6 -methyladenosine (m6A) epitranscriptome predicted using a machine learning approach. Nucleic Acids Research,

[14] [14]

RNA Biol, 2021

Li, J., et al., HSm6AP: a high-precision predictor for the Homo sapiens N6-methyladenosine (m;6 A) based on multiple weights and feature stitching. RNA Biol, 2021. 18(11): p. 1882- 1892

2021

[15] [15]

Cell Genomics, 2024

Fan, R., et al., A combined deep learning framework for mammalian m6A site prediction. Cell Genomics, 2024. 4(12)

2024

[16] [16]

Nature Communications,

Song, Z., et al., Attention-based multi-label neural networks for integrated prediction and interpretation of twelve widely occurring RNA modifications. Nature Communications,

[17] [17]

BMC Bioinformatics, 2024

Tu, G., et al., m6A-TCPred: a web server to predict tissue-conserved human m6A sites using machine learning approach. BMC Bioinformatics, 2024. 25(1): p. 127

2024

[18] [18]

Nucleic Acids Research, 2021

Xiong, Y ., et al., Modeling multi-species RNA modification through multi -task curriculum learning. Nucleic Acids Research, 2021. 49(7): p. 3719-3734

2021

[19] [19]

Signal Transduct Target Ther, 2021

Jiang, X., et al., The role of m6A modification in the biological functions and diseases. Signal Transduct Target Ther, 2021. 6(1): p. 74

2021

[20] [20]

BioRxiv, 2021: p

Chen, Y ., et al., A systematic benchmark of Nanopore long read RNA sequencing for transcript level analysis in human cell lines. BioRxiv, 2021: p. 2021.04. 21.440736

2021

[21] [21]

Bioinformatics, 2021

Ji, Y ., et al., DNABERT: pre -trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics, 2021. 37(15): p. 2112- 2120

2021

[22] [22]

bioRxiv, 2026

Jo, S., et al., Systematic identification of tissue -conserved m(6)A sites reveals a stable epitranscriptomic regulatory layer controlling essential genes. bioRxiv, 2026

2026

[23] [23]

Nucleic Acids Research, 2021

Körtel, N., et al., Deep and accurate detection of m6A RNA modifications using miCLIP2 and m6Aboost machine learning. Nucleic Acids Research, 2021. 49(16): p. e92-e92

2021

[24] [24]

Nucleic Acids Research, 2023

Liang, Z., et al., m6A-Atlas v2.0: updated resources for unraveling the N6-methyladenosine (m6A) epitranscriptome among multiple species. Nucleic Acids Research, 2023. 52(D1): p. D194-D202

2023

[25] [25]

Tegowski, and K.D

Flamand, M.N., M. Tegowski, and K.D. Meyer, The Proteins of mRNA Modification: Writers, Readers, and Erasers. Annu Rev Biochem, 2023. 92: p. 145-173

2023

[26] [26]

Nucleic Acids Research, 2021

Zhao, W., et al., POSTAR3: an updated platform for exploring post -transcriptional regulation coordinated by RNA-binding proteins. Nucleic Acids Research, 2021. 50(D1): p. D287-D294

2021

[27] [27]

Wang, X. and Y . Wang. Sentence-level resampling for named entity recognition . in Proceedings of the 2022 Conference of the North American Chapter of the Association for computational linguistics: human language technologies. 2022

2022

[28] [28]

BMC Genomics, 2018

Pan, X., et al., Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks. BMC Genomics, 2018. 19(1): p. 511

2018

[29] [29]

GigaScience, 2021

Uhl, M., et al., RNAProt: an efficient and feature -rich RNA binding protein binding site predictor. GigaScience, 2021. 10(8)

2021

[30] [30]

Int J Gen Med, 2025

Long, X., et al., RNA Binding Motif Protein 15 (RBM15): Structure, Function and Its Research Progress in Tumors. Int J Gen Med, 2025. 18: p. 3635-3649

2025

[31] [31]

Molecular Cell, 2016

Xiao, W., et al., Nuclear m<sup>6</sup>A Reader YTHDC1 Regulates mRNA Splicing. Molecular Cell, 2016. 61(4): p. 507-519

2016

[32] [32]

Zaccara, S. and S.R. Jaffrey, A Unified Model for the Function of YTHDF Proteins in Regulating m6A-Modified mRNA. Cell, 2020. 181(7): p. 1582-1595.e18

2020

[33] [33]

Cell Reports,

Boo, S.H., et al., UPF1 promotes rapid degradation of m6A-containing RNAs. Cell Reports,

[34] [34]

Nature Cell Biology, 2018

Huang, H., et al., Recognition of RNA N6 -methyladenosine by IGF2BP proteins enhances mRNA stability and translation. Nature Cell Biology, 2018. 20(3): p. 285-295

2018

[35] [35]

Molecular Cancer, 2024

Ying, Y ., et al., Co-transcriptional R-loops-mediated epigenetic regulation drives growth retardation and docetaxel chemosensitivity enhancement in advanced prostate cancer. Molecular Cancer, 2024. 23(1): p. 79

2024

[36] [36]

Nat Cell Biol, 2018

Huang, H., et al., Recognition of RNA N(6)-methyladenosine by IGF2BP proteins enhances mRNA stability and translation. Nat Cell Biol, 2018. 20(3): p. 285-295

2018

[37] [37]

Cell Death Discovery, 2022

Yan, H., et al., Roles and mechanisms of the m6A reader YTHDC1 in biological processes and diseases. Cell Death Discovery, 2022. 8(1): p. 237

2022

[38] [38]

Journal of Translational Medicine, 2022

Wang, X., et al., SRSF9 promotes colorectal cancer progression via stabilizing DSN1 mRNA in an m6A-related manner. Journal of Translational Medicine, 2022. 20(1): p. 198

2022

[39] [39]

Cancer Biology & Therapy, 2024

Wang, J., et al., A positive feedback loop of SRSF9/USP22/ZEB1 promotes the progression of ovarian cancer. Cancer Biology & Therapy, 2024. 25(1): p. 2427415

2024

[40] [40]

eLife, 2016

Ge, Z., et al., Polypyrimidine tract binding protein 1 protects mRNAs from recognition by the nonsense-mediated mRNA decay pathway. eLife, 2016. 5: p. e11155

2016

[41] [41]

Mol Cancer Res, 2020

Zhang, K., et al., AGO2 Mediates MYC mRNA Sta bility in Hepatocellular Carcinoma. Mol Cancer Res, 2020. 18(4): p. 612-622

2020

[42] [42]

Nucleic Acids Research, 2020

Zhang, H., et al., Dynamic landscape and evolution of m6A methylation in human. Nucleic Acids Research, 2020. 48(11): p. 6251-6264

2020

[43] [43]

Molecular Cell, 2020

Liu, J.e., et al., Landscape and Regulation of m6A and m6Am Methylome across Human and Mouse Tissues. Molecular Cell, 2020. 77(2): p. 426-440.e6

2020

[44] [44]

Human Molecular Genetics, 2018

Zhang, F., et al., Fragile X mental retardation protein modulates the stability of its m6A- marked messenger RNA targets. Human Molecular Genetics, 2018. 27(22): p. 3936-3950

2018

[45] [45]

Bioinformatics, 2018

Chen, S., et al., fastp: an ultra -fast all-in-one FASTQ preprocessor. Bioinformatics, 2018. 34(17): p. i884-i890

2018

[46] [46]

Martin, M., Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. journal, 2011. 17(1): p. 10-12

2011

[47] [47]

Nature Biotechnology, 2019

Kim, D., et al., Graph-based genome alignment and genotyping with HISAT2 and HISAT - genotype. Nature Biotechnology, 2019. 37(8): p. 907-915

2019

[48] [48]

Genomics, Proteomics & Bioinformatics, 2026

Zhou, J., et al., Comprehensive Epitranscriptome Analysis from MeR IP-seq Data with exomePeak2. Genomics, Proteomics & Bioinformatics, 2026

2026

[49] [49]

Briefings in Bioinformatics, 2024

Zhang, T.-H., et al., Understanding YTHDF2-mediated mRNA degradation by m6A-BERT- Deg. Briefings in Bioinformatics, 2024. 25(3): p. bbae170

2024

[50] [50]

Cell Genomics, 2024

Fan, R., et al., A combined deep learning framework for mammalian m6A site prediction. Cell Genomics, 2024. 4(12): p. 100697

2024

[51] [51]

Bioinformatics, 2024

Genovese, G., et al., BCFtools/liftover: an accurate and comprehensive tool to convert genetic variants across genome assemblies. Bioinformatics, 2024. 40(2)

2024

[52] [52]

Nature, 2015

Zhou, J., et al., Dynamic m6A mRNA methylation directs translational control of heat shock response. Nature, 2015. 526(7574): p. 591-594

2015

[53] [53]

Better Modeling of Incomplete Annotations for Named Entity Recognition

Jie, Z., et al. Better Modeling of Incomplete Annotations for Named Entity Recognition

[54] [54]

Minneapolis, Minnesota: Association for Computational Linguistics

[55] [55]

Did the Model Understand the Question? 2018

Mudrakarta, P .K., et al. Did the Model Understand the Question? 2018. Melbourne, Australia: Association for Computational Linguistics

2018

[56] [56]

International Journal of Cancer, 2023

Nakken, S., et al., Comprehensive interrogation of gene lists f rom genome-scale cancer screens with oncoEnrichR. International Journal of Cancer, 2023. 153(10): p. 1819-1828

2023

[57] [57]

PLOS Computational Biology, 2013

Lawrence, M., et al., Software for Computing and Annotating Genomic Ranges. PLOS Computational Biology, 2013. 9(8): p. e1003118. Fig. 1 | Overview of the m6A-FORM framework. a, Pipeline for constructing the high -confidence single-base m6A dataset. A total of 224 human MeRIP-seq datasets were processed through data preparation and peak calling, yielding 2...

2013