KinMethyl: robust methylation detection in prokaryotic SMRT sequencing via kinetic signal modeling and deep feature integration.
Jichen Zhang, Yutaka Saito
Abstract
Open AccessMotivation: Accurate detection of 5-methylcytosine (5mC) from PacBio single-molecule real-time (SMRT) sequencing remains challenging in prokaryotes due to weak kinetic signals and motif diversity. Results: Here, we present KinMethyl, a generalizable deep learning framework that integrates sequence and kinetic signals to improve methylation detection across diverse bacterial genomes. Central to our approach is a regression model trained on whole-genome amplified samples to estimate the expected kinetics signals of unmethylated sequences. These predicted signals are incorporated into a downstream classifier to enhance the performance under low signal-to-noise conditions. KinMethyl outperforms existing tools such as kineticstools and ccsmeth across multiple bacterial species, methylation motifs, and modification types not only 5mC but also N6-methyladenine (6 mA) and N4-methylcytosine (4mC). In 5mC classification, KinMethyl improved the AUC by up to 0.20 compared to the existing method (0.6165 to 0.8190) with statistical significance (DeLong's test, P < 1e-10). The improvements were consistently observed in cross-species evaluations as well as different sequencing platforms including RSII and Sequel. This work highlights the utility of kinetic signal modeling and feature integration for robust and motif-independent methylation analysis in prokaryotic epigenomics. Availability and implementation: The source code is available at https://github.com/ZhangBio/KinMethyl.