GFSeeker: a splicing-graph-based approach for accurate gene fusion detection from long-read RNA sequencing data.
Bingyan Wang, Heng Hu, Runtian Gao, Guohua Wang, Tao Jiang
Abstract
Open AccessGene fusions are critical oncogenic drivers and therapeutic targets in diverse cancers. Long-read ribonucleic acid sequencing (RNA-seq) offers an unprecedented opportunity to resolve the full-length structure of fusion isoforms, but its high intrinsic error rates pose significant challenges to the precise identification of true fusion events. Here, we developed GFSeeker, an innovative splicing-graph-based computational framework for accurate gene fusion detection from long-read RNA-seq. GFSeeker employs a unique pipeline based on a splicing graph reference and a dual re-alignment validation to effectively overcome data noise from high error rates. Benchmarking across simulated, non-tumor, and cancer cell line datasets demonstrated GFSeeker's state-of-the-art performance, achieving 6%-15% higher F1 score compared to existing methods. Notably, GFSeeker successfully identified the known fusion event, MATN2-POP1, in the MCF-7 cancer cell line, missed by other tools, highlighting its superior sensitivity in resolving complex fusion events. These results validate GFSeeker as a powerful and reliable tool for gene fusion discovery, heralding its significant potential to advance cancer research and precision diagnostics.