Parsing GTF and FASTA files using the eccLib Library.
Tomasz Chady, Zuzanna Karolina Filutowska
Abstract
Open AccessSUMMARY: Leveraging the Python/C API, eccLib was developed as a high-performance library designed for parsing genomic files and analysing genomic contexts. To the best of the authors' knowledge, it is the fastest Python-based solution available. With eccLib, users can efficiently parse GTF/GFFv3 and FASTA files and utilize the provided methods for additional analysis. AVAILABILITY AND IMPLEMENTATION: This library is implemented in C and distributed under the GPL-3.0 licence. It is compatible with any system that has the Python interpreter (CPython) installed. The use of C enables numerous optimizations at both the implementation and algorithmic levels, which are either unachievable or impractical in Python.