Section: New Software and Platforms
GATB-Core
Genome Assembly and Analysis Tool Box
Keywords: Bioinformatics - NGS - Genomics - Genome assembling
Functional Description: The GATB-Core library aims to lighten the design of NGS algorithms. It offers a panel of high-level optimized building blocks to speed-up the development of NGS tools related to genome assembly and/or genome analysis. The underlying data structure is the de Bruijn graph, and the general parallelism model is multithreading. The GATB library targets standard computing resources such as current multicore processor (laptop computer, small server) with a few GB of memory. From high-level API, NGS programming designers can rapidly elaborate their own software based on domain state-of-the-art algorithms and data structures. The GATB-Core library is written in C++.
Release Functional Description: speed up from x2 to x4 for kmer counting and graph construction phases (optimizations based on minimizers and improved Bloom filters). GATB’s k-mer counter has been improved using techniques from KMC2, to achieve competitive running times compared to KMC2. ability to store arbitrary information associated to each kmer of the graph, enabled by a minimal perfect hash function (costs only 2.61 bits/kmer of memory) improved API with new possibilities (banks and kmers management) many new snippets showing how to use the library.