Paper
Document
Download
Flag content
Preprint
59

GTDB-Tk v2: memory friendly classification with the Genome Taxonomy Database

Save
TipTip
Document
Download
Flag content
59
TipTip
Save
Document
Download
Flag content

Abstract

Abstract The Genome Taxonomy Database (GTDB) and associated taxonomic classification toolkit (GTDB-Tk) have been widely adopted by the microbiology community. However, the growing size of the GTDB bacterial reference tree has resulted in GTDB-Tk requiring substantial amounts of memory (~320 GB) which limits its adoption and ease of use. Here we present an update to GTDB-Tk that uses a divide-and-conquer approach where user genomes are initially placed into a bacterial reference tree with family-level representatives followed by placement into an appropriate class-level subtree comprising species representatives. This substantially reduces the memory requirements of GTDB-Tk while having minimal impact on classification. Availability GTDB-Tk is implemented in Python and licenced under the GNU General Public Licence v3.0. Source code and documentation are available at: https://github.com/ecogenomics/gtdbtk . Contact p.chaumeil@uq.edu.au or donovan.parks@gmail.com

Paper PDF

Empty State
This PDF hasn't been uploaded yet.
Do not upload any copyrighted content to the site, only open-access content.
or