Cocoa Genome Hub

Access the cocoa Criollo Genome Version 2 with 99% of the genes anchored to the 10 chromosomes

In January 2017th was released the Theobroma cacao Criollo genome version 2. Efforts have been made to significantly improve the Belizian Criollo B97-61/B2 genome

In this work, we used a NGS-based approach to significantly improve the assembly of the Belizian Criollo B97-61/B2 genome. We combined 4 Illumina large insert size mate paired libraries with 52x of Pacific Biosciences long reads to correct misassembled regions, reduce the number of scaffolds to 554 (4,792 in assembly V1) with a N50 increased from 0.47 Mb to 6.5 Mb. 96.7% of the assembly was anchored to the 10 chromosomes compared to the previous 66.8%. Unknown sites (Ns) were reduced from 10.8% to 5.7%. Moreover, the NCBI Eukaryotic Genome Annotation Pipeline carried out a new RefSeq structural annotation based on RNAseq evidences.

The release of the Theobroma cacao Criollo genome version 2 is a valuable resource for investigating complex traits at the genomic level and is an important step for future comparative genomics and genetics studied on cocoa.