TY - JOUR
T1 - Assembly of the 373k gene space of the polyploid sugarcane genome reveals reservoirs of functional diversity in the world's leading biomass crop
AU - Souza, Glaucia Mendes
AU - Van Sluys, Marie Anne
AU - Lembke, Carolina Gimiliani
AU - Lee, Hayan
AU - Margarido, Gabriel Rodrigues Alves
AU - Hotta, Carlos Takeshi
AU - Gaiarsa, Jonas Weissmann
AU - Diniz, Augusto Lima
AU - Oliveira, Mauro De Medeiros
AU - Ferreira, Sávio De Siqueira
AU - Nishiyama, Milton Yutaka
AU - Ten-Caten, Felipe
AU - Ragagnin, Geovani Tolfo
AU - Andrade, Pablo De Morais
AU - De Souza, Robson Francisco
AU - Nicastro, Gianlucca Gonçalves
AU - Pandya, Ravi
AU - Kim, Changsoo
AU - Guo, Hui
AU - Durham, Alan Mitchell
AU - Carneiro, Monalisa Sampaio
AU - Zhang, Jisen
AU - Zhang, Xingtan
AU - Zhang, Qing
AU - Ming, Ray
AU - Schatz, Michael C.
AU - Davidson, Bob
AU - Paterson, Andrew H.
AU - Heckerman, David
N1 - Publisher Copyright:
© 2019 The Author(s) 2019. Published by Oxford University Press.
PY - 2019/12/6
Y1 - 2019/12/6
N2 - Background: Sugarcane cultivars are polyploid interspecific hybrids of giant genomes, typically with 10-13 sets of chromosomes from 2 Saccharum species. The ploidy, hybridity, and size of the genome, estimated to have >10 Gb, pose a challenge for sequencing. Results: Here we present a gene space assembly of SP80-3280, including 373,869 putative genes and their potential regulatory regions. The alignment of single-copy genes in diploid grasses to the putative genes indicates that we could resolve 2-6 (up to 15) putative homo(eo)logs that are 99.1% identical within their coding sequences. Dissimilarities increase in their regulatory regions, and gene promoter analysis shows differences in regulatory elements within gene families that are expressed in a species-specific manner. We exemplify these differences for sucrose synthase (SuSy) and phenylalanine ammonia-lyase (PAL), 2 gene families central to carbon partitioning. SP80-3280 has particular regulatory elements involved in sucrose synthesis not found in the ancestor Saccharum spontaneum. PAL regulatory elements are found in co-expressed genes related to fiber synthesis within gene networks defined during plant growth and maturation. Comparison with sorghum reveals predominantly bi-allelic variations in sugarcane, consistent with the formation of 2 "subgenomes" after their divergence ∼3.8-4.6 million years ago and reveals single-nucleotide variants that may underlie their differences. Conclusions: This assembly represents a large step towards a whole-genome assembly of a commercial sugarcane cultivar. It includes a rich diversity of genes and homo(eo)logous resolution for a representative fraction of the gene space, relevant to improve biomass and food production.
AB - Background: Sugarcane cultivars are polyploid interspecific hybrids of giant genomes, typically with 10-13 sets of chromosomes from 2 Saccharum species. The ploidy, hybridity, and size of the genome, estimated to have >10 Gb, pose a challenge for sequencing. Results: Here we present a gene space assembly of SP80-3280, including 373,869 putative genes and their potential regulatory regions. The alignment of single-copy genes in diploid grasses to the putative genes indicates that we could resolve 2-6 (up to 15) putative homo(eo)logs that are 99.1% identical within their coding sequences. Dissimilarities increase in their regulatory regions, and gene promoter analysis shows differences in regulatory elements within gene families that are expressed in a species-specific manner. We exemplify these differences for sucrose synthase (SuSy) and phenylalanine ammonia-lyase (PAL), 2 gene families central to carbon partitioning. SP80-3280 has particular regulatory elements involved in sucrose synthesis not found in the ancestor Saccharum spontaneum. PAL regulatory elements are found in co-expressed genes related to fiber synthesis within gene networks defined during plant growth and maturation. Comparison with sorghum reveals predominantly bi-allelic variations in sugarcane, consistent with the formation of 2 "subgenomes" after their divergence ∼3.8-4.6 million years ago and reveals single-nucleotide variants that may underlie their differences. Conclusions: This assembly represents a large step towards a whole-genome assembly of a commercial sugarcane cultivar. It includes a rich diversity of genes and homo(eo)logous resolution for a representative fraction of the gene space, relevant to improve biomass and food production.
KW - allele
KW - bioenergy
KW - biomass
KW - genome
KW - polyploid
UR - http://www.scopus.com/inward/record.url?scp=85075576111&partnerID=8YFLogxK
U2 - 10.1093/gigascience/giz129
DO - 10.1093/gigascience/giz129
M3 - Article
SN - 2047-217X
VL - 8
JO - GigaScience
JF - GigaScience
IS - 12
M1 - giz129
ER -