Introduction
Miscanthus, a rhizomatous perennial plant, is a promising bioenergy crop due to its high biomass yield and stress tolerance. It's one of the few C4 plants adapted to cold conditions, making it valuable for sugarcane breeding. Its heavy metal tolerance also makes it suitable for phytoremediation. *Miscanthus lutarioriparius*, endemic to the Yangtze River region of China, boasts the highest biomass production among major Miscanthus species in China, showing potential for paper-making and as a second-generation energy crop due to its high biomass, photosynthetic rates, water use efficiency, and drought and salt tolerance. The species' self-incompatibility contributes to its genetic diversity and environmental adaptation. However, heterozygosity in the Miscanthus genome has hindered genome sequencing and assembly. The large genome size and abundant repetitive sequences of *M. lutarioriparius* presented challenges for high-quality assembly using short-read sequencing. Recent advances in long-read sequencing and Hi-C technology offer solutions to this problem. This study aimed to generate a chromosome-level reference genome of *M. lutarioriparius* by combining Oxford Nanopore sequencing and Hi-C technologies, facilitating better utilization of *Miscanthus* genetic resources.
Literature Review
Previous research highlighted the potential of Miscanthus as a bioenergy crop, emphasizing its high biomass yield and stress tolerance (Heaton et al., 2010). Its cold tolerance in C4 photosynthesis has been investigated as a source for improving sugarcane (Głowacka et al., 2016), and its phytoremediation capabilities for heavy metal-contaminated soil are well documented (Barbosa et al., 2015). Within the Miscanthus genus, *M. lutarioriparius* stands out for its exceptional biomass production (Liu et al., 2013; Yan et al., 2015; Wang et al., 2019). Studies on its photosynthetic rates and water use efficiency (Yan et al., 2015) have further supported its potential as a bioenergy crop. The self-incompatibility of Miscanthus has been noted as a factor in its genetic diversity (Heaton et al., 2010), but the lack of genomic resources has limited understanding of its genomic basis and evolutionary history. Previous attempts to sequence Miscanthus genomes faced challenges due to the high heterozygosity and large genome size. This study builds upon previous work on Miscanthus genome sequencing and assembly (Mitros et al., 2020), aiming for a higher quality, chromosome-scale assembly using advanced technologies.
Methodology
The study used a *M. lutarioriparius* plant sample from Honghu Lake, China. High molecular weight (HMW) genomic DNA (gDNA) was extracted from young leaves using a CTAB method and purified. Nanopore sequencing libraries were prepared using the Oxford Nanopore LSK-109 kit and sequenced on the PromethION platform, generating 307.71 Gb of raw Nanopore data. Three Illumina paired-end libraries with varying insert sizes were constructed and sequenced (205.74 Gb of raw data) to improve the assembly. Genome size and heterozygosity were estimated using k-mer frequency distribution from Illumina short reads. Raw Oxford Nanopore reads were self-corrected using Canu. Corrected Nanopore reads were assembled with SMARTdenovo and polished using Racon with raw Nanopore reads and Pilon with Illumina data. Hi-C libraries were prepared from young leaves using a modified method, sequenced on an Illumina HiSeq4000 platform (347.76 Gb clean data), and processed using HiC-Pro. Two Hi-C scaffolding programs, LACHESIS and 3d-dna, were used to anchor the contigs into chromosome-length scaffolds. The quality of the final assembly was evaluated using various metrics, including BUSCO, Illumina read mapping rates, and RNAseq read mapping rates. Repeat analysis was performed using RepeatModeler and RepeatMasker. Gene prediction was performed using a combination of ab initio prediction (Fgenesh, Augustus), homolog protein-based prediction (Exonerate), and RNA sequencing-aided prediction (Trinity, StringTie, PASA). The EvidenceModeler (EVM) integrated these predictions, followed by quality control using MAKER. Functional annotation was performed using InterProScan, eggNOG-mapper, KEGG KOALA, PlantTFDB, and tools for non-coding RNA identification. Genomic comparisons with sorghum were performed using MCScanX, and visualizations were created using custom scripts. The timing of recent whole-genome duplication was estimated using Ks values calculated by KaKs_calculator. Gene duplication origins were classified using MCScanX. Phylogenetic analysis and divergence time estimation were performed using OrthoFinder, MAFFT, ModelGenerator, RAxML, and BEAST. Genetic diversity analysis was conducted using transcriptome data from 79 individuals. Gene family expansion and contraction analysis was performed using CAFE and KinFin. Chloroplast genome assembly and annotation were performed using MITObim and GeSeq, with phylogenetic analysis using MAFFT, Gblocks, IQ-TREE, MrBayes, and MEGA X. Plant disease resistance genes were identified using DRAGO 2 and HMMER. Cell-wall-biosynthesis-related gene families were identified using BLAST and HMMER. CAZymes were annotated using the dbCAN2 meta server.
Key Findings
This study produced a high-quality chromosome-scale assembly of the *M. lutarioriparius* genome (2.07 Gb, 96.64% complete, contig N50 of 1.71 Mb). Centromere and telomere sequences were assembled for all 19 chromosomes. The allotetraploid origin of *M. lutarioriparius* was confirmed using centromeric satellite repeats. The genome shows significant synteny with sorghum but also several chromosomal rearrangements. Tandemly duplicated genes are enriched in functions related to stress response and cell wall biosynthesis. Gene families associated with disease resistance, cell wall biosynthesis, and metal ion transport are expanded. The recent whole-genome duplication (WGD) event in *M. lutarioriparius* is estimated to have occurred around 6.15 million years ago. Phylogenetic analysis places *M. lutarioriparius* closest to *Saccharum spontaneum*, with divergence estimated at ~7.97 Ma. *M. lutarioriparius* possesses a larger number of NBS-LRR genes than other related species (547 vs. ~500 in rice, 211-346 in sorghum, 137 in maize, and 361 in *S. spontaneum*), with a significant enrichment (42.8%) on chromosomes 9 and 10. The number of CAZymes in *M. lutarioriparius* is higher than other investigated species (2919). The expansion of the cellulose-synthase-related gene families may contribute to *M. lutarioriparius*' higher cellulose content. *M. lutarioriparius* has a greater number of genes involved in lignin biosynthesis than sorghum and rice. The number of genes involved in C4 photosynthesis is almost double that of sorghum, primarily due to the WGD event, and both gene duplicates have a similarly high expression level in the leaf sample. Phylogenetic analysis based on chloroplast genomes revealed three distinct groups within the *Miscanthus* genus.
Discussion
The chromosome-scale assembly of the *M. lutarioriparius* genome provides valuable resources for understanding the genomic basis of its unique traits, such as high biomass, stress tolerance, and self-incompatibility. The expanded gene families related to disease resistance, cell wall biosynthesis, and metal ion transport likely contribute to its remarkable adaptations. The high number of NBS-LRR genes suggests a robust defense system against pathogens. The comprehensive CAZyme analysis provides insights into the complex carbohydrate metabolism of *M. lutarioriparius*. The duplication of C4 photosynthesis genes suggests adaptation to low-temperature environments. The findings confirm the allotetraploid origin of *M. lutarioriparius* and highlight its evolutionary relationships within the *Miscanthus* genus. This high-quality genome assembly will support further functional genomics studies, gene discovery, and marker-assisted breeding for improving bioenergy crops.
Conclusion
This study successfully generated a high-quality chromosome-scale genome assembly of *Miscanthus lutarioriparius*, revealing insights into its evolutionary history, genetic diversity, and the genomic basis of its valuable traits. The findings will greatly benefit future research on functional genomics and molecular breeding in this important bioenergy crop. Future research could focus on characterizing the functions of the expanded gene families, investigating the mechanisms underlying self-incompatibility, and exploring the potential for genetic improvement through marker-assisted selection.
Limitations
The study focused on a single *M. lutarioriparius* accession. Further analysis of multiple accessions is needed to capture the full genetic diversity of this species. Although the genome assembly is of high quality, some genomic regions might remain challenging to assemble due to their high complexity. Further investigations are required to fully elucidate the functions of genes within the expanded gene families.
Related Publications
Explore these studies to deepen your understanding of the subject.