Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

The field of genome assembly merely exists as long as sequencers are not able to yield chromosome-level error-less sequencing reads for all species. It consists in reconstituting the original genome sequence from sequencing reads, with a final number of fragments matching the expected number of chromosomes. This process has been facilitated by the availability of longer and more accurate reads. At the incipit of genome assembly, Sanger sequencing reads Sanger et al (Proc Natl Acad Sci 74(12)spiepr A3B2 twbch ":":spiepr A3B2 twbch5463-5467, 1977) were already used to yield initial assemblies of different species, including the first human genome assembly International Human Genome Sequencing Consortium (Nature 409(6822):860-921, 2001). The higher throughput of second-generation sequencing, often called next-generation sequencing, democratized assemblies for a wider variety of species but brought assembly difficulties as the large datasets of short reads required long computational time, large memory resources, and yielded highly fragmented assemblies, with fragment numbers far over the expected number of chromosomes Metzker (Nat Rev Genet 11(1):31-46, 2010). Third-generation sequencing introduced long reads through the technologies of Oxford Nanopore Deamer et al(Nat Biotechnol 34(5):518-524, 2016) and Pacific BioSciences (PacBio) Eid et al (Science 323(5910):133-138, 2009). Long reads can reach several tens of kilobase, and up to hundreds of thousands of base pairs; although these reads initially had a low accuracy, recent developments to decrease the error rate below 1% Sereika et al (Oxford Nanopore R10.4 long-read sequencing enables near-perfect bacterial genomes from pure cultures and metagenomes without short-read or reference polishing. bioRxiv, 2021); Wenger et al (Nat Biotechnol 37(October):1155-1162, 2019) have additionally reduced the complexity of genome assembly.Chromosome-level assemblies have become a standard in genome assembly publications: they can be used for synteny analysis, finding chromosomal rearrangements, they have more complete gene sets, a better resolution of repetitive content, and fewer contaminants. The availability of long reads and Hi-C and the decreasing cost of sequencing have brought about many high-quality insect genome assemblies, with currently 6,194 assemblies published on GenBank (accessed on 20.11.2024). Some remarkable efforts have been put toward Lepidoptera genomes, for which 188 chromosome-level assemblies were generated by the Darwin Tree of Life Wright et al (Nat Ecol Evol 8(4):777-790, 2024). At a smaller scale, a highly contiguous assembly was obtained for the ant Camponotus pennsylvanicus using Nanopore reads with a budget of only 1000$.Traditionally, genome assemblies are collapsed, meaning that sets of homologous chromosomes are represented by a single consensus sequence Guiglielmoni et al (BMC Bioinf 22(1):303, 2021). This approach is most adequate for low-heterozygosity genomes, and variants are documented a posteriori. Along the advent of high-accuracy long reads, phased assemblies, in which all haplotypes are included, are becoming more common.The assembly process typically involves multiple steps to tackle the challenges posed by the characteristics of the genome: ploidy, heterozygosity, repetitiveness… Reads may need to be preprocessed to remove adapters, select the longest and/or most accurate reads, or correct them to further improve accuracy. The most essential part lies, of course, in the assembly step. Reads are overlapped to build an assembly graph, and ad hoc algorithms are applied to find a path giving the most faithful representation of the genome. Assemblers yield a set of contiguous sequences or contigs. The output should then be evaluated to decide whether the assembly has reached the highest contiguity, completeness, and correctness, and if not, which steps should be performed subsequently to increase quality.

Download full-text PDF

Source
http://dx.doi.org/10.1007/978-1-0716-4583-3_1DOI Listing

Publication Analysis

Top Keywords

genome assembly
20
long reads
20
reads
15
sequencing reads
12
assembly
11
genome
11
sequencing
9
assemblies
9
expected number
8
number chromosomes
8

Similar Publications

Chromosome-scale genome assembly of Sauvagesia rhodoleuca (Ochnaceae) provides insights into its genome evolution and demographic history.

DNA Res

September 2025

Key Laboratory of National Forestry and Grassland Administration on Plant Conservation and Utilization in Southern China, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou 510650, China.

Sauvagesia rhodoleuca is an endangered species endemic to southern China. Due to human activities, only six fragmented populations remain in Guangdong and Guangxi. Despite considerable conservation efforts, its demographic history and evolution remain poorly understood, particularly from a genomic perspective.

View Article and Find Full Text PDF

Genome Assembly and Molecular Analysis Uncover the Salt Tolerance Mechanism of Coastal Plant Scaevola sericea.

Plant Cell Environ

October 2025

Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.

View Article and Find Full Text PDF

Neorickettsia risticii (N. risticii) is an obligatory intracellular bacterium that causes Potomac horse fever (PHF), a disease clinically characterized by diarrhea, pyrexia, and laminitis in horses. Although sporadic reports of N.

View Article and Find Full Text PDF

Mosquito reproductive biology is an underexplored area with potential for developing novel vector control strategies. In this study, we investigated the role of the testis-specific serine/threonine-protein kinase (tssk) family, an essential regulator of spermiogenesis in mammals, in mosquitoes. We identified tssk homologues, As_tssk3 and Aea_tssk1, in Anopheles stephensi and Aedes aegypti, respectively and analyzed their expression across different developmental stages.

View Article and Find Full Text PDF

Mutualistic endosymbiosis is a cornerstone of evolutionary innovation, enabling organisms to exploit diverse niches unavailable to individual species. However, our knowledge about the early evolutionary stage of this relationship remains limited. The association between the ciliate Tetrahymena utriculariae and its algal endosymbiont Micractinium tetrahymenae indicates an incipient stage of photoendosymbiosis.

View Article and Find Full Text PDF