98%
921
2 minutes
20
The current human reference genome is predominantly derived from a single individual and it does not adequately reflect human genetic diversity. Here, we analyze 338 high-quality human assemblies of genetically divergent human populations to identify missing sequences in the human reference genome with breakpoint resolution. We identify 127,727 recurrent non-reference unique insertions spanning 18,048,877 bp, some of which disrupt exons and known regulatory elements. To improve genome annotations, we linearly integrate these sequences into the chromosomal assemblies and construct a Human Diversity Reference. Leveraging this reference, an average of 402,573 previously unmapped reads can be recovered for a given genome sequenced to ~40X coverage. Transcriptomic diversity among these non-reference sequences can also be directly assessed. We successfully map tens of thousands of previously discarded RNA-Seq reads to this reference and identify transcription evidence in 4781 gene loci, underlining the importance of these non-reference sequences in functional genomics. Our extensive datasets are important advances toward a comprehensive reference representation of global human genetic diversity.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7599213 | PMC |
http://dx.doi.org/10.1038/s41467-020-19311-w | DOI Listing |
Mult Scler
September 2025
Department of Neurology with Friedrich Baur Institute, LMU University Hospital, LMU Munich, Munich, Germany.
Description of a patient with multiple sclerosis (MS) who underwent immunotherapy with ocrelizumab and suffered a severe course of tick-borne encephalitis (TBE): A 33-year-old man presented with acute cerebellitis with tonsillar herniation. The initial suspected diagnosis of TBE was confirmed after a significant diagnostic delay, likely caused by negative serological testing due to B-cell depletion from ocrelizumab treatment for underlying MS. TBE diagnosis was made using polymerase chain reaction (PCR) and oligo-hybrid capture metagenomic next-generation sequencing (mNGS) of cerebral spinal fluid and brain biopsy samples which yielded a near-full length TBE Virus (TBEV) genome.
View Article and Find Full Text PDFExp Appl Acarol
September 2025
Institute of Pathogens and Vectors, Yunnan Provincial Key Laboratory for Zoonosis Control and Prevention, Dali University, 22 Wanhua St, Dali, 671000, China.
The family Spinturnicidae belongs to the suborder Monogynapsida, superfamily Dermanyssoidea, and exclusively parasitizes the body surface of bats. In the present study, we determined the complete mitochondrial genome of Spinturnix psi, a species of bat mite, and subsequently conducted a comprehensive analysis of its genomic information. The mitochondrial genome of S.
View Article and Find Full Text PDFMamm Genome
September 2025
Department of Animal Health and Anatomy, Center for Animal Biotechnology and Gene Therapy, Universitat Autònoma de Barcelona, Travessera Dels Turons, 08193, Cerdanyola del Vallès, Barcelona, Spain.
The mouse remains the principal animal model for investigating human diseases due, among other reasons, to its anatomical similarities to humans. Despite its widespread use, the assumption that mouse anatomy is a fully established field with standardized and universally accepted terminology is misleading. Many phenotypic anatomical annotations do not refer to the authority or origin of the terminology used, while others inappropriately adopt outdated or human-centric nomenclature.
View Article and Find Full Text PDFSci Justice
September 2025
Department of Chemistry and Forensic Science, Eastern Kentucky University, 521 Lancaster Avenue, Richmond, KY 40475, United States. Electronic address:
Traditionally, when processing DNA samples, a multiple-step procedure is followed; after a sample has been collected, DNA is then extracted and quantified before a profile is generated. During the process, valuable DNA can be lost and/or consumed. When processing reference samples, where DNA is usually in abundance, DNA loss may not be a concern for the analysts.
View Article and Find Full Text PDFGenomics
September 2025
Laboratory of Single Cell Analyses, Institute of Bioorganic Chemistry Polish Academy of Sciences, Zygmunta Noskowskiego str. 12/14, 61-704 Poznań, Poland. Electronic address:
Despite advancements in genome annotation tools, challenges persist for non-classical model organisms with limited genomic resources, such as Schmidtea mediterranea. To address these challenges, we developed a flexible and scalable genome annotation pipeline that integrates short-read (Illumina) and long-read (PacBio) sequencing technologies. The pipeline combines reference-based and de novo assembly methods, effectively handling genomic variability and alternative splicing events.
View Article and Find Full Text PDF