98%
921
2 minutes
20
Sequencing technologies have been rapidly developed recently, leading to the breakthrough of sequencing-based clinical diagnosis, but accurate and complete genome variation benchmark would be required for further assessment of precision medicine applications. Despite the human cell line of NA12878 has been successfully developed to be a variation benchmark, population-specific variation benchmark is still lacking. Here, we established an Asian human variation benchmark by constructing and sequencing a stabilized cell line of a Chinese Han volunteer. By using seven different sequencing strategies, we obtained ~3.88 Tb clean data from different laboratories, hoping to reach the point of high sequencing depth and accurate variation detection. Through the combination of variations identified from different sequencing strategies and different analysis pipelines, we identified 3.35 million SNVs and 348.65 thousand indels, which were well supported by our sequencing data and passed our strict quality control, thus should be high confidence variation benchmark. Besides, we also detected 5,913 high-quality SNVs which had 969 sites were novel and located in the high homologous regions supported by long-range information in both the co-barcoding single tube Long Fragment Read (stLFR) data and PacBio HiFi CCS data. Furthermore, by using the long reads data (stLFR and HiFi CCS), we were able to phase more than 99% heterozygous SNVs, which helps to improve the benchmark to be haplotype level. Our study provided comprehensive sequencing data as well as the integrated variation benchmark of an Asian derived cell line, which would be valuable for future sequencing-based clinical development.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7300012 | PMC |
http://dx.doi.org/10.1038/s41598-020-66605-6 | DOI Listing |
Cell Rep Methods
August 2025
Department of Biomedical Engineering and Computational Biology Program, OHSU, Portland, OR, USA; Knight Cancer Institute, OHSU, Portland, OR, USA. Electronic address:
We present UniFORM, a non-parametric, Python-based pipeline for normalizing multiplex tissue imaging (MTI) data at both the feature and pixel levels. UniFORM employs an automated rigid landmark registration method tailored to the distributional characteristics of MTI, with UniFORM operating without prior distributional assumptions and handling both unimodal and bimodal patterns. By aligning the biologically invariant negative populations, UniFORM removes technical variation while preserving tissue-specific expression patterns in positive populations.
View Article and Find Full Text PDFMol Ecol Resour
September 2025
Centre for Evolutionary Hologenomics (CEH), Globe Institute, University of Copenhagen, Copenhagen, Denmark.
Global efforts to standardise methodologies benefit greatly from open-source procedures that enable the generation of comparable data. Here, we present a modular, high-throughput nucleic acid extraction protocol standardised within the Earth Hologenome Initiative to generate both genomic and microbial metagenomic data from faecal samples of vertebrates. The procedure enables the purification of either RNA and DNA in separate fractions (DREX1) or as total nucleic acids (DREX2).
View Article and Find Full Text PDFPalliat Med Rep
May 2025
Palliative Care Outcomes Collaboration, University of Wollongong, Wollongong, Australia.
Background: The Palliative Care Outcomes Collaboration (PCOC), established in 2005 and funded by the Australian Government, is a national quality improvement initiative that integrates patient outcome measures into routine clinical practice. While PCOC supports services to improve patient care, implementation across diverse clinical settings presents challenges, with variation observed between similarly resourced services. Engaging services in continuous quality improvement proves difficult as the program grows.
View Article and Find Full Text PDFBioinform Biol Insights
September 2025
School of Computer Science and Mathematics, Kingston University, London, UK.
Interpreting the effects of variants within the human genome and proteome is essential for analysing disease risk, predicting medication response, and developing personalised health interventions. Due to the intrinsic similarities between the structure of natural languages and genetic sequences, natural language processing techniques have demonstrated great applicability in computational variant effect prediction. In particular, the advent of the Transformer has led to significant advancements in the field.
View Article and Find Full Text PDFJ Dent
September 2025
Department of Endodontics, Recep Tayyip Erdogan University, Turkey. Electronic address:
Objectives: To assess patterns across 21 countries in dentists' thresholds for initiating operative treatment of active non-cavitated carious lesions and to evaluate the influence of caries risk, clinician characteristics, and geographic variation on decision-making in accordance with current guidelines.
Methods: A cross-sectional, vignette-style web-based survey was conducted between June and October 2023 across 21 countries. A standardized questionnaire, comprising theoretical radiographic scenarios of occlusal and approximal active non-cavitated carious lesions at four progressive stages (E1,E2,EDJ,D1), was distributed to general dentists and specialists.