Genotype error due to low-coverage sequencing induces uncertainty in polygenic scoring.

Am J Hum Genet

Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, USA; Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA; Department of Human Genetics, David Geffen

Published: August 2023


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Polygenic scores (PGSs) have emerged as a standard approach to predict phenotypes from genotype data in a wide array of applications from socio-genomics to personalized medicine. Traditional PGSs assume genotype data to be error-free, ignoring possible errors and uncertainties introduced from genotyping, sequencing, and/or imputation. In this work, we investigate the effects of genotyping error due to low coverage sequencing on PGS estimation. We leverage SNP array and low-coverage whole-genome sequencing data (lcWGS, median coverage 0.04×) of 802 individuals from the Dana-Farber PROFILE cohort to show that PGS error correlates with sequencing depth (p = 1.2 × 10). We develop a probabilistic approach that incorporates genotype error in PGS estimation to produce well-calibrated PGS credible intervals and show that the probabilistic approach increases classification accuracy by up to 6% as compared to traditional PGSs that ignore genotyping error. Finally, we use simulations to explore the combined effect of genotyping and effect size errors and their implication on PGS-based risk-stratification. Our results illustrate the importance of considering genotyping error as a source of PGS error especially for cohorts with varying genotyping technologies and/or low-coverage sequencing.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10432141PMC
http://dx.doi.org/10.1016/j.ajhg.2023.06.015DOI Listing

Publication Analysis

Top Keywords

genotyping error
12
genotype error
8
low-coverage sequencing
8
genotype data
8
traditional pgss
8
pgs estimation
8
pgs error
8
probabilistic approach
8
sequencing
6
genotyping
6

Similar Publications

ANASFV: a workflow for African swine fever virus whole-genome analysis.

Microb Genom

September 2025

Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong, PR China.

African swine fever virus (ASFV) is highly transmissible and can cause up to 100% mortality in pigs. The virus has spread across most regions of Asia and Europe, resulting in the deaths of millions of pigs. A deep understanding of the genetic diversity and evolutionary dynamics of ASFV is necessary to effectively manage outbreaks.

View Article and Find Full Text PDF

Objective: The development of non-invasive clinical diagnostics is paramount for the early detection of Alzheimer's disease (AD). Neurofibrillary tangles in AD originate from the entorhinal cortex, a cortical memory area that mediates navigation via path integration (PI). Here, we studied correlations between PI errors and levels of a range of AD biomarkers using a 3D virtual reality navigation system to explore PI as a non-invasive surrogate marker for early detection.

View Article and Find Full Text PDF

Genomic selection is an extension of marker-assisted selection by leveraging thousands of molecular markers distributed across the genome to capture the maximum possible proportion of the genetic variance underlying complex traits. In this study, genomic prediction models were developed by integrating phenological, physiological, and high-throughput phenotyping traits to predict grain yield in bread wheat (Triticum aestivum L.) under three environmental conditions: irrigation, drought stress, and terminal heat stress.

View Article and Find Full Text PDF

Background: Heteroplasmy, the presence of more than one type of mitochondrial DNA (mtDNA) within an individual, is an exception to the maternal transmission of mtDNA and has been observed in several animal species. A central question is whether heteroplasmy among individuals and across generations is mainly influenced by genetic drift or by selection.

Results: We quantified heteroplasmy in eight males, eight females and eight unfertilized eggs per female from a natural population of the hybrid frog species Pelophylax esculentus (between P.

View Article and Find Full Text PDF

Objective: Chronic obstructive pulmonary disease (COPD) remains a leading cause of disability and mortality among elderly populations. Studies indicate that plays a critical regulatory role in the pathogenesis of respiratory disorders. However, the genetic variations in to COPD susceptibility remain incompletely understood.

View Article and Find Full Text PDF