Publications by authors named "Iain S Forrest"

Accurate variant penetrance estimation is crucial for precision medicine. We constructed machine learning (ML) models for 10 diseases using 1,347,298 participants with electronic health records, then applied them to an independent cohort with linked exome data. Resulting probabilities were used to evaluate ML penetrance of 1648 rare variants in 31 autosomal dominant disease-predisposition genes.

View Article and Find Full Text PDF

We evaluated whether predicted continuous disease representations could enhance genetic discovery beyond case-control genome-wide association study (GWAS) phenotypes across eight complex diseases in up to 485,448 UK Biobank participants. Predicted phenotypes had high genetic correlations with case-control phenotypes (median r = 0.66) but identified more independent associations (median 306 versus 125).

View Article and Find Full Text PDF

Understanding the disease risk of genetic variants is fundamental to precision medicine. Estimates of penetrance-the probability of disease for individuals with a variant allele-rely on disease-specific cohorts, clinical testing and emerging electronic health record (EHR)-linked biobanks. These data sources, while valuable, each have limitations in quality, representativeness and analyzability.

View Article and Find Full Text PDF

Background And Aims: An in silico quantitative score of coronary artery disease (ISCAD), built using machine learning and clinical data from electronic health records, has been shown to result in gradations of risk of subclinical atherosclerosis, coronary artery disease (CAD) sequelae, and mortality. Large-scale metabolite biomarker profiling provides increased portability and objectivity in machine learning for disease prediction and gradation. However, these models have not been fully leveraged.

View Article and Find Full Text PDF
Article Synopsis
  • Mode of inheritance (MOI) is crucial for understanding pathogenic variants, yet most variants lack this information, particularly impacting recessive diseases.
  • MOI-Pred and ConMOI are new tools developed to predict variant pathogenicity by incorporating MOI, with MOI-Pred focusing on both dominant and recessive variants through evolutionary and functional data.
  • Both tools have shown high accuracy in benchmarks and real-world evaluations, with ConMOI outperforming individual methods, underscoring the benefits of using a consensus approach for variant predictions.
View Article and Find Full Text PDF
Article Synopsis
  • Coronary artery disease (CAD) involves a mix of risk factors and processes, and a new machine learning-based score can help track its progression and severity.
  • Researchers tested this score against rare gene variants in different biobanks and found significant associations in 17 genes, with 14 receiving prior support related to CAD.
  • The study suggests that there are likely more ultrarare gene variants associated with CAD, highlighting how digital tools can improve genetic research in complex diseases.
View Article and Find Full Text PDF

Background: Diet is a key modifiable risk factor of coronary artery disease (CAD). However, the causal effects of specific dietary traits on CAD risk remain unclear. With the expansion of dietary data in population biobanks, Mendelian randomization (MR) could help enable the efficient estimation of causality in diet-disease associations.

View Article and Find Full Text PDF
Article Synopsis
  • Population-based genomic screening helps identify individuals at risk for diseases by analyzing their genetic variants alongside their health records.
  • In a study of over 29,000 participants, researchers found 614 individuals with significant genetic variants, but 76% of these cases had no prior clinical diagnosis.
  • The findings suggest that genomic screening may uncover previously undiagnosed conditions, showing a higher prevalence of harmful genetic variants than clinical diagnoses and illustrating the importance of genetic testing in identifying untreated diseases.
View Article and Find Full Text PDF

Studies have shown that drug targets with human genetic support are more likely to succeed in clinical trials. Hence, a tool integrating genetic evidence to prioritize drug target genes is beneficial for drug discovery. We built a genetic priority score (GPS) by integrating eight genetic features with drug indications from the Open Targets and SIDER databases.

View Article and Find Full Text PDF

Background: Lyme disease is the most prevalent vector-borne disease in the US, yet its host factors are poorly understood and diagnostic tests are limited. We evaluated patients in a large health system to uncover cholesterol's role in the susceptibility, severity, and machine learning-based diagnosis of Lyme disease.

Methods: A longitudinal health system cohort comprised 1 019 175 individuals with electronic health record data and 50 329 with linked genetic data.

View Article and Find Full Text PDF

Systemic autoimmune rheumatic diseases (SARDs) can lead to irreversible damage if left untreated, yet these patients often endure long diagnostic journeys before being diagnosed and treated. Machine learning may help overcome the challenges of diagnosing SARDs and inform clinical decision-making. Here, we developed and tested a machine learning model to identify patients who should receive rheumatological evaluation for SARDs using longitudinal electronic health records of 161,584 individuals from two institutions.

View Article and Find Full Text PDF

Background: Causality between plasma triglyceride (TG) levels and atherosclerotic cardiovascular disease (ASCVD) risk remains controversial despite more than four decades of study and two recent landmark trials, STRENGTH, and REDUCE-IT. Further unclear is the association between TG levels and non-atherosclerotic diseases across organ systems.

Methods: Here, we conducted a phenome-wide, two-sample Mendelian randomization (MR) analysis using inverse-variance weighted (IVW) regression to systematically infer the causal effects of plasma TG levels on 2600 disease traits in the European ancestry population of UK Biobank.

View Article and Find Full Text PDF

Background: Binary diagnosis of coronary artery disease does not preserve the complexity of disease or quantify its severity or its associated risk with death; hence, a quantitative marker of coronary artery disease is warranted. We evaluated a quantitative marker of coronary artery disease derived from probabilities of a machine learning model.

Methods: In this cohort study, we developed and validated a coronary artery disease-predictive machine learning model using 95 935 electronic health records and assessed its probabilities as in-silico scores for coronary artery disease (ISCAD; range 0 [lowest probability] to 1 [highest probability]) in participants in two longitudinal biobank cohorts.

View Article and Find Full Text PDF

Phenome-wide association studies identified numerous loci associated with traits and diseases. To help interpret these associations, we constructed a phenome-wide network map of colocalized genes and phenotypes. We generated colocalized signals using the Genotype-Tissue Expression data and genome-wide association results in UK Biobank.

View Article and Find Full Text PDF

Genetic risk for coronary artery disease (CAD) is commonly measured with polygenic risk scores (PRS); yet, the relationship of atherosclerotic burden with PRS in healthy individuals not at high clinical risk for CAD (ie, without a high pooled cohort equations [PCE] score) is unknown. Here, we implemented a novel recall-by-PRS strategy to measure coronary artery calcium (CAC) scores prospectively in 53 healthy individuals with extreme high PRS (median [IQR] PRS = 94% [83-98]) and low PRS (median [IQR] PRS = 3.6% [1.

View Article and Find Full Text PDF

Background: Clinical features from electronic health records (EHRs) can be used to build a complementary tool to predict coronary artery disease (CAD) susceptibility.

Objectives: The purpose of this study was to determine whether an EHR score can improve CAD prediction and reclassification 1 year before diagnosis, beyond conventional clinical guidelines as determined by the pooled cohort equations (PCE) and a polygenic risk score for CAD.

Methods: We applied a machine learning framework using clinical features from the EHR in a multiethnic, clinical care cohort (BioMe) comprising 555 CAD cases and 6,349 control subjects and in a population-based cohort (UK Biobank) comprising 3,130 CAD cases and 378,344 control subjects for external validation.

View Article and Find Full Text PDF

Aims: Individuals with supranormal left ventricular ejection fraction (snLVEF; LVEF >70%) have increased mortality. However, the genetic and phenotypic profile of snLVEF remains unknown. This study aimed to determine the relationship of both snLVEF genetic risk and phenotype with survival and underdiagnosed heart failure (HF).

View Article and Find Full Text PDF

Importance: Population-based assessment of disease risk associated with gene variants informs clinical decisions and risk stratification approaches.

Objective: To evaluate the population-based disease risk of clinical variants in known disease predisposition genes.

Design, Setting, And Participants: This cohort study included 72 434 individuals with 37 780 clinical variants who were enrolled in the BioMe Biobank from 2007 onwards with follow-up until December 2020 and the UK Biobank from 2006 to 2010 with follow-up until June 2020.

View Article and Find Full Text PDF

Background Despite advances in cardiovascular disease and risk factor management, mortality from ischemic heart failure (HF) in patients with coronary artery disease (CAD) remains high. Given the partial role of genetics in HF and lack of reliable risk stratification tools, we developed and validated a polygenic risk score for HF in patients with CAD, which we term HF-PRS. Methods and Results Using summary statistics from a recent genome-wide association study for HF, we developed candidate PRSs in the Mount Sinai Bio CAD patient cohort (N=6274) by using the pruning and thresholding method and LDPred.

View Article and Find Full Text PDF

Purpose: Limited mechanical ventilators (MV) during the Coronavirus disease (COVID-19) pandemic have led to the use of non-invasive ventilation (NIV) in hypoxemic patients, which has not been studied well. We aimed to assess the association of NIV versus MV with mortality and morbidity during respiratory intervention among hypoxemic patients admitted with COVID-19.

Methods: We performed a retrospective multi-center cohort study across 5 hospitals during March-April 2020.

View Article and Find Full Text PDF

Biobanks with exomes linked to electronic health records (EHRs) enable the study of genetic pleiotropy between rare variants and seemingly disparate diseases. We performed robust clinical phenotyping of rare, putatively deleterious variants (loss-of-function [LoF] and deleterious missense variants) in ERCC6, a gene implicated in inherited retinal disease. We analyzed 213,084 exomes, along with a targeted set of retinal, cardiac, and immune phenotypes from two large-scale EHR-linked biobanks.

View Article and Find Full Text PDF

Diabetic retinopathy (DR) is a common consequence in type 2 diabetes (T2D) and a leading cause of blindness in working-age adults. Yet, its genetic predisposition is largely unknown. Here, we examined the polygenic architecture underlying DR by deriving and assessing a genome-wide polygenic risk score (PRS) for DR.

View Article and Find Full Text PDF

Adverse side effects often account for the failure of drug clinical trials. We evaluated whether a phenome-wide association study (PheWAS) of 1167 phenotypes in >360,000 U.K.

View Article and Find Full Text PDF