Publications by authors named "Geyu Zhou"

The broader application of polygenic risk score (PRS) is hindered by the limited transferability of PRS developed in Europeans to non-European populations. While many statistical methods have been developed to improve the performance of PRS in non-European populations, most of them focused on discrete genetic ancestry clusters and did not consider admixed individuals. Admixed individuals pose a unique challenge for PRS calculation due to the complexity of local ancestry and cross-ancestry effect sizes.

View Article and Find Full Text PDF

Methylome-wide association studies (MWASs) have identified many 5'-cytosine-phosphate-guanine-3' (CpG) sites associated with complex traits. Several methods have been developed to predict CpG methylation levels from genotypes when the direct measurements of methylation are unavailable. To date, the published methods have mostly used datasets from populations of European ancestry to train prediction models for methylations, which limits the generalizability of methylome-wide association study to non-European populations.

View Article and Find Full Text PDF

Many multi-population polygenic risk score (PRS) methods have been proposed to improve prediction accuracy in underrepresented populations; however, no single method outperforms other methods across all data scenarios. Although integrating PRS results across multiple methods and populations may lead to more accurate predictions, this approach may be limited by the availability of individual-level tuning data to calculate combination weights. In this manuscript, we introduce MIXPRS, a robust PRS integration framework based on data fission principles, to effectively combine multiple multi-population PRS methods using only genome-wide association study (GWAS) summary statistics from multiple populations.

View Article and Find Full Text PDF

Background: Polygenic scores (PGSs) have shown promise in predicting disease risk, but their predictive accuracy remains limited for many complex diseases. Leveraging the shared genetic architecture among correlated traits may improve prediction performance.

Methods: We developed a flexible framework for constructing multi-trait PGSs by integrating candidate PGSs (N=2,651) derived from publicly available GWAS summary statistics (N=51)-using single-trait, MTAG-all, and MTAG-pairwise approaches.

View Article and Find Full Text PDF

Genetic risk prediction for non-European populations is hindered by limited Genome-Wide Association Study (GWAS) sample sizes and small tuning datasets. We propose JointPRS, a data-adaptive framework that leverages genetic correlations across multiple populations using GWAS summary statistics. It achieves accurate predictions without individual-level tuning data and remains effective in the presence of a small tuning set thanks to its data-adaptive approach.

View Article and Find Full Text PDF

Polygenic risk score has become increasingly popular for predicting the value of complex traits. In many settings, polygenic risk score is used as a covariate in regression analysis to study the association between different phenotypes. However, measurement error in polygenic risk score causes attenuation bias in the estimation of regression coefficients.

View Article and Find Full Text PDF

Brain anatomy plays a key role in complex behaviors and mental disorders that are sexually divergent. While our understanding of the sex differences in the brain anatomy remains relatively limited, particularly of the underlying genetic and molecular mechanisms that contribute to these differences. We performed the largest study of sex differences in brain volumes (N = 33,208) by examining sex differences both in the raw brain volumes and after controlling the whole brain volumes.

View Article and Find Full Text PDF

With the development of next-generation sequencing technology, de novo variants (DNVs) with deleterious effects can be identified and investigated for their effects on birth defects such as congenital heart disease (CHD). However, statistical power is still limited for such studies because of the small sample size due to the high cost of recruiting and sequencing samples and the low occurrence of DNVs. DNV analysis is further complicated by genetic heterogeneity across diseased individuals.

View Article and Find Full Text PDF

Polygenic scores (PGSs) are quantitative metrics for predicting phenotypic values, such as human height or disease status. Some PGS methods require only summary statistics of a relevant genome-wide association study (GWAS) for their score. One such method is Lassosum, which inherits the model selection advantages of Lasso to select a meaningful subset of the GWAS single-nucleotide polymorphisms as predictors from their association statistics.

View Article and Find Full Text PDF

Background: Models with polygenic risk scores and clinical factors to predict risk of different cancers have been developed, but these models have been limited by the polygenic risk score-derivation methods and the incomplete selection of clinical variables.

Methods: We used UK Biobank to train the best polygenic risk scores for 8 cancers (bladder, breast, colorectal, kidney, lung, ovarian, pancreatic, and prostate cancers) and select relevant clinical variables from 733 baseline traits through extreme gradient boosting (XGBoost). Combining polygenic risk scores and clinical variables, we developed Cox proportional hazards models for risk prediction in these cancers.

View Article and Find Full Text PDF

The disparity in genetic risk prediction accuracy between European and non-European individuals highlights a critical challenge in health inequality. To bridge this gap, we introduce JointPRS, a novel method that models multiple populations jointly to improve genetic risk predictions for non-European individuals. JointPRS has three key features.

View Article and Find Full Text PDF

Polygenic risk score (PRS) has become increasingly popular for predicting the value of complex traits. In many settings, PRS is used as a covariate in regression analysis to study the association between different phenotypes. However, measurement error in PRS causes attenuation bias in the estimation of regression coefficients.

View Article and Find Full Text PDF

Genetic prediction accuracy for non-European populations is hindered by the limited sample size of Genome-wide association studies (GWAS) data in these populations. Additionally, it is challenging to tune model parameters with a small tuning dataset for methods that require tuning data, which is often the case for non-European samples. To address these challenges, we propose JointPRS, a novel, data-adaptive framework that simultaneously models multiple populations using GWAS summary statistics.

View Article and Find Full Text PDF

Background: A large proportion of pulmonary embolism (PE) heritability remains unexplained, particularly among the East Asian (EAS) population. Our study aims to expand the genetic architecture of PE and reveal more genetic determinants in Han Chinese.

Methods: We conducted the first genome-wide association study (GWAS) of PE in Han Chinese, then performed the GWAS meta-analysis based on the discovery and replication stages.

View Article and Find Full Text PDF

Most existing TWAS tools require individual-level eQTL reference data and thus are not applicable to summary-level reference eQTL datasets. The development of TWAS methods that can harness summary-level reference data is valuable to enable TWAS in broader settings and enhance power due to increased reference sample size. Thus, we develop a TWAS framework called OTTERS (Omnibus Transcriptome Test using Expression Reference Summary data) that adapts multiple polygenic risk score (PRS) methods to estimate eQTL weights from summary-level eQTL reference data and conducts an omnibus TWAS.

View Article and Find Full Text PDF

Polygenic risk score (PRS) has demonstrated its great utility in biomedical research through identifying high-risk individuals for different diseases from their genotypes. However, the broader application of PRS to the general population is hindered by the limited transferability of PRS developed in Europeans to non-European populations. To improve PRS prediction accuracy in non-European populations, we develop a statistical method called SDPRX that can effectively integrate genome wide association study summary statistics from different populations.

View Article and Find Full Text PDF

Genome wide association studies (GWAS) can play an essential role in understanding genetic basis of complex traits in plants and animals. Conventional SNP-based linear mixed models (LMM) that marginally test single nucleotide polymorphisms (SNPs) have successfully identified many loci with major and minor effects in many GWAS. In plant, the relatively small population size in GWAS and the high genetic diversity found in many plant species can impede mapping efforts on complex traits.

View Article and Find Full Text PDF

Although there are pronounced sex differences for psychiatric disorders, relatively little has been published on the heterogeneity of sex-specific genetic effects for these traits until very recently for adults. Much less is known about children because most psychiatric disorders will not manifest until later in life and existing studies for children on psychiatric traits such as cognitive functions are underpowered. We used results from publicly available genome-wide association studies for six psychiatric disorders and individual-level data from the Adolescent Brain Cognitive Development (ABCD) study and the UK Biobank (UKB) study to evaluate the associations between the predicted polygenic risk scores (PRS) of these six disorders and observed cognitive functions, behavioral and brain imaging traits.

View Article and Find Full Text PDF

Genetic prediction of complex traits has great promise for disease prevention, monitoring, and treatment. The development of accurate risk prediction models is hindered by the wide diversity of genetic architecture across different traits, limited access to individual level data for training and parameter tuning, and the demand for computational resources. To overcome the limitations of the most existing methods that make explicit assumptions on the underlying genetic architecture and need a separate validation data set for parameter tuning, we develop a summary statistics-based nonparametric method that does not rely on validation datasets to tune parameters.

View Article and Find Full Text PDF

Ring chromosomes occur when the ends of normally rod-shaped chromosomes fuse. In ring chromosome 20 (ring 20), intellectual disability and epilepsy are usually present, even if there is no deleted coding material; the mechanism by which individuals with complete ring chromosomes develop seizures and other phenotypic abnormalities is not understood. We investigated altered gene transcription as a contributing factor by performing RNA-sequencing (RNA-seq) analysis on blood from seven patients with ring 20, and 11 first-degree relatives (all parents).

View Article and Find Full Text PDF

To increase statistical power to identify genes associated with complex traits, a number of transcriptome-wide association study (TWAS) methods have been proposed using gene expression as a mediating trait linking genetic variations and diseases. These methods first predict expression levels based on inferred expression quantitative trait loci (eQTLs) and then identify expression-mediated genetic effects on diseases by associating phenotypes with predicted expression levels. The success of these methods critically depends on the identification of eQTLs, which may not be functional in the corresponding tissue, due to linkage disequilibrium (LD) and the correlation of gene expression between tissues.

View Article and Find Full Text PDF
Article Synopsis
  • * A study analyzed the genomes of 290 TN patients and found an increase in harmful genetic variants linked to GABA receptor-binding genes, suggesting a disruption in pain signaling.
  • * Additional findings also pointed to rare mutations in sodium and calcium channels, indicating that issues with ion transport might contribute to TN symptoms.
View Article and Find Full Text PDF

Background: We did a phase 2 trial of pembrolizumab in patients with non-small-cell lung cancer (NSCLC) or melanoma with untreated brain metastases to determine the activity of PD-1 blockade in the CNS. Interim results were previously published, and we now report an updated analysis of the full NSCLC cohort.

Methods: This was an open-label, phase 2 study of patients from the Yale Cancer Center (CT, USA).

View Article and Find Full Text PDF

Purpose: Limb-girdle muscular dystrophies (LGMD) are a genetically heterogeneous category of autosomal inherited muscle diseases. Many genes causing LGMD have been identified, and clinical trials are beginning for treatment of some genetic subtypes. However, even with the gene-level mechanisms known, it is still difficult to get a robust and generalizable prevalence estimation for each subtype due to the limited amount of epidemiology data and the low incidence of LGMDs.

View Article and Find Full Text PDF

Colorectal cancer (CRC) is among the most frequently occurring cancers worldwide. Baicalin is isolated from the roots of Scutellaria baicalensis and is its dominant flavonoid. Anticancer activity of baicalin has been evaluated in different types of cancers, especially in CRC.

View Article and Find Full Text PDF