Publications by authors named "Hufeng Zhou"

Lung cancer is the leading cause of cancer mortality. To investigate genetic determinants for prognosis among patients diagnosed with early-stage non-small cell lung cancer (NSCLC), we conducted the first large-scale genome-wide association prognostic study using data from the International Lung Cancer Consortium (ILCCO) through a two-phase analysis. Phase 1 includes the discovery of genome-wide association studies analysis using a multivariable Cox PH model on 3428 NSCLC patients of European ancestry from 10 ILCCO participating studies to identify genetic variants associated with overall survival and validation analysis for genome-wide significant variants (P-value ≤5 × 10-8) using the Cancer Genome Atlas (TCGA).

View Article and Find Full Text PDF

Study Objectives: Excessive daytime sleepiness (EDS), influenced by environmental and social-behavioral factors, is reported by a subset of patients with sleep apnea - a group that may be at elevated cardiovascular risk. However, it is unclear whether sleep apnea with and without EDS have distinct genetic underpinnings. In this study, we perform gene-by-EDS interaction analyses for apnea hypopnea index (AHI), a diagnostic marker of sleep apnea severity, to understand EDS's influence on its underlying genetic risk.

View Article and Find Full Text PDF

While international efforts have characterized genetic variation in millions of individuals, the interplay of environmental, social, cultural, and genetic factors is poorly understood for most worldwide populations. The province of Quebec in Canada has been the site of numerous genetic studies, often focusing on individual Mendelian diseases in founder sub-populations. Here, we profiled and analyzed genome-wide genotyped variation in 29,337 Quebec residents from the large population-based cohort CARTaGENE (CaG), including rich phenotype and environmental data.

View Article and Find Full Text PDF

Large-scale whole-genome sequencing (WGS) studies have improved our understanding of the contributions of coding and noncoding rare variants to complex human traits. Leveraging association effect sizes across multiple traits in WGS rare variant association analysis can improve statistical power over single-trait analysis, and also detect pleiotropic genes and regions. Existing multi-trait methods have limited ability to perform rare variant analysis of large-scale WGS data.

View Article and Find Full Text PDF
Article Synopsis
  • * This study is the first large-scale analysis examining the relationship between EDS and genetic variations related to OSA severity, using data from over 11,500 samples across diverse populations.
  • * Researchers identified 16 genetic targets linked to EDS and OSA, with eight being new discoveries, and discussed potential therapeutic implications involving insulin resistance and nutritional factors for patients with OSA and EDS.
View Article and Find Full Text PDF

Motivation: Functional Annotation of genomic Variants Online Resources (FAVOR) offers multi-faceted, whole genome variant functional annotations, which is essential for Whole Genome and Exome Sequencing (WGS/WES) analysis and the functional prioritization of disease-associated variants. A versatile chatbot designed to facilitate informative interpretation and interactive, user-centric summary of the whole genome variant functional annotation data in the FAVOR database is needed.

Results: We have developed FAVOR-GPT, a generative natural language interface powered by integrating large language models (LLMs) and FAVOR.

View Article and Find Full Text PDF
Article Synopsis
  • The study investigates how rare non-coding genetic variations affect complex traits, specifically focusing on human height by analyzing data from over 333,100 individuals across three large datasets.
  • Researchers found 29 significant rare variants linked to height, with impacts ranging from a decrease of 7 cm to an increase of 4.7 cm, after considering previously known variants.
  • The team also identified specific non-coding variants near key genes associated with height, demonstrating a new method for understanding the effects of rare variants in regulatory regions using whole-genome sequencing.
View Article and Find Full Text PDF

Large-scale, multi-ethnic whole-genome sequencing (WGS) studies, such as the National Human Genome Research Institute Genome Sequencing Program's Centers for Common Disease Genomics (CCDG), play an important role in increasing diversity for genetic research. Before performing association analyses, assessing Hardy-Weinberg equilibrium (HWE) is a crucial step in quality control procedures to remove low quality variants and ensure valid downstream analyses. Diverse WGS studies contain ancestrally heterogeneous samples; however, commonly used HWE methods assume that the samples are homogeneous.

View Article and Find Full Text PDF

R-loop-triggered collateral single-stranded DNA (ssDNA) nuclease activity within Class 1 Type I CRISPR-Cas systems holds immense potential for nucleic acid detection. However, the hyperactive ssDNase activity of Cas3 introduces unwanted noise and false-positive results. In this study, we identified a novel Type I-A Cas3 variant derived from Thermococcus siculi, which remains in an auto-inhibited state until it is triggered by Cascade complex and R-loop formation.

View Article and Find Full Text PDF
Article Synopsis
  • Large-scale whole-genome sequencing (WGS) studies have enhanced our understanding of how rare genetic variants affect complex human traits through better analysis techniques.* -
  • Current methods for analyzing multiple traits are limited in their ability to handle rare variants in large WGS datasets, prompting the development of MultiSTAAR.* -
  • MultiSTAAR enables more powerful analysis by considering relatedness, population structure, and the correlation between traits, leading to the discovery of new genetic associations in lipid traits that single-trait analyses missed.*
View Article and Find Full Text PDF
Article Synopsis
  • MetaSTAAR is a new framework designed for analyzing rare genetic variants in large studies, specifically whole genome and whole exome sequencing (WGS/WES).
  • It effectively manages relatedness and population differences while analyzing various traits, enhancing the ability to detect significant rare variant associations by utilizing functional annotations.
  • In tests with over 30,000 diverse samples, MetaSTAAR yielded results similar to pooled data analysis and successfully identified significant rare variant associations related to lipid traits.
View Article and Find Full Text PDF

Large biobank-scale whole genome sequencing (WGS) studies are rapidly identifying a multitude of coding and non-coding variants. They provide an unprecedented resource for illuminating the genetic basis of human diseases. Variant functional annotations play a critical role in WGS analysis, result interpretation, and prioritization of disease- or trait-associated causal variants.

View Article and Find Full Text PDF
Article Synopsis
  • Large-scale whole-genome sequencing studies allow researchers to examine associations between rare noncoding variants and complex diseases, although current methods struggle with the noncoding genome analysis.
  • The STAARpipeline framework offers a comprehensive solution for detecting noncoding rare variant associations through various analytical approaches, including gene-centric and non-gene-centric analyses that utilize functional annotations.
  • The effectiveness of STAARpipeline is demonstrated through its application in identifying significant noncoding RV sets linked to lipid traits in over 21,000 samples, with successful replication in an additional group, and further analysis of other traits.
View Article and Find Full Text PDF

To identify new susceptibility loci to lung cancer among diverse populations, we performed cross-ancestry genome-wide association studies in European, East Asian and African populations and discovered five loci that have not been previously reported. We replicated 26 signals and identified 10 new lead associations from previously reported loci. Rare-variant associations tended to be specific to populations, but even common-variant associations influencing smoking behavior, such as those with CHRNA5 and CYP2A6, showed population specificity.

View Article and Find Full Text PDF

Allele frequency estimates in admixed populations, such as Hispanics and Latinos, rely on the sample's specific admixture composition and thus may differ between two seemingly similar populations. However, ancestry-specific allele frequencies, i.e.

View Article and Find Full Text PDF

Attempts to identify and prioritize functional DNA elements in coding and non-coding regions, particularly through use of in silico functional annotation data, continue to increase in popularity. However, specific functional roles can vary widely from one variant to another, making it challenging to summarize different aspects of variant function with a one-dimensional rating. Here we propose multi-dimensional annotation-class integrative estimation (MACIE), an unsupervised multivariate mixed-model framework capable of integrating annotations of diverse origin to assess multi-dimensional functional roles for both coding and non-coding variants.

View Article and Find Full Text PDF

Background/objectives: Neck circumference, an index of upper airway fat, has been suggested to be an important measure of body-fat distribution with unique associations with health outcomes such as obstructive sleep apnea and metabolic disease. This study aims to study the genetic bases of neck circumference.

Methods: We conducted a multi-ethnic genome-wide association study of neck circumference, adjusted and unadjusted for BMI, in up to 15,090 European Ancestry (EA) and African American (AA) individuals.

View Article and Find Full Text PDF

The Epstein-Barr virus (EBV) episome is known to interact with the three-dimensional structure of the human genome in infected cells. However, the exact locations of these interactions and their potential functional consequences remain unclear. Recently, high-resolution chromatin conformation capture (Hi-C) assays in lymphoblastoid cells have become available, enabling us to precisely map the contacts between the EBV episome(s) and the human host genome.

View Article and Find Full Text PDF

Clinical trial results have recently demonstrated that inhibiting inflammation by targeting the interleukin-1β pathway can offer a significant reduction in lung cancer incidence and mortality, highlighting a pressing and unmet need to understand the benefits of inflammation-focused lung cancer therapies at the genetic level. While numerous genome-wide association studies (GWAS) have explored the genetic etiology of lung cancer, there remains a large gap between the type of information that may be gleaned from an association study and the depth of understanding necessary to explain and drive translational findings. Thus, in this study we jointly model and integrate extensive multiomics data sources, utilizing a total of 40 genome-wide functional annotations that augment previously published results from the International Lung Cancer Consortium (ILCCO) GWAS, to prioritize and characterize single nucleotide polymorphisms (SNPs) that increase risk of squamous cell lung cancer through the inflammatory and immune responses.

View Article and Find Full Text PDF

Large-scale whole-genome sequencing studies have enabled the analysis of rare variants (RVs) associated with complex phenotypes. Commonly used RV association tests have limited scope to leverage variant functions. We propose STAAR (variant-set test for association using annotation information), a scalable and powerful RV association test method that effectively incorporates both variant categories and multiple complementary annotations using a dynamic weighting scheme.

View Article and Find Full Text PDF

Whole-genome sequencing (WGS) studies are being widely conducted in order to identify rare variants associated with human diseases and disease-related traits. Classical single-marker association analyses for rare variants have limited power, and variant-set-based analyses are commonly used by researchers for analyzing rare variants. However, existing variant-set-based approaches need to pre-specify genetic regions for analysis; hence, they are not directly applicable to WGS data because of the large number of intergenic and intron regions that consist of a massive number of non-coding variants.

View Article and Find Full Text PDF

In this study, animal experimentation verified that the canonical Wnt/β-catenin signaling pathway was activated under a reduced activity of p-β-catenin (Ser33/37/Thr41) and an increased accumulation of β-catenin in the lungs and kidneys of pigs infected with a highly virulent strain of . In PK-15 and NPTr cells, it was also confirmed that infection with a high-virulence strain of induced cytoplasmic accumulation and nuclear translocation of β-catenin. infection caused a sharp degradation of E-cadherin and an increase of the epithelial cell monolayer permeability, as well as a broken interaction between β-catenin and E-cadherin dependent on Wnt/β-catenin signaling pathway.

View Article and Find Full Text PDF

Epstein-Barr virus nuclear antigen (EBNA) leader protein (EBNALP) is one of the first viral genes expressed upon B-cell infection. EBNALP is essential for EBV-mediated B-cell immortalization. EBNALP is thought to function primarily by coactivating EBNA2-mediated transcription.

View Article and Find Full Text PDF

Haemophilus parasuis, an important swine pathogen, was recently proven able to invade into endothelial or epithelial cell in vitro. NOD1/2 are specialized NLRs that participate in the recognition of pathogens able to invade intracellularly and therefore, we assessed that the contribution of NOD1/2 to inflammation responses during H. parasuis infection.

View Article and Find Full Text PDF

Epstein-Barr virus (EBV) transforms B cells to continuously proliferating lymphoblastoid cell lines (LCLs), which represent an experimental model for EBV-associated cancers. EBV nuclear antigens (EBNAs) and LMP1 are EBV transcriptional regulators that are essential for LCL establishment, proliferation, and survival. Starting with the 3D genome organization map of LCL, we constructed a comprehensive EBV regulome encompassing 1,992 viral/cellular genes and enhancers.

View Article and Find Full Text PDF