Publications by authors named "Ruibang Luo"

The prevention of chronic disease is a long-term combat with continual fine-tuning to adapt to the course of disease. Without comprehensive insights, prescriptions may prioritize short-term gains but deviate from trajectories toward long-term survival. Here we introduce Duramax, an evidence-based framework empowered by reinforcement learning to optimize long-term preventive strategies.

View Article and Find Full Text PDF

Background: Cardiovascular disease (CVD) is the leading cause of mortality and morbidity in China and worldwide while we are lacking in validated primary prevention model specifically for Chinese. To identify CVD high-risk individuals for early intervention, we created and validated a primary prevention risk prediction model, Personalized CARdiovascular DIsease risk Assessment for Chinese (1°P-CARDIAC), in contemporary Chinese cohorts in Hong Kong.

Methods: Patients without any history of CVD was categorized as derivation and validation cohorts based on their different geographical location of residence in Hong Kong.

View Article and Find Full Text PDF

Motivation: Rare diseases affect over 300 million people worldwide and are often caused by genetic variants. While variant detection has become cost-effective, interpreting these variants-particularly collecting literature-based evidence like ACMG/AMP PM3-remains complex and time-consuming.

Results: We present AutoPM3, a method that automates PM3 evidence extraction from literatures using open-source large language models (LLMs).

View Article and Find Full Text PDF

Long-read sequencing technologies have great potential for the comprehensive discovery of structural variations (SVs). However, accurate genotype assignment for SVs remains challenging due to unavoidable sequencing errors, limited coverage, and the complexity of SVs. Herein, we propose cuteFC, which employs self-adaptive clustering along with a multiallele-aware clustering to achieve accurate SV regenotyping through a force-calling approach.

View Article and Find Full Text PDF

Objective: Cholelithiasis and gastroesophageal reflux disease (GERD) contribute to significant health concerns. We aimed to investigate the potential observational, causal, and genetic relationships between cholelithiasis and GERD.

Design: The observational correlations were assessed based on the prospective cohort study from UK Biobank.

View Article and Find Full Text PDF

Differential high-order chromatin interactions between homologous chromosomes affect many biological processes. Traditional chromatin conformation capture genome analysis methods mainly identify two-way interactions and cannot provide comprehensive haplotype information, especially for low-heterozygosity organisms such as human. Here, we present a pipeline of methods to delineate diploid high-order chromatin interactions from noisy Pore-C outputs.

View Article and Find Full Text PDF

Variant calling using long-read RNA sequencing (lrRNA-seq) can be applied to diverse tasks, such as capturing full-length isoforms and gene expression profiling. It poses challenges, however, due to higher error rates than DNA data, the complexities of transcript diversity, RNA editing events, etc. In this paper, we propose Clair3-RNA, the first deep learning-based variant caller tailored for lrRNA-seq data.

View Article and Find Full Text PDF
Article Synopsis
  • Ensuring a unified representation of genetic variants is crucial for accurate downstream analysis, but current methods often treat this unification as a later step, which can lead to inconsistencies.
  • Repun is a new algorithm designed to align variant representations before variant calling, improving the reliability of training models for deep learning while also assessing alignment quality more effectively.
  • This approach uses haplotype information to streamline the unification process, achieving over 99.99% precision and more than 99.5% recall in tests across multiple sequencing platforms, and is available as an open-source tool.
View Article and Find Full Text PDF

A vast amount of single-cell RNA sequencing (SC) data have been accumulated via various studies and consortiums, but the lack of spatial information limits its analysis of complex biological activities. To bridge this gap, we introduce CellContrast, a computational method for reconstructing spatial relationships among SC cells from spatial transcriptomics (ST) reference. By adopting a contrastive learning framework and training with ST data, CellContrast projects gene expressions into a hidden space where proximate cells share similar representation values.

View Article and Find Full Text PDF

Background: Irritable bowel syndrome (IBS) significantly impacts individuals due to its prevalence and negative effect on quality of life. Current genome-wide association studies (GWAS) have only identified a small number of crucial single nucleotide polymorphisms (SNPs), not fully elucidating IBS's pathogenesis.

Objective: To identify genomic loci at which common genetic variation influences IBS susceptibility.

View Article and Find Full Text PDF

Transcriptional regulation, critical for cellular differentiation and adaptation to environmental changes, involves coordinated interactions among DNA sequences, regulatory proteins, and chromatin architecture. Despite extensive data from consortia like ENCODE, understanding the dynamics of cis-regulatory elements (CREs) in gene expression remains challenging. Deep learning is a powerful tool for learning gene expression and epigenomic signals from DNA sequences, exhibiting superior performance compared to conventional machine learning approaches.

View Article and Find Full Text PDF
Article Synopsis
  • COVID-19 can lead to heart issues, and different SARS-CoV-2 variants vary in how they impact heart cells (cardiomyocytes).
  • The study examined the effects of these variants using human heart cells grown in labs and tested them in Golden Syrian hamsters, revealing that the Omicron BA.2 variant had the most significant harmful effects on heart cells.
  • Findings indicate that Omicron BA.2 infects heart cells through a unique process and causes changes that could lead to heart dysfunction, suggesting that even variants seen as mild can pose serious risks for cardiac health and warrant further research.
View Article and Find Full Text PDF

Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder for which current treatments are limited and drug development costs are prohibitive. Identifying drug targets for ASD is crucial for the development of targeted therapies. Summary-level data of expression quantitative trait loci obtained from GTEx, protein quantitative trait loci data from the ROSMAP project, and two ASD genome-wide association studies datasets were utilized for discovery and replication.

View Article and Find Full Text PDF

Objectives: The clinical decision-making regarding choosing surgery alone (SA) or surgery followed by postoperative adjuvant chemotherapy (SPOCT) in esophageal squamous cell carcinoma (ESCC) remains controversial. We aim to propose a pre-therapy PET/CT image-based deep learning approach to improve the survival benefit and clinical management of ESCC patients.

Methods: This retrospective multicenter study included 837 ESCC patients from three institutions.

View Article and Find Full Text PDF

Background: Sleep problems are prevalent. However, the impact of sleep patterns on digestive diseases remains uncertain. Moreover, the interaction between sleep patterns and genetic predisposition with digestive diseases has not been comprehensively explored.

View Article and Find Full Text PDF

Aims: Cardiovascular disease (CVD) is a leading cause of mortality, especially in developing countries. This study aimed to develop and validate a CVD risk prediction model, Personalized CARdiovascular DIsease risk Assessment for Chinese (P-CARDIAC), for recurrent cardiovascular events using machine learning technique.

Methods And Results: Three cohorts of Chinese patients with established CVD were included if they had used any of the public healthcare services provided by the Hong Kong Hospital Authority (HA) since 2004 and categorized by their geographical locations.

View Article and Find Full Text PDF
Article Synopsis
  • Antibiotic resistance in bacteria poses a major global health threat, and identifying antibiotic resistance genes (ARGs) through high-throughput sequencing is crucial for monitoring their spread and evolution.* -
  • The study introduces ARGNet, a deep learning tool that enhances ARG identification and classification without relying on traditional sequence alignment methods, allowing it to effectively discover both known and novel ARGs from varying sequence lengths.* -
  • Performance tests reveal that ARGNet significantly outperforms existing models in speed and accuracy, making it a valuable resource for researchers, with its code and online service freely available to the public.*
View Article and Find Full Text PDF

Background & Aims: Inflammatory bowel disease (IBD) is commonly associated with extraintestinal complications, including autoimmune liver disease. The co-occurrence of IBD and primary biliary cholangitis (PBC) has been increasingly observed, but the underlying relationship between these conditions remains unclear.

Methods: Using summary statistics from genome-wide association studies (GWAS), we investigated the causal effects between PBC and IBD, including Crohn's disease (CD) and ulcerative colitis (UC).

View Article and Find Full Text PDF

Background: Emerging evidence suggests that Rho GTPases play a crucial role in tumorigenesis and metastasis, but their involvement in the tumor microenvironment (TME) and prognosis of hepatocellular carcinoma (HCC) is not well understood.

Methods: We aim to develop a tumor prognosis prediction system called the Rho GTPases-related gene score (RGPRG score) using Rho GTPase signaling genes and further bioinformatic analyses.

Results: Our work found that HCC patients with a high RGPRG score had significantly worse survival and increased immunosuppressive cell fractions compared to those with a low RGPRG score.

View Article and Find Full Text PDF

Summary: Third-generation long-read sequencing is an increasingly utilized technique for profiling human immunodeficiency virus (HIV) quasispecies and detecting drug resistance mutations due to its ability to cover the entire viral genome in individual reads. Recently, the ClusterV tool has demonstrated accurate detection of HIV quasispecies from Nanopore long-read sequencing data. However, the need for scripting skills and a computational environment may act as a barrier for many potential users.

View Article and Find Full Text PDF

Aims: Dissecting complex interactions among transcription factors (TFs), microRNAs (miRNAs) and long noncoding RNAs (lncRNAs) are central for understanding heart development and function. Although computational approaches and platforms have been described to infer relationships among regulatory factors and genes, current approaches do not adequately account for how highly diverse, interacting regulators that include noncoding RNAs (ncRNAs) control cardiac gene expression dynamics over time.

Methods: To overcome this limitation, we devised an integrated framework, cardiac gene regulatory modeling (CGRM) that integrates LogicTRN and regulatory component analysis bioinformatics modeling platforms to infer complex regulatory mechanisms.

View Article and Find Full Text PDF

Background: HIV infections often develop drug resistance mutations (DRMs), which can increase the risk of virological failure. However, it has been difficult to determine if minor mutations occur in the same genome or in different virions using Sanger sequencing and short-read sequencing methods. Oxford Nanopore Technologies (ONT) sequencing may improve antiretroviral resistance profiling by allowing for long-read clustering.

View Article and Find Full Text PDF

Background: With the continuous advances in third-generation sequencing technology and the increasing affordability of next-generation sequencing technology, sequencing data from different sequencing technology platforms is becoming more common. While numerous benchmarking studies have been conducted to compare variant-calling performance across different platforms and approaches, little attention has been paid to the potential of leveraging the strengths of different platforms to optimize overall performance, especially integrating Oxford Nanopore and Illumina sequencing data.

Results: We investigated the impact of multi-platform data on the performance of variant calling through carefully designed experiments with a deep learning-based variant caller named Clair3-MP (Multi-Platform).

View Article and Find Full Text PDF