Publications by Jingyi Jessica Li | LitMetric

Publications by authors named "Jingyi Jessica Li"

Page 1 of 4

Genetic variants affecting RNA stability influence complex traits and disease risk.

Elaine Huang , Ting Fu , Ling Zhang , Guanao Yan , Ryo Yamamoto , Thuy Linh Nguyen , Jingyi Jessica Li

Nat Genet

September 2025

Gene expression is modulated jointly by transcriptional regulation and messenger RNA stability, yet the latter is often overlooked in studies on genetic variants. Here, leveraging metabolic labeling data (Bru/BruChase-seq) and a new computational pipeline, RNAtracker, we categorize genes as allele-specific RNA stability (asRS) or allele-specific RNA transcription events. We identify more than 5,000 asRS variants among 665 genes across a panel of 11 human cell lines.

View Article and Find Full Text PDF

Beware of counter-intuitive levels of false discoveries in datasets with strong intra-correlations.

Chakravarthi Kanduri , Maria Mamica , Emilie Willoch Olstad , Manuela Zucknick , Jingyi Jessica Li

Genome Biol

August 2025

The false discovery rate (FDR) controlling method by Benjamini and Hochberg (BH) is a popular choice in the omics fields. Here, we demonstrate that in datasets with a large degree of dependencies between features, FDR correction methods like BH can sometimes counter-intuitively report very high numbers of false positives, potentially misleading researchers. We call the attention of researchers to use suited multiple testing strategies and approaches like synthetic null data (negative control) to identify and minimize caveats related to false discoveries, as in the cases where false findings do occur, they may be numerous.

View Article and Find Full Text PDF

ClipperQTL: ultrafast and powerful eGene identification method.

Heather J Zhou , Xinzhou Ge , Jingyi Jessica Li

Genome Biol

July 2025

A central task in expression quantitative trait locus analysis is to identify cis-eGenes, i.e., genes whose expression levels are regulated by at least one local genetic variant.

View Article and Find Full Text PDF

Comment on "Data Fission: Splitting a Single Data Point", Data Fission for Unsupervised Learning: A Discussion on Post-Clustering Inference and the Challenges of Debiasing.

Changhu Wang , Xinzhou Ge , Dongyuan Song , Jingyi Jessica Li

J Am Stat Assoc

April 2025

View Article and Find Full Text PDF

Worm Perturb-Seq: massively parallel whole-animal RNAi and RNA-seq.

Hefei Zhang , Xuhang Li , Dongyuan Song , Onur Yukselen , Shivani Nanda , Jingyi Jessica Li

Nat Commun

May 2025

Transcriptomes provide highly informative molecular phenotypes that, combined with gene perturbation, can connect genotype to phenotype. An ultimate goal is to perturb every gene and measure transcriptome changes, however, this is challenging, especially in whole animals. Here, we present 'Worm Perturb-Seq (WPS)', a method that provides high-resolution RNA-sequencing profiles for hundreds of replicate perturbations at a time in living animals.

View Article and Find Full Text PDF

The Farm Animal Genotype-Tissue Expression (FarmGTEx) Project.

Lingzhao Fang , Jinyan Teng , Qing Lin , Zhonghao Bai , Shuli Liu , Bingjie Li , Yali Hou , Jacqueline Smith , Konrad Rawlik , Mathew Littlejohn , Yuwen Liu , Zhonglin Tang , Liangliang Fu , Lei Liu , Li Ma , Cong-Jun Li , Wansheng Liu , Eveline M Ibeagha-Awemu , Lin Lin , Elisabetta Giuffra , Jing Li , Lin Jiang , Hehe Liu , Mingzhou Li , Delin Mo , Xiaohong Liu , Jiaqi Li , Cong Li , Kui Li , Xinfeng Liu , Wei Li , Jingyi Jessica Li , George E Liu

Nat Genet

April 2025

Genetic mutation and drift, coupled with natural and human-mediated selection and migration, have produced a wide variety of genotypes and phenotypes in farmed animals. We here introduce the Farm Animal Genotype-Tissue Expression (FarmGTEx) Project, which aims to elucidate the genetic determinants of gene expression across 16 terrestrial and aquatic domestic species under diverse biological and environmental contexts. For each species, we aim to collect multiomics data, particularly genomics and transcriptomics, from 50 tissues of 1,000 healthy adults and 200 additional animals representing a specific context.

View Article and Find Full Text PDF

Decoding heterogeneous single-cell perturbation responses.

Bicna Song , Dingyu Liu , Weiwei Dai , Natalie F McMyn , Qingyang Wang , Breanna Williams , Janet D Siliciano , Jingyi Jessica Li , Robert F Siliciano , Wei Li

Nat Cell Biol

March 2025

Understanding how cells respond differently to perturbation is crucial in cell biology, but existing methods often fail to accurately quantify and interpret heterogeneous single-cell responses. Here we introduce the perturbation-response score (PS), a method to quantify diverse perturbation responses at a single-cell level. Applied to single-cell perturbation datasets such as Perturb-seq, PS outperforms existing methods in quantifying partial gene perturbations.

View Article and Find Full Text PDF

Worm Perturb-Seq: massively parallel whole-animal RNAi and RNA-seq.

Hefei Zhang , Xuhang Li , Dongyuan Song , Onur Yukselen , Shivani Nanda , Jingyi Jessica Li

bioRxiv

February 2025

The transcriptome provides a highly informative molecular phenotype to connect genotype to phenotype and is most frequently measured by RNA-sequencing (RNA-seq). Therefore, an ultimate goal is to perturb every gene and measure changes in the transcriptome. However, this remains challenging, especially in intact organisms due to different experimental and computational challenges.

View Article and Find Full Text PDF

Semisynthetic simulation for microbiome data analysis.

Kris Sankaran , Saritha Kodikara , Jingyi Jessica Li , Kim-Anh Lê Cao

Brief Bioinform

November 2024

High-throughput sequencing data lie at the heart of modern microbiome research. Effective analysis of these data requires careful preprocessing, modeling, and interpretation to detect subtle signals and avoid spurious associations. In this review, we discuss how simulation can serve as a sandbox to test candidate approaches, creating a setting that mimics real data while providing ground truth.

View Article and Find Full Text PDF

Categorization of 34 computational methods to detect spatially variable genes from spatially resolved transcriptomics data.

Guanao Yan , Shuo Harper Hua , Jingyi Jessica Li

Nat Commun

January 2025

In the analysis of spatially resolved transcriptomics data, detecting spatially variable genes (SVGs) is crucial. Numerous computational methods exist, but varying SVG definitions and methodologies lead to incomparable results. We review 34 state-of-the-art methods, classifying SVGs into three categories: overall, cell-type-specific, and spatial-domain-marker SVGs.

View Article and Find Full Text PDF

Systematic evaluation of methylation-based cell type deconvolution methods for plasma cell-free DNA.

Tongyue Sun , Jinqi Yuan , Yacheng Zhu , Jingqi Li , Shen Yang , Wei Li , Jingyi Jessica Li , Yumei Li

Genome Biol

December 2024

Background: Plasma cell-free DNA (cfDNA) is derived from cellular death in various tissues. Investigating the tissue origin of cfDNA through cell type deconvolution, we can detect changes in tissue homeostasis that occur during disease progression or in response to treatment. Consequently, cfDNA has emerged as a valuable noninvasive biomarker for disease detection and treatment monitoring.

View Article and Find Full Text PDF

Dissecting gene expression heterogeneity: generalized Pearson correlation squares and the -lines clustering algorithm.

Jingyi Jessica Li , Heather J Zhou , Peter J Bickel , Xin Tong

J Am Stat Assoc

May 2024

Motivated by the pressing needs for dissecting heterogeneous relationships in gene expression data, here we generalize the squared Pearson correlation to capture a mixture of linear dependences between two real-valued variables, with or without an index variable that specifies the line memberships. We construct the generalized Pearson correlation squares by focusing on three aspects: variable exchangeability, no parametric model assumptions, and inference of population-level parameters. To compute the generalized Pearson correlation square from a sample without a line-membership specification, we develop a -lines clustering algorithm to find clusters that exhibit distinct linear dependences, where can be chosen in a data-adaptive way.

View Article and Find Full Text PDF

Integrated molecular and functional characterization of the intrinsic apoptotic machinery identifies therapeutic vulnerabilities in glioma.

Elizabeth G Fernandez , Wilson X Mai , Kai Song , Nicholas A Bayley , Jiyoon Kim , Pauline Young , Linda M Liau , Gang Li , William H Yong , Jingyi Jessica Li

Nat Commun

November 2024

Genomic profiling often fails to predict therapeutic outcomes in cancer. This failure is, in part, due to a myriad of genetic alterations and the plasticity of cancer signaling networks. Functional profiling, which ascertains signaling dynamics, is an alternative method to anticipate drug responses.

View Article and Find Full Text PDF

Response to "Neglecting normalization impact in semi-synthetic RNA-seq data simulation generates artificial false positives" and "Winsorization greatly reduces false positives by popular differential expression methods when analyzing human population samples".

Xinzhou Ge , Yumei Li , Wei Li , Jingyi Jessica Li

Genome Biol

October 2024

Two correspondences raised concerns or comments about our analyses regarding exaggerated false positives found by differential expression (DE) methods. Here, we discuss the points they raise and explain why we agree or disagree with these points. We add new analysis to confirm that the Wilcoxon rank-sum test remains the most robust method compared to the other five DE methods (DESeq2, edgeR, limma-voom, dearseq, and NOISeq) in two-condition DE analyses after considering normalization and winsorization, the data preprocessing steps discussed in the two correspondences.

View Article and Find Full Text PDF

Spatial Single-Cell Mapping of Transcriptional Differences Across Genetic Backgrounds in Mouse Brains.

Zachary Hemminger , Gabriela Sanchez-Tam , Haley De Ocampo , Aihui Wang , Thomas Underwood , Jingyi Jessica Li

bioRxiv

October 2024

Genetic variation can alter brain structure and, consequently, function. Comparative statistical analysis of mouse brains across genetic backgrounds requires spatial, single-cell, atlas-scale data, in replicates-a challenge for current technologies. We introduce tlas-scale ranscriptome ocalization using ggregate ignatures (ATLAS), a scalable tissue mapping method.

View Article and Find Full Text PDF

A genome-wide spectrum of tandem repeat expansions in 338,963 humans.

Ya Cui , Wenbin Ye , Jason Sheng Li , Jingyi Jessica Li , Eric Vilain , Wei Li

Cell

October 2024

View Article and Find Full Text PDF

APIR: Aggregating Universal Proteomics Database Search Algorithms for Peptide Identification with FDR Control.

Yiling Elaine Chen , Xinzhou Ge , Kyla Woyshner , MeiLu McDermott , Antigoni Manousopoulou , Kexin Li , Jingyi Jessica Li

Genomics Proteomics Bioinformatics

July 2024

Advances in mass spectrometry (MS) have enabled high-throughput analysis of proteomes in biological systems. The state-of-the-art MS data analysis relies on database search algorithms to quantify proteins by identifying peptide-spectrum matches (PSMs), which convert mass spectra to peptide sequences. Different database search algorithms use distinct search strategies and thus may identify unique PSMs.

View Article and Find Full Text PDF

Categorization of 33 computational methods to detect spatially variable genes from spatially resolved transcriptomics data.

Guanao Yan , Shuo Harper Hua , Jingyi Jessica Li

ArXiv

October 2024

In the analysis of spatially resolved transcriptomics data, detecting spatially variable genes (SVGs) is crucial. Numerous computational methods exist, but varying SVG definitions and methodologies lead to incomparable results. We review 33 state-of-the-art methods, categorizing SVGs into three types: overall, cell-type-specific, and spatial-domain-marker SVGs.

View Article and Find Full Text PDF

scCDC: a computational method for gene-specific contamination detection and correction in single-cell and single-nucleus RNA-seq data.

Weijian Wang , Yihui Cen , Zezhen Lu , Yueqing Xu , Tianyi Sun , Wanlu Liu , Jingyi Jessica Li

Genome Biol

May 2024

In droplet-based single-cell and single-nucleus RNA-seq assays, systematic contamination of ambient RNA molecules biases the quantification of gene expression levels. Existing methods correct the contamination for all genes globally. However, there lacks specific evaluation of correction efficacy for varying contamination levels.

View Article and Find Full Text PDF

Developmental isoform diversity in the human neocortex informs neuropsychiatric risk mechanisms.

Ashok Patowary , Pan Zhang , Connor Jops , Celine K Vuong , Xinzhou Ge , Michael Margolis , Chunyu Liu , Jingyi Jessica Li

Science

May 2024

RNA splicing is highly prevalent in the brain and has strong links to neuropsychiatric disorders; yet, the role of cell type-specific splicing and transcript-isoform diversity during human brain development has not been systematically investigated. In this work, we leveraged single-molecule long-read sequencing to deeply profile the full-length transcriptome of the germinal zone and cortical plate regions of the developing human neocortex at tissue and single-cell resolution. We identified 214,516 distinct isoforms, of which 72.

View Article and Find Full Text PDF

Targeting circadian transcriptional programs through a cis-regulatory mechanism in triple negative breast cancer.

Yuanzhong Pan , Tsu-Pei Chiu , Lili Zhou , Priscilla Chan , Tia Tyrsett Kuo , Francesca Battaglin , Jingyi Jessica Li

bioRxiv

October 2024

Circadian clock genes are emerging targets in many types of cancer, but their mechanistic contributions to tumor progression are still largely unknown. This makes it challenging to stratify patient populations and develop corresponding treatments. In this work, we show that in breast cancer, the disrupted expression of circadian genes has the potential to serve as biomarkers.

View Article and Find Full Text PDF

A genome-wide spectrum of tandem repeat expansions in 338,963 humans.

Ya Cui , Wenbin Ye , Jason Sheng Li , Jingyi Jessica Li , Eric Vilain , Wei Li

Cell

April 2024

The Genome Aggregation Database (gnomAD), widely recognized as the gold-standard reference map of human genetic variation, has largely overlooked tandem repeat (TR) expansions, despite the fact that TRs constitute ∼6% of our genome and are linked to over 50 human diseases. Here, we introduce the TR-gnomAD (https://wlcb.oit.

View Article and Find Full Text PDF

Statistical method scDEED for detecting dubious 2D single-cell embeddings and optimizing t-SNE and UMAP hyperparameters.

Lucy Xia , Christy Lee , Jingyi Jessica Li

Nat Commun

February 2024

Two-dimensional (2D) embedding methods are crucial for single-cell data visualization. Popular methods such as t-distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP) are commonly used for visualizing cell clusters; however, it is well known that t-SNE and UMAP's 2D embeddings might not reliably inform the similarities among cell clusters. Motivated by this challenge, we present a statistical method, scDEED, for detecting dubious cell embeddings output by a 2D-embedding method.

View Article and Find Full Text PDF

Benchmarking computational methods to identify spatially variable genes and peaks.

Zhijian Li , Zain M Patel , Dongyuan Song , Guanao Yan , Jingyi Jessica Li

bioRxiv

December 2023

Spatially resolved transcriptomics offers unprecedented insight by enabling the profiling of gene expression within the intact spatial context of cells, effectively adding a new and essential dimension to data interpretation. To efficiently detect spatial structure of interest, an essential step in analyzing such data involves identifying spatially variable genes. Despite researchers having developed several computational methods to accomplish this task, the lack of a comprehensive benchmark evaluating their performance remains a considerable gap in the field.

View Article and Find Full Text PDF

scReadSim: a single-cell RNA-seq and ATAC-seq read simulator.

Guanao Yan , Dongyuan Song , Jingyi Jessica Li

Nat Commun

November 2023

Benchmarking single-cell RNA-seq (scRNA-seq) and single-cell Assay for Transposase-Accessible Chromatin using sequencing (scATAC-seq) computational tools demands simulators to generate realistic sequencing reads. However, none of the few read simulators aim to mimic real data. To fill this gap, we introduce scReadSim, a single-cell RNA-seq and ATAC-seq read simulator that allows user-specified ground truths and generates synthetic sequencing reads (in a FASTQ or BAM file) by mimicking real data.

View Article and Find Full Text PDF