PEREGRINE: A genome-wide prediction of enhancer to gene relationships supported by experimental evidence.

PLoS One

Division of Bioinformatics, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States of America.

Published: January 2021


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Enhancers are powerful and versatile agents of cell-type specific gene regulation, which are thought to play key roles in human disease. Enhancers are short DNA elements that function primarily as clusters of transcription factor binding sites that are spatially coordinated to regulate expression of one or more specific target genes. These regulatory connections between enhancers and target genes can therefore be characterized as enhancer-gene links that can affect development, disease, and homeostatic cellular processes. Despite their implication in disease and the establishment of cell identity during development, most enhancer-gene links remain unknown. Here we introduce a new, publicly accessible database of predicted enhancer-gene links, PEREGRINE. The PEREGRINE human enhancer-gene links interactive web interface incorporates publicly available experimental data from ChIA-PET, eQTL, and Hi-C assays across 78 cell and tissue types to link 449,627 enhancers to 17,643 protein-coding genes. These enhancer-gene links are made available through the new Enhancer module of the PANTHER database and website where the user may easily access the evidence for each enhancer-gene link, as well as query by target gene and enhancer location.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7737992PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0243791PLOS

Publication Analysis

Top Keywords

enhancer-gene links
20
target genes
8
enhancer-gene
6
links
5
peregrine genome-wide
4
genome-wide prediction
4
prediction enhancer
4
enhancer gene
4
gene relationships
4
relationships supported
4

Similar Publications

Cellular decision-making and tissue homeostasis are governed by transcriptional networks shaped by chromatin accessibility. Using single-nucleus multi-omics, we jointly profile gene expression and chromatin accessibility in 10,335 cells from the Drosophila testis apical tip. This enables inference of 147 cell type-specific enhancer-gene regulons using SCENIC + .

View Article and Find Full Text PDF

Background: Enhancer elements interact with target genes at a distance to modulate their expression, but the molecular details of enhancer-promoter interaction are incompletely understood. G-quadruplex DNA secondary structures (G4s) have recently been shown to co-occur with 3D chromatin interactions; however, the functional importance of G4s within enhancers remains unclear.

Results: In this study, we identify novel G4 structures within two locus control regions at the human α- and β-globin loci.

View Article and Find Full Text PDF

Variant-specific priors clarify colocalisation analysis.

PLoS Genet

May 2025

MRC Biostatistics Unit, University of Cambridge, Cambridge, United Kingdom.

Linking GWAS variants to their causal gene and context remains an ongoing challenge. A widely used method for performing this analysis is the coloc package for statistical colocalisation analysis, which can be used to link GWAS and eQTL associations. Currently, coloc assumes that all variants in a region are equally likely to be causal, despite the success of fine-mapping methods that use additional information to adjust their prior probabilities.

View Article and Find Full Text PDF

Multi-trait QTL (xQTL) colocalization has shown great promises in identifying causal variants with shared genetic etiology across multiple molecular modalities, contexts, and complex diseases. However, the lack of scalable and efficient methods to integrate large-scale multi-omics data limits deeper insights into xQTL regulation. Here, we propose , a multi-task learning colocalization method that can scale to hundreds of traits, while accounting for multiple causal variants within a genomic region of interest.

View Article and Find Full Text PDF

Introduction: Non-alcoholic fatty liver disease (NAFLD) represents the most widespread liver disease globally, ranging from non-alcoholic fatty liver (NAFL) and steatohepatitis (NASH) to fibrosis/cirrhosis, with potential progression to hepatocellular carcinoma (HCC). Genome-wide association studies (GWASs) have identified several single nucleotide polymorphisms (SNPs) associated with NAFLD. However, numerous GWAS signals associated with NAFLD locate in non-coding regions, posing a challenge for interpreting their functional annotation.

View Article and Find Full Text PDF