CacPred: a cascaded convolutional neural network for TF-DNA binding prediction.

BMC Genomics

Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China.

Published: March 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Transcription factors (TFs) regulate the genes' expression by binding to DNA sequences. Aligned TFBSs of the same TF are seen as cis-regulatory motifs, and substantial computational efforts have been invested to find motifs. In recent years, convolutional neural networks (CNNs) have succeeded in TF-DNA binding prediction, but existing DL methods' accuracy needs to be improved and convolution function in TF-DNA binding prediction should be further explored.

Results: We develop a cascaded convolutional neural network model named CacPred to predict TF-DNA binding on 790 Chromatin immunoprecipitation-sequencing (ChIP-seq) datasets and seven ChIP-nexus (chromatin immunoprecipitation experiments with nucleotide resolution through exonuclease, unique barcode, and single ligation) datasets. We compare CacPred to six existing DL models across nine standard evaluation metrics. Our results indicate that CacPred outperforms all comparison models for TF-DNA binding prediction, and the average accuracy (ACC), matthews correlation coefficient (MCC), and the area of eight metrics radar (AEMR) are improved by 3.3%, 9.2%, and 6.4% on 790 ChIP-seq datasets. Meanwhile, CacPred improves the average ACC, MCC, and AEMR of 5.5%, 16.8%, and 12.9% on seven ChIP-nexus datasets. To explain the proposed method, motifs are used to show features CacPred learned. In light of the results, CacPred can find some significant motifs from input sequences.

Conclusions: This paper indicates that CacPred performs better than existing models on ChIP-seq data. Seven ChIP-nexus datasets are also analyzed, and they coincide with results that our proposed method performs the best on ChIP-seq data. CacPred only is equipped with the convolutional algorithm, demonstrating that pooling processing of the existing models leads to losing some sequence information. Some significant motifs are found, showing that CacPred can learn features from input sequences. In this study, we demonstrate that CacPred is an effective and feasible model for predicting TF-DNA binding. CacPred is freely available at https://github.com/zhangsq06/CacPred .

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11916463PMC
http://dx.doi.org/10.1186/s12864-025-11399-yDOI Listing

Publication Analysis

Top Keywords

tf-dna binding
24
binding prediction
16
cacpred
12
convolutional neural
12
existing models
12
cascaded convolutional
8
neural network
8
find motifs
8
chip-seq datasets
8
chip-nexus datasets
8

Similar Publications

RNA polymerase II (RNAPII) is regulated by sequence-specific transcription factors (TFs) and the pre-initiation complex (PIC): TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIH, and Mediator. TFs, Mediator, and RNAPII contain intrinsically disordered regions (IDRs) and form phase-separated condensates, but how IDRs control RNAPII function remains poorly understood. Using purified PIC factors, we developed a real-time in vitro fluorescence transcription (RIFT) assay for second-by-second visualization of transcription at hundreds of promoters simultaneously.

View Article and Find Full Text PDF

Motivation: In silico transcription factor and DNA (TF-DNA) binding affinity prediction plays a vital role in examining TF binding preferences and understanding gene regulation. The existing tools employ TF-DNA binding profiles from in vitro high-throughput technologies to predict TF-DNA binding affinity. However, TFs tend to bind to sequences in open chromatin regions in vivo, such TF binding preference is seldomly considered by these existing tools.

View Article and Find Full Text PDF

Exploring maize transcriptional regulatory landscape through large-scale profiling of transcription factor binding sites.

Mol Plant

August 2025

, State Key Laboratory of Maize Bio-breeding, Frontiers Science Center for Molecular Design Breeding, Joint International Research Laboratory of Crop Molecular Breeding, National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100193, China; ,

Understanding gene regulatory networks (GRNs) is essential for improving maize yield and quality through molecular breeding approaches. The lack of comprehensive transcription factor (TF)-DNA interaction data has hindered accurate GRN predictions, limiting our insight into the regulatory mechanisms. Here, we performed large-scale profiling of maize TF binding sites.

View Article and Find Full Text PDF

Transcription factors (TFs) play a crucial role in gene regulation. They drive chromatin remodeling, transcription, mRNA splicing, and RNA processing via dynamic protein interactions. However, their low abundance and complex binding networks complicate the study of TF partners.

View Article and Find Full Text PDF

Evolutionary Analysis of Transcriptional Regulation Mediated by Cdx2 in Rodents.

Cell Prolif

July 2025

Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China.

Differences in gene expression, which arise from divergence in cis-regulatory elements or alterations in transcription factors (TFs) binding specificity, are one of the most important causes of phenotypic diversity during evolution. On one hand, changes in the cis-elements located in the vicinity of target genes affect TF binding and/or local chromatin environment, thereby modulating gene expression in cis. On the other hand, alterations in trans-factors influence the expression of their target genes in a more pleiotropic fashion.

View Article and Find Full Text PDF