Accurate detection of tandem repeats exposes ubiquitous reuse of biological sequences.

Nucleic Acids Res

Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, United States.

Published: September 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Tandem repetition is one of the major processes underlying genome evolution and phenotypic diversification. While newly formed tandem repeats are often easy to identify, it is more challenging to detect repeat copies as they diverge over evolutionary timescales. Existing programs for finding tandem repeats return markedly different results, and it is unclear which predictions are more correct and how much room remains for improvement. Here, we introduce DetectRepeats, a new method that uses empirical information about structural repeats to improve the accuracy of repeat detection. We show that DetectRepeats advances the state-of-the-art by finding highly divergent repeats with relatively few false positive detections. We apply DetectRepeats to genomes across the tree of life to discover an enrichment of detectable tandem repeats within different genes, genome regions, and taxa. Furthermore, we use phylogenetic reconciliation to determine that some tandem repeats continue to evolve through intra-repeat unit replacement. In this manner, tandem repeats serve as a renewable genetic resource offering a bountiful source of alternative genetic material. Our work unlocks the confident detection of ancient tandem repeats, opening a doorway to future discoveries. DetectRepeats is part of the DECIPHER package for the R programming language and available via Bioconductor.

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkaf866DOI Listing

Publication Analysis

Top Keywords

tandem repeats
28
repeats
9
tandem
8
accurate detection
4
detection tandem
4
repeats exposes
4
exposes ubiquitous
4
ubiquitous reuse
4
reuse biological
4
biological sequences
4

Similar Publications

Background/aims: Drug addiction is a neuropsychiatric disorder characterised by compulsive drug-seeking behaviour notwithstanding adverse consequences. This work seeks to address a deficiency in the literature by comparing drug-addicted and non-addicted individuals within an Iraqi population through the analysis of a 1000-base pair variable number of tandem repeats (VNTRs) polymorphism of the dopamine receptor gene DRD4. The association of this novel polymorphism with drug addiction has not yet been examined.

View Article and Find Full Text PDF

Accurate detection of tandem repeats exposes ubiquitous reuse of biological sequences.

Nucleic Acids Res

September 2025

Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, United States.

Tandem repetition is one of the major processes underlying genome evolution and phenotypic diversification. While newly formed tandem repeats are often easy to identify, it is more challenging to detect repeat copies as they diverge over evolutionary timescales. Existing programs for finding tandem repeats return markedly different results, and it is unclear which predictions are more correct and how much room remains for improvement.

View Article and Find Full Text PDF

pv. is a pathogen of rice responsible for bacterial leaf streak, a disease that can cause up to 32% yield loss. While it was first reported a century ago in Asia, its first report in Africa was in the 1980s.

View Article and Find Full Text PDF

CRISPR homing gene drive is a disruptive biotechnology developed over the past decade with potential applications in public health, agriculture, and conservation biology. This technology relies on an autonomous selfish genetic element able to spread in natural populations through the release of gene drive individuals. However, it has not yet been deployed in the wild.

View Article and Find Full Text PDF

Huntington's disease (HD) is a progressive, autosomal dominant neurodegenerative disorder characterized by motor dysfunction, cognitive decline, and psychiatric disturbances. It is caused by CAG repeat expansions in the HTT gene, resulting in the formation of mutant huntingtin protein that aggregates and disrupts neuronal function. This review outlines the pathogenesis of HD, including genetic, molecular, and environmental factors.

View Article and Find Full Text PDF