Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

A novel approach to the detection of genomic repeats is presented in this paper. The technique, dubbed SAGRI (Spectrum Assisted Genomic Repeat Identifier), is based on the spectrum (set of sequence k-mers, for some k) of the genomic sequence. Specifically, the genome is scanned twice. The first scan (FindHit) detects candidate pairs of repeat-segments, by effectively reconstructing portions of the Euler path of the (k-1)-mer graph of the genome only in correspondence with likely repeat sites. This process produces candidate repeat pairs, for which the location of the leftmost term is unknown. Candidate pairs are then subjected to validation in a second scan, in which the genome is labelled for hits in the (much smaller) spectrum of the repeat candidates: high hit density is taken as evidence of the location of the first segment of a repeat, and the pair of segments is then certified by pairwise alignment. The design parameters of the technique are selected on the basis of a careful probabilistic analysis (based on random sequences). SAGRI is compared with three leading repeat-finding tools on both synthetic and natural DNA sequences, and found to be uniformly superior in versatility (ability to detect repeats of different lengths) and accuracy (the central goal of repeat finding), while being quite competitive in speed. An executable program can be downloaded at http://sagri.comp.nus.edu.sg.

Download full-text PDF

Source
http://dx.doi.org/10.1089/cmb.2008.0013DOI Listing

Publication Analysis

Top Keywords

detection genomic
8
candidate pairs
8
repeat
7
spectrum-based novo
4
novo repeat
4
repeat detection
4
genomic
4
genomic sequences
4
sequences novel
4
novel approach
4

Similar Publications

Background: Genetic modifiers are believed to play an important role in the onset and severity of polycystic kidney disease (PKD), but identifying these modifiers has been challenging due to the lack of effective methodologies.

Methods: We generated zebrafish mutants of IFT140, a skeletal ciliopathy gene and newly identified autosomal dominant PKD (ADPKD) gene, to examine skeletal development and kidney cyst formation in larval and juvenile mutants. Additionally, we utilized ift140 crispants, generated through efficient microhomology-mediated end joining (MMEJ)-based genome editing, to compare phenotypes with mutants and conduct a pilot genetic modifier screen.

View Article and Find Full Text PDF

Plastoglobuli (PG) are plant lipoprotein compartments, present in plastid organelles. They are involved in the formation and/or storage of lipophilic metabolites. FIBRILLINs (FBNs) are one of the main PG-associated proteins and are particularly abundant in carotenoid-enriched chromoplasts found in ripe fruits and flowers.

View Article and Find Full Text PDF

Background: Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder lacking objective biomarkers for early diagnosis. DNA methylation is a promising epigenetic marker, and machine learning offers a data-driven classification approach. However, few studies have examined whole-blood, genome-wide DNA methylation profiles for ASD diagnosis in school-aged children.

View Article and Find Full Text PDF

Nuclear mitochondrial DNA segments (NUMTs), which are mitochondrial DNA fragments integrated into the nuclear genome, serve as markers of evolutionary history. This study aims to enhance the detection and analysis of NUMTs by developing a script named NUMTsearcher. Utilizing the latest chromosome-level genome assemblies from various species, including human, rabbit, and six fish species, the study compares NUMTsearcher's performance against traditional methods such as LAST (Local Alignment Search Tool), BLAST (Basic Local Alignment Search Tool), BLAT (BLAST-Like Alignment Tool), and the pan-mitogenome approach, which integrates mitogenomes from diverse sources to identify fixed NUMTs in the nuclear genome.

View Article and Find Full Text PDF

Foodborne illness is a critical food safety and public health concern, often resulting from contamination events by resident pathogens in food processing environments (FPEs). , the causative agent of listeriosis, can persist in FPEs over long time periods. Despite rigorous research on the phenotypic and genotypic traits of , no clear pattern has arisen to explain why some strains are able to persist.

View Article and Find Full Text PDF