Publications by authors named "Max A Alekseyev"

The () gene family in the purple sea urchin, , encodes immune response proteins. The genes are clustered, surrounded by short tandem repeats, and some are present in genomic segmental duplications. The genes share regions of sequence and include repeats in the coding exon.

View Article and Find Full Text PDF

Background: Anopheles coluzzii and Anopheles arabiensis belong to the Anopheles gambiae complex and are among the major malaria vectors in sub-Saharan Africa. However, chromosome-level reference genome assemblies are still lacking for these medically important mosquito species.

Findings: In this study, we produced de novo chromosome-level genome assemblies for A.

View Article and Find Full Text PDF

Motivation: One of the key computational problems in comparative genomics is the reconstruction of genomes of ancestral species based on genomes of extant species. Since most dramatic changes in genomic architectures are caused by genome rearrangements, this problem is often posed as minimization of the number of genome rearrangements between extant and ancestral genomes. The basic case of three given genomes is known as the genome median problem.

View Article and Find Full Text PDF

Background: New sequencing technologies have lowered financial barriers to whole genome sequencing, but resulting assemblies are often fragmented and far from 'finished'. Updating multi-scaffold drafts to chromosome-level status can be achieved through experimental mapping or re-sequencing efforts. Avoiding the costs associated with such approaches, comparative genomic analysis of gene order conservation (synteny) to predict scaffold neighbours (adjacencies) offers a potentially useful complementary method for improving draft assemblies.

View Article and Find Full Text PDF

Reconstruction of the median genome consisting of linear chromosomes from three given genomes is known to be intractable. There exist efficient methods for solving a relaxed version of this problem, where the median genome is allowed to have circular chromosomes. We propose a method for construction of an approximate solution to the original problem from a solution to the relaxed problem and prove a bound on its approximation error.

View Article and Find Full Text PDF

Construction of phylogenetic trees and networks for extant species from their characters represents one of the key problems in phylogenomics. While solution to this problem is not always uniquely defined and there exist multiple methods for tree/network construction, it becomes important to measure how well the constructed networks capture the given character relationship across the species. We propose a novel method for measuring the specificity of a given phylogenetic network in terms of the total number of distributions of homoplasy-free character states at the leaves that the network may impose.

View Article and Find Full Text PDF

Genome rearrangements are large-scale evolutionary events that shuffle genomic architectures. The minimal number of such events between two genomes is often used in phylogenomic studies to measure the evolutionary distance between the genomes. Double-Cut-and-Join (DCJ) operations represent a convenient model of most common genome rearrangements (reversals, translocations, fissions, and fusions), while other genome rearrangements, such as transpositions, can be modeled by pairs of DCJs.

View Article and Find Full Text PDF

Background: Despite the recent progress in genome sequencing and assembly, many of the currently available assembled genomes come in a draft form. Such draft genomes consist of a large number of genomic fragments (scaffolds), whose positions and orientations along the genome are unknown. While there exists a number of methods for reconstruction of the genome from its scaffolds, utilizing various computational and wet-lab techniques, they often can produce only partial error-prone scaffold assemblies.

View Article and Find Full Text PDF

Background: The ability to estimate the evolutionary distance between extant genomes plays a crucial role in many phylogenomic studies. Often such estimation is based on the parsimony assumption, implying that the distance between two genomes can be estimated as the rearrangement distance equal the minimal number of genome rearrangements required to transform one genome into the other. However, in reality the parsimony assumption may not always hold, emphasizing the need for estimation that does not rely on the rearrangement distance.

View Article and Find Full Text PDF

Background: Genome median and genome halving are combinatorial optimization problems that aim at reconstruction of ancestral genomes by minimizing the number of evolutionary events between them and genomes of the extant species. While these problems have been widely studied in past decades, their solutions are often either not efficient or not biologically adequate. These shortcomings have been recently addressed by restricting the problems solution space.

View Article and Find Full Text PDF

Genome rearrangements can be modeled as k-breaks, which break a genome at k positions and glue the resulting fragments in a new order. In particular, reversals, translocations, fusions, and fissions are modeled as 2-breaks, and transpositions are modeled as 3-breaks. Although k-break rearrangements for [Formula: see text] have not been observed in evolution, they are used in cancer genomics to model chromothripsis, a catastrophic event of multiple breakages happening simultaneously in a genome.

View Article and Find Full Text PDF

Since most dramatic genomic changes are caused by genome rearrangements as well as gene duplications and gain/loss events, it becomes crucial to understand their mechanisms and reconstruct ancestral genomes of the given genomes. This problem was shown to be NP-complete even in the "simplest" case of three genomes, thus calling for heuristic rather than exact algorithmic solutions. At the same time, a larger number of input genomes may actually simplify the problem in practice as it was earlier illustrated with MGRA, a state-of-the-art software tool for reconstruction of ancestral genomes of multiple genomes.

View Article and Find Full Text PDF

Background: Anguilla japonica (Japanese eel) is currently one of the most important research subjects in eastern Asia aquaculture. Enigmatic life cycle of the organism makes study of artificial reproduction extremely limited. Henceforth genomic and transcriptomic resources of eels are urgently needed to help solving the problems surrounding this organism across multiple fields.

View Article and Find Full Text PDF

A high throughput screen for compounds that induce TRAIL-mediated apoptosis identified ML100 as an active chemical probe, which potentiated TRAIL activity in prostate carcinoma PPC-1 and melanoma MDA-MB-435 cells. Follow-up in silico modeling and profiling in cell-based assays allowed us to identify NSC130362, pharmacophore analog of ML100 that induced 65-95% cytotoxicity in cancer cells and did not affect the viability of human primary hepatocytes. In agreement with the activation of the apoptotic pathway, both ML100 and NSC130362 synergistically with TRAIL induced caspase-3/7 activity in MDA-MB-435 cells.

View Article and Find Full Text PDF

Advances in DNA sequencing technology over the past decade have increased the volume of raw sequenced genomic data available for further assembly and analysis. While there exist many algorithms for assembly of sequenced genomic material, they often experience difficulties in constructing complete genomic sequences. Instead, they produce long genomic subsequences (scaffolds), which then become a subject to scaffold assembly aimed at reconstruction of their order along genome chromosomes.

View Article and Find Full Text PDF

Variation in vectorial capacity for human malaria among Anopheles mosquito species is determined by many factors, including behavior, immunity, and life history. To investigate the genomic basis of vectorial capacity and explore new avenues for vector control, we sequenced the genomes of 16 anopheline mosquito species from diverse locations spanning ~100 million years of evolution. Comparative analyses show faster rates of gene gain and loss, elevated gene shuffling on the X chromosome, and more intron losses, relative to Drosophila.

View Article and Find Full Text PDF

Recent advances in single-cell genomics provide an alternative to largely gene-centric metagenomics studies, enabling whole-genome sequencing of uncultivated bacteria. However, single-cell assembly projects are challenging due to (i) the highly nonuniform read coverage and (ii) a greatly elevated number of chimeric reads and read pairs. While recently developed single-cell assemblers have addressed the former challenge, methods for assembling highly chimeric reads remain poorly explored.

View Article and Find Full Text PDF

Error correction of sequenced reads remains a difficult task, especially in single-cell sequencing projects with extremely non-uniform coverage. While existing error correction tools designed for standard (multi-cell) sequencing data usually come up short in single-cell sequencing projects, algorithms actually used for single-cell error correction have been so far very simplistic.We introduce several novel algorithms based on Hamming graphs and Bayesian subclustering in our new error correction tool BAYESHAMMER.

View Article and Find Full Text PDF

In comparative genomics, the rearrangement distance between two genomes (equal the minimal number of genome rearrangements required to transform them into a single genome) is often used for measuring their evolutionary remoteness. Generalization of this measure to three genomes is known as the median score (while a resulting genome is called median genome). In contrast to the rearrangement distance between two genomes which can be computed in linear time, computing the median score for three genomes is NP-hard.

View Article and Find Full Text PDF
Article Synopsis
  • Evolutionary relationships among placental mammalian orders have been debated, but new genome sequencing and computational methods are helping to clarify these connections.
  • Using a double cut and join distance metric to analyze gene order, researchers found that Rodentia is distinct from the group that includes Primates, Carnivora, Perissodactyla, and Artiodactyla.
  • The study also examined breakpoint reuse and synteny block lengths, confirming gene order as a valid method for studying phylogenetics and discussing factors affecting different interpretations of mammalian lineage relationships.
View Article and Find Full Text PDF

One of the key advances in genome assembly that has led to a significant improvement in contig lengths has been improved algorithms for utilization of paired reads (mate-pairs). While in most assemblers, mate-pair information is used in a post-processing step, the recently proposed Paired de Bruijn Graph (PDBG) approach incorporates the mate-pair information directly in the assembly graph structure. However, the PDBG approach faces difficulties when the variation in the insert sizes is high.

View Article and Find Full Text PDF

The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads.

View Article and Find Full Text PDF

Background: An important question in genome evolution is whether there exist fragile regions (rearrangement hotspots) where chromosomal rearrangements are happening over and over again. Although nearly all recent studies supported the existence of fragile regions in mammalian genomes, the most comprehensive phylogenomic study of mammals raised some doubts about their existence.

Results: Here we demonstrate that fragile regions are subject to a birth and death process, implying that fragility has a limited evolutionary lifespan.

View Article and Find Full Text PDF

Recently completed whole-genome sequencing projects marked the transition from gene-based phylogenetic studies to phylogenomics analysis of entire genomes. We developed an algorithm MGRA for reconstructing ancestral genomes and used it to study the rearrangement history of seven mammalian genomes: human, chimpanzee, macaque, mouse, rat, dog, and opossum. MGRA relies on the notion of the multiple breakpoint graphs to overcome some limitations of the existing approaches to ancestral genome reconstructions.

View Article and Find Full Text PDF

Multi-break rearrangements break a genome into multiple fragments and further glue them together in a new order. While 2-break rearrangements represent standard reversals, fusions, fissions, and translocations, 3-break rearrangements represent a natural generalization of transpositions. Alekseyev and Pevzner (2007a, 2008a) studied multi-break rearrangements in circular genomes and further applied them to the analysis of chromosomal evolution in mammalian genomes.

View Article and Find Full Text PDF