Aligning Protein-Coding Nucleotide Sequences with MACSE.

Vincent Ranwez , Nathalie Chantret , Frédéric Delsuc

Methods Mol Biol

Institut des Sciences de l'Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de Montpellier, Montpellier, France.

Published: April 2021

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Most genomic and evolutionary comparative analyses rely on accurate multiple sequence alignments. With their underlying codon structure, protein-coding nucleotide sequences pose a specific challenge for multiple sequence alignment. Multiple Alignment of Coding Sequences (MACSE) is a multiple sequence alignment program that provided the first automatic solution for aligning protein-coding gene datasets containing both functional and nonfunctional sequences (pseudogenes). Through its unique features, reliable codon alignments can be built in the presence of frameshifts and stop codons suitable for subsequent analysis of selection based on the ratio of nonsynonymous to synonymous substitutions. Here we offer a practical overview and guidelines on the use of MACSE v2. This major update of the initial algorithm now comes with a graphical interface providing user-friendly access to different subprograms to handle multiple alignments of protein-coding sequences. We also present new pipelines based on MACSE v2 subprograms to handle large datasets and distributed as Singularity containers. MACSE and associated pipelines are available at: https://bioweb.supagro.inra.fr/macse/ .

Download full-text PDF	Source
http://dx.doi.org/10.1007/978-1-0716-1036-7_4	DOI Listing

Publication Analysis

Top Keywords

multiple sequence

aligning protein-coding

protein-coding nucleotide

nucleotide sequences

sequences macse

sequence alignment

subprograms handle

sequences

macse

multiple

Similar Publications

Single-cell multiome and spatial profiling reveals pancreas cell type-specific gene regulatory programs of type 1 diabetes progression.

Sci Adv

September 2025

Department of Pediatrics, University of California San Diego, La Jolla, CA, USA.

Rebecca Melton , Sara Jimenez , Weston Elison , Luca Tucciarone , Abigail Howell

Cell type-specific regulatory programs that drive type 1 diabetes (T1D) in the pancreas are poorly understood. Here, we performed single-nucleus multiomics and spatial transcriptomics in up to 32 nondiabetic (ND), autoantibody-positive (AAB), and T1D pancreas donors. Genomic profiles from 853,005 cells mapped to 12 pancreatic cell types, including multiple exocrine subtypes.

View Article and Find Full Text PDF

Similar Publications

AmpliconTyper - a tool for analysing ONT multiplex PCR data from environmental and other complex samples.

Microb Genom

September 2025

Department of Infection Biology, Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, London, UK.

Anton Spadar , Jaspreet Mahindroo , Catherine Troman , Michael Owusu , Yaw Adu-Sarkodie

Amplicon sequencing is a popular method for understanding the diversity of bacterial communities in samples containing multiple organisms as exemplified by 16S rRNA sequencing. Another application of amplicon sequencing includes multiplexing both primer sets and samples, allowing sequencing of multiple targets in multiple samples in the same sequencing run. Multiple tools exist to process the amplicon sequencing data produced via the short-read Illumina platform, but there are fewer options for long-read Oxford Nanopore Technologies (ONT) sequencing, or for processing data from environmental surveillance or other sources with many different organisms.

View Article and Find Full Text PDF

Similar Publications

MultiFusion2HPO: A Multimodal Deep Learning Approach for Enhancing Human Protein-Phenotype Association Prediction.

IEEE Trans Comput Biol Bioinform

September 2025

Weiqi Zhai , Yongjun Deng , Xiaodi Huang , Shaojun Wang , Shanfeng Zhu

Accurately identifying associations between human genes (proteins) and clinical phenotypes is critical for advancing drug development and precision medicine. While the human phenotype ontology (HPO) standardizes clinical phenotypes, current computational approaches for predicting human protein-phenotype associations suffer from two limitations: (1) underutilization of multimodal protein-related information and (2) lack of state-of-the-art deep learning representations tailored to diverse data modalities, such as text and sequence. To overcome these limitations, we introduce MultiFusion2HPO, a novel multimodal model that integrates diverse features and advanced learning methods from multiple data sources to enhance the prediction of human protein-HPO associations.

View Article and Find Full Text PDF

Similar Publications

HOT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers.

IEEE Trans Pattern Anal Mach Intell

September 2025

Wenhao Li , Mengyuan Liu , Hong Liu , Pichao Wang , Shijian Lu

Transformers have been successfully applied in the field of video-based 3D human pose estimation. However, the high computational costs of these video pose transformers (VPTs) make them impractical on resource-constrained devices. In this paper, we present a hierarchical plug-and-play pruning-and-recovering framework, called Hierarchical Hourglass Tokenizer (HOT), for efficient transformer-based 3D human pose estimation from videos.

View Article and Find Full Text PDF

Similar Publications

Evolution of cross-tolerance to metals in yeast.

Proc Natl Acad Sci U S A

September 2025

Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, BC V6T 1Z4, Canada.

Anna L Bazzicalupo , Penelope C Kahn , Eully Ao , Joel Campbell , Sarah P Otto

Organisms often face multiple selective pressures simultaneously (e.g., mine tailings with multiple heavy metal contaminants), yet we know little about when adaptation to one stressor provides cross-tolerance or cross-intolerance to other stressors.

View Article and Find Full Text PDF

Similar Publications