98%
921
2 minutes
20
A variety of deep generative models have been adopted to perform functional protein generation. Compared to 3D protein design, sequence-based generation methods, which aim to generate amino acid sequences with desired functions, remain a major approach for functional protein generation due to the abundance and quality of protein sequence data, as well as the relatively low modeling complexity for training. Although these models are typically trained to match protein sequences from the training data, exact matching of every amino acid is not always essential. Certain amino acid changes (e.g., mismatches, insertions, and deletions) may not necessarily lead to functional changes. This suggests that maximizing the training data likelihood beyond the amino acid sequence space could yield better generative models. Pre-trained protein large language models (PLMs) like ESM2 can encode protein sequences into a latent space, potentially serving as functional validators. We propose training functional protein sequence generative models by simultaneously optimizing the likelihood of training data in both the amino acid sequence space and the latent space derived from a PLM. This training scheme can also be viewed as a knowledge distillation approach that dynamically re-weights samples during training. We applied our method to train GPT-like models (i.e., autoregressive transformers) for antimicrobial peptide (AMP) and malate dehydrogenase (MDH) generation tasks. Computational experiments confirmed that our method outperformed various deep generative models (e.g., generative adversarial net, variational autoencoder, and GPT model without the proposed training strategy) on these tasks, demonstrating the effectiveness of our multi-likelihood optimization strategy.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11741333 | PMC |
http://dx.doi.org/10.1101/2025.01.07.631724 | DOI Listing |
Neurochem Res
September 2025
Biology and Health Laboratory, Faculty of Sciences, Ibn Tofail University, Kenitra, Morocco.
Parkinson's disease (PD) is characterized by impairments in motor control following the degeneration of dopamine-producing neurons located in the substantia nigra pars compacta. Environmental pesticides such as Paraquat (PQ) and Maneb (MB) contribute to the onset of PD by inducing oxidative stress (OS). This study evaluated the therapeutic efficacy of moderate physical activity (PA) on both motor and non-motor symptoms in a Wistar rat model of Paraquat and Maneb (PQ/MB) induced PD.
View Article and Find Full Text PDFEur J Clin Microbiol Infect Dis
September 2025
School of Bioengineering and Biosciences, Department of Biochemistry, Lovely Professional University, Punjab, 144411, India.
Purpose: This study investigates codon usage and amino acid usage bias in the genus Acinetobacter to uncover the evolutionary forces shaping these patterns and their implications for pathogenicity and biotechnology.
Methods: Codon usage patterns were examined in representative genomes of the genus Acinetobacter using standard codon bias indices, including GC content, relative synonymous codon usage (RSCU), effective number of codons (ENC), and codon adaptation index (CAI). Neutrality and parity plots were employed to evaluate the relative influence of mutational pressure and natural selection on codon preferences.
Vet Res Commun
September 2025
Department of Physiology, Faculty of Veterinary Medicine, Cairo University, PO 11221, Giza, Egypt.
This comprehensive review examines the versatile applications and effects of Moringa oleifera across multiple fish species in aquaculture systems amid growing challenges of rising feed costs and antimicrobial resistance. M. oleifera, commonly called the Miracle tree, contains an exceptional nutritional profile with high protein content (22.
View Article and Find Full Text PDFMol Biol Rep
September 2025
Cytogenetics and Molecular Genetics Lab, Pathology Unit, Medical Division (BARC Hospital), Bhabha Atomic Research Centre, Anushakti Nagar, Mumbai, India.
Background: Hearing loss (HL) is one of the most common congenital anomalies and is a complex etiologically diverse condition. Molecular genetic characterization of HL remains challenging owing to the high genetic heterogeneity. This study aimed to screen for potential disease-causing genetic variations in a cohort of Indian patients with congenital bilateral severe-to-profound sensorineural HL.
View Article and Find Full Text PDFCurr Microbiol
September 2025
Department of Integrative Biotechnology, Sungkyunkwan University, Natural Science Campus, 2066 Seobu-ro, Jangan-Gu, Suwon-Si, Gyeonggi-Do, 16419, Republic of Korea.
A novel bacterial strain, SM-13 was isolated from the rhizospheric soil of Epipremnum aureum (Jade Pothos) sampled in Suwon, Republic of Korea. The isolate was Gram-stain-negative, aerobic, motile, rod-shaped, cream-coloured, oxidase- and catalase-positive. Strain SM-13 grew at the range of 15-37 °C (optimum, 25 °C), at pH 6.
View Article and Find Full Text PDF