Fine-tuning protein language models to understand the functional impact of missense variants.

Comput Struct Biotechnol J

School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.

Published: May 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Elucidating the functional effects of missense variants is crucial yet challenging. To investigate their impact, we fine-tuned protein language models, including ESM2 and ProtT5, to classify 20 protein features at amino acid resolution. In addition, we trained a fully connected neural network classifier on frozen embeddings and compared its performance to fine-tuning in order to quantify the added value of task-specific adaptation. We then used the fine-tuned models to: 1) identify protein features enriched in either pathogenic or benign missense variants, and 2) compare the predicted feature profiles of proteins with reference and alternate alleles to understand how missense variants affect protein functionality. We show that our models can be used to reclassify variants of uncertain significance and provide mechanistic insights into the functional consequences of missense mutations.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12166733PMC
http://dx.doi.org/10.1016/j.csbj.2025.05.022DOI Listing

Publication Analysis

Top Keywords

missense variants
16
protein language
8
language models
8
protein features
8
missense
5
variants
5
fine-tuning protein
4
models
4
models understand
4
understand functional
4

Similar Publications

Novel Grm6 Variant in a no b-wave (nob) Mouse Model: Phenotype Characterization and Gene Therapy.

Invest Ophthalmol Vis Sci

September 2025

Department of Ophthalmology, Edward S. Harkness Eye Institute, Vagelos College of Physicians and Surgeons, Columbia University Irving Medical Center, Columbia University, New York, New York, United States.

Purpose: To characterize a no b-wave (nob) mouse model of congenital stationary night blindness (CSNB) caused by a Grm6 variant that disrupts photoreceptor-to-bipolar cell signaling. Additionally, we aim to evaluate the efficacy of gene therapy in restoring visual function.

Methods: The nob mouse was generated through selective breeding to regenerate the nob phenotype.

View Article and Find Full Text PDF

Purpose: To define the genetic architecture of foveal morphology and explore its relevance to foveal hypoplasia (FH), a hallmark of developmental macular disorders.

Methods: We applied deep-learning algorithms to quantify foveal pit depth from central optical coherence tomography (OCT) B-scans in 61,269 UK Biobank participants. A genome-wide association study (GWAS) was conducted using REGENIE, adjusting for age, sex, height, and ancestry.

View Article and Find Full Text PDF

Rationale: Weaver syndrome is a rare congenital overgrowth disorder characterized by a wide spectrum of clinical manifestations that often overlap with other overgrowth syndromes. It is primarily caused by pathogenic variants in the Enhancer of Zeste Homolog 2 (EZH2) gene on chromosome 7q36.1.

View Article and Find Full Text PDF

Objective: Pituitary adenomas (PAs) are one of the three major lesions in Multiple Endocrine Neoplasia type 1 (MEN1), with a prevalence of 32 to 58%, yet their specific risk factors remain unidentified. This study aimed to identify predictors influencing PA occurrence in MEN1.

Methods: This nationwide, multicenter, retrospective cohort study involved 240 MEN1 patients, 55.

View Article and Find Full Text PDF

Background: Overexpression of rs3761936 of DCLRE1B gene has been observed in both breast cancer and cervical cancer patients. To justify the association of this polymorphism with these cancers, we performed this case-control study.

Method: A total of 245 cancer patients and 108 healthy controls participated in the research.

View Article and Find Full Text PDF