Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Motivation: The development of novel compounds targeting proteins of interest is one of the most important tasks in the pharmaceutical industry. Deep generative models have been applied to targeted molecular design and have shown promising results. Recently, target-specific molecule generation has been viewed as a translation between the protein language and the chemical language. However, such a model is limited by the availability of interacting protein-ligand pairs. On the other hand, large amounts of unlabelled protein sequences and chemical compounds are available and have been used to train language models that learn useful representations. In this study, we propose exploiting pretrained biochemical language models to initialize (i.e. warm start) targeted molecule generation models. We investigate two warm start strategies: (i) a one-stage strategy where the initialized model is trained on targeted molecule generation and (ii) a two-stage strategy containing a pre-finetuning on molecular generation followed by target-specific training. We also compare two decoding strategies to generate compounds: beam search and sampling.

Results: The results show that the warm-started models perform better than a baseline model trained from scratch. The two proposed warm-start strategies achieve similar results to each other with respect to widely used metrics from benchmarks. However, docking evaluation of the generated compounds for a number of novel proteins suggests that the one-stage strategy generalizes better than the two-stage strategy. Additionally, we observe that beam search outperforms sampling in both docking evaluation and benchmark metrics for assessing compound quality.

Availability And Implementation: The source code is available at https://github.com/boun-tabi/biochemical-lms-for-drug-design and the materials (i.e., data, models, and outputs) are archived in Zenodo at https://doi.org/10.5281/zenodo.6832145.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btac482DOI Listing

Publication Analysis

Top Keywords

language models
12
molecule generation
12
exploiting pretrained
8
pretrained biochemical
8
biochemical language
8
warm start
8
targeted molecule
8
one-stage strategy
8
model trained
8
two-stage strategy
8

Similar Publications

Purpose: The study aims to compare the treatment recommendations generated by four leading large language models (LLMs) with those from 21 sarcoma centers' multidisciplinary tumor boards (MTBs) of the sarcoma ring trial in managing complex soft tissue sarcoma (STS) cases.

Methods: We simulated STS-MTBs using four LLMs-Llama 3.2-vison: 90b, Claude 3.

View Article and Find Full Text PDF

This study quantitatively evaluated the adsorption performance of natural bentonite for removing three dye classes-cationic (Basic dye: BEZACRYL RED GRL), anionic (Reactive dye: AVITERA LIGHT RED SE), and non-ionic (Disperse dye: BEMACRON BLUE HP3R) from synthetic textile wastewater. Batch adsorption experiments were conducted under varying conditions of contact time (15-90 min), adsorbent dosage (20-60 g L⁻), pH (4 and 12), and temperature (25-100 °C), with dye concentrations quantified by UV-Vis spectroscopy. At a contact time of 30 min and room temperature (25 °C), maximum removal efficiencies reached 99.

View Article and Find Full Text PDF

Sexual pleasure in older age: haptic visuality and female eroticism in three contemporary Spanish films.

J Aging Stud

September 2025

Dean of Area Studies and Assistant Dean of Faculty, IES Abroad Barcelona (Spain) & Research Fellow, Aston University, UK. Electronic address:

This article explores the representation of female sexuality in later life through the lens of three contemporary Spanish films: La vida era eso (2020), Destello bravío (2021), and Mamacruz (2023). Drawing from feminist aging studies, film theory, and concepts such as haptic visuality and clitoral sexuality, the study challenges the patriarchal, ageist, and phallocentric narratives that have long shaped cultural understandings of older women's erotic lives. Through close readings of these films, the article demonstrates how they subvert the dominant heteronormative gaze by foregrounding sensory pleasure, autoeroticism, and the reawakening of desire in older women.

View Article and Find Full Text PDF

Background: Clinical communication is central to the delivery of effective, timely, and safe patient care. The use of text-based tools for clinician-to-clinician communication-commonly referred to as secure messaging-has increased exponentially over the past decade. The use of secure messaging has a potential impact on clinician work behaviors, workload, and cognitive burden.

View Article and Find Full Text PDF

Background: Cancer screening nonadherence persists among adults who are deaf, deafblind, and hard of hearing (DDBHH). These barriers span individual, clinician, and health care system levels, contributing to difficulties understanding cancer information, accessing screening services, and following treatment directives. Critical communication barriers include ineffective patient-physician communication, limited access to American Sign Language (ASL) cancer information, misconceptions about medical procedures, insurance navigation difficulties, and intersectional barriers for multiply marginalized individuals.

View Article and Find Full Text PDF