Chemical Language Model Linker: blending text and molecules with modular adapters.

ArXiv

Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53715, United States.

Published: June 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

The development of large language models and multi-modal models has enabled the appealing idea of generating novel molecules from text descriptions. Generative modeling would shift the paradigm from relying on large-scale chemical screening to find molecules with desired properties to directly generating those molecules. However, multi-modal models combining text and molecules are often trained from scratch, without leveraging existing high-quality pretrained models. Training from scratch consumes more computational resources and prohibits model scaling. In contrast, we propose a lightweight adapter-based strategy named ical anguage odel inker (ChemLML). ChemLML blends the two single domain models and obtains conditional molecular generation from text descriptions while still operating in the specialized embedding spaces of the molecular domain. ChemLML can tailor diverse pretrained text models for molecule generation by training relatively few adapter parameters. We find that the choice of molecular representation used within ChemLML, SMILES versus SELFIES, has a strong influence on conditional molecular generation performance. SMILES is often preferable despite not guaranteeing valid molecules. We raise issues in using the entire PubChem dataset of molecules and their associated descriptions for evaluating molecule generation and provide a filtered version of the dataset as a generation test set. To demonstrate how ChemLML could be used in practice, we generate candidate protein inhibitors and use docking to assess their quality and also generate candidate membrane permeable molecules.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12047907PMC

Publication Analysis

Top Keywords

molecules
8
text molecules
8
multi-modal models
8
text descriptions
8
conditional molecular
8
molecular generation
8
molecule generation
8
generate candidate
8
models
6
text
5

Similar Publications

A systematic analysis of human hormone receptors.

Sci China Life Sci

September 2025

Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.

Hormones are important bioactive molecules that regulate the development, function, and homeostasis of tissues/organs via binding to hormone receptors (HRs) in target cells. Although human HRs are essential for both basic research and drug development, a comprehensive analysis of them is still lacking. Here, we present a systematic bioinformatic investigation of all known human HRs, characterizing their genomic distributions, biological functions, subcellular localizations, and expression patterns in various cell types and tissues/organs.

View Article and Find Full Text PDF

TET3 is a regulator and can be targeted for the intervention of myocardial fibrosis.

EMBO Mol Med

September 2025

State Key Laboratory of Natural Medicines, Department of Pharmacology, China Pharmaceutical University, Nanjing, China.

Cardiac fibrosis contributes to adverse cardiac remodeling and loss of heart function eventually leading to heart failure (HF). Resident cardiac fibroblasts are the principal source of myofibroblasts that produce extracellular matrix proteins to mediate cardiac fibrosis. We report that TET3 depletion in cultured cardiac fibroblasts blocked transition to myofibroblasts in response to different pro-fibrogenic stimuli.

View Article and Find Full Text PDF

The hydrogenation side-reaction in copper-mediated radiofluorination.

EJNMMI Radiopharm Chem

September 2025

Department of Experimental Neurooncological Radiopharmacy, Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Institute of Radiopharmaceutical Cancer Research, 04318, Leipzig, Germany.

Background: Copper-mediated radiofluorination (CMRF) is a breakthrough in F-radiochemistry, enabling F incorporation into molecules even at electron-rich aromatic positions. In recent years, several improved protocols have been reported to advance the application of CMRF. These advancements primarily focus on improving radiochemical conversion, expanding substrate scope, and enabling scalability for remote-controlled radiotracer production.

View Article and Find Full Text PDF

Chronic treatment with fluoxetine, a widely prescribed selective serotonin reuptake inhibitor (SSRI), is known to promote neural plasticity. The role of fluoxetine in plasticity has been particularly tied to parvalbumin-positive interneurons, a key population of GABAergic neurons that regulate inhibitory tone and network stability. While our previous studies have highlighted fluoxetine-induced plasticity in the visual cortex and hippocampus, its cell-type-specific effects in the prefrontal cortex (PFC) remain unclear.

View Article and Find Full Text PDF

Precise control of spin states and spin-spin interactions in atomic-scale magnetic structures is crucial for spin-based quantum technologies. A promising architecture is molecular spin systems, which offer chemical tunability and scalability for larger structures. An essential component, in addition to the qubits themselves, is switchable qubit-qubit interactions that can be individually addressed.

View Article and Find Full Text PDF