98%
921
2 minutes
20
The correct mapping of the proteome is an important step towards advancing our understanding of biological systems and cellular mechanisms. Methods that provide better mappings can fuel important processes such as drug discovery and disease understanding. Currently, true determination of translation initiation sites is primarily achieved by experiments. Here, we propose TIS Transformer, a deep learning model for the determination of translation start sites solely utilizing the information embedded in the transcript nucleotide sequence. The method is built upon deep learning techniques first designed for natural language processing. We prove this approach to be best suited for learning the semantics of translation, outperforming previous approaches by a large margin. We demonstrate that limitations in the model performance are primarily due to the presence of low-quality annotations against which the model is evaluated against. Advantages of the method are its ability to detect key features of the translation process and multiple coding sequences on a transcript. These include micropeptides encoded by short Open Reading Frames, either alongside a canonical coding sequence or within long non-coding RNAs. To demonstrate the use of our methods, we applied TIS Transformer to remap the full human proteome.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9985340 | PMC |
http://dx.doi.org/10.1093/nargab/lqad021 | DOI Listing |
Brief Bioinform
July 2025
Bioinformatics Laboratory, College of Computing, University Mohammed VI Polytechnic, Lot 660, Hay Moulay Rachid, Ben Guerir 43150, Morocco.
Accurate bacterial gene prediction is essential for understanding microbial functions and advancing biotechnology. Traditional methods based on sequence homology and statistical models often struggle with complex genetic variations and novel sequences due to their limited ability to interpret the "language of genes." To overcome these challenges, we explore genomic language models (gLMs)-inspired by large language models in natural language processing-to enhance bacterial gene prediction.
View Article and Find Full Text PDFMol Pharmacol
April 2024
Department of Pharmacology and Toxicology, School of Medicine, Virginia Commonwealth University, Richmond, Virginia (A.M.E., D.A.G.); Department of Pharmacology and Toxicology, Faculty of Pharmacy, Kafrelsheikh University, Kafrelsheikh, Egypt (A.M.E.); and Department of Pharmacology and Public Healt
Artificial intelligence (AI) platforms, such as Generative Pretrained Transformer (ChatGPT), have achieved a high degree of popularity within the scientific community due to their utility in providing evidence-based reviews of the literature. However, the accuracy and reliability of the information output and the ability to provide critical analysis of the literature, especially with respect to highly controversial issues, has generally not been evaluated. In this work, we arranged a question/answer session with ChatGPT regarding several unresolved questions in the field of cancer research relating to therapy-induced senescence (TIS), including the topics of senescence reversibility, its connection to tumor dormancy, and the pharmacology of the newly emerging drug class of senolytics.
View Article and Find Full Text PDFNAR Genom Bioinform
March 2023
Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Oost-Vlaanderen 9000, Belgium.
The correct mapping of the proteome is an important step towards advancing our understanding of biological systems and cellular mechanisms. Methods that provide better mappings can fuel important processes such as drug discovery and disease understanding. Currently, true determination of translation initiation sites is primarily achieved by experiments.
View Article and Find Full Text PDFReprod Toxicol
October 2022
DeepDoc Academy, Rotterdam, the Netherlands; Department of Child Neurology, UMC Utrecht Brain Center, University Medical Center (UMC) Utrecht, Utrecht, the Netherlands.
The Dutch Teratology Information Service Lareb counsels healthcare professionals and patients about medication use during pregnancy and lactation. To keep the evidence up to date, employees perform a standardized weekly PubMed query where relevant literature is identified manually. We aimed to develop an accurate machine-learning algorithm to predict the relevance of PubMed entries, thereby reducing the labor-intensive task of manually screening the articles.
View Article and Find Full Text PDF