Comput Struct Biotechnol J
July 2025
The development of modern DNA sequencing technologies has resulted in the rapid growth of genomic data. Alongside the collection of this data, there is an increasing need for the development of modern computational tools leveraging this data for tasks including but not limited to antimicrobial resistance and gene annotation. Current deep learning architectures and tokenization techniques have been explored for the extraction of meaningful underlying information contained within this sequencing data.
View Article and Find Full Text PDFMotivation: Antibiotic resistance in Mycobacterium tuberculosis (MTB) poses a significant challenge to global public health. Rapid and accurate prediction of antibiotic resistance can inform treatment strategies and mitigate the spread of resistant strains. In this study, we present a novel approach leveraging large language models (LLMs) to predict antibiotic resistance in MTB (LLMTB).
View Article and Find Full Text PDFThis study leverages the Bacterial and Viral Bioinformatics Resource Center (BV-BRC) to analyze over 27,000 Mycobacterium tuberculosis (MTB) genomic strains, providing a comprehensive and large-scale overview of antibiotic resistance (AMR) prevalence and resistance patterns. We used MTB++, which is the newest and most comprehensive AI-based MTB drug resistance profiler tool, to predict the resistance profile of each of the 27,000 MTB isolates and then used feature analysis to identify key genes that were associated with the resistance. There are three main contributions to this study.
View Article and Find Full Text PDFThe correct interpretation of breast density is important in the assessment of breast cancer risk. AI has been shown capable of accurately predicting breast density, however, due to the differences in imaging characteristics across mammography systems, models built using data from one system do not generalize well to other systems. Though federated learning (FL) has emerged as a way to improve the generalizability of AI without the need to share data, the best way to preserve features from all training data during FL is an active area of research.
View Article and Find Full Text PDF