Commun Med (Lond)
September 2025
Background: Klebsiella pneumoniae is ubiquitous in animals, humans, and the environment, facilitating the dissemination of antimicrobial resistance (AMR) and virulence traits. Most studies are primarily focused on human clinical isolates, leaving critical gaps in understanding non-human reservoirs and cross-species transmission risks.
Methods: We combined large-scale genomic analyses with in vitro and in vivo infection models to characterize the evolutionary dynamics of 2809 K.
Respiratory disease (RD) is a worldwide leading threat to the pig industry, but there is still limited understanding of the pathogens associated with swine RD. In this study, we conducted a nationwide genomic surveillance on identifying viruses, bacteria, and antimicrobial resistance genes (ARGs) from the lungs of pigs with RD in China. By performing metatranscriptomic sequencing combined with metagenomic sequencing, we identified 21 viral species belonging to 12 viral families.
View Article and Find Full Text PDFGenomics Inform
October 2024
The extraction of biological regulation events has been a key focus in the field of biomedical nature language processing (BioNLP). However, existing methods often encounter challenges such as cascading errors in text mining pipelines and limitations in topic coverage from the selected corpus. Fortunately, the emergence of large language models (LLMs) presents a potential solution due to their robust semantic understanding and extensive knowledge base.
View Article and Find Full Text PDFDespite the abundance of genotype-phenotype association studies, the resulting association outcomes often lack robustness and interpretations. To address these challenges, we introduce PheSeq, a Bayesian deep learning model that enhances and interprets association studies through the integration and perception of phenotype descriptions. By implementing the PheSeq model in three case studies on Alzheimer's disease, breast cancer, and lung cancer, we identify 1024 priority genes for Alzheimer's disease and 818 and 566 genes for breast cancer and lung cancer, respectively.
View Article and Find Full Text PDFIt is vital to investigate the complex mechanisms underlying tumors to better understand cancer and develop effective treatments. Metabolic abnormalities and clinical phenotypes can serve as essential biomarkers for diagnosing this challenging disease. Additionally, genetic alterations provide profound insights into the fundamental aspects of cancer.
View Article and Find Full Text PDFProliferative enteropathy caused by is an important economic associated disease to pig industry, but the knowledge about the prevalence of in pig farms in China is limited. In addition, there is no complete genome sequence available for isolates from China. In this study, we developed a TaqMan qPCR for the screening of by targeting the bacterial 16S rDNA gene.
View Article and Find Full Text PDFMotivation: Node embedding of biological entity network has been widely investigated for the downstream application scenarios. To embed full semantics of gene and disease, a multi-relational heterogeneous graph is considered in a scenario where uni-relation between gene/disease and other heterogeneous entities are abundant while multi-relation between gene and disease is relatively sparse. After introducing this novel graph format, it is illuminative to design a specific data integration algorithm to fully capture the graph information and bring embeddings with high quality.
View Article and Find Full Text PDFGenomics Inform
September 2021
Due to the rapid evolution of high-throughput technologies, a tremendous amount of data is being produced in the biological domain, which poses a challenging task for information extraction and natural language understanding. Biological named entity recognition (NER) and named entity normalisation (NEN) are two common tasks aiming at identifying and linking biologically important entities such as genes or gene products mentioned in the literature to biological databases. In this paper, we present an updated version of OryzaGP, a gene and protein dataset for rice species created to help natural language processing (NLP) tools in processing NER and NEN tasks.
View Article and Find Full Text PDFBackground: Natural language processing has long been applied in various applications for biomedical knowledge inference and discovery. Enrichment analysis based on named entity recognition is a classic application for inferring enriched associations in terms of specific biomedical entities such as gene, chemical, and mutation.
Objective: The aim of this study was to investigate the effect of pathway enrichment evaluation with respect to biomedical text-mining results and to develop a novel metric to quantify the effect.