The Monarch Initiative in 2024: an analytic platform integrating phenotypes, genes and diseases across species.

Tim E Putman , Kevin Schaper , Nicolas Matentzoglu , Vincent P Rubinetti , Faisal S Alquaddoomi , Corey Cox , J Harry Caufield , Glass Elsarboukh , Sarah Gehrke , Harshad Hegde , Justin T Reese , Ian Braun , Richard M Bruskiewich , Luca Cappelletti , Seth Carbon , Anita R Caron , Lauren E Chan , Christopher G Chute , Katherina G Cortes , Vinícius De Souza

Nucleic Acids Res

Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA.

Published: January 2024

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Bridging the gap between genetic variations, environmental determinants, and phenotypic outcomes is critical for supporting clinical diagnosis and understanding mechanisms of diseases. It requires integrating open data at a global scale. The Monarch Initiative advances these goals by developing open ontologies, semantic data models, and knowledge graphs for translational research. The Monarch App is an integrated platform combining data about genes, phenotypes, and diseases across species. Monarch's APIs enable access to carefully curated datasets and advanced analysis tools that support the understanding and diagnosis of disease for diverse applications such as variant prioritization, deep phenotyping, and patient profile-matching. We have migrated our system into a scalable, cloud-based infrastructure; simplified Monarch's data ingestion and knowledge graph integration systems; enhanced data mapping and integration standards; and developed a new user interface with novel search and graph navigation features. Furthermore, we advanced Monarch's analytic tools by developing a customized plugin for OpenAI's ChatGPT to increase the reliability of its responses about phenotypic data, allowing us to interrogate the knowledge in the Monarch graph using state-of-the-art Large Language Models. The resources of the Monarch Initiative can be found at monarchinitiative.org and its corresponding code repository at github.com/monarch-initiative/monarch-app.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10767791	PMC
http://dx.doi.org/10.1093/nar/gkad1082	DOI Listing

Publication Analysis

Top Keywords

monarch initiative

diseases species

data

monarch

initiative 2024

2024 analytic

analytic platform

platform integrating

integrating phenotypes

phenotypes genes and

Similar Publications

Genes associated with genetic and rare lung diseases and the risk of lung cancer.

Res Sq

August 2025

Department of Genetic Epidemiology, University Medical Center, GeorgAugust-University Göttingen, Göttingen, Germany.

Albert Rosenberger , Heike Bickeböller , David C Christiani , Neil E Caporaso , Geoffrey Liu

Background: We investigated whether markers, genes or terms of the associated with genetic or rare diseases (GARDs) that affect airway or lung function are associated with lung cancer.

Methods: Genes of interest were extracted from , , and Monarch Initiative. Individual SNP, gene level and gene-set analyses were performed for 52,207 SNPs, 1,677 genes or for 620 terms of the .

View Article and Find Full Text PDF

Similar Publications

Leveraging generative AI to assist biocuration of medical actions for rare disease.

Bioinform Adv

June 2025

The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, United States.

Enock Niyonkuru , J Harry Caufield , Leigh C Carmody , Michael A Gargano , Sabrina Toro

Motivation: Structured representations of clinical data can support computational analysis of individuals and cohorts, and ontologies representing disease entities and phenotypic abnormalities are now commonly used for translational research. The Medical Action Ontology (MAxO) provides a computational representation of treatments and other actions taken for clinical management. Currently, manual biocuration is used to annotate MAxO terms to rare diseases.

View Article and Find Full Text PDF

Similar Publications

Systematic benchmarking demonstrates large language models have not reached the diagnostic accuracy of traditional rare-disease decision support tools.

medRxiv

May 2025

Monarch Initiative.

Justin T Reese , Leonardo Chimirri , Yasemin Bridges , Daniel Danis , J Harry Caufield

Article Synopsis

- Large language models (LLMs) are being tested for their ability to help diagnose genetic diseases, but their evaluation is complicated due to how they generate unstructured responses.
- Researchers benchmarked LLMs against 5,213 case reports using established phenotypic criteria and compared their performance to a traditional diagnostic tool, Exomiser.
- The best-performing LLM correctly diagnosed cases 23.6% of the time, while Exomiser achieved 35.5%, indicating that while LLMs are improving, they still lag behind conventional bioinformatics methods and need further research for effective integration into diagnostic processes.

View Article and Find Full Text PDF

Similar Publications

FastHPOCR: pragmatic, fast, and accurate concept recognition using the human phenotype ontology.

Bioinformatics

July 2024

Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany.

Tudor Groza , Dylan Gration , Gareth Baynam , Peter N Robinson

Motivation: Human Phenotype Ontology (HPO)-based phenotype concept recognition (CR) underpins a faster and more effective mechanism to create patient phenotype profiles or to document novel phenotype-centred knowledge statements. While the increasing adoption of large language models (LLMs) for natural language understanding has led to several LLM-based solutions, we argue that their intrinsic resource-intensive nature is not suitable for realistic management of the phenotype CR lifecycle. Consequently, we propose to go back to the basics and adopt a dictionary-based approach that enables both an immediate refresh of the ontological concepts as well as efficient re-analysis of past data.

View Article and Find Full Text PDF

Similar Publications

Node-degree aware edge sampling mitigates inflated classification performance in biomedical random walk-based graph representation learning.

Bioinform Adv

March 2024

The Jackson Laboratory for Genomic Medicine, CT 06032, United States.

Luca Cappelletti , Lauren Rekerle , Tommaso Fontana , Peter Hansen , Elena Casiraghi

Motivation: Graph representation learning is a family of related approaches that learn low-dimensional vector representations of nodes and other graph elements called embeddings. Embeddings approximate characteristics of the graph and can be used for a variety of machine-learning tasks such as novel edge prediction. For many biomedical applications, partial knowledge exists about positive edges that represent relationships between pairs of entities, but little to no knowledge is available about negative edges that represent the explicit lack of a relationship between two nodes.

View Article and Find Full Text PDF

Similar Publications