Publications by authors named "Julius O Jacobsen"

Article Synopsis
  • - Large language models (LLMs) are being tested for their ability to help diagnose genetic diseases, but their evaluation is complicated due to how they generate unstructured responses.
  • - Researchers benchmarked LLMs against 5,213 case reports using established phenotypic criteria and compared their performance to a traditional diagnostic tool, Exomiser.
  • - The best-performing LLM correctly diagnosed cases 23.6% of the time, while Exomiser achieved 35.5%, indicating that while LLMs are improving, they still lag behind conventional bioinformatics methods and need further research for effective integration into diagnostic processes.
View Article and Find Full Text PDF

Background: Computational approaches to support rare disease diagnosis are challenging to build, requiring the integration of complex data types such as ontologies, gene-to-phenotype associations, and cross-species data into variant and gene prioritisation algorithms (VGPAs). However, the performance of VGPAs has been difficult to measure and is impacted by many factors, for example, ontology structure, annotation completeness or changes to the underlying algorithm. Assertions of the capabilities of VGPAs are often not reproducible, in part because there is no standardised, empirical framework and openly available patient data to assess the efficacy of VGPAs - ultimately hindering the development of effective prioritisation tools.

View Article and Find Full Text PDF

To discover rare disease-gene associations, we developed a gene burden analytical framework and applied it to rare, protein-coding variants from whole genome sequencing of 35,008 cases with rare diseases and their family members recruited to the 100,000 Genomes Project (100KGP). Following triaging of the results, 88 novel associations were identified including 38 with existing experimental evidence. We have published the confirmation of one of these associations, hereditary ataxia with , and independent confirmatory evidence has recently been published for four more.

View Article and Find Full Text PDF

The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution.

View Article and Find Full Text PDF

Background: The U.K. 100,000 Genomes Project is in the process of investigating the role of genome sequencing in patients with undiagnosed rare diseases after usual care and the alignment of this research with health care implementation in the U.

View Article and Find Full Text PDF

Although next-generation sequencing has revolutionized the ability to associate variants with human diseases, diagnostic rates and development of new therapies are still limited by a lack of knowledge of the functions and pathobiological mechanisms of most genes. To address this challenge, the International Mouse Phenotyping Consortium is creating a genome- and phenome-wide catalog of gene function by characterizing new knockout-mouse strains across diverse biological systems through a broad set of standardized phenotyping tests. All mice will be readily available to the biomedical community.

View Article and Find Full Text PDF

The correlation of phenotypic outcomes with genetic variation and environmental factors is a core pursuit in biology and biomedicine. Numerous challenges impede our progress: patient phenotypes may not match known diseases, candidate variants may be in genes that have not been characterized, model organisms may not recapitulate human or veterinary diseases, filling evolutionary gaps is difficult, and many resources must be queried to find potentially significant genotype-phenotype associations. Non-human organisms have proven instrumental in revealing biological mechanisms.

View Article and Find Full Text PDF

Deep phenotyping has been defined as the precise and comprehensive analysis of phenotypic abnormalities in which the individual components of the phenotype are observed and described. The three components of the Human Phenotype Ontology (HPO; www.human-phenotype-ontology.

View Article and Find Full Text PDF

The principles of genetics apply across the entire tree of life. At the cellular level we share biological mechanisms with species from which we diverged millions, even billions of years ago. We can exploit this common ancestry to learn about health and disease, by analyzing DNA and protein sequences, but also through the observable outcomes of genetic differences, i.

View Article and Find Full Text PDF

Exomiser is an application that prioritizes genes and variants in next-generation sequencing (NGS) projects for novel disease-gene discovery or differential diagnostics of Mendelian disease. Exomiser comprises a suite of algorithms for prioritizing exome sequences using random-walk analysis of protein interaction networks, clinical relevance and cross-species phenotype comparisons, as well as a wide range of other computational filters for variant frequency, predicted pathogenicity and pedigree analysis. In this protocol, we provide a detailed explanation of how to install Exomiser and use it to prioritize exome sequences in a number of scenarios.

View Article and Find Full Text PDF

The International Mouse Phenotyping Consortium (IMPC) is providing the world's first functional catalogue of a mammalian genome by characterising a knockout mouse strain for every gene. A robust and highly structured informatics platform has been developed to systematically collate, analyse and disseminate the data produced by the IMPC. As the first phase of the project, in which 5000 new knockout strains are being broadly phenotyped, nears completion, the informatics platform is extending and adapting to support the increasing volume and complexity of the data produced as well as addressing a large volume of users and emerging user groups.

View Article and Find Full Text PDF

Understanding which are the catalytic residues in an enzyme and what function they perform is crucial to many biology studies, particularly those leading to new therapeutics and enzyme design. The original version of the Catalytic Site Atlas (CSA) (http://www.ebi.

View Article and Find Full Text PDF