2,174 results match your criteria: "Big Data Institute[Affiliation]"

Understanding the cognitive trajectory of a neurological disease can provide important insight on underlying mechanisms and disease progression. Cognitive impairment is now well established as beginning many years before the diagnosis of Alzheimer's disease, but pre-diagnostic profiles are unclear for other neurological conditions that may be associated with cognitive impairment. We analysed data from the prospective UK Biobank cohort with study baseline assessment performed between 2006 and 2010 and participants followed until 2021.

View Article and Find Full Text PDF

Protocol for using treeLFA to infer multimorbidity patterns in the form of disease topics from diagnosis data in biobanks.

STAR Protoc

September 2025

Department of Epidemiology, University Medical Center Groningen, University of Groningen, Groningen 9700 RB, the Netherlands. Electronic address:

Research on multimorbidity patterns promotes our understanding of the common pathological mechanisms that underlie co-occurring diseases. Here, we present a protocol to infer multimorbidity clusters in the form of disease topics from large-scale diagnosis data using treeLFA, a topic model based on the Bayesian binary non-negative matrix factorization. We describe steps for installing software, preparing input data, and training the model.

View Article and Find Full Text PDF

An Electronic Health Record-Wide Association Study to identify populations at increased risk of E. coli bloodstream infections.

J Infect

September 2025

Nuffield Department of Medicine, University of Oxford, Oxford, UK; The National Institute for Health Research Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance at the University of Oxford, Oxford, UK; The National Institute for Health Research Oxford Bi

Objectives: Escherichia coli bacteraemias have been under mandatory surveillance in the UK for fifteen years, but cases continue to rise. Systematic searches of all features present within electronic healthcare records (EHRs), described here as an EHR-wide association study (EHR-WAS), could potentially identify under-appreciated factors that could be targeted to reduce infections.

Methods: We used data from Oxfordshire, UK, and an EHR-WAS method developed for use with large-scale COVID-19 data to estimate associations between E.

View Article and Find Full Text PDF

Combining evidence from human genetic and functional screens to identify pathways altering obesity and fat distribution.

Am J Hum Genet

August 2025

Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK; Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK; Nuffield Department of Women's and Reproductive Health, Medical Sciences Division, University o

Overall adiposity and body fat distribution are heritable traits associated with altered risk of cardiometabolic disease and mortality. Performing rare-variant (minor allele frequency <1%) association testing using exome-sequencing data from 402,375 participants of European ancestry in the UK Biobank for nine overall and tissue-specific fat distribution traits, we identified 19 genes where putatively damaging rare variation associated with at least one trait (Bonferroni-adjusted p < 1.58 × 10) and 50 additional genes at false discovery rate (FDR) ≤1% (p ≤ 4.

View Article and Find Full Text PDF

Importance: When randomized trials are unavailable or not feasible, observational studies can be used to answer causal questions about the comparative effects of interventions by attempting to emulate a hypothetical pragmatic randomized trial (target trial). Published guidance to aid reporting of these studies is not available.

Objective: To develop consensus based guidance for reporting observational studies performed to estimate causal effects by explicitly emulating a target trial.

View Article and Find Full Text PDF

Importance: When randomized trials are unavailable or not feasible, observational studies can be used to answer causal questions about the comparative effects of interventions by attempting to emulate a hypothetical pragmatic randomized trial (target trial). Published guidance to aid reporting of these studies is not available.

Objective: To develop consensus-based guidance for reporting observational studies performed to estimate causal effects by explicitly emulating a target trial.

View Article and Find Full Text PDF

The global burden of multimorbidity is increasing yet poorly understood, owing to insufficient methods for modelling complex systems of conditions. In particular, hepatosplenic multimorbidity has been inadequately investigated. From 17 January to 16 February 2023, we examined 3186 individuals aged 5-92 years from 52 villages across Uganda within the SchistoTrack Cohort.

View Article and Find Full Text PDF

The PRIME 2.0 checklist is an updated, domain-specific framework designed to standardize the development, evaluation, and reporting of artificial intelligence (AI) applications in cardiovascular imaging. This update specifically responds to rapid advances from traditional machine learning to deep learning, large language models, and multimodal generative AI.

View Article and Find Full Text PDF

Background: Each year, over 700,000 pregnancies occur in the UK, with up to 10% affected by complications such as hypertensive disorders of pregnancy and gestational diabetes mellitus. Pregnancy-related complications and reproductive factors are associated with an increased risk of cardiovascular disease (CVD) later in life. Our aim was to determine whether adding pregnancy factors to a prediction model with established CVD risk factors improves 10-year risk prediction of CVD in postpartum women, using QRISK®-3 as a benchmark model.

View Article and Find Full Text PDF

Background: A faecal immunochemical test (FIT) result ≥ 10 µg/g is recommended in the UK to triage patients with symptoms of colorectal cancer (CRC) in primary care for urgent cancer investigation. The COLOFIT model combining FIT results with demographics and blood tests was developed to reduce the proportion of people referred without CRC. This study aims to externally validate the COLOFIT using data from Oxford University Hospitals (OUH).

View Article and Find Full Text PDF

Background: Single base substitution (SBS) mutations, particularly C > T and T > C, are increased owing to unrepaired DNA replication errors in mismatch repair-deficient (MMRd) cancers. Excess CpG > TpG mutations have been reported in MMRd cancers defective in mismatch detection (dMutSα), but not in mismatch correction (dMutLα). Somatic CpG > TpG mutations conventionally result from unrepaired spontaneous deamination of 5'-methylcytosine throughout the cell cycle, causing T:G mismatches and signature SBS1.

View Article and Find Full Text PDF

Established in 2018 to push beyond the constraints of individual health and population cohorts, the IHCC is a community of cohorts advancing global science and health. We summarize the collective resources of 69 member cohorts, representing over 34 million people.

View Article and Find Full Text PDF

Introduction: Cluster analysis, a machine learning-based and data-driven technique for identifying groups in data, has demonstrated its potential in a wide range of contexts. However, critical appraisal and reproducibility are often limited by insufficient reporting, ultimately hampering the interpretation and trust of key stakeholders. The present paper describes the protocol that will guide the development of a reporting guideline and checklist for studies incorporating cluster analyses-Transparent Reporting of Cluster Analyses.

View Article and Find Full Text PDF

The ups and downs of brain stress: Extending the triple network hypothesis.

Biol Psychiatry Cogn Neurosci Neuroimaging

August 2025

Research Division of Mind and Brain, Department of Psychiatry and Psychotherapy CCM, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany; German Center for Mental Health (DZPG), Partner

Background: This pre-registered functional magnetic resonance imaging study aimed to test and possibly extend the triple network hypothesis of psychosocial stress processing, positing that responses in the salience (SN) and default mode network (DMN) dominate at the expense of the central executive network (CEN). Furthermore, we tested the hypothesis that stress-related responses in SN- and DMN-structures are associated with hormonal, cardiovascular, and affective stress responses, while CEN- and DMN-structures are associated with task performance. We also examined sex-specific associations between neural and stress-induced cortisol, heart rate, and negative affect responses as well as task performance.

View Article and Find Full Text PDF

Purpose: Given the limited real-world testing of algorithms for wrist-worn sensors to estimate sedentary time, we examined the performance of 21 algorithms in free-living adults.

Methods: Seventy-one adults (35-65 years) wore a GENEActiv (wrist) and an activPAL (thigh) sensor for up to 10 days. activPAL was our reference measure.

View Article and Find Full Text PDF

Multiple sclerosis (MS) affects 2.9 million people. Traditional classification of MS into distinct subtypes poorly reflects its pathobiology and has limited value for prognosticating disease evolution and treatment response, thereby hampering drug discovery.

View Article and Find Full Text PDF

Why Global Health Security Should be Managed as a Value-Based Enterprise.

Health Secur

August 2025

Frances Charlotte Butcher, BMBS, DPhil, MFPH, is an Academic Clinical Lecturer in Public Health, Ethox Centre, Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom.

Efforts to improve global health security should be a key international priority. In this commentary, I argue that while global health security is increasingly perceived as the domain of various professional and academic disciplines, ranging from global health to international relations, it is crucial to recognize it also as a value-based enterprise. Drawing on ethics literature, this commentary shows how a value-based approach is useful for analyzing ethical challenges in global health security in 4 key areas: analyzing the implicit values shaping global health security's problematic meaning, considering whether solidarity might be useful for grounding compensation for those facing an increased surveillance burden, examining how labelling outbreaks by origin can disguise questions of responsibility, and addressing how reasonable demands of nationalism are balanced.

View Article and Find Full Text PDF

Neurodevelopmental conditions such as attention-deficit/hyperactivity disorder (ADHD) and autism co-occur with cardiometabolic conditions. However, little is known about the mechanisms underlying this co-occurrence. In this nationwide three-generation study using population-based registers in the Netherlands (n=15 million), we assessed the familial (co-)aggregation of ADHD, autism, and cardiometabolic conditions, and estimated their heritabilities and genetic correlations.

View Article and Find Full Text PDF

npstat: An Efficient Tool to Explore the Population Genome Variability and Divergence Using Pool Sequencing Data.

Methods Mol Biol

August 2025

Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department for Medicine, University of Oxford, Oxford, UK.

Pool sequencing has emerged as a valuable approach in ecological studies, particularly when dealing with very small organisms (with limited amount of DNA available), when distinguishing individual organisms is a challenge (e.g., in colonies, microbiome), when there is a trade-off between the sequencing cost and the number of individuals to sequence, when the main goal is to estimate nucleotide variability and variant frequency patterns at the population level (that is, when individual information is not required).

View Article and Find Full Text PDF

Background & Aims: A quarter of the world population is estimated to have metabolic dysfunction-associated steatotic liver disease. Here, we aim to understand the impact of liver trait-associated genetic variants on fat content and tissue volume across organs and body compartments and on a large set of biomarkers.

Methods: Genome-wide association analyses were performed on liver fat and liver volume estimated with magnetic resonance imaging in up to 27,243 unrelated European participants from the UK Biobank.

View Article and Find Full Text PDF

Brain signatures of nociplastic pain: Fibromyalgia Index and descending modulation at population level.

Brain

August 2025

Oxford University Centre for Integrative Neuroimaging, FMRIB, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK.

Nociplastic pain is defined by altered nociceptive processing in the absence of clear peripheral damage or somatosensory lesions. The Fibromyalgia Index (FMI), derived from the 2016 diagnostic criteria, is increasingly used as a marker of nociplastic pain severity in clinical studies, yet its neurobiological validity remains untested at scale. Using multimodal neuroimaging data from over 40,000 participants in UK Biobank, we examined whether FMI scores were associated with altered functional and structural connectivity within the descending pain modulatory system (DPMS), a brain network involved in endogenous pain control and implicated in nociplastic pain conditions.

View Article and Find Full Text PDF

Background: Estimating the time since HIV infection (TSI) at population level is essential for tracking changes in the global HIV epidemic. Most methods for determining TSI give a binary classification of infections as recent or non-recent within a window of several months, and cannot assess the cumulative impact of an intervention.

Results: We developed a Random Forest Regression model, HIV-phyloTSI, which combines measures of within-host diversity and divergence to generate continuous TSI estimates directly from viral deep-sequencing data, with no need for additional variables.

View Article and Find Full Text PDF