Research with structured Electronic Health Records (EHRs) is expanding as data becomes more accessible; analytic methods advance; and the scientific validity of such studies is increasingly accepted. However, data science methodology to enable the rapid searching/extraction, cleaning and analysis of these large, often complex, datasets is less well developed. In addition, commonly used software is inadequate, resulting in bottlenecks in research workflows and in obstacles to increased transparency and reproducibility of the research.
View Article and Find Full Text PDFBackground: In modern health care systems, the computerization of all aspects of clinical care has led to the development of large data repositories. For example, in the UK, large primary care databases hold millions of electronic medical records, with detailed information on diagnoses, treatments, outcomes and consultations. Careful analyses of these observational datasets of routinely collected data can complement evidence from clinical trials or even answer research questions that cannot been addressed in an experimental setting.
View Article and Find Full Text PDFBackground: The use of Electronic Health Records databases for medical research has become mainstream. In the UK, increasing use of Primary Care Databases is largely driven by almost complete computerisation and uniform standards within the National Health Service. Electronic Health Records research often begins with the development of a list of clinical codes with which to identify cases with a specific condition.
View Article and Find Full Text PDFObjectives: The UK's Quality and Outcomes Framework permits practices to exempt patients from financially-incentivised performance targets. To better understand the determinants and consequences of being exempted from the framework, we investigated the associations between exception reporting, patient characteristics and mortality. We also quantified the proportion of exempted patients that met quality targets for a tracer condition (diabetes).
View Article and Find Full Text PDFObjective: To describe the incidence and distribution of ADHD within the United Kingdom, and to examine whether there was any association between ADHD incidence and socioeconomic deprivation.
Method: The study used data from the Clinical Practice Research Datalink (CPRD). Patients diagnosed with ADHD before the age of 19 between January 1, 2004 and December 31, 2013 were stratified according to the region in which their general practice was based.
Interrupted time series analysis is a quasi-experimental design that can evaluate an intervention effect, using longitudinal data. The advantages, disadvantages, and underlying assumptions of various modelling approaches are discussed using published examples
View Article and Find Full Text PDFUnlabelled: We used a Bayesian hierarchical selection model to study publication bias in 1106 meta-analyses from the Cochrane Database of Systematic Reviews comparing treatment with either placebo or no treatment. For meta-analyses of efficacy, we estimated the ratio of the probability of including statistically significant outcomes favoring treatment to the probability of including other outcomes. For meta-analyses of safety, we estimated the ratio of the probability of including results showing no evidence of adverse effects to the probability of including results demonstrating the presence of adverse effects.
View Article and Find Full Text PDFObjectives: To conduct a fully independent, external validation of a research study based on one electronic health record database using a different database sampling from the same population.
Design: Retrospective cohort analysis of β-blocker therapy and all-cause mortality in patients with cancer.
Setting: Two UK national primary care databases (PCDs): the Clinical Practice Research Datalink (CPRD) and Doctors' Independent Network (DIN).
Objectives: To quantify the relationship between a national primary care pay-for-performance programme, the UK's Quality and Outcomes Framework (QOF), and all-cause and cause-specific premature mortality linked closely with conditions included in the framework.
Design: Longitudinal spatial study, at the level of the "lower layer super output area" (LSOA).
Setting: 32482 LSOAs (neighbourhoods of 1500 people on average), covering the whole population of England (approximately 53.
Aims/hypothesis: We aimed to describe the shape of observed relationships between risk factor levels and clinically important outcomes in type 2 diabetes after adjusting for multiple confounders.
Methods: We used retrospective longitudinal data on 246,544 adults with type 2 diabetes from 600 practices in the Clinical Practice Research Datalink, 2006-2012. Proportional hazards regression models quantified the risks of mortality and microvascular or macrovascular events associated with four modifiable biological variables (HbA1c, systolic BP, diastolic BP and total cholesterol), while controlling for important patient and practice covariates.
Lists of clinical codes are the foundation for research undertaken using electronic medical records (EMRs). If clinical code lists are not available, reviewers are unable to determine the validity of research, full study replication is impossible, researchers are unable to make effective comparisons between studies, and the construction of new code lists is subject to much duplication of effort. Despite this, the publication of clinical codes is rarely if ever a requirement for obtaining grants, validating protocols, or publishing research.
View Article and Find Full Text PDFObjective: To conduct a fully independent and external validation of a research study based on one electronic health record database, using a different electronic database sampling the same population.
Design: Using the Clinical Practice Research Datalink (CPRD), we replicated a published investigation into the effects of statins in patients with ischaemic heart disease (IHD) by a different research team using QResearch. We replicated the original methods and analysed all-cause mortality using: (1) a cohort analysis and (2) a case-control analysis nested within the full cohort.
Objectives: To investigate the effect of withdrawing incentives on recorded quality of care, in the context of the UK Quality and Outcomes Framework pay for performance scheme.
Design: Retrospective longitudinal study.
Setting: Data for 644 general practices, from 2004/05 to 2011/12, extracted from the Clinical Practice Research Datalink.
Significant changes in plant phenology have been observed in response to increases in mean global temperatures. There are concerns that accelerated phenologies can negatively impact plant populations. However, the fitness consequence of changes in phenology in response to elevated temperature is not well understood, particularly under field conditions.
View Article and Find Full Text PDFBackground: Heterogeneity has a key role in meta-analysis methods and can greatly affect conclusions. However, true levels of heterogeneity are unknown and often researchers assume homogeneity. We aim to: a) investigate the prevalence of unobserved heterogeneity and the validity of the assumption of homogeneity; b) assess the performance of various meta-analysis methods; c) apply the findings to published meta-analyses.
View Article and Find Full Text PDFMany biological characteristics of evolutionary interest are not scalar variables but continuous functions. Given a dataset of function-valued traits generated by evolution, we develop a practical, statistical approach to infer ancestral function-valued traits, and estimate the generative evolutionary process. We do this by combining dimension reduction and phylogenetic Gaussian process regression, a non-parametric procedure that explicitly accounts for known phylogenetic relationships.
View Article and Find Full Text PDFThe origin of species diversity has challenged biologists for over two centuries. Allopatric speciation, the divergence of species resulting from geographical isolation, is well documented. However, sympatric speciation, divergence without geographical isolation, is highly controversial.
View Article and Find Full Text PDFMol Phylogenet Evol
April 2004
A phylogeny of basils and allies (Lamiaceae, tribe Ocimeae) based on sequences of the trnL intron, trnL-trnF intergene spacer and rps 16 intron of the plastid genome is presented. Several methods were used to reconstruct phylogenies and to assess statistical support for clades: maximum parsimony with equally and successively weighted characters, bootstrap resampling, and Bayesian inference. The phylogeny is used to investigate the distribution of morphological, pericarp anatomy, chemical, and pollen characters as well as the geographical distribution of the clades.
View Article and Find Full Text PDF