Publications by authors named "Mathias Drton"

Differential expression analysis provides insights into fundamental biological processes and with the advent of single-cell transcriptomics, gene expression can now be studied at the level of individual cells. Many analyses treat cells as samples and assume statistical independence. As cells are pseudoreplicates, this assumption does not hold, leading to reduced robustness, reproducibility, and an inflated type 1 error rate.

View Article and Find Full Text PDF

The pattern graph framework solves a wide range of missing data problems with nonignorable mechanisms. However, it faces two challenges of assessability and interpretability, particularly important in safety-critical problems such as clinical diagnosis: (i) How can one assess the validity of the framework's a priori assumption and make necessary adjustments to accommodate known information about the problem? (ii) How can one interpret the process of exponential tilting used for sensitivity analysis in the pattern graph framework and choose the tilt perturbations based on meaningful real-world quantities? In this paper, we introduce Informed Sensitivity Analysis, an extension of the pattern graph framework that enables us to incorporate substantive knowledge about the missingness mechanism into the pattern graph framework. Our extension allows us to examine the validity of assumptions underlying pattern graphs and interpret sensitivity analysis results in terms of realistic problem characteristics.

View Article and Find Full Text PDF

With advances in technology, gene expression measurements from single cells can be used to gain refined insights into regulatory relationships among genes. Directed graphical models are well-suited to explore such (cause-effect) relationships. However, statistical analyses of single cell data are complicated by the fact that the data often show zero-inflated expression patterns.

View Article and Find Full Text PDF

We consider the problem of learning causal structures in sparse high-dimensional settings that may be subject to the presence of (potentially many) unmeasured confounders, as well as selection bias. Based on structure found in common families of large random networks, we propose a new local notion of sparsity for structure learning in the presence of latent and selection variables, and develop a new version of the Fast Causal Inference (FCI) algorithm, which we refer to as local FCI (lFCI). Under the new sparsity condition and an additional assumption that ensures that conditional dependencies can be determined locally, lFCI is consistent and offers reduced computational and sample complexity when compared to standard FCI algorithms.

View Article and Find Full Text PDF

Background: Differential correlation networks are increasingly used to delineate changes in interactions among biomolecules. They characterize differences between omics networks under two different conditions, and can be used to delineate mechanisms of disease initiation and progression.

Results: We present a new R package, CorDiffViz, that facilitates the estimation and visualization of differential correlation networks using multiple correlation measures and inference methods.

View Article and Find Full Text PDF

Estimation of density functions supported on general domains arises when the data are naturally restricted to a proper subset of the real space. This problem is complicated by typically intractable normalizing constants. Score matching provides a powerful tool for estimating densities with such intractable normalizing constants but as originally proposed is limited to densities on [Formula: see text] and [Formula: see text].

View Article and Find Full Text PDF

This paper concerns the development of an inferential framework for high-dimensional linear mixed effect models. These are suitable models, for instance, when we have repeated measurements for subjects. We consider a scenario where the number of fixed effects is large (and may be larger than ), but the number of random effects is small.

View Article and Find Full Text PDF

In most organisms, dietary restriction (DR) increases lifespan. However, several studies have found that genotypes within the same species vary widely in how they respond to DR. To explore the mechanisms underlying this variation, we exposed 178 inbred Drosophila melanogaster lines to a DR or ad libitum (AL) diet, and measured a panel of 105 metabolites under both diets.

View Article and Find Full Text PDF

Bulk gene expression experiments relied on aggregations of thousands of cells to measure the average expression in an organism. Advances in microfluidic and droplet sequencing now permit expression profiling in single cells. This study of cell-to-cell variation reveals that individual cells lack detectable expression of transcripts that appear abundant on a population level, giving rise to zero-inflated expression patterns.

View Article and Find Full Text PDF

A common challenge in estimating parameters of probability density functions is the intractability of the normalizing constant. While in such cases maximum likelihood estimation may be implemented using numerical integration, the approach becomes computationally intensive. The score matching method of Hyvärinen (2005) avoids direct calculation of the normalizing constant and yields closed-form estimates for exponential families of continuous distributions over .

View Article and Find Full Text PDF

Cohort studies in air pollution epidemiology aim to establish associations between health outcomes and air pollution exposures. Statistical analysis of such associations is complicated by the multivariate nature of the pollutant exposure data as well as the spatial misalignment that arises from the fact that exposure data are collected at regulatory monitoring network locations distinct from cohort locations. We present a novel clustering approach for addressing this challenge.

View Article and Find Full Text PDF

Graphical models are widely used to model stochastic dependences among large collections of variables. We introduce a new method of estimating undirected conditional independence graphs based on the score matching loss, introduced by Hyvärinen (2005), and subsequently extended in Hyvärinen (2007). The method we propose applies to settings with continuous observations and allows for computationally efficient treatment of possibly non-Gaussian exponential family models.

View Article and Find Full Text PDF

Objective: The accuracy of methods that classify the cardiac rhythm despite CPR artifact could potentially be improved by utilizing continuous ECG data. Our objective is to compare three approaches which use identical ECG features and differ only in their degree of temporal integration: (1) static classification, which analyzes 4-s ECG frames in isolation; (2) "best-of-three averaging," which takes the average of three consecutive static classifications successively; and (3) "adaptive rhythm sequencing," which uses hidden Markov models to model ECG segments as rhythm sequences.

Methods: Defibrillator recordings from 95 out-of-hospital cardiac arrests were divided into training and test sets.

View Article and Find Full Text PDF

RNA viruses provide prominent examples of measurably evolving populations. In human immunodeficiency virus (HIV) infection, the development of drug resistance is of particular interest because precise predictions of the outcome of this evolutionary process are a prerequisite for the rational design of antiretroviral treatment protocols. We present a mutagenetic tree hidden Markov model for the analysis of longitudinal clonal sequence data.

View Article and Find Full Text PDF

Discussion abounds in the literature as to whether aphasia is a deficit of linguistic competence or linguistic performance and, if it is a performance deficit, what are its precise mechanisms. Considerable evidence suggests that alteration of nonlinguistic factors can affect language performance in aphasia, a finding that raises questions about the modularity of language and the purity of linguistic mechanisms underlying the putative language deficits in persons with aphasia. This study investigated whether temporal stress plus additional cognitive demands placed on non-brain-damaged adults would produce aphasic-like performance on a picture naming task.

View Article and Find Full Text PDF