Addressing extrema and censoring in pollutant and exposure data using mixture of normal distributions.

Atmos Environ (1994)

Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, USA.

Published: October 2013


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Volatile organic compounds (VOC), which include many hazardous chemicals, have been used extensively in personal, commercial and industrial products. Due to the variation in source emissions, differences in the settings and environmental conditions where exposures occur, and measurement issues, distributions of VOC concentrations can have multiple modes, heavy tails, and significant portions of data below the method detection limit (MDL). These issues challenge standard parametric distribution models needed to estimate the exposures, even after log-transformation of the data.

Methods: This paper considers mixture of distributions that can be directly applied to concentration and exposure data. Two types of mixture distributions are considered: the traditional finite mixture of normal distributions, and a semi-parametric Dirichlet process mixture (DPM) of normal distributions. Both methods are implemented for a sample data set obtained from the Relationship between Indoor, Outdoor and Personal Air (RIOPA) study. Performance is assessed based on goodness-of-fit criteria that compare the closeness of the density estimates with the empirical density based on data. The goodness-of-fit for the proposed density estimation methods are evaluated by a comprehensive simulation study.

Results: The finite mixture of normals and DPM of normals have superior performance when compared to the single normal distribution fitted to log-transformed exposure data. The advantages of using these mixture distributions are more pronounced when exposure data have heavy tails or a large fraction of data below the MDL. Distributions from the DPM provided slightly better fits than the finite mixture of normals. Additionally, the DPM method avoids certain convergence issues associated with the finite mixture of normals, and adaptively selects the number of components.

Conclusions: Compared to the finite mixture of normals, DPM of normals has advantages by characterizing uncertainty around the number of components, and by providing a formal assessment of uncertainty for all model parameters through the posterior distribution. The method adapts to a spectrum of departures from standard model assumptions and provides robust estimates of the exposure density even under censoring due to MDL.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3857711PMC
http://dx.doi.org/10.1016/j.atmosenv.2013.05.004DOI Listing

Publication Analysis

Top Keywords

finite mixture
20
exposure data
16
mixture normals
16
normal distributions
12
mixture distributions
12
mixture
10
data
8
mixture normal
8
distributions
8
heavy tails
8

Similar Publications

When analyzing real data sets, statisticians often face the question that the data are heterogeneous and it may not necessarily be possible to model this heterogeneity directly. One natural option in this case is to use the methods based on finite mixtures. The key question in these techniques often is what is the best number of mixtures or, depending on the focus of the analysis, the best number of sub-populations when the model is otherwise fixed.

View Article and Find Full Text PDF

Introduction: Benchtop and animal models have traditionally been used to study the propagation of Onyx Liquid Embolic Systems (Onyx) used in the treatment of brain arteriovenous malformations (AVM). However, such models are costly, do not provide sufficient detail to elucidate how variations in Onyx viscosity alter flow dynamics, and rely on some trial-and-error, resulting in elongated timelines for product development.

Objectives: The goal of this study was to leverage Computational Fluid Dynamics (CFD) simulations to predict the behavior of different Onyx formulations.

View Article and Find Full Text PDF

Latent profile analysis (LPA) is in the finite mixture model analysis family and identifies subgroups by participants' responses to continuous variables (i.e., indicators); participants' probable membership in each subgroup is based on the similarity between the subgroup's prototypical responses and the person's unique responses.

View Article and Find Full Text PDF

Estimating statistical power is essential for designing behavioral medicine studies efficiently and conserving finite resources. Sometimes behavioral medicine researchers are interested in calculating power for 1-sided z-tests of individual parameters (e.g.

View Article and Find Full Text PDF

Response to biologics along a gradient of T2 involvement in patients with severe asthma: a data-driven biomarker clustering approach.

J Allergy Clin Immunol Pract

September 2025

Observational and Pragmatic Research Institute, Singapore, Singapore; Optimum Patient Care Global, Cambridge, UK; Centre of Academic Primary Care, Division of Applied Health Sciences, University of Aberdeen, Aberdeen, UK. Electronic address:

Background: Asthma with low levels of T2-biomarkers is poorly understood.

Objective: To characterize severe asthma phenotypes and compare pre- to post-biologic change in asthma outcomes along a gradient of T2-involvement.

Methods: This was a registry-based, cohort study including data from 24 countries.

View Article and Find Full Text PDF