A New Technique for Evaluating Land-use Regression Models and Their Impact on Health Effect Estimates.

Epidemiology

From the aInstitute for Risk Assessment Sciences, Utrecht University, Utrecht, The Netherlands; bDepartment of Environmental and Occupational Health Sciences, University of Washington, Seattle, WA; cJulius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The N

Published: January 2016


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Leave-one-out cross-validation that fails to account for variable selection does not properly reflect prediction accuracy when the number of training sites is small. The impact on health effect estimates has rarely been studied. The objective of this study was to develop an improved validation procedure for land-use regression models with variable selection and investigate health effect estimates in relation to land-use regression model performance.

Methods: We randomly generated 10 training and test sets for nitrogen dioxide and particulate matter. For each training set, we developed models and evaluated them using a cross-holdout validation approach. Cross-holdout validation develops new models for each evaluation compared with refitting the model without variable selection, as in standard leave-one-out cross-validation. We also implemented holdout validation, which evaluates model predictions using independent test sets. We evaluated the relationship between cross-holdout validation and holdout validation R and estimates of the association between air pollution and forced vital capacity in the Dutch birth cohort.

Results: Cross-holdout validation Rs were generally identical to holdout validation Rs, but were notably smaller than leave-one-out cross-validation Rs. Decreases in forced vital capacity in relation to air pollution exposure were larger for land-use regression models that had larger holdout validation and cross-holdout validation Rs rather than leave-one-out cross-validation R.

Conclusion: Cross-holdout validation accurately reflects predictive ability of land-use regression models and is a useful validation approach for small datasets. Land-use regression predictive ability in terms of holdout validation and cross-holdout validation rather than leave-one-out cross-validation was associated with the magnitude of health effect estimates in a case study.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5221608PMC
http://dx.doi.org/10.1097/EDE.0000000000000404DOI Listing

Publication Analysis

Top Keywords

cross-holdout validation
28
land-use regression
24
leave-one-out cross-validation
20
holdout validation
20
regression models
16
health estimates
16
validation
14
variable selection
12
impact health
8
test sets
8

Similar Publications

Land use regression modelling of NO in São Paulo, Brazil.

Environ Pollut

November 2021

Department of Epidemiology and Public Health, Swiss Tropical and Public Health Institute, Socinstrasse 57, P.O.Box, 4002 Basel, Switzerland; University of Basel, Petersplatz 1, P. O. Box, 4001, Basel, Switzerland. Electronic address:

Background: Air pollution is a major global public health problem. The situation is most severe in low- and middle-income countries, where pollution control measures and monitoring systems are largely lacking. Data to quantify the exposure to air pollution in low-income settings are scarce.

View Article and Find Full Text PDF

Development of land-use regression models for fine particles and black carbon in peri-urban South India.

Sci Total Environ

September 2018

Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain.

Land-use regression (LUR) has been used to model local spatial variability of particulate matter in cities of high-income countries. Performance of LUR models is unknown in less urbanized areas of low-/middle-income countries (LMICs) experiencing complex sources of ambient air pollution and which typically have limited land use data. To address these concerns, we developed LUR models using satellite imagery (e.

View Article and Find Full Text PDF

A New Technique for Evaluating Land-use Regression Models and Their Impact on Health Effect Estimates.

Epidemiology

January 2016

From the aInstitute for Risk Assessment Sciences, Utrecht University, Utrecht, The Netherlands; bDepartment of Environmental and Occupational Health Sciences, University of Washington, Seattle, WA; cJulius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The N

Background: Leave-one-out cross-validation that fails to account for variable selection does not properly reflect prediction accuracy when the number of training sites is small. The impact on health effect estimates has rarely been studied. The objective of this study was to develop an improved validation procedure for land-use regression models with variable selection and investigate health effect estimates in relation to land-use regression model performance.

View Article and Find Full Text PDF