J Comput Graph Stat
April 2025
Conducting a randomization test is a common method for testing causal null hypotheses in randomized experiments. The popularity of randomization tests is largely because their statistical validity only depends on the randomization design, and no distributional or modeling assumption on the outcome variable is needed. However, randomization tests may still suffer from other sources of bias, among which outcome misclassification is a significant one.
View Article and Find Full Text PDFData collection procedures are often time-consuming and expensive. An alternative to collecting full information from all subjects enrolled in a study is a two-phase design: Variables that are inexpensive or easy to measure are obtained for the study population, and more specific, expensive, or hard-to-measure variables are collected only for a well-selected sample of individuals. Often, only these subjects that provided full information are used for inference, while those that were partially observed are discarded from the analysis.
View Article and Find Full Text PDFInt J Environ Res Public Health
April 2025
Physical function is likely bidirectionally associated with physical activity (PA), sedentary behavior (SB), and sleep. We examined trajectories of physical function as predictors of these behaviors in community-dwelling adults aged ≥65 y without dementia from the Adult Changes in Thought cohort. Exposures were trajectories of physical performance (short Performance-Based Physical Function [sPPF]) and self-reported activities of daily living (ADL) impairment.
View Article and Find Full Text PDFBackground: Anti-inflammatory dietary patterns are associated with slower cognitive decline in older adults; however, little is known about the effects of an anti-inflammatory dietary pattern in middle age.
Objectives: This study aims to adapt an anti-inflammatory diet to a multicultural setting and assess its impact on cognitive decline and Alzheimer's disease risk and related dementias in healthy middle-aged adults.
Methods: We performed a phase II pilot randomized clinical trial in adults (40-65 y old; n = 290) in Bronx, New York.
Multiple imputation (MI) models can be improved with auxiliary covariates (AC), but their performance in high-dimensional data remains unclear. We aimed to develop and compare high-dimensional MI (HDMI) methods using structured and natural language processing (NLP)-derived AC in studies with partially observed confounders. We conducted a plasmode simulation with acute kidney injury as outcome and simulated 100 cohorts with a null treatment effect, incorporating creatinine labs, atrial fibrillation (AFib), and other investigator-derived confounders in the outcome generation.
View Article and Find Full Text PDFObservational databases provide unprecedented opportunities for secondary use in biomedical research. However, these data can be error-prone and must be validated before use. It is usually unrealistic to validate the whole database because of resource constraints.
View Article and Find Full Text PDFBackground: Changes in sleep, physical activity and mental health were observed in older adults during early stages of the COVID-19 pandemic. Here we describe effects of the COVID-19 pandemic on older adult mental health, wellbeing, and lifestyle behaviors and explore predictors of better mid-pandemic mental health and wellbeing.
Methods: Participants in the Adult Changes in Thought study completed measures of lifestyle behaviors (e.
Objective: Partially observed confounder data pose challenges to the statistical analysis of electronic health records (EHR) and systematic assessments of potentially underlying missingness mechanisms are lacking. We aimed to provide a principled approach to empirically characterize missing data processes and investigate performance of analytic methods.
Methods: Three empirical sub-cohorts of diabetic SGLT2 or DPP4-inhibitor initiators with complete information on HbA1c, BMI and smoking as confounders of interest (COI) formed the basis of data simulation under a plasmode framework.
J Gerontol A Biol Sci Med Sci
July 2024
Background: We examined whether trajectories of cognitive function over 10 years predict later-life physical activity (PA), sedentary time (ST), and sleep.
Methods: Participants were from the Adult Changes in Thought (ACT) cohort study. We included 611 ACT participants who wore accelerometers and had 3+ measures of cognition in the 10 years prior to accelerometer wear.
Clin Gastroenterol Hepatol
October 2024
Background & Aims: Mailed outreach for colorectal cancer (CRC) screening increases uptake but it is unclear how to offer the choice of testing. We evaluated if the active choice between colonoscopy and fecal immunochemical test (FIT), or FIT alone, increased response compared with colonoscopy alone.
Methods: This pragmatic, randomized, controlled trial at a community health center included patients between ages 50 and 74 who were not up to date with CRC screening.
Scand Stat Theory Appl
March 2024
Practical problems with missing data are common, and many methods have been developed concerning the validity and/or efficiency of statistical procedures. On a central focus, there have been longstanding interests on the mechanism governing data missingness, and correctly deciding the appropriate mechanism is crucially relevant for conducting proper practical investigations. In this paper, we present a new hypothesis testing approach for deciding between the conventional notions of missing at random and missing not at random in generalized linear models in the presence of instrumental variables.
View Article and Find Full Text PDFObjectives: Partially observed confounder data pose a major challenge in statistical analyses aimed to inform causal inference using electronic health records (EHRs). While analytic approaches such as imputation are available, assumptions on underlying missingness patterns and mechanisms must be verified. We aimed to develop a toolkit to streamline missing data diagnostics to guide choice of analytic approaches based on meeting necessary assumptions.
View Article and Find Full Text PDFIn this paper, we present a practical approach for computing the sandwich variance estimator in 2-stage regression model settings. As a motivating example for 2-stage regression, we consider regression calibration, a popular approach for addressing covariate measurement error. The sandwich variance approach has rarely been applied in regression calibration, despite its requiring less computation time than popular resampling approaches for variance estimation, specifically the bootstrap.
View Article and Find Full Text PDFValidation studies are often used to obtain more reliable information in settings with error-prone data. Validated data on a subsample of subjects can be used together with error-prone data on all subjects to improve estimation. In practice, more than one round of data validation may be required, and direct application of standard approaches for combining validation data into analyses may lead to inefficient estimators since the information available from intermediate validation steps is only partially considered or even completely ignored.
View Article and Find Full Text PDFMatern Child Health J
February 2024
Introduction: Excessive maternal gestational weight gain (GWG) is strongly correlated with childhood obesity, yet how excess maternal weight gain and gestational diabetes mellitus (GDM) interact to affect early childhood obesity is poorly understood. The purpose of this study was to investigate whether overall and trimester-specific maternal GWG and GDM were associated with obesity in offspring by age 6 years.
Methods: A cohort of 10,335 maternal-child dyads was established from electronic health records.
Int J Environ Res Public Health
November 2023
Aircraft noise can disrupt sleep and impair recuperation. The last U.S.
View Article and Find Full Text PDFIn response to the escalating global obesity crisis and its associated health and financial burdens, this paper presents a novel methodology for analyzing longitudinal weight loss data and assessing the effectiveness of financial incentives. Drawing from the Keep It Off trial-a three-arm randomized controlled study with 189 participants-we examined the potential impact of financial incentives on weight loss maintenance. Given that some participants choose not to weigh themselves because of small weight change or weight gains, which is a common phenomenon in many weight-loss studies, traditional methods, for example, the Generalized Estimating Equations (GEE) method tends to overestimate the effect size due to the assumption that data are missing completely at random.
View Article and Find Full Text PDFIn large epidemiologic studies, it is typical for an inexpensive, non-invasive procedure to be used to record disease status during regular follow-up visits, with less frequent assessment by a gold standard test. Inexpensive outcome measures like self-reported disease status are practical to obtain, but can be error-prone. Association analysis reliant on error-prone outcomes may lead to biased results; however, restricting analyses to only data from the less frequently observed error-free outcome could be inefficient.
View Article and Find Full Text PDFMeasurement error is a major issue in self-reported diet that can distort diet-disease relationships. Use of blood concentration biomarkers has the potential to mitigate the subjective bias inherent in self-reporting. As part of the Hispanic Community Health Study/Study of Latinos (HCHS/SOL) baseline visit (2008-2011), self-reported information on diet was collected from all participants (n = 16,415).
View Article and Find Full Text PDF