98%
921
2 minutes
20
It is of substantial scientific interest to detect mediators that lie in the causal pathway from an exposure to a survival outcome. However, with high-dimensional mediators, as often encountered in modern genomic data settings, there is a lack of powerful methods that can provide valid post-selection inference for the identified marginal mediation effect. To resolve this challenge, we develop a post-selection inference procedure for the maximally selected natural indirect effect using a semiparametric efficient influence function approach. To this end, we establish the asymptotic normality of a stabilized one-step estimator that takes the selection of the mediator into account. Simulation studies show that our proposed method has good empirical performance. We further apply our proposed approach to a lung cancer dataset and find multiple DNA methylation CpG sites that might mediate the effect of cigarette smoking on lung cancer survival.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12369553 | PMC |
http://dx.doi.org/10.1111/sjos.12770 | DOI Listing |
Scand Stat Theory Appl
June 2025
Department of Biostatistics, Columbia University, New York, New York, USA.
It is of substantial scientific interest to detect mediators that lie in the causal pathway from an exposure to a survival outcome. However, with high-dimensional mediators, as often encountered in modern genomic data settings, there is a lack of powerful methods that can provide valid post-selection inference for the identified marginal mediation effect. To resolve this challenge, we develop a post-selection inference procedure for the maximally selected natural indirect effect using a semiparametric efficient influence function approach.
View Article and Find Full Text PDFScand Stat Theory Appl
June 2025
Department of Statistics and Probability, Michigan State University, East Lansing, Michigan, USA.
We develop a post-selection inference method for the Cox proportional hazards model with interval-censored data, which provides asymptotically valid p-values and confidence intervals conditional on the model selected by lasso. The method is based on a pivotal quantity that is shown to converge to a uniform distribution under local parameters. Our method involves estimation of the efficient information matrix, for which several approaches are proposed with proof of their consistency.
View Article and Find Full Text PDFFront Public Health
July 2025
Children's Health and Environment Program, Child Health Research Centre, The University of Queensland, Brisbane, QLD, Australia.
Introduction: The association between air pollution and adverse health outcomes has been extensively documented, with oxidative stress widely considered a contributing factor. However, the precise underlying mechanism(s) remains unclear. Recent studies suggest that environmentally persistent free radicals (EPFRs) may provide the missing connection between air pollution and its detrimental health effects.
View Article and Find Full Text PDFNat Commun
July 2025
Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA.
In a standard analysis, pleiotropic variants are identified by running separate genome-wide association studies (GWAS) and combining results across traits. But such statistical approach based on marginal summary statistics may lead to spurious results. We propose a new statistical approach, Debiased-regularized Factor Analysis Regression Model (DrFARM), through a joint regression model for simultaneous analysis of high-dimensional genetic variants and multilevel dependencies.
View Article and Find Full Text PDFStat Methods Med Res
June 2025
Department of Biostatistics, University of Oslo, Oslo, Norway.
Simultaneously performing variable selection and inference in high-dimensional regression models is an open challenge in statistics and machine learning. The increasing availability of vast amounts of variables requires the adoption of specific statistical procedures to accurately select the most important predictors in a high-dimensional space, while controlling the false discovery rate (FDR) associated with the variable selection procedure. In this paper, we propose the joint adoption of the Mirror Statistic approach to FDR control, coupled with outcome randomisation to maximise the statistical power of the variable selection procedure, measured through the true positive rate.
View Article and Find Full Text PDF