Bioconductor: Planning a third decade of comprehensive support for genomic data science.

Patterns (N Y)

Channing Division of Network Medicine, Mass General Brigham, 181 Longwood Avenue, Boston, MA 02115, USA.

Published: July 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

This opinion piece discusses the Bioconductor project for open-source bioinformatics and the engineering concepts underlying its effectiveness to date. Since the inception of Bioconductor in 2002 with 15 software packages devoted to analysis of DNA microarrays, it has grown into an ecosystem of ∼3,000 packages contributed by more than 1,000 developers. Aspects of the history and commitments are reviewed here to contribute to thinking about the design and orchestration of future open-source software projects.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416078PMC
http://dx.doi.org/10.1016/j.patter.2025.101319DOI Listing

Publication Analysis

Top Keywords

bioconductor planning
4
planning third
4
third decade
4
decade comprehensive
4
comprehensive support
4
support genomic
4
genomic data
4
data science
4
science opinion
4
opinion piece
4

Similar Publications

Bioconductor: Planning a third decade of comprehensive support for genomic data science.

Patterns (N Y)

July 2025

Channing Division of Network Medicine, Mass General Brigham, 181 Longwood Avenue, Boston, MA 02115, USA.

This opinion piece discusses the Bioconductor project for open-source bioinformatics and the engineering concepts underlying its effectiveness to date. Since the inception of Bioconductor in 2002 with 15 software packages devoted to analysis of DNA microarrays, it has grown into an ecosystem of ∼3,000 packages contributed by more than 1,000 developers. Aspects of the history and commitments are reviewed here to contribute to thinking about the design and orchestration of future open-source software projects.

View Article and Find Full Text PDF

EpipwR: efficient power analysis for EWAS with continuous outcomes.

Bioinform Adv

June 2025

Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX 76107, United States.

Motivation: Epigenome-wide association studies (EWAS) have emerged as a popular way to investigate the pathophysiology of complex diseases and to assist in bridging the gap between genotypes and phenotypes. Despite the increasing popularity of EWAS, very few tools exist to aid researchers in power estimation and those are limited to case-control studies. The existence of user-friendly tools, expanding power calculation functionality to additional study designs, would be a significant aid to researchers planning EWAS.

View Article and Find Full Text PDF

Summary: Thousands of DNA methylation (DNAm) array samples from human blood are publicly available on the Gene Expression Omnibus (GEO), but they remain underutilized for experiment planning, replication and cross-study and cross-platform analyses. To facilitate these tasks, we augmented our recountmethylation R/Bioconductor package with 12 537 uniformly processed EPIC and HM450K blood samples on GEO as well as several new features. We subsequently used our updated package in several illustrative analyses, finding (i) study ID bias adjustment increased variation explained by biological and demographic variables, (ii) most variation in autosomal DNAm was explained by genetic ancestry and CD4+ T-cell fractions and (iii) the dependence of power to detect differential methylation on sample size was similar for each of peripheral blood mononuclear cells (PBMC), whole blood and umbilical cord blood.

View Article and Find Full Text PDF

Background: The National Cancer Institute Informatics Technology for Cancer Research (ITCR) program provides a series of funding mechanisms to create an ecosystem of open-source software (OSS) that serves the needs of cancer research. As the ITCR ecosystem substantially grows, it faces the challenge of the long-term sustainability of the software being developed by ITCR grantees. To address this challenge, the ITCR sustainability and industry partnership working group (SIP-WG) was convened in 2019.

View Article and Find Full Text PDF

Motivation: Accurately predicting the risk of cancer patients is a central challenge for clinical cancer research. For high-dimensional gene expression data, Cox proportional hazard model with the least absolute shrinkage and selection operator for variable selection (Lasso-Cox) is one of the most popular feature selection and risk prediction algorithms. However, the Lasso-Cox model treats all genes equally, ignoring the biological characteristics of the genes themselves.

View Article and Find Full Text PDF