Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets.

J Am Med Inform Assoc

Institute for Genomics and Systems Biology, University of Chicago, Chicago, Illinois, USA Computation Institute, University of Chicago, Chicago, Illinois, USA Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago Illinois, USA.

Published: May 2015


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: As large genomics and phenotypic datasets are becoming more common, it is increasingly difficult for most researchers to access, manage, and analyze them. One possible approach is to provide the research community with several petabyte-scale cloud-based computing platforms containing these data, along with tools and resources to analyze it.

Methods: Bionimbus is an open source cloud-computing platform that is based primarily upon OpenStack, which manages on-demand virtual machines that provide the required computational resources, and GlusterFS, which is a high-performance clustered file system. Bionimbus also includes Tukey, which is a portal, and associated middleware that provides a single entry point and a single sign on for the various Bionimbus resources; and Yates, which automates the installation, configuration, and maintenance of the software infrastructure required.

Results: Bionimbus is used by a variety of projects to process genomics and phenotypic data. For example, it is used by an acute myeloid leukemia resequencing project at the University of Chicago. The project requires several computational pipelines, including pipelines for quality control, alignment, variant calling, and annotation. For each sample, the alignment step requires eight CPUs for about 12 h. BAM file sizes ranged from 5 GB to 10 GB for each sample.

Conclusions: Most members of the research community have difficulty downloading large genomics datasets and obtaining sufficient storage and computer resources to manage and analyze the data. Cloud computing platforms, such as Bionimbus, with data commons that contain large genomics datasets, are one choice for broadening access to research data in genomics.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4215034PMC
http://dx.doi.org/10.1136/amiajnl-2013-002155DOI Listing

Publication Analysis

Top Keywords

large genomics
16
genomics datasets
12
genomics phenotypic
8
manage analyze
8
computing platforms
8
bionimbus
6
genomics
6
data
5
bionimbus cloud
4
cloud managing
4

Similar Publications

Germline Findings From Tumor-Only Comprehensive Genomic Profiling in the RATIONAL Study: A Missed Opportunity?

JCO Precis Oncol

September 2025

Cell Biology and Biotherapy Unit, Istituto Nazionale Tumori IRCCS Fondazione G. Pascale, Napoli, Italy.

Purpose: Tumor comprehensive genomic profiling (CGP) may detect potential germline pathogenic/likely pathogenic (P/LP) alterations as secondary findings. We analyzed the frequency of potentially germline variants and large rearrangements (LRs) in the RATIONAL study, an Italian multicenter, observational clinical trial that collects next-generation sequencing-based tumor profiling data, and evaluated how these findings were managed by the enrolling centers.

Patients And Methods: Patients prospectively enrolled in the pathway-B of the RATIONAL study and undergoing CGP with the FoundationOne CDx assays were included in the analysis.

View Article and Find Full Text PDF

Kobuviruses (family Picornaviridae, genus Kobuvirus) are enteric viruses that infect a wide range of both human and animal hosts. Much of the evolutionary history of kobuviruses remains elusive, largely due to limited screening in wildlife. Bats have been implicated as major sources of virulent zoonoses, including coronaviruses, henipaviruses, lyssaviruses, and filoviruses, though much of the bat virome still remains uncharacterized.

View Article and Find Full Text PDF

An international prognostic index to predict the early chemoimmunotherapy failure of diffuse large B-cell lymphoma.

Ann Hematol

September 2025

Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine at Shanghai, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.

 Approximately 30-40% of diffuse large B-cell lymphoma (DLBCL) patients will develop relapse/refractory disease, who may benefit from novel therapies, such as CAR-T cell therapy. Thus, accurate identification of individuals at high risk of early chemoimmunotherapy failure (ECF) is crucial. Methods.

View Article and Find Full Text PDF

Genetic variants in HSP40 co-chaperones modulate ischemic heart disease risk.

Mol Biol Rep

September 2025

Laboratory of Genomic Research, Research Institute for Genetic and Molecular Epidemiology, Kursk State Medical University, Kursk, 305041, Russia.

Background: The chaperoning system, which is responsible for protein homeostasis, plays a significant role in cardiovascular diseases. Among molecular chaperones or heat shock proteins (HSPs), the HSP40 family, the main co-chaperone of HSP70, remains largely underexplored, especially in ischemic heart disease (IHD) risk.

Materials And Results: We genotyped 834 IHD patients and 1,328 healthy controls for three SNPs (rs2034598 and rs7189628 DNAJA2 and rs4926222 DNAJB1) using probe-based real-time PCR.

View Article and Find Full Text PDF

Unlabelled: Oropouche fever is a debilitating disease caused by Oropouche virus (OROV), an arthropod-borne member of the Peribunyaviridae family. Despite its public health significance, the molecular mechanisms driving OROV pathogenesis remain poorly understood. In other bunyaviruses, the nonstructural NSs protein encoded by the small (S) genome segment acts as a major virulence factor.

View Article and Find Full Text PDF