98%
921
2 minutes
20
Background: Modern data-driven medical research provides new insights into the development and course of diseases and enables novel methods of clinical decision support. Clinical and translational data warehouses, such as Informatics for Integrating Biology and the Bedside (i2b2) and tranSMART, are important infrastructure components that provide users with unified access to the large heterogeneous data sets needed to realize this and support use cases such as cohort selection, hypothesis generation, and ad hoc data analysis.
Objective: Often, different warehousing platforms are needed to support different use cases and different types of data. Moreover, to achieve an optimal data representation within the target systems, specific domain knowledge is needed when designing data-loading processes. Consequently, informaticians need to work closely with clinicians and researchers in short iterations. This is a challenging task as installing and maintaining warehousing platforms can be complex and time consuming. Furthermore, data loading typically requires significant effort in terms of data preprocessing, cleansing, and restructuring. The platform described in this study aims to address these challenges.
Methods: We formulated system requirements to achieve agility in terms of platform management and data loading. The derived system architecture includes a cloud infrastructure with unified management interfaces for multiple warehouse platforms and a data-loading pipeline with a declarative configuration paradigm and meta-loading approach. The latter compiles data and configuration files into forms required by existing loading tools, thereby automating a wide range of data restructuring and cleansing tasks. We demonstrated the fulfillment of the requirements and the originality of our approach by an experimental evaluation and a comparison with previous work.
Results: The platform supports both i2b2 and tranSMART with built-in security. Our experiments showed that the loading pipeline accepts input data that cannot be loaded with existing tools without preprocessing. Moreover, it lowered efforts significantly, reducing the size of configuration files required by factors of up to 22 for tranSMART and 1135 for i2b2. The time required to perform the compilation process was roughly equivalent to the time required for actual data loading. Comparison with other tools showed that our solution was the only tool fulfilling all requirements.
Conclusions: Our platform significantly reduces the efforts required for managing clinical and translational warehouses and for loading data in various formats and structures, such as complex entity-attribute-value structures often found in laboratory data. Moreover, it facilitates the iterative refinement of data representations in the target platforms, as the required configuration files are very compact. The quantitative measurements presented are consistent with our experiences of significantly reduced efforts for building warehousing platforms in close cooperation with medical researchers. Both the cloud-based hosting infrastructure and the data-loading pipeline are available to the community as open source software with comprehensive documentation.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7404007 | PMC |
http://dx.doi.org/10.2196/15918 | DOI Listing |
Reprod Biol
September 2025
Department of Obstetrics and Gynecology, The First Affiliated Hospital of Anhui Medical University, Hefei 230022, China; Engineering Research Center of Biopreservation and Artificial Organs, Ministry of Education, No 218 Jixi Road, Hefei Anhui230022, China; Key Laboratory of Population Health Across
Current research indicates that polyethylene terephthalate microplastics (PET-MPs) may significantly impair male reproductive function. This study aimed to investigate the potential molecular mechanisms underlying this impairment. Potential gene targets of PET-MPs were predicted via the SwissTargetPrediction database.
View Article and Find Full Text PDFEBioMedicine
September 2025
Department of Radiology, Yantai Yuhuangding Hospital, Qingdao University, Yantai, Shandong, 264000, PR China; Big Data and Artificial Intelligence Laboratory, Yantai Yuhuangding Hospital, Qingdao University, Yantai, Shandong, 264000, PR China. Electronic address:
Eur J Radiol
September 2025
Department of Radiology, Affiliated Hospital of Hebei University, Baoding 071000, China. Electronic address:
Purpose: The present study aimed to develop a noninvasive predictive framework that integrates clinical data, conventional radiomics, habitat imaging, and deep learning for the preoperative stratification of MGMT gene promoter methylation in glioma.
Materials And Methods: This retrospective study included 410 patients from the University of California, San Francisco, USA, and 102 patients from our hospital. Seven models were constructed using preoperative contrast-enhanced T1-weighted MRI with gadobenate dimeglumine as the contrast agent.
JACC Heart Fail
September 2025
Université de Lorraine, Inserm, Centre d'Investigations Cliniques Plurithématique 1433, Centre Hospitalier Régional Universitaire de Nancy, Nancy, France.
Pathol Res Pract
September 2025
Department of Pathology, Xijing Hospital and School of Basic Medicine, Fourth Military Medical University, Xi'an, China. Electronic address:
Background: Dermal clear cell sarcoma (DCCS) is a rare malignant mesenchymal neoplasm. Owing to the overlaps in its morphological and immunophenotypic profiles with a broad spectrum of tumors exhibiting melanocytic differentiation, it is frequently misdiagnosed as other tumor entities in clinical practice. By systematically analyzing the clinicopathological characteristics, immunophenotypic features, and molecular biological properties of DCCS, this study intends to further enhance pathologists' understanding of this disease and provide a valuable reference for its accurate diagnosis.
View Article and Find Full Text PDF