Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Objectives: The success of artificial intelligence (AI) and machine learning (ML) approaches in biomedical research depends on the quality of the underlying data. The National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) Data Centric Challenge was designed to address the challenge of making raw clinical research data AI ready, with a focus on type 1 diabetes studies available in the NIDDK Central Repository (NIDDK-CR). This paper aims to present a structured methodology for enhancing the AI readiness of clinical datasets.

Materials And Methods: We detail a systematic approach for data aggregation and preprocessing, including binning continuous data, processing text features, managing missing values, and encoding for categorical variables while maintaining the data integrity and compatibility with ML algorithms.

Results: We applied the proposed methodology to transform raw clinical data from type 1 diabetes studies in the NIDDK-CR into a structured, AI-ready dataset. The evaluation process validated the effectiveness of our AI-readiness enhancement steps and explored the potential use cases in type 1 diabetes research.

Discussion: The methodology discussed in this paper will serve as guidance for preparing data for AI-driven clinical research, with the resulting AI-ready data to serve as a training tool for building and improving AI/ML model performance.

Conclusion: We present a generalizable framework for preparing clinical research data for AI applications. The resulting datasets lay a strong foundation for downstream AI/ML applications, setting the stage for a new era of data-driven discoveries.

Download full-text PDF

Source
http://dx.doi.org/10.1093/jamia/ocaf114DOI Listing

Publication Analysis

Top Keywords

clinical data
16
data
12
type diabetes
12
preparing clinical
8
artificial intelligence
8
national institute
8
institute diabetes
8
diabetes digestive
8
digestive kidney
8
kidney diseases
8

Similar Publications

Analyzing the toxicological effects of PET-MPs on male infertility: Insights from network toxicology, mendelian randomization, and transcriptomics.

Reprod Biol

September 2025

Department of Obstetrics and Gynecology, The First Affiliated Hospital of Anhui Medical University, Hefei 230022, China; Engineering Research Center of Biopreservation and Artificial Organs, Ministry of Education, No 218 Jixi Road, Hefei Anhui230022, China; Key Laboratory of Population Health Across

Current research indicates that polyethylene terephthalate microplastics (PET-MPs) may significantly impair male reproductive function. This study aimed to investigate the potential molecular mechanisms underlying this impairment. Potential gene targets of PET-MPs were predicted via the SwissTargetPrediction database.

View Article and Find Full Text PDF

Purpose: The present study aimed to develop a noninvasive predictive framework that integrates clinical data, conventional radiomics, habitat imaging, and deep learning for the preoperative stratification of MGMT gene promoter methylation in glioma.

Materials And Methods: This retrospective study included 410 patients from the University of California, San Francisco, USA, and 102 patients from our hospital. Seven models were constructed using preoperative contrast-enhanced T1-weighted MRI with gadobenate dimeglumine as the contrast agent.

View Article and Find Full Text PDF

Clinicopathological features of dermal clear cell sarcoma: A series of 13 cases.

Pathol Res Pract

September 2025

Department of Pathology, Xijing Hospital and School of Basic Medicine, Fourth Military Medical University, Xi'an, China. Electronic address:

Background: Dermal clear cell sarcoma (DCCS) is a rare malignant mesenchymal neoplasm. Owing to the overlaps in its morphological and immunophenotypic profiles with a broad spectrum of tumors exhibiting melanocytic differentiation, it is frequently misdiagnosed as other tumor entities in clinical practice. By systematically analyzing the clinicopathological characteristics, immunophenotypic features, and molecular biological properties of DCCS, this study intends to further enhance pathologists' understanding of this disease and provide a valuable reference for its accurate diagnosis.

View Article and Find Full Text PDF

Background: Existing longitudinal cohort study data and associated biospecimen libraries provide abundant opportunities to efficiently examine new hypotheses through retrospective specimen testing. Outcome-dependent sampling (ODS) methods offer a powerful alternative to random sampling when testing all available specimens is not feasible or biospecimen preservation is desired. For repeated binary outcomes, a common ODS approach is to extend the case-control framework to the longitudinal setting.

View Article and Find Full Text PDF