Application of machine learning in early childhood development research: a scoping review.

Faith Neema Benson , Daisy Chelangat , Willie Brink , Patrick N Mwangala , Akbar K Waljee , Cheryl A Moyer , Amina Abubakar

BMJ Open

Institute for Human Development, The Aga Khan University, Nairobi, Kenya.

Published: August 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Background: Early childhood development (ECD) lays the foundation for lifelong health, academic success and social well-being, yet over 250 million children in low- and middle-income countries are at risk of not reaching their developmental potential. Traditional measures fail to fully capture the risks associated with a child's development outcomes. Artificial intelligence techniques, particularly machine learning (ML), offer an innovative approach by analysing complex datasets to detect subtle developmental patterns.

Objective: To map the existing literature on the use of ML in ECD research, including its geographical distribution, to identify research gaps and inform future directions. The review focuses on applied ML techniques, data types, feature sets, outcomes, data splitting and validation strategies, model performance, model explainability, key themes, clinical relevance and reported limitations.

Design: Scoping review using the Arksey and O'Malley framework with enhancements by Levac DATA SOURCES: A systematic search was conducted on 16 June 2024 across PubMed, Web of Science, IEEE Xplore and PsycINFO, supplemented by grey literature (OpenGrey) and reference hand-searching. No publication date limits were applied.

Eligibility Criteria: Included studies applied ML or its variants (eg, deep learning (DL), natural language processing) to developmental outcomes in children aged 0-8 years. Studies were in English and addressed cognitive, language, motor or social-emotional development. Excluded were studies focusing on robotics; neurodevelopmental disorders such as autism spectrum disorder, attention-deficit/hyperactivity disorder and communication disorders; disease or medical conditions; and review articles.

Data Extraction And Charting: Three reviewers independently extracted data using a structured MS Excel template, covering study ML techniques, data types, feature sets, outcomes, outcome measures, data splitting and validation strategies, model performance, model explainability, key themes, clinical relevance and limitations. A narrative synthesis was conducted, supported by descriptive statistics and visualisations.

Results: Of the 759 articles retrieved, 27 met the inclusion criteria. Most studies (78%) originated from high-income countries, with none from sub-Saharan Africa. Supervised ML classifiers (40.7%) and DL techniques (22.2%) were the most used approaches. Cognitive development was the most frequently targeted outcome (33.3%), often measured using the Bayley Scales of Infant and Toddler Development-III (33.3%). Data types varied, with image, video and sensor-based data being most prevalent. Key predictive features were grouped into six categories: brain features; anthropometric and clinical/biological markers; socio-demographic and environmental factors; medical history and nutritional indicators; linguistic and expressive features; and motor indicators. Most studies (74.1%) focused solely on prediction, with the majority conducting predictions at age 2 years and above. Only 41% of studies employed explainability methods, and validation strategies varied widely. Few studies (7.4%) conducted external validation, and only one had progressed to a clinical trial. Common limitations included small sample sizes, lack of external validation and imbalanced datasets.

Conclusion: There is growing interest in using ML for ECD research, but current research lacks geographical diversity, external validation, explainability and practical implementation. Future work should focus on developing inclusive, interpretable and externally validated models that are integrated into real-world implementation.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12366570	PMC
http://dx.doi.org/10.1136/bmjopen-2025-100358	DOI Listing

Publication Analysis

Top Keywords

data types

validation strategies

external validation

machine learning

early childhood

childhood development

scoping review

data

techniques data

types feature

Similar Publications

Predicting left-without-being-seen in an emergency department as a dynamic risk.

Am J Emerg Med

September 2025

University of Toronto, Rotman School of Management, Canada.

Yaniv Ravid , Rouba Ibrahim , Junqi Hu , Kal Pasupathy , David M Nestler

Study Objective: Accurately predicting which Emergency Department (ED) patients are at high risk of leaving without being seen (LWBS) could enable targeted interventions aimed at reducing LWBS rates. Machine Learning (ML) models that dynamically update these risk predictions as patients experience more time waiting were developed and validated, in order to improve the prediction accuracy and correctly identify more patients who LWBS.

Methods: The study was deemed quality improvement by the institutional review board, and collected all patient visits to the ED of a large academic medical campus over 24 months.

View Article and Find Full Text PDF

Similar Publications

Exploring Online Health Information-Seeking Behavior Among Young Adults: Scoping Review.

J Med Internet Res

September 2025

Department of Community Medicine, Faculty of Health, UiT The Arctic University of Norway, Tromsø, Norway.

Kristine Stifjell , Torkjel M Sandanger , Charlotte Wien

Background: The ability to access and evaluate online health information is essential for young adults to manage their physical and mental well-being. With the growing integration of the internet, mobile technology, and social media, young adults (aged 18-30 years) are increasingly turning to digital platforms for health-related content. Despite this trend, there remains a lack of systematic insights into their specific behaviors, preferences, and needs when seeking health information online.

View Article and Find Full Text PDF

Similar Publications

Structural Basis for HIV-1 Maturation Inhibition by PF-46396 Determined by MAS NMR.

J Am Chem Soc

September 2025

Department of Chemistry and Biochemistry, University of Delaware, Newark, Delaware 19716, United States.

Roman Zadorozhnyi , Caitlin M Quinn , Kaneil K Zadrozny , Sherimay D Ablan , Brandon J Kennedy

Among the different types of HIV-1 maturation inhibitors, those that stabilize the junction between the capsid protein C-terminal domain (CA) and the spacer peptide 1 (SP1) within the immature Gag lattice are promising candidates for antiretroviral therapies. Here, we report the atomic-resolution structure of CA-SP1 assemblies with the small-molecule maturation inhibitor PF-46396 and the assembly cofactor inositol hexakisphosphate (IP6), determined by magic angle spinning (MAS) NMR spectroscopy. Our results reveal that although the two PF-46396 enantiomers exhibit distinct binding modes, they both possess similar anti-HIV potency.

View Article and Find Full Text PDF

Similar Publications

Identification and Exploration of Novel B Cell Infiltration-Related Biomarkers in Endometriosis.

Am J Reprod Immunol

September 2025

Department of Laboratory Animal Science, Kunming Medical University, Kunming, China.

Chunyang Zhao , Shuwei Zhang , Baosu Zhang , Hang Tian , Guojun Yan

Objective: To explore B cell infiltration-related genes in endometriosis (EM) and investigate their potential as diagnostic biomarkers.

Methods: Gene expression data from the GSE51981 dataset, containing 77 endometriosis and 34 control samples, were analyzed to detect differentially expressed genes (DEGs). The xCell algorithm was applied to estimate the infiltration levels of 64 immune and stromal cell types, focusing on B cells and naive B cells.

View Article and Find Full Text PDF

Similar Publications

Kinship verification via correlation calculation-based multi-task learning.

PLoS One

September 2025

School of Computer Science and Technology, Huaiyin Normal University, Huai'an, Jiangsu, China.

Xiaoqian Qin , Dakun Liu , Bin Gui

Previous studies have demonstrated that metric learning approaches yield remarkable performance in the field of kinship verification. Nevertheless, a prevalent limitation of most existing methods lies in their over-reliance on learning exclusively from specified types of given kin data, which frequently results in information isolation. Although generative-based metric learning methods present potential solutions to this problem, they are hindered by substantial computational costs.

View Article and Find Full Text PDF

Similar Publications