Advancing deep learning for expressive music composition and performance modeling.

Man Zhang

Sci Rep

School of Mechanical Engineering, Yellow River Conservancy Technical University, Kaifeng, 475004, Henan, China.

Published: July 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

The pursuit of expressive and human-like music generation remains a significant challenge in the field of artificial intelligence (AI). While deep learning has advanced AI music composition and transcription, current models often struggle with long-term structural coherence and emotional nuance. This study presents a comparative analysis of three leading deep learning architectures: Long Short-Term Memory (LSTM) networks, Transformer models, and Generative Adversarial Networks (GANs), for AI-generated music composition and transcription using the MAESTRO dataset. Our key innovation lies in the integration of a dual evaluation framework that combines objective metrics (perplexity, harmonic consistency, and rhythmic entropy) with subjective human evaluations via a Mean Opinion Score (MOS) study involving 50 listeners. The Transformer model achieved the best overall performance (perplexity: 2.87, harmonic consistency: 79.4%, MOS: 4.3), indicating its superior ability to produce musically rich and expressive outputs. However, human compositions remained highest in perceptual quality (MOS: 4.8). Our findings provide a benchmarking foundation for future AI music systems and emphasize the need for emotion-aware modeling, real-time human-AI collaboration, and reinforcement learning to bridge the gap between machine-generated and human-performed music.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12314053	PMC
http://dx.doi.org/10.1038/s41598-025-13064-6	DOI Listing

Publication Analysis

Top Keywords

deep learning

music composition

composition transcription

harmonic consistency

music

advancing deep

learning

learning expressive

expressive music

composition performance

Similar Publications

Neuroimaging Data Informed Mood and Psychosis Diagnosis Using an Ensemble Deep Multimodal Framework.

Hum Brain Mapp

September 2025

Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS), Georgia State University, Georgia Institute of Technology, and Emory University, Atlanta, Georgia, USA.

Hooman Rokham , Haleh Falakshahi , Godfrey D Pearlson , Vince D Calhoun

Investigating neuroimaging data to identify brain-based markers of mental illnesses has gained significant attention. Nevertheless, these endeavors encounter challenges arising from a reliance on symptoms and self-report assessments in making an initial diagnosis. The absence of biological data to delineate nosological categories hinders the provision of additional neurobiological insights into these disorders.

View Article and Find Full Text PDF

Similar Publications

A robust deep learning-driven framework for detecting Parkinson's disease using EEG.

Comput Methods Biomech Biomed Engin

September 2025

Institute of Radio Physics and Electronics, University of Calcutta, Kolkata, India.

Prithwijit Mukherjee , Anisha Halder Roy

Parkinson's disease (PD) is a neurodegenerative condition that impairs motor functions. Accurate and early diagnosis is essential for enhancing well-being and ensuring effective treatment. This study proposes a deep learning-based approach for PD detection using EEG signals.

View Article and Find Full Text PDF

Similar Publications

Catheter-Based Thrombectomy for Clot-In-Transit and Massive Pulmonary Embolism In A Young Patient with Rheumatoid Arthritis: Inflammation as A Hidden Catalyst for Catastrophic Thromboembolism.

Eur J Case Rep Intern Med

August 2025

Internal Medicine, University of California, Riverside School of Medicine, Riverside, USA.

Han Shin Lee , Christ Ordookhanian , Ryan Amidon , Benjamin Tabibian

Introduction: Pulmonary embolism (PE) is a life-threatening condition with well-defined management strategies; however, the presence of a clot-in-transit (CIT)-a mobile thrombus within the right heart-introduces a uniquely high-risk scenario associated with a significantly elevated mortality rate. While several therapeutic approaches are available-including anticoagulation, systemic thrombolysis, surgical embolectomy, and catheter-directed therapies-there is no established consensus on a superior treatment modality. Catheter-based mechanical thrombectomy has emerged as a promising, minimally invasive alternative that mitigates the bleeding risks of systemic thrombolysis and the invasiveness of surgery.

View Article and Find Full Text PDF

Similar Publications

Artificial Intelligence in Liver Pathology: Precision Histology for Accurate Diagnoses.

J Clin Exp Hepatol

August 2025

Dept of Histopathology, PGIMER, Chandigarh, 160012, India.

Parikshit Sanyal , Dipanwita Biswas , Suvradeep Mitra

Artificial intelligence (AI) is a technique or tool to simulate or emulate human "intelligence." Precision medicine or precision histology refers to the subpopulation-tailored diagnosis, therapeutics, and management of diseases with its sociocultural, behavioral, genomic, transcriptomic, and pharmaco-omic implications. The modern decade experiences a quantum leap in AI-based models in various aspects of daily routines including practice of precision medicine and histology.

View Article and Find Full Text PDF

Similar Publications

Reducing motion artifacts in craniocervical background subtraction angiography with deformable registration and unsupervised deep learning.

Radiol Adv

September 2024

Department of Radiology, Northwestern University and Northwestern Medicine, Chicago, IL, 60611, United States.

Chaochao Zhou , Ramez N Abdalla , Dayeong An , Syed H A Faruqui , Teymour Sadrieh

Background: In clinical practice, digital subtraction angiography (DSA) often suffers from misregistration artifact resulting from voluntary, respiratory, and cardiac motion during acquisition. Most prior efforts to register the background DSA mask to subsequent postcontrast images rely on key point registration using iterative optimization, which has limited real-time application.

Purpose: Leveraging state-of-the-art, unsupervised deep learning, we aim to develop a fast, deformable registration model to substantially reduce DSA misregistration in craniocervical angiography without compromising spatial resolution or introducing new artifacts.

View Article and Find Full Text PDF

Similar Publications