Deep learning-based super-resolution method for projection image compression in radiotherapy.

Zhixing Chang , Jiawen Shang , Yuhan Fan , Peng Huang , Zhihui Hu , Ke Zhang , Jianrong Dai , Hui Yan

Quant Imaging Med Surg

Department of Radiation Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.

Published: September 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Background: Cone-beam computed tomography (CBCT) is a three-dimensional (3D) imaging method designed for routine target verification of cancer patients during radiotherapy. The images are reconstructed from a sequence of projection images obtained by the on-board imager attached to a radiotherapy machine. CBCT images are usually stored in a health information system, but the projection images are mostly abandoned due to their massive volume. To store them economically, in this study, a deep learning (DL)-based super-resolution (SR) method for compressing the projection images was investigated.

Methods: In image compression, low-resolution (LR) images were down-sampled by a factor from the high-resolution (HR) projection images and then encoded to the video file. In image restoration, LR images were decoded from the video file and then up-sampled to HR projection images via the DL network. Three SR DL networks, convolutional neural network (CNN), residual network (ResNet), and generative adversarial network (GAN), were tested along with three video coding-decoding (CODEC) algorithms: Advanced Video Coding (AVC), High Efficiency Video Coding (HEVC), and AOMedia Video 1 (AV1). Based on the two databases of the natural and projection images, the performance of the SR networks and video codecs was evaluated with the compression ratio (CR), peak signal-to-noise ratio (PSNR), video quality metric (VQM), and structural similarity index measure (SSIM).

Results: The codec AV1 achieved the highest CR among the three codecs. The CRs of AV1 were 13.91, 42.08, 144.32, and 289.80 for the down-sampling factor (DSF) 0 (non-SR) 2, 4, and 6, respectively. The SR network, ResNet, achieved the best restoration accuracy among the three SR networks. Its PSNRs were 69.08, 41.60, 37.08, and 32.44 dB for the four DSFs, respectively; its VQMs were 0.06%, 3.65%, 6.95%, and 13.03% for the four DSFs, respectively; and its SSIMs were 0.9984, 0.9878, 0.9798, and 0.9518 for the four DSFs, respectively. As the DSF increased, the CR increased proportionally with the modest degradation of the restored images.

Conclusions: The application of the SR model can further improve the CR based on the current result achieved by the video encoders. This compression method is not only effective for the two-dimensional (2D) projection images, but also applicable to the 3D images used in radiotherapy.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12397698	PMC
http://dx.doi.org/10.21037/qims-2024-2962	DOI Listing

Publication Analysis

Top Keywords

projection images

images

video

super-resolution method

projection

image compression

video file

three networks

network resnet

video coding

Similar Publications

Toward universal immunofluorescence normalization for multiplex tissue imaging with UniFORM.

Cell Rep Methods

August 2025

Department of Biomedical Engineering and Computational Biology Program, OHSU, Portland, OR, USA; Knight Cancer Institute, OHSU, Portland, OR, USA. Electronic address:

Kunlun Wang , Kaoutar Ait-Ahmad , Sam Kupp , Zachary Sims , Eric Cramer

We present UniFORM, a non-parametric, Python-based pipeline for normalizing multiplex tissue imaging (MTI) data at both the feature and pixel levels. UniFORM employs an automated rigid landmark registration method tailored to the distributional characteristics of MTI, with UniFORM operating without prior distributional assumptions and handling both unimodal and bimodal patterns. By aligning the biologically invariant negative populations, UniFORM removes technical variation while preserving tissue-specific expression patterns in positive populations.

View Article and Find Full Text PDF

Similar Publications

Temporal Modeling With Frozen Vision-Language Foundation Models for Parameter-Efficient Text-Video Retrieval.

IEEE Trans Neural Netw Learn Syst

September 2025

Leqi Shen , Tianxiang Hao , Tao He , Yifeng Zhang , Pengzhang Liu

Temporal modeling plays an important role in the effective adaption of the powerful pretrained text-image foundation model into text-video retrieval. However, existing methods often rely on additional heavy trainable modules, such as transformer or BiLSTM, which are inefficient. In contrast, we avoid introducing such heavy components by leveraging frozen foundation models.

View Article and Find Full Text PDF

Similar Publications

Quantitative Analysis of Sexual Dimorphism in Three-Dimensional Facial Anthropometry in Contemporary Thai Population: Implications for Facial Feminization Surgery.

J Craniofac Surg

September 2025

Division of Plastic Surgery, Department of Surgery, Faculty of Medicine, Siriraj Hospital, Mahidol University, Bangkok, Thailand.

Orapa Pongsri , Nattamon Sampattanavorachai , Sirin Apichonbancha , Nutcha Yodrabum , Walter Flapper

Facial feminization surgery (FFS) reshapes masculine facial attributes to align with feminine norms, yet normative anthropometric data for Asian populations remain sparse. We therefore quantified sex-related 3-dimensional (3D) facial metrics in healthy Asian adults to delineate dimorphic benchmarks for surgical planning. We prospectively recruited 40 healthy Asian adults (20 males, 20 females; age 18 to 45 years, mean 28.

View Article and Find Full Text PDF

Similar Publications

EEG-ERnet: Emotion Recognition based on Rhythmic EEG Convolutional Neural Network Model.

J Integr Neurosci

August 2025

School of Computer Science, Guangdong Polytechnic Normal University, 510665 Guangzhou, Guangdong, China.

Shuang Zhang , Chen Ling , Jingru Wu , Jiawen Li , Jiujiang Wang

Background: Emotion recognition from electroencephalography (EEG) can play a pivotal role in the advancement of brain-computer interfaces (BCIs). Recent developments in deep learning, particularly convolutional neural networks (CNNs) and hybrid models, have significantly enhanced interest in this field. However, standard convolutional layers often conflate characteristics across various brain rhythms, complicating the identification of distinctive features vital for emotion recognition.

View Article and Find Full Text PDF

Similar Publications

Evaluation of algorithmic requirements for clinical application of material decomposition using a multi-layer flat panel detector.

J Med Imaging (Bellingham)

September 2025

Otto von Guericke University, Institute for Medical Engineering and Research Campus STIMULATE, Magdeburg, Germany.

Jamin Schaefer , Steffen Kappler , Ferdinand Lueck , Ludwig Ritschl , Thomas Weber

Purpose: The combination of multi-layer flat panel detector (FPDT) X-ray imaging and physics-based material decomposition algorithms allows for the removal of anatomical structures. However, the reliability of these algorithms may be compromised by unaccounted materials or scattered radiation.

Approach: We investigated the two-material decomposition performance of a multi-layer FPDT in the context of 2D chest radiography without and with a 13:1 anti-scatter grid employed.

View Article and Find Full Text PDF

Similar Publications