FSDM: An efficient video super-resolution method based on Frames-Shift Diffusion Model.

Shijie Yang , Chao Chen , Jie Liu , Jie Tang , Gangshan Wu

Neural Netw

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, Jiangsu, China; Department of Computer Science and Technology, Nanjing University, Nanjing, 210023, Jiangsu, China. Electronic address:

Published: August 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Video super-resolution is a fundamental task aimed at enhancing video quality through intricate modeling techniques. Recent advancements in diffusion models have significantly enhanced image super-resolution processing capabilities. However, their integration into video super-resolution workflows remains constrained due to the computational complexity of temporal fusion modules, demanding more computational resources compared to their image counterparts. To address this challenge, we propose a novel approach: a Frames-Shift Diffusion Model based on the image diffusion models. Compared to directly training diffusion-based video super-resolution models, redesigning the diffusion process of image models without introducing complex temporal modules requires minimal training consumption. We incorporate temporal information into the image super-resolution diffusion model by using optical flow and perform multi-frame fusion. This model adapts the diffusion process to smoothly transition from image super-resolution to video super-resolution diffusion without additional weight parameters. As a result, the Frames-Shift Diffusion Model efficiently processes videos frame by frame while maintaining computational efficiency and achieving superior performance. It enhances perceptual quality and achieves comparable performance to other state-of-the-art diffusion-based VSR methods in PSNR and SSIM. This approach optimizes video super-resolution by simplifying the integration of temporal data, thus addressing key challenges in the field.

Download full-text PDF	Source
http://dx.doi.org/10.1016/j.neunet.2025.107435	DOI Listing

Publication Analysis

Top Keywords

video super-resolution

diffusion model

frames-shift diffusion

image super-resolution

super-resolution

diffusion

diffusion models

diffusion process

super-resolution diffusion

video

Similar Publications

Deep learning-based super-resolution method for projection image compression in radiotherapy.

Quant Imaging Med Surg

September 2025

Department of Radiation Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.

Zhixing Chang , Jiawen Shang , Yuhan Fan , Peng Huang , Zhihui Hu

Background: Cone-beam computed tomography (CBCT) is a three-dimensional (3D) imaging method designed for routine target verification of cancer patients during radiotherapy. The images are reconstructed from a sequence of projection images obtained by the on-board imager attached to a radiotherapy machine. CBCT images are usually stored in a health information system, but the projection images are mostly abandoned due to their massive volume.

View Article and Find Full Text PDF

Similar Publications

Instance-Adaptive Spatial-Temporal Enhancement for Efficient Video Compression.

IEEE Trans Image Process

September 2025

Yan Zhao , Zhengxue Cheng , Jiangchuan Li , Donghui Feng , Qunshan Gu

Efficiently compressing HD/UHD content has long been challenging due to high bitrate costs. Instance-adaptive enhancement methods try to tackle this issue by compressing a video at reduced resolution and enhancing it using a neural model specifically overfitted for this video. However, existing methods focus solely on spatial super-resolution (SR) and under-utilize the videos' temporal redundancy.

View Article and Find Full Text PDF

Similar Publications

Displacement-Guided Anisotropic 3D-MRI Super-Resolution with Warp Mechanism.

IEEE J Biomed Health Inform

August 2025

Lulu Wang , Siyi Liu , Zhengtao Yu , Jinglong Du , Yingna Li

Enhancing the resolution of Magnetic Resonance Imaging (MRI) through super-resolution (SR) reconstruction is crucial for boosting diagnostic precision. However, current SR methods primarily rely on single LR images or multi-contrast features, limiting detail restoration. Inspired by video frame interpolation, this work utilizes the spatiotemporal correlations between adjacent slices to reformulate the SR task of anisotropic 3D-MRI image into the generation of new high-resolution (HR) slices between adjacent 2D slices.

View Article and Find Full Text PDF

Similar Publications

Identifying the dynamics of interacting objects with applications to scene understanding and video temporal manipulation.

IFAC Pap OnLine

September 2024

ECE Dept., Northeastern University, Boston, MA 02115 USA.

Armand Comas , Christian Fernandez , Sandesh Ghimire , Haolin Li , Octavia Camps

There is an ongoing effort in the machine learning community to enable machines to understand the world symbolically, facilitating human interaction with learned representations of complex scenes. A pre-requisite to achieving this is the ability to identify the dynamics of interacting objects from time traces of relevant features. In this paper, we introduce GrODID (GRaph-based Object-Centric Dynamic Mode Decomposition), a framework based on graph neural networks that enables Dynamic Mode Decomposition for systems involving interacting objects.

View Article and Find Full Text PDF

Similar Publications

Plug-and-play algorithm for 3D guided video super-resolution of single-photon LiDAR data.

Opt Express

March 2025

Alice Ruget , Lewis Wilson , Jonathan Leach , Rachael Tobin , Aongus McCarthy

Single-photon avalanche diodes (SPADs) are advanced sensors capable of detecting individual photons and recording their arrival times with picosecond resolution using time-correlated single-photon counting (TCSPC) detection techniques. They are used in various applications, such as LiDAR and low-light imaging. These single-photon cameras can capture high-speed sequences of binary single-photon images, offering great potential for reconstructing 3D environments with high motion dynamics.

View Article and Find Full Text PDF

Similar Publications