98%
921
2 minutes
20
Video super-resolution is a fundamental task aimed at enhancing video quality through intricate modeling techniques. Recent advancements in diffusion models have significantly enhanced image super-resolution processing capabilities. However, their integration into video super-resolution workflows remains constrained due to the computational complexity of temporal fusion modules, demanding more computational resources compared to their image counterparts. To address this challenge, we propose a novel approach: a Frames-Shift Diffusion Model based on the image diffusion models. Compared to directly training diffusion-based video super-resolution models, redesigning the diffusion process of image models without introducing complex temporal modules requires minimal training consumption. We incorporate temporal information into the image super-resolution diffusion model by using optical flow and perform multi-frame fusion. This model adapts the diffusion process to smoothly transition from image super-resolution to video super-resolution diffusion without additional weight parameters. As a result, the Frames-Shift Diffusion Model efficiently processes videos frame by frame while maintaining computational efficiency and achieving superior performance. It enhances perceptual quality and achieves comparable performance to other state-of-the-art diffusion-based VSR methods in PSNR and SSIM. This approach optimizes video super-resolution by simplifying the integration of temporal data, thus addressing key challenges in the field.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.neunet.2025.107435 | DOI Listing |
Quant Imaging Med Surg
September 2025
Department of Radiation Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
Background: Cone-beam computed tomography (CBCT) is a three-dimensional (3D) imaging method designed for routine target verification of cancer patients during radiotherapy. The images are reconstructed from a sequence of projection images obtained by the on-board imager attached to a radiotherapy machine. CBCT images are usually stored in a health information system, but the projection images are mostly abandoned due to their massive volume.
View Article and Find Full Text PDFIEEE Trans Image Process
September 2025
Efficiently compressing HD/UHD content has long been challenging due to high bitrate costs. Instance-adaptive enhancement methods try to tackle this issue by compressing a video at reduced resolution and enhancing it using a neural model specifically overfitted for this video. However, existing methods focus solely on spatial super-resolution (SR) and under-utilize the videos' temporal redundancy.
View Article and Find Full Text PDFIEEE J Biomed Health Inform
August 2025
Enhancing the resolution of Magnetic Resonance Imaging (MRI) through super-resolution (SR) reconstruction is crucial for boosting diagnostic precision. However, current SR methods primarily rely on single LR images or multi-contrast features, limiting detail restoration. Inspired by video frame interpolation, this work utilizes the spatiotemporal correlations between adjacent slices to reformulate the SR task of anisotropic 3D-MRI image into the generation of new high-resolution (HR) slices between adjacent 2D slices.
View Article and Find Full Text PDFIFAC Pap OnLine
September 2024
ECE Dept., Northeastern University, Boston, MA 02115 USA.
There is an ongoing effort in the machine learning community to enable machines to understand the world symbolically, facilitating human interaction with learned representations of complex scenes. A pre-requisite to achieving this is the ability to identify the dynamics of interacting objects from time traces of relevant features. In this paper, we introduce GrODID (GRaph-based Object-Centric Dynamic Mode Decomposition), a framework based on graph neural networks that enables Dynamic Mode Decomposition for systems involving interacting objects.
View Article and Find Full Text PDFSingle-photon avalanche diodes (SPADs) are advanced sensors capable of detecting individual photons and recording their arrival times with picosecond resolution using time-correlated single-photon counting (TCSPC) detection techniques. They are used in various applications, such as LiDAR and low-light imaging. These single-photon cameras can capture high-speed sequences of binary single-photon images, offering great potential for reconstructing 3D environments with high motion dynamics.
View Article and Find Full Text PDF