UrbanGen: Urban Generation with Compositional and Controllable Neural Fields.

Yuanbo Yang , Yujun Shen , Yue Wang , Andreas Geiger , Yiyi Liao

IEEE Trans Pattern Anal Mach Intell

Published: August 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Despite the rapid progress in generative radiance fields, most existing methods focus on object-centric applications and are not able to generate complex urban scenes. In this paper, we propose UrbanGen, a solution for the challenging task of generating urban radiance fields with photorealistic rendering, accurate geometry, high controllability, and diverse city styles. Our key idea is to leverage a coarse 3D panoptic prior, represented by a semantic voxel grid for stuff and bounding boxes for countable objects, to condition a compositional generative radiance field. This panoptic prior simplifies the task of learning complex urban geometry, enables disentanglement of stuff and objects, and provides versatile control over both. Moreover, by combining semantic and geometry losses with adversarial training, our method faithfully adheres to the input conditions, allowing for joint rendering of semantic and depth maps alongside RGB images. In addition, we collect a unified dataset with images and their panoptic priors in the same format from 3 diverse real-world datasets: KITTI-360, nuScenes, and Waymo, and train a city style-aware model on this data. Our systematic study shows that UrbanGen outperforms state-of-the-art generative radiance field baselines in terms of image fidelity and geometry accuracy for urban scene generation. Furthermore, UrbenGen brings a new set of controllability features, including large camera movements, stuff editing, and city style control.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TPAMI.2025.3600440	DOI Listing

Publication Analysis

Top Keywords

generative radiance

radiance fields

complex urban

panoptic prior

radiance field

urbangen urban

urban generation

generation compositional

compositional controllable

controllable neural

Similar Publications

Simulating at-sensor hyperspectral satellite data for inland water algal blooms.

Sci Total Environ

September 2025

Department of Geological Sciences and Geological Engineering, Queen's University, 99 University Ave, K7L 3N6 Kingston, Ontario, Canada.

Danielle Beaulne , Fadhli Atarita , Georgia Fotopoulos , Alexander Braun

Hyperspectral data have been overshadowed by multispectral data for studying algal blooms for decades. However, newer hyperspectral missions, including the recent Plankton, Aerosol, Cloud, ocean Ecosystem (PACE) Ocean Color Instrument (OCI), are opening the doors to accessible hyperspectral data, at spatial and temporal resolutions comparable to ocean color and multispectral missions. Simulation studies can help to understand the potential of these hyperspectral sensors prior to launch and without extensive field data collection.

View Article and Find Full Text PDF

Similar Publications

TomoGRAF: An X-ray physics-driven generative radiance field framework for extremely sparse view CT reconstruction.

PLoS One

August 2025

Radiation Oncology, University of California, San Francisco, California, United States of America.

Di Xu , Yang Yang , Hengjie Liu , Qihui Lyu , Martina Descovich

Objectives: Computed tomography (CT) provides high spatial-resolution visualization of 3D structures for various applications. Traditional analytical/iterative CT reconstruction algorithms require hundreds of angular samplings, a condition may not be met practically for physical and mechanical limitations. Sparse view CT reconstruction has been proposed using constrained optimization and machine learning methods with varying success, less so for ultra-sparse view reconstruction.

View Article and Find Full Text PDF

Similar Publications

Vision-Guided Surgical Navigation Using Computer Vision for Dynamic Intraoperative Imaging Updates.

Int Forum Allergy Rhinol

August 2025

Department of Otolaryngology-Head and Neck Surgery, University of Washington, Seattle, Washington, USA.

Jeremy Ruthberg , Nicole Gunderson , Pengcheng Chen , Graham Harris , Hannah Case

Background: Residual disease after endoscopic sinus surgery (ESS) contributes to poor outcomes and revision surgery. Image-guided surgery systems cannot dynamically reflect intraoperative changes. We propose a sensorless, video-based method for intraoperative CT updating using neural radiance fields (NeRF), a deep learning algorithm used to create 3D surgical field reconstructions.

View Article and Find Full Text PDF

Similar Publications

TransGI: Real-Time Dynamic Global Illumination with Object-Centric Neural Transfer Model.

IEEE Trans Vis Comput Graph

August 2025

Yijie Deng , Lei Han , Lu Fang

Neural rendering algorithms have revolutionized computer graphics, yet their impact on real-time rendering under arbitrary lighting conditions remains limited due to strict latency constraints in practical applications. The key challenge lies in formulating a compact yet expressive material representation. To address this, we propose TransGI, a novel neural rendering method for real-time, high-fidelity global illumination.

View Article and Find Full Text PDF

Similar Publications

Neural radiance fields assisted by image features for UAV scene reconstruction.

Sci Rep

August 2025

School of Electrical Engineering, Guangxi University, Nanning, 530004, China.

Zhihong Chen , Xueyun Chen , Chenghong Ye , Shaojie Wu , Xiang Wu

With the rapid advancement of Unmanned Aerial Vehicle applications, vision-based 3D scene reconstruction has demonstrated significant value in fields such as remote sensing and target detection. However, scenes captured by UAVs are often large-scale, sparsely viewed, and complex. These characteristics pose significant challenges for neural radiance field (NeRF)-based reconstruction.

View Article and Find Full Text PDF

Similar Publications