Publications by authors named "Xuemiao Xu"

We introduce GenPoly, a novel generalized 3D prior model designed for multiple 3D generation tasks, focusing on preserving fine details. While previous works learn generalizable representations by decomposing objects into coarse-grained components to reassemble a coherent global structure, this approach sacrifices small-scale details. In this paper, we take a different perspective, formulating 3D prior modeling as a bottom-up polymorphic evolving process.

View Article and Find Full Text PDF

Interactive 3D segmentation in radiance fields is crucial for advanced 3D scene understanding and manipulation. However, existing methods often struggle to achieve both volumetric completeness and segmentation accuracy, primarily because they fail to consider the critical links between 2D prompt-based segmentations across multiple views. Motivated by this gap, we introduce Gaussian Prompter, a novel approach specifically designed for 3D Gaussian Splatting.

View Article and Find Full Text PDF

Traditional human neural radiance fields often overlook crucial body semantics, resulting in ambiguous reconstructions, particularly in occluded regions. To address this problem, we propose the Super-Semantic Disentangled Neural Renderer (SSD-NeRF), which employs rich regional semantic priors to enhance human rendering accuracy. This approach initiates with a Visible-Invisible Semantic Propagation module, ensuring coherent semantic assignment to occluded parts based on visible body segments.

View Article and Find Full Text PDF

Model-based class incremental learning (CIL) methods aim to address the challenge of catastrophic forgetting by retaining certain parameters and expanding the model architecture. However, retaining too many parameters can lead to an overly complex model, increasing inference overhead. Additionally, compressing these parameters to reduce the model size can result in performance degradation.

View Article and Find Full Text PDF

Previous asymmetric image retrieval methods based on knowledge distillation have primarily focused on aligning the global features of two networks to transfer global semantic information from the gallery network to the query network. However, these methods often fail to effectively transfer local semantic information, limiting the fine-grained alignment of feature representation spaces between the two networks. To overcome this limitation, we propose a novel approach called Layered-Granularity Localized Distillation (GranDist).

View Article and Find Full Text PDF

Diffusion models have garnered significant attention for MRI Super-Resolution (SR) and have achieved promising results. However, existing diffusion-based SR models face two formidable challenges: 1) insufficient exploitation of complementary information from multi-contrast images, which hinders the faithful reconstruction of texture details and anatomical structures; and 2) reliance on fixed magnification factors, such as 2× or 4×, which is impractical for clinical scenarios that require arbitrary scale magnification. To circumvent these issues, this paper introduces IM-Diff, an implicit multi-contrast diffusion model for arbitrary-scale MRI SR, leveraging the merits of both multi-contrast information and the continuous nature of implicit neural representation (INR).

View Article and Find Full Text PDF

Single-image 3D shape reconstruction has attracted significant attention with the advance of generative models. Recent studies have utilized diffusion models to achieve unprecedented shape reconstruction quality. However, these methods, in each sampling step, perform denoising in a single forward pass, leading to cumulative errors that severely impact the geometric consistency of the generated shapes with the input targets and face difficulties in reconstructing rich details of complex 3D shapes.

View Article and Find Full Text PDF

The vulnerability of 3D point cloud analysis to unpredictable rotations poses an open yet challenging problem: orientation-aware 3D domain generalization. Cross-domain robustness and adaptability of 3D representations are crucial but not easily achieved through rotation augmentation. Motivated by the inherent advantages of intricate orientations in enhancing generalizability, we propose an innovative rotation-adaptive domain generalization framework for 3D point cloud analysis.

View Article and Find Full Text PDF

Text-to-image generation models have significantly broadened the horizons of creative expression through the power of natural language. However, navigating these models to generate unique concepts, alter their appearance, or reimagine them in unfamiliar roles presents an intricate challenge. For instance, how can we exploit language-guided models to transpose an anime character into a different art style, or envision a beloved character in a radically different setting or role? This paper unveils a novel approach named DreamAnime, designed to provide this level of creative freedom.

View Article and Find Full Text PDF

Throughout history, static paintings have captivated viewers within display frames, yet the possibility of making these masterpieces vividly interactive remains intriguing. This research paper introduces 3DArtmator, a novel approach that aims to represent artforms in a highly interpretable stylized space, enabling 3D-aware animatable reconstruction and editing. Our rationale is to transfer the interpretability and 3D controllability of the latent space in a 3D-aware GAN to a stylized sub-space of a customized GAN, revitalizing the original artforms.

View Article and Find Full Text PDF

Using a sequence of discrete still images to tell a story or introduce a process has become a tradition in the field of digital visual media. With the surge in these media and the requirements in downstream tasks, acquiring their main topics or genres in a very short time is urgently needed. As a representative form of the media, comic enjoys a huge boom as it has gone digital.

View Article and Find Full Text PDF

Converting a human portrait to anime style is a desirable but challenging problem. Existing methods fail to resolve this problem due to the large inherent gap between two domains that cannot be overcome by a simple direct mapping. For this reason, these methods struggle to preserve the appearance features in the original photo.

View Article and Find Full Text PDF

Photorealistic multiview face synthesis from a single image is a challenging problem. Existing works mainly learn a texture mapping model from the source to the target faces. However, they rarely consider the geometric constraints on the internal deformation arising from pose variations, which causes a high level of uncertainty in face pose modeling, and hence, produces inferior results for large pose variations.

View Article and Find Full Text PDF

Occluding effects have been frequently used to present weather conditions and environments in cartoon animations, such as raining, snowing, moving leaves, and moving petals. While these effects greatly enrich the visual appeal of the cartoon animations, they may also cause undesired occlusions on the content area, which significantly complicate the analysis and processing of the cartoon animations. In this article, we make the first attempt to separate the occluding effects and content for cartoon animations.

View Article and Find Full Text PDF

In the above article [1], unfortunately, Fig. 5 was not displayed correctly with many empty images. The correct version is supplemented here.

View Article and Find Full Text PDF

Existing GAN-based multi-view face synthesis methods rely heavily on "creating" faces, and thus they struggle in reproducing the faithful facial texture and fail to preserve identity when undergoing a large angle rotation. In this paper, we combat this problem by dividing the challenging large-angle face synthesis into a series of easy small-angle rotations, and each of them is guided by a face flow to maintain faithful facial details. In particular, we propose a Face Flow-guided Generative Adversarial Network (FFlowGAN) that is specifically trained for small-angle synthesis.

View Article and Find Full Text PDF

With the explosive growth of action categories, zero-shot action recognition aims to extend a well-trained model to novel/unseen classes. To bridge the large knowledge gap between seen and unseen classes, in this brief, we visually associate unseen actions with seen categories in a visually connected graph, and the knowledge is then transferred from the visual features space to semantic space via the grouped attention graph convolutional networks (GAGCNs). In particular, we extract visual features for all the actions, and a visually connected graph is built to attach seen actions to visually similar unseen categories.

View Article and Find Full Text PDF

Deep learning has been recently demonstrated as an effective tool for raster-based sketch simplification. Nevertheless, it remains challenging to simplify extremely rough sketches. We found that a simplification network trained with a simple loss, such as pixel loss or discriminator loss, may fail to retain the semantically meaningful details when simplifying a very sketchy and complicated drawing.

View Article and Find Full Text PDF

Superfluidity is a special state of matter exhibiting macroscopic quantum phenomena and acting like a fluid with zero viscosity. In such a state, superfluid vortices exist as phase singularities of the model equation with unique distributions. This paper presents novel techniques to aid the visual understanding of superfluid vortices based on the state-of-the-art non-linear Klein-Gordon equation, which evolves a complex scalar field, giving rise to special vortex lattice/ring structures with dynamic vortex formation, reconnection, and Kelvin waves, etc.

View Article and Find Full Text PDF

Most graphics hardware features memory to store textures and vertex data for rendering. However, because of the irreversible trend of increasing complexity of scenes, rendering a scene can easily reach the limit of memory resources. Thus, vertex data are preferably compressed, with a requirement that they can be decompressed during rendering.

View Article and Find Full Text PDF

While ASCII art is a worldwide popular art form, automatic generating structure-based ASCII art from natural photographs remains challenging. The major challenge lies on extracting the perception-sensitive structure from the natural photographs so that a more concise ASCII art reproduction can be produced based on the structure. However, due to excessive amount of texture in natural photos, extracting perception-sensitive structure is not easy, especially when the structure may be weak and within the texture region.

View Article and Find Full Text PDF

Exterior orientation parameters' (EOP) estimation using space resection plays an important role in topographic reconstruction for push broom scanners. However, existing models of space resection are highly sensitive to errors in data. Unfortunately, for lunar imagery, the altitude data at the ground control points (GCPs) for space resection are error-prone.

View Article and Find Full Text PDF

Traditional methods in graphics to simulate liquid-air dynamics under different scenarios usually employ separate approaches with sophisticated interface tracking/reconstruction techniques. In this paper, we propose a novel unified approach which is easy and effective to produce a variety of liquid-air interface phenomena. These phenomena, such as complex surface splashes, bubble interactions, as well as surface tension effects, can co-exist in one single simulation, and are created within the same computational framework.

View Article and Find Full Text PDF
Article Synopsis
  • We introduce a new biometric recognition method that uses inner knuckle prints (IKPs), which is effective in varying lighting, different hand positions, and low-quality images.
  • Our approach includes a unique feature extraction technique that emphasizes important details while minimizing errors, and a structure-context descriptor that deals with changes in hand orientation.
  • Compared to existing methods, our technique is more flexible and accurate, especially in uncontrolled environments, marking a significant advancement in low-resolution hand biometrics.
View Article and Find Full Text PDF