98%
921
2 minutes
20
The unprecedented advancements in Large Language Models (LLMs) have shown a profound impact on natural language processing but are yet to fully embrace the realm of 3D understanding. This paper introduces PointLLM, a preliminary effort to fill this gap, empowering LLMs to understand point clouds and offering a new avenue beyond 2D data. PointLLM understands colored object point clouds with human instructions, including coordinate-based part specifications, and generates contextually appropriate responses, illustrating its grasp of point clouds and common sense. Specifically, it leverages a point cloud encoder with a powerful LLM to effectively fuse geometric, appearance, and linguistic information. To overcome the scarcity of point-text instruction following data, we developed an automated data generation pipeline, collecting a large-scale dataset of about 1.8M samples with 1M different 3D objects, which facilitates the adoption of the two-stage training strategy prevalent in MLLM development. Additionally, we address the absence of appropriate benchmarks and the limitations of current evaluation metrics by proposing two novel benchmarks: Generative 3D Object Classification and 3D Object Captioning, which are supported by new, comprehensive evaluation metrics derived from human and GPT analyses. Through exploring various training strategies, we develop PointLLM, significantly outperforming 2D and 3D baselines and achieving SOTA performance, with a notable achievement in object captioning tasks where it surpasses human annotators in over 50% of the samples. Codes, datasets, and benchmarks will be available at https://github.com/OpenRobotLab/PointLLM.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TPAMI.2025.3590784 | DOI Listing |
Am J Ophthalmol
September 2025
Singapore Eye Research Institute, Singapore National Eye Centre, Singapore; Duke-NUS Graduate Medical School, Singapore; Department of Ophthalmology, Emory University School of Medicine, Emory University; Department of Biomedical Engineering, Georgia Institute of Technology/Emory University, Atlanta
Purpose: To characterize the 3D structural phenotypes of the optic nerve head (ONH) in patients with glaucoma, high myopia, and concurrent high myopia and glaucoma, and to evaluate their variations across these conditions.
Design: Retrospective cross-sectional study.
Participants: A total of 685 optical coherence tomography (OCT) scans from 754 subjects of Singapore-Chinese ethnicity, including 256 healthy (H), 94 highly myopic (HM), 227 glaucomatous (G), and 108 highly myopic with glaucoma (HMG) cases METHODS: We segmented the retinal and connective tissue layers from OCT volumes and their boundary edges were converted into 3D point clouds.
Neural Netw
September 2025
School of Automation and Intelligent Sensing, Shanghai Jiao Tong University, Shanghai, 200240, China; Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, 200240, China; Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China.
3D shape defect detection plays an important role in autonomous industrial inspection. However, accurate detection of anomalies remains challenging due to the complexity of multimodal sensor data, especially when both color and structural information are required. In this work, we propose a lightweight inter-modality feature prediction framework that effectively utilizes multimodal fused features from the inputs of RGB, depth and point clouds for efficient 3D shape defect detection.
View Article and Find Full Text PDFPhys Chem Chem Phys
September 2025
Department of Chemistry, Veer Narmad South Gujarat University (VNSGU), Udhna - Magdalla Road, Surat-395007, Gujarat, India.
This work reports the nanoscale micellar formation in single and mixed surfactant systems by combining an amphiphilic graft copolymer, Soluplus® (primary surfactant), blended with other polyoxyethylene (POE)-based nonionic surfactants such as Kolliphor® HS15, Kolliphor® EL, Tween-80, TPGS®, and Pluronics® P123 in an aqueous solution environment. The solution behaviour of these surfactants as a single system were analyzed in a wide range of surfactant concentrations and temperatures. Rheological measurements revealed distinct solution behaviour in the case of Soluplus®, ranging from low-viscosity () and fluid-like behavior at ≤20% w/v to a highly viscous state at ≥90% w/v, where the loss modulus ('') exceeded the storage modulus (').
View Article and Find Full Text PDFNeurology
September 2025
Florey Department of Neuroscience and Mental Health, University of Melbourne, Australia.
Background And Objectives: Stroke is a leading cause of long-term disability. Etanercept, a competitive tumor necrosis factor-α inhibitor, has been proposed as a potential treatment for post-stroke impairments when given through a perispinal subcutaneous injection. We aimed to evaluate the safety and efficacy of perispinal etanercept in patients with chronic stroke.
View Article and Find Full Text PDFPLoS One
September 2025
School of Mechanical and Electrical Engineering, China University of Mining and Technology (Beijing), Beijing, China.
Multi-modal data fusion plays a critical role in enhancing the accuracy and robustness of perception systems for autonomous driving, especially for the detection of small objects. However, small object detection remains particularly challenging due to sparse LiDAR points and low-resolution image features, which often lead to missed or imprecise detections. Currently, many methods process LiDAR point clouds and visible-light camera images separately, and then fuse them in the detection head.
View Article and Find Full Text PDF