3D Computer Vision

Proof pending

17papers

6.8viability

-67%30d

Proof pending

Proof pending. Core topic summary fields are still materializing.

State of the Field

3D computer vision is advancing rapidly, focusing on enhancing object detection, scene understanding, and interaction modeling. Recent innovations include techniques for identifying repeated objects, robust point cloud registration, and monocular 3D detection with sparse annotations. These developments are crucial for applications in augmented reality, autonomous driving, and robotics, where accurate 3D perception is essential. By leveraging novel architectures and datasets, researchers are addressing challenges such as occlusion, noise, and the need for real-time processing. This progress not only improves the quality of 3D models but also enables more intuitive human-computer interactions, making it vital for builders looking to integrate advanced 3D capabilities into their products.

Last updated May 27, 2026

Topic-linked question coverage is still building for this proof surface.

Topic trend

Topic-specific paper and score movement from the daily diff ledger.

Papers

1-10 of 17

Research Paper·Mar 25, 2026

Lookalike3D: Seeing Double in 3D

3D object understanding and generation methods produce impressive results, yet they often overlook a pervasive source of information in real-world scenes: repeated objects. We introduce the task of lo...

8.0 viability

Research Paper·Mar 13, 2026

CMHANet: A Cross-Modal Hybrid Attention Network for Point Cloud Registration

Robust point cloud registration is a fundamental task in 3D computer vision and geometric deep learning, essential for applications such as large-scale 3D reconstruction, augmented reality, and scene ...

8.0 viability

Research Paper·Apr 2, 2026

MonoSAOD: Monocular 3D Object Detection with Sparsely Annotated Label

Monocular 3D object detection has achieved impressive performance on densely annotated datasets. However, it struggles when only a fraction of objects are labeled due to the high cost of 3D annotation...

7.0 viabilityHas code

Research Paper·Apr 2, 2026

Semantic Segmentation of Textured Non-manifold 3D Meshes using Transformers

Textured 3D meshes jointly represent geometry, topology, and appearance, yet their irregular structure poses significant challenges for deep-learning-based semantic segmentation. While a few recent me...

7.0 viability

Research Paper·Apr 1, 2026

IGLOSS: Image Generation for Lidar Open-vocabulary Semantic Segmentation

This paper presents a new method for the zero-shot open-vocabulary semantic segmentation (OVSS) of 3D automotive lidar data. To circumvent the recognized image-text modality gap that is intrinsic to a...

7.0 viabilityHas code

Research Paper·Mar 19, 2026

Generalized Hand-Object Pose Estimation with Occlusion Awareness

Generalized 3D hand-object pose estimation from a single RGB image remains challenging due to the large variations in object appearances and interaction patterns, especially under heavy occlusion. We ...

7.0 viability

Research Paper·Mar 30, 2026

Hg-I2P: Bridging Modalities for Generalizable Image-to-Point-Cloud Registration via Heterogeneous Graphs

Image-to-point-cloud (I2P) registration aims to align 2D images with 3D point clouds by establishing reliable 2D-3D correspondences. The drastic modality gap between images and point clouds makes it c...

7.0 viabilityHas code

Research Paper·May 15, 2026·Media & EntertainmentB2B

Robust Prior-Guided Segmentation for Editable 3D Gaussian Splatting

3D Gaussian Splatting (3D-GS) enables real-time 3D scene reconstruction but lacks robust segmentation for editing tasks such as object removal, extraction, and recoloring. Existing approaches that lif...

7.0 viability

Research Paper·Mar 30, 2026

SHOW3D: Capturing Scenes of 3D Hands and Objects in the Wild

Accurate 3D understanding of human hands and objects during manipulation remains a significant challenge for egocentric computer vision. Existing hand-object interaction datasets are predominantly cap...

7.0 viability

Research Paper·Mar 19, 2026

DriveTok: 3D Driving Scene Tokenization for Unified Multi-View Reconstruction and Understanding

With the growing adoption of vision-language-action models and world models in autonomous driving systems, scalable image tokenization becomes crucial as the interface for the visual modality. However...

7.0 viabilityHas code

Page 1 of 2

3D Computer Vision

Proof pending

State of the Field

Topic trend

Papers

Lookalike3D: Seeing Double in 3D

CMHANet: A Cross-Modal Hybrid Attention Network for Point Cloud Registration

MonoSAOD: Monocular 3D Object Detection with Sparsely Annotated Label

Semantic Segmentation of Textured Non-manifold 3D Meshes using Transformers

IGLOSS: Image Generation for Lidar Open-vocabulary Semantic Segmentation

Generalized Hand-Object Pose Estimation with Occlusion Awareness

Hg-I2P: Bridging Modalities for Generalizable Image-to-Point-Cloud Registration via Heterogeneous Graphs

Robust Prior-Guided Segmentation for Editable 3D Gaussian Splatting

SHOW3D: Capturing Scenes of 3D Hands and Objects in the Wild

DriveTok: 3D Driving Scene Tokenization for Unified Multi-View Reconstruction and Understanding

Filters

Topic proof surfaces

3D Computer Vision

Use this topic page as a durable research-area proof surface