ByteDance Unveils Depth Anything 3 AI Model That Creates 3D Geometry from Any Visual Input
Summary
ByteDance launches Depth Anything 3, a groundbreaking AI model that transforms any visual input into accurate 3D geometry using a single transformer backbone, dramatically outperforming previous models across depth estimation and 3D generation tasks while offering open-source access with multiple variants and interactive tools.
Key Points
- ByteDance releases Depth Anything 3 (DA3), a new AI model that predicts spatially consistent 3D geometry from any visual inputs using a single transformer backbone and unified depth-ray representation
- DA3 significantly outperforms previous models like DA2 and VGGT across multiple tasks including monocular depth estimation, multi-view depth estimation, camera pose estimation, and 3D Gaussian generation
- The open-source release includes multiple model variants ranging from 0.08B to 1.40B parameters, with features like interactive web UI, command-line interface, and support for various export formats including GLB and PLY