NVIDIA Launches Cosmos 3 AI Model to Power Robots and Autonomous Vehicles With Real-World Understanding

Jun 02, 2026
NVIDIA Blog
Article image for NVIDIA Launches Cosmos 3 AI Model to Power Robots and Autonomous Vehicles With Real-World Understanding

Summary

NVIDIA unveils Cosmos 3, a groundbreaking open world foundation model combining vision reasoning, multimodal generation, and native action output to help robots and autonomous vehicles understand real-world scenarios, already topping multiple benchmarks and available now to developers worldwide.

Key Points

  • NVIDIA is unveiling Cosmos 3, a new open world foundation model announced at GTC Taipei at COMPUTEX, combining vision reasoning and multimodal generation across text, video, images, sound, and action to help robots, autonomous vehicles, and smart space systems understand and predict real-world scenarios.
  • Cosmos 3 features a mixture-of-transformers architecture that enables native action generation — producing numerical data like joint angles and trajectory points — allowing developers to fine-tune the model for specific robotic tasks, embodiments, and environments, with companies like Agile Robots already using it to generate diverse action-conditioned training data at scale.
  • Cosmos 3 is topping multiple open-weight benchmarks including Physics-IQ, R-Bench, PAI-Bench, VANTAGE-Bench, and TAR, and is available now for developers via build.nvidia.com, Hugging Face, GitHub, and NVIDIA NIM microservices under the open OpenMDW 1.1 license from the Linux Foundation.

Tags

Read Original Article