Computer Vision - News Articles

Tsinghua University and Tencent Hunyuan Release Spatial-TTT, a Streaming Spatial Intelligence Framework Achieving State-of-the-Art Video Benchmark Results

Mar 16, 2026

GitHub

Tsinghua University and Tencent Hunyuan unveil Spatial-TTT, a groundbreaking streaming spatial intelligence framework that uses Test-Time Training to continuously update spatial memory from live video streams, achieving state-of-the-art results on video spatial benchmarks like VSI-Bench, with code, a 97k-sample dataset, and a lightweight model now publicly available.

Computer Vision Deep Learning Research Generative AI Machine Learning

Claude Now Generates Interactive Charts and Visualizations Directly Inside Chat Conversations

Mar 13, 2026

Claude

Claude now generates interactive charts and visualizations directly inside chat conversations in real time, allowing users to request adjustments via natural language as part of a broader push to enhance response formats across all plan types.

Generative AI Human-Computer Interaction Natural Language Data Analytics Computer Vision

Google Maps Launches AI-Powered 'Ask Maps' and Immersive 3D Navigation in Major Overhaul

Mar 13, 2026

Google

Google Maps launches 'Ask Maps,' a Gemini-powered AI feature delivering personalized location recommendations from 300 million places, alongside its biggest driving overhaul in a decade featuring immersive 3D navigation, smart lane guidance, and real-time route disruption alerts now rolling out in the U.S. and India.

Computer Vision Natural Language Generative AI Human-Computer Interaction Transportation

Anthropic Adds Real-Time Visual Generation to Claude, Enabling Interactive Charts and Diagrams in Live Conversations

Mar 13, 2026

The Deep View

Anthropic's Claude now generates real-time interactive charts and diagrams directly within conversations, a free feature available to all users that evolves dynamically as discussions progress, signaling a major shift in how AI tools are enhancing visual learning and education.

Generative AI Education Human-Computer Interaction Computer Vision Natural Language

OpenAI Integrates Sora Video Generator Into ChatGPT Amid Deepfake Fears and Rising Competition

Mar 11, 2026

The Verge

OpenAI is integrating its Sora video generator directly into ChatGPT, sparking deepfake fears as the tool has already been used to create realistic fake videos of historical figures, while the move signals a competitive push against Anthropic's rapidly growing Claude platform.

Generative AI Computer Vision Ethics Media Security

Niantic Spatial Uses 30 Billion Pokémon Go Images to Build Centimeter-Precise AI Navigation System for Real-World Robots

Mar 11, 2026

MIT Technology Review

Niantic Spatial is turning 30 billion Pokémon Go player images into a centimeter-precise AI navigation system, now powering Coco Robotics' 1,000 delivery robots across US and European cities where GPS falls short.

Robotics Computer Vision Autonomous Systems Augmented Reality Smart Cities

Google Launches First Natively Multimodal Embedding Model, Gemini Embedding 2, Supporting Text, Images, Video, and Audio in a Unified Space

Mar 11, 2026

Google

Google launches Gemini Embedding 2, its first natively multimodal embedding model, capable of processing text, images, video, audio, and documents together in a single unified space across 100+ languages, now available in public preview via the Gemini API and Vertex AI.

Generative AI Deep Learning Natural Language Computer Vision Audio

Tencent AI Lab Launches Penguin-VL: A Compact Vision-Language Model That Ditches Traditional Visual Encoders for LLM-Based Architecture

Mar 10, 2026

GitHub

Tencent AI Lab launches Penguin-VL, a compact vision-language model that ditches traditional visual encoders in favor of a text-LLM-initialized architecture, delivering stronger fine-grained visual understanding and efficient long-video processing, with two model variants now live on Hugging Face.

Computer Vision Deep Learning Generative AI Natural Language Machine Learning

MIT Researchers Unveil AI Technique That Makes Computer Vision Models More Accurate and Explainable

Mar 10, 2026

MIT News | Massachusetts Institute of Technology

MIT researchers unveil a breakthrough AI technique that makes computer vision models both more accurate and explainable by extracting learned concepts and translating them into plain-language descriptions, outperforming existing interpretable models in tasks like bird species identification and skin lesion detection.

Computer Vision Deep Learning Healthcare Natural Language Research

DeepSeek, Tencent, and University of Hong Kong Launch Pointer-CAD to Revolutionize AI-Powered 3D Design

Mar 08, 2026

South China Morning Post

DeepSeek, Tencent, and the University of Hong Kong unveil Pointer-CAD, a groundbreaking open-source AI framework built on Alibaba's Qwen 2.5 model that dramatically improves 3D design precision by reducing segmentation errors and streamlining complex entity selection in computer-aided design.

Computer Vision Generative AI Deep Learning Manufacturing Research