Computer Vision

652 articles found

Tsinghua University and Tencent Hunyuan Release Spatial-TTT, a Streaming Spatial Intelligence Framework Achieving State-of-the-Art Video Benchmark Results

Tsinghua University and Tencent Hunyuan Release Spatial-TTT, a Streaming Spatial Intelligence Framework Achieving State-of-the-Art Video Benchmark Results

Mar 16, 2026
GitHub

Tsinghua University and Tencent Hunyuan unveil Spatial-TTT, a groundbreaking streaming spatial intelligence framework that uses Test-Time Training to continuously update spatial memory from live video streams, achieving state-of-the-art results on video spatial benchmarks like VSI-Bench, with code, a 97k-sample dataset, and a lightweight model now publicly available.

Google Maps Launches AI-Powered 'Ask Maps' and Immersive 3D Navigation in Major Overhaul

Google Maps Launches AI-Powered 'Ask Maps' and Immersive 3D Navigation in Major Overhaul

Mar 13, 2026
Google

Google Maps launches 'Ask Maps,' a Gemini-powered AI feature delivering personalized location recommendations from 300 million places, alongside its biggest driving overhaul in a decade featuring immersive 3D navigation, smart lane guidance, and real-time route disruption alerts now rolling out in the U.S. and India.

Anthropic Adds Real-Time Visual Generation to Claude, Enabling Interactive Charts and Diagrams in Live Conversations

Anthropic Adds Real-Time Visual Generation to Claude, Enabling Interactive Charts and Diagrams in Live Conversations

Mar 13, 2026
The Deep View

Anthropic's Claude now generates real-time interactive charts and diagrams directly within conversations, a free feature available to all users that evolves dynamically as discussions progress, signaling a major shift in how AI tools are enhancing visual learning and education.

Tencent AI Lab Launches Penguin-VL: A Compact Vision-Language Model That Ditches Traditional Visual Encoders for LLM-Based Architecture

Tencent AI Lab Launches Penguin-VL: A Compact Vision-Language Model That Ditches Traditional Visual Encoders for LLM-Based Architecture

Mar 10, 2026
GitHub

Tencent AI Lab launches Penguin-VL, a compact vision-language model that ditches traditional visual encoders in favor of a text-LLM-initialized architecture, delivering stronger fine-grained visual understanding and efficient long-video processing, with two model variants now live on Hugging Face.

MIT Researchers Unveil AI Technique That Makes Computer Vision Models More Accurate and Explainable

MIT Researchers Unveil AI Technique That Makes Computer Vision Models More Accurate and Explainable

Mar 10, 2026
MIT News | Massachusetts Institute of Technology

MIT researchers unveil a breakthrough AI technique that makes computer vision models both more accurate and explainable by extracting learned concepts and translating them into plain-language descriptions, outperforming existing interpretable models in tasks like bird species identification and skin lesion detection.

DeepSeek, Tencent, and University of Hong Kong Launch Pointer-CAD to Revolutionize AI-Powered 3D Design

DeepSeek, Tencent, and University of Hong Kong Launch Pointer-CAD to Revolutionize AI-Powered 3D Design

Mar 08, 2026
South China Morning Post

DeepSeek, Tencent, and the University of Hong Kong unveil Pointer-CAD, a groundbreaking open-source AI framework built on Alibaba's Qwen 2.5 model that dramatically improves 3D design precision by reducing segmentation errors and streamlining complex entity selection in computer-aided design.

Previous
Page 14 of 66
Next
Showing 131 - 140 of 652 articles