Computer Vision - News Articles

Google Launches Computer Use in Gemini 3.5 Flash, Enabling AI Agents to Control Browsers, Mobile, and Desktop

Jun 25, 2026

Google

Google launches computer use in Gemini 3.5 Flash, empowering AI agents to see, reason, and take control of browsers, mobile, and desktop environments, while deploying new security safeguards to protect against prompt injection attacks.

Autonomous Systems Computer Vision Security Human-Computer Interaction Generative AI

Baidu Launches Unlimited-OCR: Open-Source Document Parsing Model Gains 5.6K GitHub Stars

Jun 24, 2026

GitHub

Baidu's newly released Unlimited-OCR, an open-source long-horizon document parsing model built on DeepSeek-OCR, is rapidly gaining traction with 5.6K GitHub stars, offering powerful single-image and multi-page PDF parsing capabilities with a live Hugging Face demo available now.

Computer Vision Deep Learning Natural Language Generative AI Research

Mistral Launches OCR 4 With 85.20 Benchmark Score, 8x Cost Reduction, and 170-Language Support

Jun 24, 2026

Mistral AI

Mistral launches OCR 4, a powerful document recognition model scoring 85.20 on OlmOCRBench, supporting 170 languages, delivering up to 8x cost reduction and 17x lower latency than competitors, and available now via API at just $4 per 1,000 pages.

Natural Language Deep Learning Computer Vision Big Data Cloud Computing

Alibaba Cloud Launches HappyHorse 1.1 AI Video Model, Ranks No. 2 Globally as OpenAI and ByteDance Retreat from Market

Jun 23, 2026

Venturebeat

Alibaba Cloud surges to No. 2 globally in AI video generation with its newly launched HappyHorse 1.1 model, capitalizing on a rapidly contracting market as OpenAI shuts down Sora and ByteDance freezes its rival product, though geopolitical tensions and a Pentagon military designation threaten to limit Western enterprise adoption.

Generative AI Computer Vision Geopolitics Cloud Computing Media

AI Fuels AR Glasses Surge as Smart Eyewear Sales Jump 167% and Industry Nears Mainstream Breakthrough

Jun 22, 2026

The Deep View

Smart glasses sales explode 167% year-over-year in Q1 2026 as AI supercharges AR technology, with broader AR/VR categories surging 86%, driven by consumer demand for real-world AI assistants and fueled by breakthroughs in chip miniaturization and display tech that are pushing the industry closer than ever to mainstream adoption.

Augmented Reality Hardware Computer Vision Human-Computer Interaction Generative AI

AI Filmmaking Goes Mainstream: Runway's Festival Highlights Hollywood's Generative AI Revolution

Jun 22, 2026

The Deep View

Runway's AI Film Festival in Santa Monica puts generative AI filmmaking in the spotlight, showcasing ten short films that blend 4K AI-generated shots with traditional footage, as Hollywood accelerates its embrace of the technology through major studio deals and high-profile advisers like Martin Scorsese, even amid growing pushback from creatives …

Generative AI Art Entertainment Media Computer Vision

Microsoft's Free 12-Week AI Curriculum Hits 48,000 GitHub Stars, Making AI Education Accessible Worldwide

Jun 19, 2026

GitHub

Microsoft's free 'AI For Beginners' open-source curriculum is taking the world by storm, amassing 48,000 GitHub stars as it delivers 12 weeks of hands-on AI education — covering neural networks, computer vision, and ethics — in over 50 languages to learners worldwide.

Education Deep Learning Computer Vision Natural Language Ethics

New Open-Source Tool 'Lift' Extracts Structured JSON from PDFs with 90% Accuracy, Outperforming Azure and NuExtract3

Jun 19, 2026

GitHub

A powerful new open-source tool called 'Lift' launches on GitHub, achieving 90.2% accuracy in extracting structured JSON data from PDFs and images, outperforming Azure Content Understanding and NuExtract3 across 225 benchmark documents, with a managed API version pushing accuracy even further to 95.9%.

Natural Language Computer Vision Deep Learning Big Data Data Analytics

AI Startup General Intuition Seeks $300M at $2B Valuation, Backed by Bezos and Schmidt

Jun 19, 2026

TechCrunch

AI startup General Intuition is raising $300M at a $2B valuation, backed by Jeff Bezos and Eric Schmidt, leveraging 2 billion gaming videos annually to train embodied AI agents with unique spatial-temporal reasoning capabilities.

Gaming Autonomous Systems Deep Learning Generative AI Computer Vision

NVIDIA Launches Open Source XR AI Beta, Bringing Real-Time Visual and Voice Intelligence to AR Glasses and XR Headsets

Jun 18, 2026

NVIDIA Technical Blog

NVIDIA launches an open source XR AI beta platform enabling developers to build real-time visual and voice AI agents for AR glasses and XR headsets, powered by GPU-accelerated services including Cosmos, Nemotron, and NeMo Agent Toolkit.

Augmented Reality Generative AI Computer Vision Audio Hardware