Thinking Machines Lab Launches 'Interaction Models' Capable of Real-Time Multimodal AI With No External Scaffolding

May 12, 2026

Thinking Machines Lab

Summary

Thinking Machines Lab unveils 'interaction models,' a groundbreaking new class of AI that natively handles real-time audio, video, and text simultaneously using a 200ms micro-turn design, outperforming competitors with entirely new capabilities like proactive visual reaction and time-triggered speech that no existing commercial model can currently perform.

Key Points

Thinking Machines Lab is unveiling a research preview of 'interaction models,' a new class of AI systems that natively handle real-time, multimodal interaction across audio, video, and text without relying on external scaffolding or harnesses.
Unlike traditional turn-based AI models that wait for users to finish before responding, interaction models use a multi-stream, 200ms micro-turn design that enables simultaneous speech, visual proactivity, time-awareness, and seamless dialog management, keeping humans continuously in the loop.
Benchmarks show TML-Interaction-Small leads competing models in both interaction quality and responsiveness, introducing entirely new capabilities such as proactive visual reaction and time-triggered speech that no existing commercial model can currently perform.

Thinking Machines Lab Launches 'Interaction Models' Capable of Real-Time Multimodal AI With No External Scaffolding

Summary

Key Points

Tags