Computer Vision

543 articles found

397-Billion Parameter AI Model Runs on MacBook Pro With 48GB RAM at 4.4 Tokens Per Second Using Custom C/Metal Engine

397-Billion Parameter AI Model Runs on MacBook Pro With 48GB RAM at 4.4 Tokens Per Second Using Custom C/Metal Engine

Mar 23, 2026
GitHub

A custom C/Metal inference engine called Flash-MoE is now running a massive 397-billion parameter AI model on a standard MacBook Pro with 48GB RAM, streaming 209GB directly from SSD at 4.4 tokens per second — with 58 documented experiments revealing that Apple Silicon's unified memory architecture defies conventional optimization wisdom.

Page 1 of 55
Next
Showing 1 - 10 of 543 articles