Apple's MLX Framework Unlocks 4x AI Speedup on New M5 Chip with Enhanced Neural Accelerator Support

Mar 10, 2026

Apple Machine Learning Research

Article image for Apple's MLX Framework Unlocks 4x AI Speedup on New M5 Chip with Enhanced Neural Accelerator Support

Summary

Apple's MLX framework now supports the M5 chip's Neural Accelerators, delivering a massive 4x speedup in AI response times and up to 27% faster token generation, powered by the M5's 153GB/s memory bandwidth, enabling MacBook Pro users to run powerful large language models locally with ease.

Key Points

Apple's MLX framework now supports the Neural Accelerators in the new M5 chip, delivering up to 4x speedup in time-to-first-token performance for large language model inference compared to the M4 chip.
The M5's increased memory bandwidth of 153GB/s, up 28% from the M4's 120GB/s, drives a 19-27% boost in token generation speed across tested LLM architectures, while the 24GB MacBook Pro can comfortably run models like 8B BF16 and 30B MoE 4-bit quantized within 18GB of memory.
MLX supports a wide range of ML tasks including text generation, image generation, and fine-tuning, with easy installation via pip and compatibility across CPU, GPU, Python, Swift, C, and C++ on all Apple silicon systems.

Apple's MLX Framework Unlocks 4x AI Speedup on New M5 Chip with Enhanced Neural Accelerator Support

Summary

Key Points

Tags