Deep Learning

664 articles found

Modal Slashes GPU Cold Start Times From 2,000 to 50 Seconds With Serverless Inference Breakthrough

Modal Slashes GPU Cold Start Times From 2,000 to 50 Seconds With Serverless Inference Breakthrough

May 13, 2026
Modal

Modal slashes GPU cold start times from over 2,000 seconds to just 50 seconds using a breakthrough combination of cloud-buffered idle GPUs, lazy-loading filesystems, CPU memory snapshotting, and CUDA checkpoint/restore, delivering 4-10x faster serverless inference for LLM workloads across hundreds of organizations.

Page 1 of 67
Next
Showing 1 - 10 of 664 articles