Gartner Predicts 90% Drop in AI Inference Costs by 2030, But Total Spending Set to Surge

Mar 27, 2026

Gartner

Summary

AI inference costs are set to plummet over 90% by 2030, but total spending will surge as agentic AI models consume dramatically more tokens per task, forcing companies to strategically route workloads or risk falling into a costly commoditization trap.

Key Points

Gartner predicts that by 2030, performing inference on a 1 trillion parameter LLM will cost GenAI providers over 90% less than in 2025, driven by semiconductor improvements, model design innovations, and increased use of inference-specialized silicon.
Despite falling token costs, overall inference spending is expected to rise as advanced agentic models require 5 to 30 times more tokens per task than standard chatbots, meaning cheaper tokens will not democratize frontier AI intelligence.
Gartner analysts warn that Chief Product Officers must route routine tasks to smaller, cost-efficient models while strictly reserving expensive frontier-level inference for high-margin, complex reasoning tasks to avoid being caught in a commoditization trap.

Gartner Predicts 90% Drop in AI Inference Costs by 2030, But Total Spending Set to Surge

Summary

Key Points

Tags