Gartner Predicts 90% Drop in AI Inference Costs by 2030, But Total Spending Set to Surge

Mar 27, 2026
Gartner
Article image for Gartner Predicts 90% Drop in AI Inference Costs by 2030, But Total Spending Set to Surge

Summary

AI inference costs are set to plummet over 90% by 2030, but total spending will surge as agentic AI models consume dramatically more tokens per task, forcing companies to strategically route workloads or risk falling into a costly commoditization trap.

Key Points

  • Gartner predicts that by 2030, performing inference on a 1 trillion parameter LLM will cost GenAI providers over 90% less than in 2025, driven by semiconductor improvements, model design innovations, and increased use of inference-specialized silicon.
  • Despite falling token costs, overall inference spending is expected to rise as advanced agentic models require 5 to 30 times more tokens per task than standard chatbots, meaning cheaper tokens will not democratize frontier AI intelligence.
  • Gartner analysts warn that Chief Product Officers must route routine tasks to smaller, cost-efficient models while strictly reserving expensive frontier-level inference for high-margin, complex reasoning tasks to avoid being caught in a commoditization trap.

Tags

Read Original Article