Google Launches Two New Gemini API Tiers Offering Developers Cost Savings and Priority Reliability

Apr 03, 2026
Google
Article image for Google Launches Two New Gemini API Tiers Offering Developers Cost Savings and Priority Reliability

Summary

Google launches two new Gemini API tiers — Flex and Priority — giving developers a powerful choice between 50% cost savings for background tasks or maximum reliability for critical real-time applications, all through a single unified interface.

Key Points

  • Google is launching two new Gemini API service tiers, Flex and Priority, giving developers granular control over cost and reliability through a single unified interface.
  • Flex Inference offers 50% cost savings for latency-tolerant background tasks like data enrichment and agentic workflows, using synchronous endpoints without the complexity of async job management.
  • Priority Inference provides the highest reliability for critical, user-facing applications like real-time support bots, automatically downgrading overflow requests to Standard tier instead of failing, and is available to Tier 2 and 3 paid projects.

Tags

Read Original Article