Anthropic's Batch API Cuts Token Costs 50% But Only Pays Off at Fleet Scale, New Analysis Finds
Anthropic's Batch API slashes token costs by 50%, but a new analysis reveals the savings only make sense at fleet scale — single-agent use suffers from up to 24-hour latency, and surprisingly, cheaper Haiku models take longer to batch than pricier Sonnet or Opus.