Google Expands Gemini API With Multimodal Search, Custom Filters, and Page-Level Citations for RAG Systems
Summary
Google expands its Gemini API with powerful new features for RAG systems, including multimodal search for images and text, custom metadata filtering for faster and more accurate queries, and page-level citations that link AI responses directly to their source documents.
Key Points
- Google is expanding the Gemini API File Search tool with multimodal support, enabling RAG systems to natively process and search both images and text using the Gemini Embedding 2 model.
- Custom metadata filtering is now available, allowing developers to attach key-value labels to unstructured data and scope queries to specific data slices, improving speed and accuracy in RAG workflows.
- Page-level citations are being introduced, tying model responses directly to their source page numbers within documents, boosting transparency and enabling rigorous fact-checking for end users.