IBM Releases Granite-Docling-258M Open-Source AI Model for Advanced Document-to-Text Conversion

Oct 12, 2025
InfoQ
Article image for IBM Releases Granite-Docling-258M Open-Source AI Model for Advanced Document-to-Text Conversion

Summary

IBM Research unveils Granite-Docling-258M, a powerful 258-million parameter open-source AI model that converts complex documents to text while preserving layouts, tables, and equations, featuring improved stability and experimental multilingual support for Arabic, Chinese, and Japanese languages.

Key Points

  • IBM Research introduces Granite-Docling-258M, a compact 258-million parameter open-source vision-language model designed specifically for high-fidelity document-to-text conversion while preserving complex layouts, tables, equations, and lists
  • The model builds on SmolDocling-256M-preview with upgraded Granite 3-based architecture and SigLIP2 visual encoder, addressing previous stability issues like token repetition and incomplete parses through improved dataset filtering
  • Granite-Docling uses DocTags structured markup format to describe page elements and their relationships, enabling outputs in Markdown, JSON, or HTML formats, with experimental multilingual support for Arabic, Chinese, and Japanese

Tags

Read Original Article