Baidu Launches Unlimited-OCR: Open-Source Document Parsing Model Gains 5.6K GitHub Stars
Summary
Baidu's newly released Unlimited-OCR, an open-source long-horizon document parsing model built on DeepSeek-OCR, is rapidly gaining traction with 5.6K GitHub stars, offering powerful single-image and multi-page PDF parsing capabilities with a live Hugging Face demo available now.
Key Points
- Baidu releases Unlimited-OCR, an open-source one-shot long-horizon document parsing model on GitHub with 5.6k stars, building on DeepSeek-OCR to push OCR capabilities further.
- The model supports single-image inference in two configurations — 'gundam' for cropped high-detail parsing and 'base' for full-image parsing — as well as multi-page and PDF document parsing via both Hugging Face Transformers and SGLang backends.
- A demo is now live on Hugging Face Spaces, the model is available on ModelScope, and a research paper has been published on arXiv under the identifier 2606.23050.