DeepSeek-AI Unveils New OCR Model with Dynamic Resolution and Document-to-Markdown Conversion

Oct 21, 2025
GitHub
Article image for DeepSeek-AI Unveils New OCR Model with Dynamic Resolution and Document-to-Markdown Conversion

Summary

DeepSeek-AI launches DeepSeek-OCR, a groundbreaking optical character recognition model that dynamically adjusts resolution from 512×512 to 1280×1280 pixels and converts documents directly to markdown format while parsing figures and locating specific text within images.

Key Points

  • DeepSeek-AI releases DeepSeek-OCR, a new model that investigates vision encoders from an LLM-centric viewpoint for optical character recognition tasks
  • The model supports multiple resolution modes from 512×512 (64 vision tokens) to 1280×1280 (400 vision tokens) plus dynamic resolution capabilities
  • DeepSeek-OCR can convert documents to markdown, perform OCR on images, parse figures, and locate specific text within images using specialized prompt formats

Tags

Read Original Article