DeepSeek-AI Unveils Open-Source OCR Model with Human-Like Visual Processing Technology

Jan 28, 2026
GitHub
Article image for DeepSeek-AI Unveils Open-Source OCR Model with Human-Like Visual Processing Technology

Summary

DeepSeek-AI launches DeepSeek-OCR-2, an open-source visual OCR model featuring groundbreaking Visual Causal Flow technology that mimics human visual processing, supporting dynamic resolution up to 6×768×768 plus 1×1024×1024 image patches with document-to-markdown conversion, PDF processing, and streaming output capabilities through vLLM and Transformers frameworks.

Key Points

  • DeepSeek-AI releases DeepSeek-OCR-2, an open-source visual OCR model that uses Visual Causal Flow technology for human-like visual encoding
  • The model supports dynamic resolution processing with up to 6×768×768 plus 1×1024×1024 image patches and offers both document-to-markdown conversion and free OCR capabilities
  • DeepSeek-OCR-2 provides inference options through both vLLM and Transformers frameworks, supporting streaming output, PDF processing, and batch evaluation for benchmarks

Tags

Read Original Article