AllenAI's olmOCR Converts PDFs and Images to Markdown for Under $200 Per Million Pages
AllenAI's open-source olmOCR toolkit converts PDFs and images into clean Markdown text — including equations, tables, and handwriting — for under $200 per million pages, with its latest v0.4.0 release scoring 82.4 on a 7,000+ test benchmark, rivaling top OCR tools while supporting GPU inference, Docker, and multi-node cloud processing.