Mistral OCR 3 is a state-of-the-art optical character recognition model designed to extract text and embedded images from diverse document types with exceptional fidelity. It represents a significant upgrade over Mistral OCR 2, achieving a 74% overall win rate across forms, scanned documents, complex tables, and handwriting.
Key Features:
- Breakthrough Performance: Outperforms both enterprise document processing solutions and AI-native OCR solutions
- Advanced Document Understanding: Supports markdown output enriched with HTML-based table reconstruction for preserving document structure
- Multi-format Support: Excels at processing forms, invoices, receipts, compliance forms, government documents, and historical archives
- Handwriting Recognition: Accurately interprets cursive, mixed-content annotations, and handwritten text layered over printed forms
- Complex Table Processing: Reconstructs table structures with headers, merged cells, multi-row blocks, and column hierarchies
- Robust Scanning: Significantly more robust to compression artifacts, skew, distortion, low DPI, and background noise
- Cost-Effective: Available at $2 per 1,000 pages with 50% Batch-API discount reducing cost to $1 per 1,000 pages
Use Cases:
- Extracting text and images into markdown for downstream agents and knowledge systems
- Automated parsing of forms, invoices, and operational documents
- End-to-end document understanding pipelines
- Digitization of handwritten or historical documents
- Enterprise search enhancement through clean text extraction from technical and scientific reports
Availability: The model (mistral-ocr-2512) is accessible via API and through the Document AI Playground interface in Mistral AI Studio, providing a simple drag-and-drop interface for parsing PDFs/images into clean text or structured JSON.