Name: Mistral OCR 4: 85.20 OlmOCRBench, 170 Languages (2026)
Brand: Mistral AI

Question 1

What is Mistral OCR 4 and who built it?

Accepted Answer

Mistral OCR 4 is a proprietary document intelligence model developed by Mistral AI and released on June 23, 2026. It is the fourth generation of Mistral's optical character recognition lineup, following OCR 3 (December 2025), OCR 2, and the original Mistral OCR (March 2025). Unlike Mistral's general-purpose language models, OCR 4 is purpose-built for extracting structured content from PDFs, scanned images, and multi-page documents, with three structural primitives: bounding boxes that localize every text block, typed block classification that labels each region by structural role (title, table, equation, signature, and ten more types), and per-word confidence scores. The model supports 170 languages across 10 language groups and ships as a single container for on-premise deployment. On the OlmOCRBench public leaderboard it scores 85.20, the highest reported result as of June 2026. It achieves a 72% human-preference win rate over competing OCR and document-AI systems across 600 real-world documents in 12 or more languages. The model is accessible via Mistral's la Plateforme API, Amazon SageMaker, and Microsoft Foundry.

Question 2

How much does Mistral OCR 4 cost per 1,000 pages?

Accepted Answer

Mistral OCR 4 uses per-page billing rather than per-token pricing. The standard API costs $4 per 1,000 pages, providing real-time extraction with bounding boxes, block types, and confidence scores. The Batch API costs $2 per 1,000 pages, a 50% discount for asynchronous workloads where results are returned within hours rather than seconds. The Document AI add-on, which adds schema-conformant JSON extraction via mistral-small-2603, costs $5 per 1,000 pages through Mistral Studio. For a team processing 100,000 pages per month, the standard API cost is $400 and the Batch API cost is $200. A team processing 1 million pages monthly on the Batch API spends $2,000, saving $2,000 compared to the standard tier. Self-hosted enterprise pricing requires a commercial agreement with Mistral sales and is not publicly listed. Google Document AI charges approximately $5 per 1,000 pages for similar multilingual extraction, making OCR 4 cheaper on the standard tier and matching Google's price on the Document AI tier.

Question 3

What is Mistral OCR 4's context window and how does it handle large documents?

Accepted Answer

Mistral OCR 4 does not use a traditional token context window. Processing is page-based: each page is analyzed independently and returns a structured object with extracted text, bounding boxes, block types, and confidence scores. Multi-page documents are fully supported, and the pages parameter accepts both integer arrays and range strings (for example, '0,2-4' selects pages 0, 2, 3, and 4). There is no published upper limit on document length, but processing cost scales linearly with page count at $4 per 1,000 pages. For very large document sets, the Batch API handles asynchronous processing at $2 per 1,000 pages. The Document AI add-on applies a user-supplied JSON schema across the full document output after all pages are extracted, producing a single schema-conformant JSON object regardless of document length. Unlike LLMs, OCR 4 does not suffer from context degradation at long document lengths because each page is processed independently. This makes it well suited for 200-page contracts or 500-page financial filings without the recall degradation seen in long-context LLMs.

Question 4

How does Mistral OCR 4 compare on benchmarks versus Google Document AI and AWS Textract?

Accepted Answer

On the OlmOCRBench public leaderboard, Mistral OCR 4 scores 85.20, the top reported score as of June 2026. Google Document AI and AWS Textract do not publish OlmOCRBench or OmniDocBench scores, making direct automated benchmark comparison difficult. In human preference evaluations, OCR 4 wins 72% of pairwise comparisons against competing OCR and document-AI systems across 600 real-world documents in 12 or more languages; Mistral does not name every system tested, but the description covers cloud OCR leaders. On pricing, OCR 4 at $4 per 1,000 pages is cheaper than Google Document AI at approximately $5 per 1,000 pages and comparable to AWS Textract's per-page rate for multi-language extraction. The key differentiator versus both Google and AWS is deployment: OCR 4 ships as a single on-premise container, while Google Document AI and AWS Textract are cloud-only services. For regulated industries where documents cannot leave internal infrastructure, OCR 4 is currently the only API-first OCR model with a self-hosted option at this benchmark level. Rogo reports 17 times lower latency and 8 times lower cost versus leading agentic document parsers on financial QA datasets using OCR 4.

Question 5

Is Mistral OCR 4 open source or proprietary?

Accepted Answer

Mistral OCR 4 is proprietary with closed model weights. It is not downloadable from Hugging Face and cannot be run locally without a commercial agreement with Mistral. This contrasts with several of Mistral's language models (Mistral Small 4, Devstral, Mistral Large 3) that are released under Apache 2.0 as open-source weights. OCR 4 is accessible via three managed platforms: Mistral's la Plateforme API (with an API key), Amazon SageMaker (with AWS IAM), and Microsoft Foundry in Azure (with an Azure API key). For organizations needing on-premise deployment, Mistral offers a self-hosted enterprise option as a single Docker container, but this requires a commercial agreement with Mistral sales and is not publicly available from Docker Hub. The self-hosted option provides data sovereignty (no documents leave your infrastructure) but does not grant access to the model weights themselves. Pricing for the self-hosted enterprise tier is not publicly listed and requires a sales engagement.

Question 6

What structured outputs does Mistral OCR 4 return?

Accepted Answer

Mistral OCR 4 returns several levels of structured output depending on which API parameters are set. By default, without include_blocks: true, the API returns extracted text and markdown-formatted content. With include_blocks: true, each page returns a blocks array with paragraph-level bounding boxes (pixel coordinates marking where each region sits on the page) and a structural label from one of 14 block types: text, title, list, table, image, equation, caption, code, references, aside_text, header, footer, and signature. Every block also carries an inline confidence score indicating extraction reliability. The Document AI add-on, activated at the same endpoint with a JSON schema parameter, routes OCR 4 output through mistral-small-2603 to produce schema-conformant JSON, allowing users to extract named fields (like invoice total, contract parties, or form answers) without writing custom parsing logic. This three-level output (extracted text, bounding boxes plus block types, schema-conformant JSON) covers most enterprise document extraction use cases from raw digitization to structured data pipelines.

Question 7

Does Mistral OCR 4 train on user data?

Accepted Answer

Mistral OCR 4 does not train on API inputs by default. API usage does not feed back into model training. Mistral's Data Processing Addendum, available at legal.mistral.ai/terms/data-processing-addendum, governs how input data is handled and covers compliance with GDPR (Regulation EU 2016/679). Mistral holds SOC 2 Type II and ISO 27001 certifications. GDPR compliance is supported via EU-based hosting in Paris, and Mistral offers data residency options in both the EU and the US. For organizations with stricter requirements, the self-hosted enterprise container provides full data sovereignty: no document is transmitted outside the organization's infrastructure, eliminating any vendor-side data handling question entirely. HIPAA eligibility is not confirmed for OCR 4 as of June 2026. For enterprise procurement, Mistral provides standard DPA documentation suitable for regulated industry IT security review. The EU AI Act's enforcement provisions take effect August 2, 2026, and Mistral has signed the GPAI Code of Practice as a proactive compliance measure.

Question 8

Who is Mistral OCR 4 best for and who should avoid it?

Accepted Answer

Mistral OCR 4 is best suited to enterprise data engineering teams building document ingestion pipelines for RAG, search, or compliance workflows that need source-grounded citation (bounding boxes are required for this). It is the strongest choice for regulated industries (healthcare, finance, legal, government) that cannot route sensitive documents to cloud-only OCR APIs and need an on-premise container option. Multilingual organizations processing documents in low-resource or non-Latin scripts benefit from the 170-language coverage, which outpaces Google Document AI and AWS Textract on script breadth. Teams should avoid OCR 4 for real-time sub-second document processing: the Batch API is asynchronous and the standard API is not optimized for sub-100ms turnaround. AWS Textract or Google Document AI are better choices if latency is the primary constraint. OCR 4 is also the wrong tool for general-purpose vision tasks: it cannot describe photographs, caption images, or answer visual questions about charts without the Document AI layer; use Pixtral or Mistral Small 4 for those tasks. For low-volume or ad hoc document conversion, open-source alternatives like Tesseract or Surya are free and sufficient.

Mistral OCR 4: 85.20 OlmOCRBench, 170 Languages (2026)

About Mistral OCR 4

Pricing

Key Features

Pros

Cons

Benchmarks

Frequently Asked Questions