Name: GPT-4o mini: Pricing, Specs & 2026 Status Explained
Brand: OpenAI
Price: 0.15 USD
Availability: InStock

Question 1

What is GPT-4o mini and who built it?

Accepted Answer

GPT-4o mini is a small, cost-efficient multimodal model built by OpenAI and announced on July 18, 2024. It is part of the GPT-4o family of dense transformer models, positioned as the small-tier successor to GPT-3.5 Turbo in both the API and ChatGPT. OpenAI has not disclosed an exact parameter count, but the model trades raw capability for low latency and low cost. On benchmarks it scores 82.0% on MMLU and 87.2% on HumanEval, figures that were competitive at launch but have since been overtaken by newer small models such as GPT-4.1 mini and Gemini 2.0 Flash. It was designed to make vision, function calling, and a 128K context window affordable for high-volume applications. As of 2026, OpenAI directs new projects toward GPT-5.1 mini, with GPT-4o mini remaining as a legacy option. The headline price is $0.15 input / $0.60 output per 1M tokens with a 128K context window.

Question 2

How much does GPT-4o mini cost per 1M tokens?

Accepted Answer

GPT-4o mini costs $0.15 per 1M input tokens and $0.60 per 1M output tokens, a price that has not changed since its July 2024 launch. Cached input tokens cost $0.075 per 1M (50% off). The Batch API offers a further 50% discount at $0.075 input / $0.30 output per 1M tokens, with results returned within 24 hours. Fine-tuning costs $0.30 per 1M training tokens, after which inference on the fine-tuned model rises to $0.30 input / $1.20 output per 1M tokens. As worked examples, summarizing a 100K-token document costs about $0.015, a daily coding-assistant workload of 1M input / 200K output tokens costs roughly $0.27, and a support bot handling 1,000 turns of 2K in / 500 out per day costs about $0.60. By comparison, GPT-5.1 mini costs more per token but delivers materially higher reasoning scores, so GPT-4o mini remains attractive mainly for very high-volume, low-complexity workloads. The model cannot be self-hosted, so there is no infrastructure cost alternative.

Question 3

What is GPT-4o mini's context window and max output?

Accepted Answer

GPT-4o mini has a 128,000 token context window and a maximum output of 16,384 tokens per request, matching the input ceiling of the broader GPT-4o family. OpenAI has not published a dedicated long-context recall evaluation for the mini variant specifically, but independent needle-in-haystack tests of the GPT-4o family generally show reliable retrieval across the 128K window with some degradation for information placed in the middle of very long prompts, a pattern common to 2024-era models. There is no separate extended-context tier above 128K for this model. For document-heavy workloads, the 128K window comfortably fits documents in the tens of thousands of words, but teams working with multi-hundred-page documents should chunk inputs or use retrieval rather than relying on a single 128K call. Compared to 2026 frontier models offering 200K-1M token windows, GPT-4o mini's context is now mid-pack rather than leading.

Question 4

How does GPT-4o mini compare on benchmarks vs GPT-5.1 mini and Gemini 2.0 Flash?

Accepted Answer

GPT-4o mini scores 82.0% on MMLU, 87.2% on HumanEval, roughly 7.8% on SWE-bench Verified, and around 1274 Elo on LMArena. GPT-5.1 mini and Gemini 2.0 Flash, both released after GPT-4o mini, post materially higher scores on agentic coding and reasoning benchmarks, with 2026 frontier models clearing 70%+ on SWE-bench Verified compared to GPT-4o mini's single-digit score. On general knowledge (MMLU) the gap is smaller, since MMLU has become a near-saturated benchmark for models at this scale. In practice, a roughly 60-point SWE-bench gap means GPT-4o mini will frequently produce broken or incomplete multi-file code edits where newer small models succeed. GPT-4o mini does not publish GPQA Diamond or AIME 2025 scores at all, while newer small models increasingly report both, signalling that reasoning was simply not a design priority for this model. For pure cost-per-token on simple classification tasks, GPT-4o mini remains competitive, but for anything agentic, GPT-5.1 mini is the clear winner.

Question 5

Is GPT-4o mini open source or proprietary?

Accepted Answer

GPT-4o mini is fully proprietary and API-only; OpenAI has not released its weights and has no open-weights or open-source variant of this model. Access is available through the direct OpenAI API and through Azure OpenAI Service under the same model name, gpt-4o-mini. There is no AWS Bedrock or Google Vertex AI listing for this model, since those platforms primarily host first-party and select third-party open-weight models rather than OpenAI's proprietary lineup. Commercial use is permitted under OpenAI's standard API terms and usage policies, with no separate license fee beyond per-token API charges. Fine-tuned versions of the model remain OpenAI's proprietary weights; customers cannot export or self-host a fine-tuned GPT-4o mini. Anyone needing an open-weights alternative with similar capability should look at models like Llama 3.1 8B or Qwen2.5 7B instead.

Question 6

What modalities does GPT-4o mini support?

Accepted Answer

GPT-4o mini accepts text and image inputs and produces text output, plus structured tool-call outputs via function calling. Vision input covers document understanding, screenshot interpretation, and basic visual classification, and can be combined with function calling so the model reasons over an image before invoking a tool. Audio and video inputs were promised at the original 2024 launch as 'coming in the future,' but in practice OpenAI shipped those as separate specialized models, gpt-4o-mini-transcribe, gpt-4o-mini-tts, and gpt-4o-mini-audio-preview, rather than as native capabilities of the base chat model. There is no computer-use or web-browsing capability built into GPT-4o mini; those exist only in separate OpenAI agent products. Structured outputs (JSON mode and JSON schema) are fully supported, making the model reliable for extraction and classification pipelines that need machine-readable responses.

Question 7

Does GPT-4o mini train on user data?

Accepted Answer

By default, OpenAI does not train its models on data submitted through the API, including GPT-4o mini, and retains API inputs and outputs for up to 30 days for abuse monitoring before deletion. Enterprise customers can apply for zero-data-retention agreements that remove even this 30-day window. Image inputs are subject to an opt-out fingerprinting system that can exclude specific images from any future training data across the GPT-4o model series. OpenAI's API platform holds SOC 2 Type II certification, offers a Business Associate Agreement for HIPAA-eligible enterprise workloads, and provides GDPR-aligned data processing terms with US and EU data residency options. On Azure OpenAI Service, data handling follows Microsoft's enterprise data processing agreements, which similarly exclude customer data from model training by default. Consumer ChatGPT usage (where still applicable to GPT-4o family models) is governed by separate, more permissive default settings that users can disable in their account data controls.

Question 8

Who is GPT-4o mini best for and who should avoid it?

Accepted Answer

GPT-4o mini is best for teams running high-volume, low-complexity classification, extraction, or routing pipelines where its $0.15/$0.60 per 1M token pricing makes per-call cost negligible. It also suits document and screenshot processing workloads that need cheap vision input, and existing production integrations already built on GPT-4o mini that don't want to absorb a migration. Teams should avoid it for agentic coding, where its roughly 7.8% SWE-bench Verified score means frequent broken multi-file edits, GPT-5.1 mini or a frontier model is the better choice. It is also a poor fit for any task requiring knowledge after October 2023 without retrieval augmentation, and for graduate-level reasoning or competition math, where it has no published GPQA Diamond or AIME 2025 scores at all. New projects in 2026 should generally start with GPT-5.1 mini, which costs more per token but clears agentic and reasoning benchmarks by a wide margin, reserving GPT-4o mini for legacy continuity or extremely cost-sensitive simple tasks.

GPT-4o mini

GPT-4o mini: Pricing, Specs & 2026 Status Explained

About GPT-4o mini

Pricing

Key Features

Pros

Cons

Benchmarks

Frequently Asked Questions