Name: GPT-4o Review 2026: 128K Context, $2.50/M Pricing, Deprecated
Brand: OpenAI
Price: 2.50 USD
Availability: InStock

Question 1

What is GPT-4o and who built it?

Accepted Answer

GPT-4o is a natively multimodal large language model built by OpenAI, announced and released on May 13, 2024 during the company's Spring Updates livestream. The 'o' stands for omni, reflecting that it was trained end-to-end to handle text, image, and audio in one network rather than bolting vision or speech onto a text-first model. OpenAI has not disclosed its parameter count or whether it uses a dense or mixture-of-experts architecture, but independent estimates place it in the hundreds of billions of parameters, in line with the broader GPT-4 family. On release, it scored 88.7% on MMLU and 90.2% on HumanEval, both improvements over GPT-4 Turbo. It was designed to bring GPT-4 Turbo-level reasoning to real-time, low-latency multimodal interaction at roughly half the cost. It sits at the top of the GPT-4o line, with a cheaper GPT-4o mini variant released two months later. As of 2026, it competed directly with Gemini 1.5 Pro and Claude 3.5 Sonnet at launch, and was specifically designed to beat them on combined vision, audio, and multilingual benchmarks. Its headline price was $2.50 input and $10.00 output per 1M tokens, with a 128K context window.

Question 2

How much does GPT-4o cost per 1M tokens?

Accepted Answer

GPT-4o costs $2.50 per 1M input tokens and $10.00 per 1M output tokens, a price that has held since OpenAI's 50% price cut in October 2024 from the original May 2024 launch price of $5.00/$15.00. Cached input tokens, used for repeated prompt prefixes, cost $1.25 per 1M, half the standard input rate. The Batch API gives a further 50% discount on both input and output, bringing batch pricing to $1.25/$5.00 per 1M tokens for asynchronous jobs returned within 24 hours. As worked examples: summarizing a 100K-token document costs about $0.27, running a coding agent that processes 1M input and 200K output tokens per day costs about $4.50, and a support bot handling 1,000 daily chat turns at 2K input and 500 output tokens each costs roughly $10.00 per day. By comparison, GPT-4.1 offers similar or better performance at a lower per-token cost, and GPT-4o cannot be self-hosted since OpenAI has not released its weights.

Question 3

What is GPT-4o's context window and max output?

Accepted Answer

GPT-4o has a 128,000 token context window, unchanged since its May 2024 release. The maximum output token limit started at 4,096 tokens at launch but was raised to 16,384 tokens in the gpt-4o-2024-11-20 update, a four-fold increase that benefits long-form generation, large code outputs, and structured JSON responses. There is no separate extended-context tier or sliding-window mode documented for GPT-4o. By comparison, Gemini 2.5 Pro offers up to 2 million tokens of context and Claude's largest tier reaches 1 million tokens, making GPT-4o's 128K mid-pack by 2026 standards. For multi-document or large-codebase tasks, GPT-4o's effective working context is smaller than these newer models, so very large inputs need to be chunked or summarized before being passed to GPT-4o. Document handling for PDFs and images counts toward the same 128K token budget as text.

Question 4

How does GPT-4o compare on benchmarks vs GPT-4.1 and GPT-5?

Accepted Answer

GPT-4o scored 88.7% on MMLU and 90.2% on HumanEval at release in May 2024. OpenAI did not publish a direct SWE-bench Verified score for GPT-4o, but when introducing GPT-4.1 in 2025, OpenAI stated GPT-4.1 improved on SWE-bench Verified by 21.4 points over GPT-4o, implying GPT-4o scored roughly 33% on that benchmark, well behind GPT-4.1 and Claude Opus 4's 72.5%. On the LMArena Chatbot Arena, the August 2024 GPT-4o snapshot reached an Elo near 1314, but by mid-2026 the leaderboard is led by Claude Opus 4.6 (1418), Gemini 3.1 Pro (1406), and GPT-5.2 (1402), with GPT-4o no longer ranked among current models. In practice, the SWE-bench gap means GPT-4o is meaningfully less reliable at multi-step coding and agentic tasks than GPT-4.1 or GPT-5.x. GPT-4o's strongest published numbers (MMLU and HumanEval) are both legacy, saturated benchmarks by 2026 standards, and OpenAI has not released GPQA Diamond or AIME 2025 scores for it.

Question 5

Is GPT-4o open source or proprietary?

Accepted Answer

GPT-4o is fully proprietary and API-only. OpenAI has not released its weights, and there is no open-weights or open-source variant of GPT-4o itself. Access is exclusively through the OpenAI API (api.openai.com) and Azure OpenAI Service, both of which require an API key or Azure AD credential. OpenAI's separate gpt-oss-20b and gpt-oss-120b models, released under the Apache 2.0 license, are open-weight models but are architecturally distinct from GPT-4o and were released later as a separate initiative. There are no Hugging Face weights, VRAM requirements, or quantization options for GPT-4o because self-hosting is not possible. Commercial use of GPT-4o is governed entirely by OpenAI's usage policies and API terms of service, with no separate community license to consider.

Question 6

What modalities does GPT-4o support?

Accepted Answer

GPT-4o accepts text, image, audio, and video as input, and can generate text, audio, and image as output, all from a single natively trained model. Its real-time voice capability, accessed through OpenAI's Realtime API, responds to spoken input in as little as 232 milliseconds (average 320ms), close to human conversational latency; this is distinct from sending audio through the standard Chat Completions endpoint, which is not optimized for real-time use. Function calling is fully supported, and as of the November 2024 update, function calling works alongside vision input, letting the model decide which tools to call based on what it sees in an image. Structured outputs with strict JSON schema enforcement are generally available on the gpt-4o-2024-08-06 snapshot and later. GPT-4o does not support computer-use style screen control or web browsing natively. Compared to Gemini's native video understanding or Claude's computer-use tooling, GPT-4o's video input support is less emphasized in OpenAI's documentation.

Question 7

Does GPT-4o train on user data?

Accepted Answer

By default, OpenAI does not use API inputs and outputs to train its models, including GPT-4o. Data sent through the API may be retained for up to 30 days for abuse and misuse monitoring, after which it is deleted, unless an organization has been approved for zero data retention, in which case inputs and outputs are not stored at all. OpenAI's API platform holds SOC 2 Type II certification, and OpenAI offers HIPAA business associate agreements and EU data residency options for eligible enterprise customers. Usage through Azure OpenAI Service follows Microsoft's separate Azure data handling and compliance commitments, which can differ from the direct OpenAI API in terms of regional data storage. ChatGPT consumer usage (as opposed to the API) has separate data controls, including a setting to opt out of having conversations used to improve OpenAI's models. There is no GPT-4o-specific data policy beyond OpenAI's standard API-wide terms.

Question 8

Who is GPT-4o best for and who should avoid it in 2026?

Accepted Answer

GPT-4o is best for teams already running production integrations on pinned snapshots like gpt-4o-2024-08-06 or gpt-4o-2024-11-20 who want stable, known pricing at $2.50/$10.00 per 1M tokens without an immediate migration. It also suits real-time voice assistant prototypes that benefit from its 232-320ms Realtime API latency, and cost-sensitive multimodal prototypes that need native vision, audio, and text without paying GPT-5-class prices. Teams should avoid GPT-4o for new agentic coding projects, since GPT-4.1 scores 21.4 points higher on SWE-bench Verified at a lower cost, and GPT-5.1/5.2 lead further still. It's also a poor fit for anything needing context windows beyond 128K, where Gemini 2.5 Pro (2M) or Claude (1M) are better suited. Finally, since GPT-4o was retired from ChatGPT on February 13, 2026 and is officially deprecated, any new product built today should default to GPT-4.1 or GPT-5.1/5.2 rather than building fresh dependencies on a model OpenAI is actively phasing out.

GPT-4o

GPT-4o Review 2026: 128K Context, $2.50/M Pricing, Deprecated

About GPT-4o

Pricing

Key Features

Pros

Cons

Benchmarks

Frequently Asked Questions