Claude Opus 4.7: Benchmarks, Pricing & Context | hokai.io
Claude Opus 4.7 by Anthropic (April 2026): 87.6% SWE-bench, 1M token context, 3.75MP vision. $5/$25 per 1M tokens. Top GA model for agentic coding tasks.
Claude Opus 4.7, released April 16, 2026 by Anthropic, is the top generally available model for agentic coding with 87.6% on SWE-bench Verified. It supports a 1M token context window and 3.75MP vision. Pricing is $5 per million input tokens and $25 per million output tokens. Adaptive thinking replaces extended thinking budgets. Output speed is 42.2 tokens per second, making it unsuitable for real-time applications.
Claude Opus 4.7 is Anthropic's most capable generally available model, released April 16, 2026. It scores 87.6% on SWE-bench Verified and 94.2% on GPQA Diamond. The context window is 1 million tokens with 128K max output. Pricing is $5 per million input tokens and $25 per million output tokens. It leads all GA models on tool use (MCP-Atlas 77.3%) and is the first Claude with 3.75MP high-resolution vision.
Provider: Anthropic · Family: Claude 4
Context window: 1,000,000 tokens · Max output: 128,000
Input modalities: text, image, pdf, tool-calls · Output: text, tool-calls
About Claude Opus 4.7
Claude Opus 4.7 is Anthropic's most capable generally available model, released on April 16, 2026. Built on a dense Transformer architecture (not MoE) with undisclosed parameter count, it is the fourth major Opus iteration in the Claude 4 generation, following Opus 4.1, 4.5, 4.6, and now 4.7. Anthropic positioned it specifically to close the gap between their generally-available lineup and frontier research previews, with a deliberate focus on agentic coding, long-horizon autonomous tasks, and high-fidelity vision understanding. It replaces Claude Opus 4.6 as Anthropic's recommended starting point for any complex workload. On benchmarks, Claude Opus 4.7 posts 87.6% on SWE-bench Verified, a 6.8-percentage-point gain over Opus 4.6's 80.8% and ahead of GPT-5.4 on the same leaderboard. SWE-bench Pro, a harder multi-file variant, jumped from 53.4% on Opus 4.6 to 64.3% on Opus 4.7, leapfrogging GPT-5.4 (57.7%) and Gemini 3.1 Pro (54.2%). On GPQA Diamond, which tests PhD-level scientific reasoning across physics, chemistry, and biology, Opus 4.7 scores 94.2%, effectively tied with GPT-5.4 at 94.4% and Gemini 3.1 Pro at 94.3% — that benchmark is approaching saturation. For tool use, MCP-Atlas scores 77.3%, ahead of Opus 4.6 at 75.8%, GPT-5.4 at 68.1%, and Gemini 3.1 Pro at 73.9%. Computer use on OSWorld-Verified reached 78.0%, ahead of GPT-5.4 at 75.0%. Humanity's Last Exam without tools sits at 46.9%, and with tools at 54.7%. The context window is 1,000,000 tokens — the full 1M provided at standard per-token rates with no long-context premium. Maximum output via the synchronous Messages API is 128,000 tokens. On the Batch API with the output-300k-2026-03-24 beta header, Opus 4.7 can produce up to 300,000 output tokens per request. A single request can include up to 600 images or PDF pages. Note: Opus 4.7 uses a new tokenizer that processes text with up to 35% more tokens than Opus 4.6 for equivalent input, so effective costs on existing prompts can increase even though the rate card is unchanged. Modalities supported as inputs are text and images (PDF documents are parsed as images). Output is text only. No native audio input or output is available. Opus 4.7 is the first Claude model with high-resolution image support, raising maximum image resolution from 1568px / 1.15MP to 2576px / 3.75MP. This is more than a 3x increase in pixel area, which meaningfully improves performance on dense UIs, technical diagrams, chemical structure reading, and computer use screenshots. Image coordinates now map 1:1 to actual pixels, removing the need for scale-factor math in agentic loops. Tool use and function calling are fully supported with native parallel tool calls, client-side tool schemas, and server-side tools including web search, web fetch, and code execution. Computer use is available via the beta API. Task budgets (beta) allow the model to self-regulate token spend across a full agentic loop. The new xhigh effort level sits between high and max, giving finer control over reasoning depth versus speed. Pricing is $5.00 per 1M input tokens and $25.00 per 1M output tokens, unchanged from Opus 4.6. Prompt caching cuts cached input reads to $0.50 per 1M (10% of base rate); a 5-minute cache write costs $6.25 per 1M and a 1-hour cache write costs $10.00 per 1M. Batch API processing cuts both input and output by 50%, to $2.50 and $12.50 per 1M respectively. US-only inference via the inference_geo parameter adds a 1.1x multiplier. For a 100,000-token document summarization task (100K input, 1K output), the uncached cost is $0.525. A daily coding agent running 1M input tokens and 200K output tokens costs $6.00. Customer support at 2,000 tokens in and 500 tokens out across 1,000 turns costs $13.50 before caching discounts. Claude Opus 4.7 is available via the Anthropic API (model ID: claude-opus-4-7), AWS Bedrock (ID: anthropic.claude-opus-4-7-v1:0), Google Cloud Vertex AI (ID: claude-opus-4-7@20260416), and Microsoft Foundry (via the Azure AI Foundry catalog). Authentication is via API key on Anthropic direct, AWS IAM on Bedrock, GCP IAM on Vertex, and Azure subscription billing on Foundry. Microsoft Foundry regions are restricted to East US2 and Sweden Central. AWS Bedrock supports both global endpoints (dynamic routing) and regional endpoints (geo-guaranteed routing). Cloud providers add a margin on top of Anthropic's base rates for enterprise SLAs and region pinning. Safety evaluation for Opus 4.7 was published in the system card released April 16, 2026. The model uses Constitutional AI and RLHF alignment. In multi-turn safety evaluations, Opus 4.7 successfully identified escalating harm patterns even when individual prompts appeared benign. In agentic settings, it is better than Opus 4.6 at refusing malicious instructions and resisting prompt injection in Claude Code and computer use contexts. Cyber capabilities are roughly similar to Opus 4.6; an external evaluation by the UK AI Security Institute found it could not complete their full cyber range, unlike the more capable Mythos Preview. Evaluation awareness was detected in about 9% of transcripts, triggered by inconsistencies in mocked tool results. Cybersecurity refusals are new real-time safeguards; teams doing legitimate security work can apply to the Cyber Verification Program. Claude Opus 4.7 is the right model for teams running autonomous coding agents over long multi-file sessions, where its 87.6% SWE-bench Verified score and 77.3% MCP-Atlas tool-use rating provide a measurable edge. It is also strong for enterprise knowledge work requiring long-document analysis — the 1M context at standard pricing means no special endpoint is needed to ingest a full codebase or document corpus. The model excels at vision-heavy agentic tasks like UI automation and chart interpretation, given the 3.3x resolution upgrade and 1:1 pixel coordinate mapping. Teams should not choose Opus 4.7 for latency-sensitive voice or chat applications: time to first token is approximately 19 seconds and output speed is 42.2 tokens per second, notably below the frontier median of 59.8 t/s. For cost-sensitive high-volume tasks, Claude Sonnet 4.6 at $3/$15 per 1M is a strong alternative that maintains near-Opus coding performance at lower latency. Training data cutoff is January 2026, meaning the model has reliable knowledge of events through that date. Anthropic does not train on API inputs by default. Enterprise zero-retention plans are available. The model is SOC 2 Type II compliant, HIPAA-eligible, ISO 27001 certified, and GDPR compliant. EU AI Act classification is general-purpose AI with systemic risk obligations. API inputs and outputs are retained for 30 days for abuse monitoring and deleted unless flagged, unless the enterprise zero-retention option is enabled. Versus Opus 4.6, the headline gains are: SWE-bench Verified up 6.8pp to 87.6%, SWE-bench Pro up 10.9pp to 64.3%, OSWorld-Verified computer use up from 72.7% to 78.0%, and image resolution up 3.3x to 3.75MP. Breaking changes shipped with Opus 4.7 include removal of extended thinking budgets (adaptive thinking is now the only mode), removal of sampling parameter overrides (temperature, top_p, top_k now return 400 errors), and a new tokenizer that increases token counts by up to 35%. Some developers have reported regressions on specific workflows, particularly those relying on precise temperature control or verbose chain-of-thought output.
Pricing
$5.00 per 1M input tokens, $25.00 per 1M output tokens. Prompt caching: 5-min cache write at $6.25/MTok, 1-hour cache write at $10.00/MTok, cache reads at $0.50/MTok (90% savings). Batch API: 50% discount at $2.50 input / $12.50 output per MTok. US-only inference adds 1.1x multiplier. New tokenizer may increase effective token counts by up to 35% vs Opus 4.6.
Key Features
- High-Resolution Vision (3.75MP): First Claude model to support 2576px / 3.75MP images, a 3.3x increase over previous models. Pixel coordinates map 1:1 to actual image pixels, removing scale-factor math from computer use agents.
- 1M Token Context at Standard Pricing: Full 1 million token context window with no long-context premium. Up to 600 images or PDF pages per request. Reliable knowledge through January 2026.
- Adaptive Thinking with xhigh Effort: New xhigh effort level between high and max for finer reasoning/latency control. Adaptive thinking (replacing extended thinking budgets) is the only reasoning mode and outperforms explicit budget setting in internal evals.
- Task Budgets for Agentic Loops: Beta feature: give Claude an advisory token target across a full agentic session. Model self-regulates via a running countdown, finishing tasks gracefully without runaway token spend.
- Best-in-Class Tool Use: MCP-Atlas 77.3%, highest among generally available models. Supports native parallel tool calls, client-side schemas, and server-side tools including web search ($10/1K searches), web fetch (free), and code execution.
- Prompt Caching (90% Savings): Cache system prompts, tool definitions, and document context at $0.50/MTok (10% of base input rate). Stack with Batch API for up to 95% savings vs uncached standard requests.
Pros
- 87.6% SWE-bench Verified, the top score among generally available models as of April 2026.
- 1M context window at $5/MTok with no surcharge, enabling full codebase or document corpus ingestion in a single call.
- MCP-Atlas 77.3% tool use, the highest generally available score, critical for reliable multi-tool agent orchestration.
Cons
- Time to first token of ~19 seconds and output speed of 42.2 t/s, well below the 59.8 t/s frontier median, ruling it out for latency-sensitive applications.
- New tokenizer silently increases token counts by up to 35%, meaning existing prompts can cost more despite unchanged per-token rates.
- Breaking API changes at launch: temperature/top_p/top_k removed, extended thinking budget syntax removed — requires migration work for all Opus 4.6 integrations.
Benchmarks
- mmlu: 91.5
- gpqa diamond: 94.2
- swe bench verified: 87.6
- humanitys last exam: 46.9
Frequently Asked Questions
What is Claude Opus 4.7 and who built it?
Claude Opus 4.7 is Anthropic's most capable generally available AI model, released on April 16, 2026. It belongs to the Claude 4 family and is built on a dense Transformer architecture with undisclosed parameter count. Anthropic does not publish parameter sizes for Opus-class models. The model scores 87.6% on SWE-bench Verified, 94.2% on GPQA Diamond, and 77.3% on MCP-Atlas tool use, leading all generally available models on the first two. It was designed to close the capability gap between Anthropic's GA lineup and research previews, with particular focus on agentic coding, long-horizon task execution, and high-resolution vision. It sits above Claude Sonnet 4.6 and Claude Haiku 4.5 in Anthropic's current lineup, and replaces Claude Opus 4.6 as the recommended starting point for complex workloads. Pricing is $5 per million input tokens and $25 per million output tokens.
How much does Claude Opus 4.7 cost per 1M tokens?
Claude Opus 4.7 costs $5.00 per million input tokens and $25.00 per million output tokens, unchanged from Opus 4.6. Prompt caching cuts cached input reads to $0.50 per million tokens — a 90% reduction — with a 5-minute cache write costing $6.25/MTok and a 1-hour cache write costing $10.00/MTok. The Batch API provides a 50% discount on both input and output, reducing rates to $2.50 input and $12.50 output per million tokens for async workloads. A daily coding agent consuming 1M input tokens and 200K output tokens costs $10.00 uncached, or roughly $1.40 with aggressive prompt caching on the system context. A 100K-token document summarization task costs $0.525. US-only inference adds a 1.1x multiplier via the inference_geo parameter. Importantly, Opus 4.7 uses a new tokenizer that processes text with up to 35% more tokens than Opus 4.6, meaning real bills can increase even though the per-token rate is unchanged. Compared to Claude Sonnet 4.6 at $3/$15 per MTok, Opus 4.7 is 67% more expensive on input and 67% more on output.
What is Claude Opus 4.7's context window and max output?
Claude Opus 4.7 supports a 1,000,000 token context window — approximately 555,000 words or 2.5 million Unicode characters using the new tokenizer. The max output via the synchronous Messages API is 128,000 tokens. On the Message Batches API with the output-300k-2026-03-24 beta header, max output extends to 300,000 tokens. A single request can include up to 600 images or PDF pages, compared to 100 for models with a 200K context window. The full 1M context is available at standard per-token pricing with no long-context premium or special endpoint required. Opus 4.7 uses a new tokenizer that counts the same text as up to 35% more tokens than Opus 4.6, meaning existing prompts may approach context limits faster. Long-context recall is rated high; Anthropic reports reliable performance across the full 1M window. Compared to Claude Sonnet 4.6, which shares the same 1M context but tops out at 64K output, Opus 4.7 doubles the max synchronous output.
How does Claude Opus 4.7 compare on benchmarks vs GPT-5.4 and Gemini 3.1 Pro?
Claude Opus 4.7 leads GPT-5.4 on SWE-bench Verified (87.6% vs unspecified lower score) and MCP-Atlas tool use (77.3% vs 68.1%), making it the stronger choice for agentic coding and multi-tool workflows. On SWE-bench Pro, Opus 4.7 scores 64.3%, ahead of GPT-5.4 at 57.7% and Gemini 3.1 Pro at 54.2%. On GPQA Diamond, all three frontier models are essentially tied: Opus 4.7 at 94.2%, GPT-5.4 at 94.4%, and Gemini 3.1 Pro at 94.3% — that benchmark is approaching saturation. GPT-5.4 Pro leads on BrowseComp web research (89.3% vs 79.3%), making it the better choice for open web retrieval tasks. On computer use (OSWorld-Verified), Opus 4.7 scores 78.0% vs GPT-5.4 at 75.0%, giving it an edge for UI automation. Claude Mythos Preview outperforms Opus 4.7 on most benchmarks but is invitation-only and not generally available. Opus 4.7 is the best generally available model for coding and tool orchestration as of April 2026, with GPT-5.4 as the closest competitor.
Is Claude Opus 4.7 open source or proprietary?
Claude Opus 4.7 is fully proprietary. The model weights are closed and cannot be downloaded, self-hosted, fine-tuned, or deployed in air-gapped environments. Access is API-only through four platforms: the Anthropic API directly (model ID: claude-opus-4-7), AWS Bedrock (model ID: anthropic.claude-opus-4-7-v1:0), Google Cloud Vertex AI (model ID: claude-opus-4-7@20260416), and Microsoft Foundry via the Azure AI Foundry catalog. Microsoft Foundry access requires a project in East US2 or Sweden Central. There is no Hugging Face download and no community license. Anthropic has never published model sizes or weights for any Opus-class model. For teams that require open weights for on-device inference or fine-tuning, alternatives include Meta's Llama 4 family or Mistral's open-weights models.
What modalities does Claude Opus 4.7 support?
Claude Opus 4.7 accepts text and images as inputs, and produces text as output. It is the first Claude model with high-resolution image support: maximum resolution is 2576px / 3.75MP (up from 1568px / 1.15MP on prior models), a 3.3x increase in pixel area. A single request can include up to 600 images or PDF pages. The model supports native function calling and tool use with parallel tool calls, client-side tool schemas, and server-side tools including web search and web fetch. Computer use (beta) allows the model to control a desktop via screenshot analysis and coordinate-based input — with Opus 4.7's pixel coordinates now mapping 1:1 to actual image pixels. No native audio input or audio output is supported; voice applications must use a separate speech recognition and synthesis pipeline. Video input is not supported. Adaptive thinking is supported (the only reasoning mode on Opus 4.7); extended thinking budgets from prior models are removed.
Does Claude Opus 4.7 train on user data?
Claude Opus 4.7 does not train on API inputs by default. Anthropic retains API inputs and outputs for 30 days for abuse monitoring, then deletes them unless flagged. Enterprise customers can enable zero-retention mode, which prevents any logging of inputs or outputs. The model is SOC 2 Type II certified, ISO 27001 certified, HIPAA-eligible, and GDPR compliant. The EU AI Act classifies it as a general-purpose AI model with systemic risk obligations. On AWS Bedrock, data handling follows AWS's shared responsibility model; on Google Vertex AI, Google's data processing addendum applies. The trust center and compliance documentation are available at anthropic.com/transparency. Anthropic's responsible scaling policy (RSP) governs how training decisions are made as model capabilities increase.
Who is Claude Opus 4.7 best for and who should avoid it?
Claude Opus 4.7 is the right choice for engineering teams running autonomous coding agents, where its 87.6% SWE-bench Verified and 77.3% MCP-Atlas tool use scores directly translate to fewer failed tasks and more reliable multi-tool loops. It is also strong for enterprise document analysis requiring 100K to 1M token inputs, long-context recall, and knowledge current through January 2026. Computer use and UI automation teams benefit from the 3.3x vision resolution upgrade and 1:1 pixel coordinate mapping. Teams should not choose Opus 4.7 for real-time applications: time to first token is ~19 seconds and output speed is 42.2 t/s, well below the frontier median of 59.8 t/s — Claude Haiku 4.5 or Sonnet 4.6 are the correct alternatives for latency-sensitive products. High-volume cost-sensitive teams should evaluate Sonnet 4.6 at $3/$15 per MTok before committing to Opus 4.7's $5/$25 rates, especially given the new tokenizer's 35% token inflation. On-device or air-gapped deployments are impossible with Opus 4.7; Llama 4 or Mistral open-weights models are the alternatives there.