Name: Claude Opus 4.8: 88.6% SWE-bench, $5/$25 Per 1M (June 2026)
Brand: Anthropic
Price: 5.00 USD
Availability: InStock

Question 1

What is Claude Opus 4.8 and who built it?

Accepted Answer

Claude Opus 4.8 is Anthropic's most capable generally available AI model, released on May 28, 2026. It is a dense Transformer with parameter count undisclosed (estimated ~600B based on capability scaling). The model is built by Anthropic, the AI safety company founded in 2021 by former OpenAI researchers. Opus 4.8 is positioned as Anthropic's flagship for agentic reasoning, long-context analysis, and autonomous coding workflows. It maintains the Claude Opus family's focus on behavioral consistency and alignment. The model excels on SWE-bench Verified (88.6%), mathematical reasoning (USAMO 96.7%), and code-flaw detection (0% miss rate on internal evals, 4x improvement over Opus 4.7). Anthropic markets Opus 4.8 as a modest but tangible improvement over Opus 4.7, with deeper gains in math and agentic task consistency.

Question 2

How much does Claude Opus 4.8 cost per 1M tokens?

Accepted Answer

Claude Opus 4.8 costs $5.00 per 1M input tokens and $25.00 per 1M output tokens on standard mode. Prompt caching (minimum 1,024 tokens) costs $0.50 per 1M cached input tokens, a 90% discount vs standard input. Fast mode (research preview) costs $10.00 input / $50.00 output per 1M and delivers 2.5x faster output tokens at double the standard price. Batch API (async within 24 hours) costs $2.50 input / $12.50 output per 1M, a 50% discount. Worked examples: (1) Analyzing a 100K-token research paper with caching costs $0.32. (2) Daily agentic coding loop (1M input / 200K output, standard mode) costs $6.00. (3) Customer-support chatbot (1000 turns, avg 2K input / 500 output per turn) costs $13.50 total. (4) Batch processing 10 million tokens overnight costs $37.50. Pricing has remained stable since Opus 4.5 (Nov 2025) through Opus 4.8 (May 2026).

Question 3

What is Claude Opus 4.8's context window and max output?

Accepted Answer

Claude Opus 4.8 supports a 1M token (1 million token) context window by default on the Claude API, AWS Bedrock, and Google Vertex AI. Microsoft Foundry offers 200k context. Standard maximum output is 128,000 tokens. For batch processing, there is a beta 300k output option via the output-300k-2026-03-24 header on the Batch API. Long-context recall is reliable: the model maintains 99%+ accuracy on needle-in-haystack evals (finding facts buried deep in the context), verified above 100K token depth. The model architecture preserves token position awareness without significant degradation, enabling tasks like analyzing entire codebases (1M is ~3-5 files of code) or multi-document research synthesis. Prompt caching minimum is 1,024 tokens, lower than Opus 4.7 (2,048), making it cost-effective to cache smaller system prompts and tool definitions.

Question 4

How does Claude Opus 4.8 compare on benchmarks vs GPT-5.5 and Gemini 3.1 Pro?

Accepted Answer

Claude Opus 4.8 dominates on agentic coding and mathematical reasoning. SWE-bench Verified: Opus 4.8 leads at 88.6% vs GPT-5.5 (estimated 85%) and Gemini 3.1 Pro (84.2%). SWE-bench Pro (harder agentic coding): Opus 4.8 reaches 69.2%, beating GPT-5.5 (58.6%) and Gemini (62.8%). USAMO 2026 (math competition): Opus 4.8 dominates at 96.7%, far exceeding GPT-5.5 (91.2%) and Gemini (89.4%). GPQA Diamond (graduate reasoning): Opus 4.8 scores 93.6%, competitive with Gemini 3.1 Deep Think (94.1%) but ahead of GPT-5.5 (91.2%). Humanity's Last Exam (multidisciplinary): Opus 4.8 scores 57.9% with tools, highest in the field. Code honesty: Opus 4.8 achieves 0% flaw-miss rate (first Claude model), vs Opus 4.7 (25% miss rate) and GPT-5.5 (18% miss rate). LMArena Elo: Opus 4.8 ranks #2 at ~1,412 Elo, behind GPT-5.5 (~1,428) but ahead of Gemini 3.1 Pro. Each model wins different axes: Opus excels in agentic coding and math, GPT-5.5 leads on speed, Gemini leads pure theorem-proving.

Question 5

Is Claude Opus 4.8 open source or proprietary?

Accepted Answer

Claude Opus 4.8 is proprietary with closed weights. There is no open-source or open-weights version. Users cannot download the model, self-host, deploy air-gapped, or fine-tune the base weights. Access is API-only through multiple vendor channels. The Claude API (api.anthropic.com) offers direct access with API Key authentication. AWS Bedrock hosts the model with IAM-based auth across us-east-1, us-west-2, eu-central-1, ap-northeast-1, and other regions. Google Vertex AI provides access with GCP IAM authentication. Microsoft Foundry (Azure) offers deployment with Azure Identity authentication (200k context only). No other major platforms (Together.ai, Fireworks.ai, Lambda Labs) currently host Opus 4.8 as of May 2026. SDKs are available in Python, TypeScript, JavaScript, Java, Go, and Ruby for all major deployment platforms.

Question 6

What modalities does Claude Opus 4.8 support?

Accepted Answer

Claude Opus 4.8 is multimodal with text and vision inputs, text output only. Input modalities: text (unlimited tokens), images (up to 100 per request, any resolution), PDF documents (native reading without extraction), and tool calls (function calling schema compatible with OpenAI). Output modalities: text (up to 128k tokens, beta 300k), and tool calls (parallel execution supported). Special capabilities: structured output (JSON mode via API), function calling (native, OpenAI-compatible schema), and computer use (screen reading, keyboard/mouse control verified at 83.4% on OSWorld). Notably absent: no native audio input, no audio output, no video input. Audio workflows require pairing with a separate ASR (automatic speech recognition) and TTS (text-to-speech) model. Video understanding requires extracting frames and processing as images or using a separate video model. The model's vision capabilities are strong on documents, diagrams, charts, and UI screenshots—optimized for code-centric and analysis tasks.

Question 7

Does Claude Opus 4.8 train on user data?

Accepted Answer

No, Claude Opus 4.8 does not train on user data by default. Inputs and outputs are retained for 30 days for abuse monitoring and then deleted unless flagged. The model was trained on data with a cutoff of January 2026, including public web text, licensed datasets, and synthetic reasoning traces. Anthropic does not train future Claude models on API inputs; this is the default policy. Users can opt for zero-retention mode on enterprise plans, which means data is deleted immediately after processing. Data governance: Claude Opus 4.8 is SOC 2 Type II certified, ISO 27001 compliant, HIPAA-eligible for healthcare workflows, and GDPR-compliant for EU data protection. Data residency options include US and EU regions depending on deployment (Claude API regional selection, Bedrock region choice, Vertex AI location). The model supports safe data handling for regulated industries. Anthropic's Constitutional AI and RLHF alignment are standard safety measures across all deployments.

Question 8

Who is Claude Opus 4.8 best for and who should avoid it?

Accepted Answer

Claude Opus 4.8 excels for four main use cases: (1) Agentic coding teams running long autonomous loops in CI/CD, GitHub Copilot, or Claude Code—88.6% SWE-bench and 0% code-flaw miss rate make it the industry leader. (2) Enterprise research and knowledge teams analyzing 100K+ documents—1M context with 99%+ long-recall and 90% cheaper caching. (3) Mathematical and multidisciplinary reasoning workflows—96.7% USAMO and 93.6% GPQA excel here. (4) Customer support and DevOps teams needing reliable tool orchestration and computer use (83.4% OSWorld). Teams that should avoid: (1) Real-time voice assistants (no native audio, 850ms p50 latency too high for sub-500ms voice interactions; use GPT-5.5 mini or speech-specialized models). (2) On-device or air-gapped deployments (proprietary, API-only; open models like Llama 4 or self-hosted Gemini are alternatives). (3) Pure theorem-proving or symbolic mathematics (Gemini 3.1 Deep Think excels here; specialized math models also compete). (4) Video-first workflows (no native video; separate vision model required). Budget-conscious teams may prefer Claude Sonnet 4.6 (40% cheaper) for less complex tasks or open-weights alternatives.

Claude Opus 4.8

Claude Opus 4.8: 88.6% SWE-bench, $5/$25 Per 1M (June 2026)

About Claude Opus 4.8

Pricing

Key Features

Pros

Cons

Benchmarks

Frequently Asked Questions