Name: Claude Sonnet 4.6: Benchmarks, Pricing & API Guide | hokai.io
Brand: Anthropic
Price: 3.00 USD
Availability: InStock

Question 1

What is Claude Sonnet 4.6 and who built it?

Accepted Answer

Claude Sonnet 4.6 is a mid-tier large language model built by Anthropic and released on February 17, 2026. It sits in the Claude 4 model family between the budget-oriented Claude Haiku 4.5 and the research-grade Claude Opus 4.6, and it replaced Claude Sonnet 4.5 as the default model on claude.ai for Free and Pro users on launch day. The model uses a dense transformer architecture with an undisclosed parameter count. On the most important coding benchmark, SWE-bench Verified, it scores 79.6% -- within 1.2 points of Claude Opus 4.6 (80.8%) at 60% of the price. On computer use (OSWorld), it scores 72.5%, nearly matching Opus 4.6 (72.7%) and far ahead of GPT-5.2 (38.2%). The model was specifically designed to close the performance gap between Anthropic's mid-tier and flagship tiers on the workloads most development teams rely on: agentic coding, long-document analysis, and GUI automation. Pricing starts at $3.00 per million input tokens and $15.00 per million output tokens, with a 1-million-token context window included at no surcharge.

Question 2

How much does Claude Sonnet 4.6 cost per 1M tokens?

Accepted Answer

Claude Sonnet 4.6 costs $3.00 per million input tokens and $15.00 per million output tokens on the Anthropic API, confirmed on Anthropic's official pricing page as of May 2026. Prompt caching reduces input costs significantly: writing to a 5-minute cache costs $3.75 per million tokens, writing to a 1-hour cache costs $6.00 per million tokens, and cache reads cost $0.30 per million tokens -- a 90% reduction from the base input rate. The Batch API offers a 50% discount, bringing prices to $1.50 input and $7.50 output per million tokens, with asynchronous result delivery. To illustrate real workload costs: summarizing a 100,000-token research paper costs roughly $0.30; running a daily coding agent at 1 million input tokens and 200,000 output tokens costs approximately $6.00; processing 1,000 customer support turns at an average of 2,000 input and 500 output tokens each costs roughly $9.50. Regional endpoints on AWS Bedrock and Google Vertex AI add a 10% premium over global routing. The pricing is identical to Claude Sonnet 4.5, maintaining Anthropic's $3/$15 per million tokens across four Sonnet generations.

Question 3

What is Claude Sonnet 4.6's context window and max output?

Accepted Answer

Claude Sonnet 4.6 has a 1-million-token context window that became generally available on March 13, 2026, with no beta header required and no long-context price premium. A 900,000-token request bills at the same per-token rate as a 9,000-token request. The synchronous Messages API supports up to 64,000 output tokens per call, making it suitable for generating long documents, code files, and reports. For larger generation tasks, the Message Batches API supports up to 300,000 output tokens per call via the output-300k-2026-03-24 beta header, paired with the standard 50% Batch API discount. The model accepts up to 600 images or PDFs in a single request alongside text. Anthropic has not published a public needle-in-haystack recall benchmark for Sonnet 4.6, though the 1M context window is architecturally shared with Opus 4.6. The model also features adaptive context compaction, which summarizes older conversation turns as context fills, enabling sustained long-horizon agentic sessions without manual truncation. Claude Haiku 4.5, by contrast, has a 200K token context window, and Claude Opus 4.7 has a 1M window with up to 128K output tokens.

Question 4

How does Claude Sonnet 4.6 compare to Claude Opus 4.6 and GPT-5.2 on benchmarks?

Accepted Answer

Claude Sonnet 4.6 and Claude Opus 4.6 are nearly tied on coding and computer use: SWE-bench Verified is 79.6% for Sonnet versus 80.8% for Opus (a gap of 1.2 points), and OSWorld is 72.5% versus 72.7% (essentially identical). The meaningful gap between the two models appears in scientific reasoning: GPQA Diamond is 74.1% for Sonnet versus 91.3% for Opus, a 17-point difference that favors Opus for tasks in chemistry, biology, advanced physics, and graduate-level problem solving. Against GPT-5.2, Sonnet 4.6 leads clearly on computer use (72.5% versus 38.2% OSWorld) and is comparable on coding (79.6% versus approximately 78% SWE-bench). On GPQA Diamond, GPT-5.2 (73.8%) and Sonnet 4.6 (74.1%) are nearly identical. Gemini 3.1 Pro sits at 80.6% on SWE-bench, slightly ahead of Sonnet 4.6. On MATH, Sonnet 4.6 scores 89%, up sharply from 62% on Sonnet 4.5. The benchmark numbers reported above for Sonnet 4.6 and Opus 4.6 come from independent and vendor sources published in February-March 2026; GPT-5.2 scores are third-party estimates and should be treated with some caution.

Question 5

Is Claude Sonnet 4.6 open source or proprietary?

Accepted Answer

Claude Sonnet 4.6 is fully proprietary. Anthropic has not released the model weights, and there is no self-hosted deployment option. Access is API-only through four platforms: the Anthropic API directly (api.anthropic.com), AWS Bedrock (model ID: anthropic.claude-sonnet-4-6), Google Vertex AI (model ID: claude-sonnet-4-6), and Microsoft Foundry. On Bedrock, the model supports global cross-region routing and geo cross-region routing across US, EU, AU, and JP geographies. On Vertex, global, multi-region, and regional endpoints are available. Authentication uses API keys for the Anthropic API and IAM credentials for Bedrock and Vertex. Regional and multi-region endpoints carry a 10% price premium. SDKs are available in Python, TypeScript, Java, Go, and Ruby; Go and Ruby do not support Microsoft Foundry. Commercial use is governed by Anthropic's Commercial Terms of Service. Teams that require on-device deployment, air-gapped inference, or the ability to fine-tune the base weights should instead evaluate open-weights models such as Meta's Llama 4 or Mistral's open-licensed models.

Question 6

What modalities does Claude Sonnet 4.6 support?

Accepted Answer

Claude Sonnet 4.6 accepts text, images, PDFs, and tool-calls as input, and produces text and tool-calls as output. Vision is fully live: the model processes images and PDFs natively in the API without external OCR preprocessing. Up to 600 images or PDFs can be included in a single request, with individual images up to 8,000x8,000 pixels. There is no native audio input or output; teams building voice products must add a separate ASR layer for transcription and a TTS layer for speech synthesis. Video input is not supported. Function calling uses Anthropic's standard tool_use schema with support for parallel tool calls, structured JSON output, and the tool_choice parameter for enforcing specific tool invocation. Computer use is supported via a dedicated tool, enabling GUI navigation, spreadsheet manipulation, and multi-step web form completion, with a 72.5% OSWorld score. Adaptive thinking provides chain-of-thought reasoning with configurable effort levels. Compared to Google Gemini 3.1 Pro, which supports native audio and video input, Sonnet 4.6's modality coverage is narrower, but its computer use capability is significantly stronger.

Question 7

Does Claude Sonnet 4.6 train on user data?

Accepted Answer

Anthropic does not train Claude Sonnet 4.6 on API inputs by default. API inputs and outputs are retained for up to 30 days for safety and abuse monitoring, and then deleted unless a specific retention flag is applied. Enterprise customers can request a zero-retention arrangement, in which inputs are not stored after processing. This zero-retention option is available through both the direct Anthropic API and AWS Bedrock. Anthropic holds SOC 2 Type II certification, is HIPAA-eligible for qualifying use cases, and is GDPR-compliant. The EU AI Act classifies Sonnet 4.6 as a general-purpose AI with systemic risk obligations. Anthropic's full compliance documentation is available at anthropic.com/transparency. On AWS Bedrock, data residency and governance controls are managed separately at the infrastructure level. On Google Vertex AI, data handling follows Google's enterprise data processing addendum. For US-only inference routing via the inference_geo parameter on the direct API, a 1.1x pricing multiplier applies on all token categories.

Question 8

Who is Claude Sonnet 4.6 best for and who should avoid it?

Accepted Answer

Claude Sonnet 4.6 is the best choice for teams running agentic coding loops at scale: its 79.6% SWE-bench Verified score is near-Opus quality at 60% of Opus pricing, and the 1M-token context window enables full-codebase operations without chunking. It is also the leading model for computer use and GUI automation pipelines, with a 72.5% OSWorld score that far exceeds GPT-5.2's 38.2%. Teams doing enterprise document comprehension or long-document RAG benefit from the 1M context at no premium and from Sonnet 4.6's near-Opus score on OfficeQA. Teams that should avoid Sonnet 4.6 include those requiring deep scientific reasoning: the 17-point GPQA gap versus Opus 4.6 (74.1% vs 91.3%) means Opus is the correct choice for chemistry, biology, and graduate-level problem solving. Real-time voice applications are ruled out by the absence of native audio I/O; teams should look at models with native audio support such as GPT-4o Audio. Teams with air-gapped or on-device deployment requirements should evaluate open-weights models like Llama 4 or Mistral, since Sonnet 4.6 is API-only with no self-hosting path.

Claude Sonnet 4.6: Benchmarks, Pricing & API Guide | hokai.io

About Claude Sonnet 4.6

Pricing

Key Features

Pros

Cons

Benchmarks

Frequently Asked Questions