Name: DeepSeek V4 Pro: 80.6% SWE-bench, 1M Context, MIT 2026
Brand: DeepSeek
Price: 1.74 USD
Availability: InStock

Question 1

What is DeepSeek-V4-Pro and who built it?

Accepted Answer

DeepSeek-V4-Pro is the flagship model from DeepSeek, a Chinese AI research company founded in 2023 by High-Flyer hedge fund CEO Liang Wenfeng. Released April 24, 2026 as a preview under the MIT license, it is a Mixture-of-Experts transformer with 1.6 trillion total parameters and 49 billion activated per token, pre-trained on 33 trillion tokens. The hybrid attention architecture combines Compressed Sparse Attention and Heavily Compressed Attention, requiring only 27% of the inference FLOPs of its predecessor DeepSeek-V3.2 at 1M-token context. In Think Max mode it scores 80.6% on SWE-bench Verified (within 0.2 points of Claude Opus 4.6), 90.1% on GPQA Diamond, and holds a Codeforces rating of 3,206, the highest of any publicly ranked model as of May 2026. The model was trained entirely on Huawei Ascend 950PR chips, the first frontier-scale model to complete training on Huawei silicon. V4-Pro sits at the top of DeepSeek's lineup above V4-Flash (284B total, 13B active), the faster companion for throughput-sensitive workloads. Standard API pricing is $1.74 per 1M input tokens and $3.48 per 1M output tokens with a 1M-token context window and 384K max output.

Question 2

How much does DeepSeek-V4-Pro cost per 1M tokens?

Accepted Answer

Standard list pricing on the DeepSeek API is $1.74 per 1M cache-miss input tokens and $3.48 per 1M output tokens. A 75% promotional discount is active until May 31, 2026, reducing rates to $0.435 per 1M input and $0.87 per 1M output. Cache-hit input pricing is $0.145 per 1M at standard rates ($0.003625/M on promo), after a 1/10 reduction applied April 26, 2026. A daily agentic coding loop processing 1M input and 200K output costs $1.74 plus $0.70 = $2.44/day at standard rates. A customer-support deployment at 1,000 turns of 2K input and 500 output per turn costs $3.48 plus $1.74 = $5.22/day. Compared to Claude Opus 4.7 ($15/M output) and GPT-5.5 ($25/M output), V4-Pro output tokens are 4-7x cheaper at list price on comparable coding benchmarks. Self-hosted deployments on open weights are free beyond infrastructure; a 4-node H200 141GB GPU cluster is the minimum practical hardware. The DeepSeek API grants 5 million free tokens to every new account.

Question 3

What is DeepSeek-V4-Pro's context window and max output?

Accepted Answer

DeepSeek-V4-Pro supports a context window of 1,048,576 tokens (one million tokens) with a maximum output of 384,000 tokens per request. The Think Max reasoning mode requires at least 384K tokens of context budget to operate at maximum reasoning depth. At 1M-token context, the hybrid CSA+HCA attention uses only 27% of single-token inference FLOPs versus DeepSeek-V3.2 and 10% of V3.2's KV cache, via HCA's 128x sequence compression before dense attention. No independent third-party needle-in-haystack evaluation at 1M depth has been published as of May 2026; efficiency claims are from DeepSeek's April 27, 2026 model card. Among current frontier models, Gemini 3.1 Pro also provides 1M-token context; GPT-5.4 supports 128K by default with a 1M-token preview tier; Claude Opus 4.6 maxes out at 200K with 64K max output. The 384K max output per request exceeds Claude's 64K limit, making V4-Pro well suited for generating large code files or full document drafts in a single API call. PDF and multi-file inputs are handled via standard text tokenization; there is no native document parser distinct from the text context window.

Question 4

How does DeepSeek-V4-Pro compare on benchmarks vs Claude Opus 4.6?

Accepted Answer

On SWE-bench Verified (real-world GitHub issue resolution), DeepSeek-V4-Pro in Think Max mode scores 80.6% versus Claude Opus 4.6 at approximately 80.8%, a gap under 0.3 percentage points. On GPQA Diamond (graduate-level scientific reasoning), DeepSeek-V4-Pro scores 90.1%, while Gemini 3.1 Pro leads at 94.3%; Claude Opus 4.6 falls between these two. On Humanity's Last Exam, DeepSeek-V4-Pro reaches 37.7% versus Claude Opus 4.6's 40.0%, a 2.3-point gap on the hardest evaluation published as of May 2026. On MMLU-Pro, DeepSeek-V4-Pro scores 87.5% against Gemini 3.1 Pro's 91.0%, with Claude in the same range. For competitive coding, V4-Pro leads with a Codeforces rating of 3,206 versus GPT-5.4 at 3,168 and Gemini 3.1 Pro at 3,052, and achieves 93.5 on LiveCodeBench. V4-Pro leads on competitive coding and math olympiad benchmarks; Claude and Gemini lead on the hardest scientific reasoning tasks (GPQA Diamond, HLE); the gap on most benchmarks stays within 5 percentage points, with output cost being the clearest differentiator at $3.48/M versus $15/M. All V4 benchmark scores are vendor-reported at this stage; independent third-party reproductions are still being published.

Question 5

Is DeepSeek-V4-Pro open source or proprietary?

Accepted Answer

DeepSeek-V4-Pro is fully open-source under the MIT license, with weights available at huggingface.co/deepseek-ai/DeepSeek-V4-Pro since April 24, 2026. The MIT license imposes zero restrictions on commercial use, fine-tuning, redistribution, or modification of the weights. Four model variants are available: DeepSeek-V4-Pro-Base (raw pretrained), DeepSeek-V4-Pro-Instruct (post-trained with SFT and GRPO), and equivalent Flash variants. Official weights use FP4 plus FP8 mixed precision with the full V4-Pro model occupying approximately 865GB. Community GGUF quantizations (Q4_K_M, Q2_K) are available via Unsloth; Q2_K is approximately 400GB in compressed format, so self-hosting V4-Pro remains a multi-GPU proposition. Self-hosting requires a multi-GPU cluster of 4x H200 141GB at minimum; for V4-Flash, a single H200 141GB is feasible. The open weights mean any party can remove or alter DeepSeek's safety training without restriction, a categorically different risk profile from API-only closed models. V4-Flash is the practical self-hosted alternative for teams without full cluster access.

Question 6

What modalities does DeepSeek-V4-Pro support?

Accepted Answer

At launch in April 2026, DeepSeek-V4-Pro supports text input and text output only; there is no native image, audio, or video processing in the preview release. Confirmed output modalities are text and tool-calls via function calling. The model supports tool use and function calling through both an OpenAI ChatCompletions-compatible interface and an Anthropic-compatible API, with JSON mode confirmed across all eight current hosting providers. Parallel tool calls are supported, enabling multi-step agentic loops where the model issues multiple function calls in a single response. The three reasoning modes (Non-think, Think High, Think Max) are toggled per request through API parameters, with Think Max requiring a 384K-token context budget. DeepSeek has indicated that multimodal support (image input) is in development, potentially as a V4 Vision extension; no release date has been announced as of May 2026. Computer use and web browsing are not available in the April 2026 preview; agentic computer interaction requires pairing V4-Pro with a separate browser or shell execution layer.

Question 7

Does DeepSeek-V4-Pro train on user data?

Accepted Answer

The direct DeepSeek API does not publish a data retention or training-on-inputs policy equivalent to what Anthropic or OpenAI disclose for their enterprise tiers. DeepSeek's April 27, 2026 model card states that sensitive personal information, credit card numbers, and identification data are excluded from training data sources, but does not specify API input retention periods or opt-out mechanisms. The direct DeepSeek API is not certified under SOC 2 Type II, ISO 27001, or HIPAA; organizations with compliance requirements should access V4-Pro through AWS Bedrock, Google Vertex AI, or Azure AI Foundry where platform-level certifications apply. On Bedrock and Vertex, API inputs are governed by AWS and Google's DPAs, not DeepSeek's, providing EU data residency and contractual data handling guarantees. Because V4-Pro weights are MIT-licensed, self-hosting provides maximum data privacy: inference stays entirely on the team's own infrastructure with zero external data transmission. The EU AI Act classification for V4-Pro has not been formally assessed; at 1.6T parameters it likely meets the GPAI systemic risk threshold. Teams handling HIPAA-regulated data must confirm their chosen cloud provider has a BAA covering DeepSeek V4 Pro before deploying.

Question 8

Who is DeepSeek-V4-Pro best for and who should avoid it?

Accepted Answer

DeepSeek-V4-Pro is the strongest open-weight choice for agentic coding teams, with 80.6% SWE-bench Verified (Think Max) and Codeforces rating 3,206, the highest published for any model as of May 2026. Cost-sensitive teams needing frontier-grade reasoning benefit most from $1.74/$3.48 per 1M token pricing, roughly 4-7x cheaper on output than Claude Opus 4.7 or GPT-5.5. Open-source teams requiring fine-tuning, private inference, or air-gapped deployment should prioritize V4-Pro over any closed model given the MIT license and downloadable weights. Teams building vision-first or multimodal applications should not use V4-Pro at launch (text-only in the April 2026 preview); use GPT-5.4, Gemini 3.1 Pro, or Claude Opus 4.6 for image understanding. Voice-first products and real-time speech applications are ruled out by the absence of native audio I/O. Teams requiring HIPAA compliance or SOC 2 Type II on the API endpoint should use AWS Bedrock or Azure AI Foundry rather than the direct DeepSeek API. For the hardest scientific reasoning tasks (GPQA Diamond 94.3%), Gemini 3.1 Pro currently leads V4-Pro by 4.2 points; teams where that gap matters should prefer Gemini or Claude. Organizations with EU data sovereignty requirements should verify their chosen provider's data residency options before deploying.

DeepSeek V4 Pro: 80.6% SWE-bench, 1M Context, MIT 2026

About DeepSeek-V4-Pro

Pricing

Key Features

Pros

Cons

Benchmarks

Frequently Asked Questions