DeepSeek V4 Pro on VM0. Cost-optimised reasoning

DeepSeek's flagship V4 reasoning model. Within 0.2 points of Claude Opus 4.6 on SWE-bench Verified at one-seventh the vendor cost. Claude-compatible API.

1M tokens · Text / Code · Prompt cache

Use DeepSeek V4 Pro on VM0

DeepSeek V4 Pro is the flagship of DeepSeek's V4 generation — an open-weight 1.6T-parameter MoE under the MIT license. The headline is the price-to-quality ratio: vendor-reported SWE-bench Verified is 80.6%, within a fraction of a point of Claude Opus 4.6, at roughly one-seventh of Anthropic's vendor cost. That makes reasoning-heavy agents — bulk PR review, batch document analysis, scheduled summarisation — affordable at high volume.

Vendor list price is $1.74 / $3.48 per 1M tokens with cache reads at $0.028 / 1M and free cache writes (unique in the lineup). 1M-token context, Anthropic-compatible API. Reach for Sonnet 4.6 when production tool-routing reliability is the deciding factor, and for V4 Flash when single-shot bulk work justifies a 12× cheaper model.

What is DeepSeek V4 Pro?

April 24, 2026 · Reasoning variant of the DeepSeek V4 family. Paired with V4 Flash for cost.

DeepSeek V4 Pro is the flagship of DeepSeek's V4 generation, released April 24, 2026 under the MIT License. It's an open-weight Mixture-of-Experts model with 1.6T total parameters and 49B active per token, paired with V4 Flash (284B / 13B active) for cost-sensitive work.

Both V4 models share an identical feature set: 1M-token context window, 384K maximum output, three reasoning effort modes (standard, think, think-max), JSON output, tool calls, and FIM completion in non-think mode. The Pro model adds a hybrid attention architecture (Compressed Sparse Attention + Heavily Compressed Attention) for dramatically improved long-context efficiency. 27% of single-token inference FLOPs and 10% of KV cache vs DeepSeek V3.2 at 1M context.

DeepSeek made waves through 2025 by delivering Anthropic-grade reasoning at a fraction of the price. V4 Pro continues that pattern: vendor-reported SWE-bench Verified 80.6% sits within 0.2 points of Claude Opus 4.6, at roughly one-seventh the vendor cost. On VM0 it's exposed via the DeepSeek API-key provider and on VM0 Managed at ×0.3. The same multiplier as Claude Haiku 4.5 but with substantially stronger reasoning behaviour.

What's notable about DeepSeek V4 Pro

Headline architecture and capability features.

V4 Pro is a Mixture-of-Experts model with 1.6T total parameters and 49B active per token, fronted by a hybrid attention stack (Compressed Sparse Attention plus Heavily Compressed Attention) that keeps long-context inference cheap. It supports a 1M-token context window with 384K of maximum output, three reasoning effort modes (standard, think, and think-max), and uses Manifold-Constrained Hyper-Connections for stable signal propagation. The model was trained on 32T+ tokens with the Muon optimizer and is released under the MIT License with open weights.

Specs at a glance

FamilyDeepSeek V4 series

Parameters1.6T total / 49B active (MoE)

ModalitiesText, code

LanguagesMultilingual

Context window1M tokens

Max output384K tokens

LicenseMIT (open weights)

Available on VM0April 24, 2026

DeepSeek V4 Pro benchmarks

Vendor-reported scores from DeepSeek's V4 Pro release. Independent reviews (Geeky Gadgets, Code Arena) place V4 Pro third on Code Arena behind GLM-5.1 and Kimi K2.6. The strongest benchmark claims come from DeepSeek's own materials. Treat directionally rather than as absolute truth.

SWE-bench Verifiedvendor-reported; within 0.2pts of Opus 4.6

80.6%

Terminal-Bench 2.0vendor-reported; leads Opus 4.6

67.9%

LiveCodeBenchvendor-reported

93.5%

Codeforces ratingvendor-reported

3206

MMLU-Provendor-reported

Matches GPT-5.4

Artificial Analysis Intelligence Indexmax effort

SpeedArtificial Analysis

~36 tokens/sec

DeepSeek V4 Pro pricing

Provider list price, per 1M tokens.

Input$1.74

Output$3.48

Cache read$0.14

Cache writeNot billed

How DeepSeek V4 Pro behaves in practice

Observed behaviour from production agent runs.

Reasoning

Strongest sub-Sonnet reasoning in our lineup. Holds up on multi-step work where cheaper models start to drift. Vendor-reported MMLU-Pro matches GPT-5.4.

Coding benchmarks

Vendor-reported SWE-bench Verified 80.6% (within 0.2 of Opus 4.6), Terminal-Bench 2.0 67.9% (leads Opus 4.6), LiveCodeBench 93.5%.

Cost efficiency

The standout property. ×0.3 credit cost with reasoning that competes well with Sonnet 4.6 makes V4 Pro the cost-optimisation default. ~7× cheaper than Claude Opus 4.7.

Cache economics

Cache writes are free. Unique among VM0's Built-in models. Stable system prompts and large pasted reference docs cost nothing extra to cache, only the read side bills.

Speed

Around 36 tokens/sec at max effort per Artificial Analysis. Slower than Haiku, slightly slower than Opus 4.6.

Best agent tasks for DeepSeek V4 Pro

The PR-review agent that runs on every commit

Sonnet-tier accuracy at roughly one-third of Sonnet's vendor cost is what makes "review every commit, not just the big PRs" actually viable. V4 Pro reads the diff, the related files, and the linked issue, then writes a structured comment — and the per-call price is low enough that running it as a CI step on every push doesn't show up as a noticeable line item.

The scheduled summariser that runs every night

Pulls yesterday's customer conversations, support tickets, or sales calls and writes a digest. The system prompt and tool schema don't change between runs, and DeepSeek doesn't bill cache writes — so the long fixed prefix is paid for once and cached reads cost a fraction of normal input. This is where V4 Pro's pricing model genuinely changes what's affordable.

The whole-repo code agent that costs less than Opus

1M-token context with hybrid attention (Compressed Sparse Attention plus Heavily Compressed Attention) means a mid-sized codebase fits in one prompt and inference cost stays manageable as the window fills up. For cross-file refactors and architecture-level reviews, this is where you get the Opus-style "see everything at once" workflow without the Opus-style invoice.

When to skip DeepSeek V4 Pro

Skip V4 Pro on the hardest tool-routing edge cases where Sonnet 4.6 still leads, and on bulk single-shot work where reasoning isn't required and V4 Flash is roughly 12× cheaper.

DeepSeek V4 Pro vs other models

DeepSeek V4 Pro vs DeepSeek V4 Flash

Same vendor, different positioning. V4 Pro (×0.3) gives you reasoning; V4 Flash (×0.02) gives you the cheapest possible single-shot model. Vendor-reported SWE-bench Verified shows Flash within 1.6 points of Pro (79.0 vs 80.6). But Pro pulls ahead on Terminal-Bench (67.9 vs 56.9) on multi-step tool use.

DeepSeek V4 Pro vs Claude Sonnet 4.6

Sonnet 4.6 (×1) wins on tool-routing edge cases and English-language reasoning. V4 Pro (×0.3) wins on cost and is competitive on coding benchmarks (vendor-reported). Worth A/B-testing on a real agent before committing.

DeepSeek V4 Pro vs Kimi K2.6

Same multiplier (×0.3). Kimi has stronger long-context recall and a higher Intelligence Index (54 vs 52); V4 Pro has the better cache economics (free writes) and a 1M context window vs Kimi's 256K. Pick by which property matters more.

Bottom line: should you use DeepSeek V4 Pro?

Pre-filter with V4 Flash, escalate to V4 Pro for reasoning, escalate to Sonnet 4.6 only when V4 Pro stalls on tool-routing edge cases.

Frequently asked questions

When was DeepSeek V4 Pro released?

DeepSeek released V4 Pro and V4 Flash together on April 24, 2026 under the MIT License with open weights.

Why are cache writes free?

DeepSeek doesn't bill the cache-write portion. Only cache reads bill, at $0.145 per 1M tokens. Stable system prompts and large reference contexts cost nothing extra to cache.

What's V4 Pro's context window?

1 million tokens with up to 384K tokens of output. The hybrid attention architecture makes the full window usable at much lower inference cost than V3.2.

How does V4 Pro compare to Claude Opus 4.6?

Vendor-reported SWE-bench Verified is within 0.2 points (80.6 vs 80.8). Terminal-Bench 2.0 favours V4 Pro (67.9 vs 65.4). Opus 4.6 leads on HLE (40.0 vs 37.7) and HMMT 2026 math (96.2 vs 95.2). At ~7× lower vendor cost, V4 Pro is the right call when reasoning quality is the bar but cost matters.

Is V4 Pro open-source?

Yes. Weights are published under the MIT License. The hosted DeepSeek API is the production path for VM0.

Alternatives

DeepSeek V4 Flash

12× cheaper, single-shot work

Claude Sonnet 4.6

Step up for hard tool routing

Kimi K2.6

Same price, stronger long-context recall

Using DeepSeek V4 Pro on VM0

Two ways to access DeepSeek V4 Pro on VM0

VM0 supports DeepSeek V4 Pro as a Built-in model billed in VM0 credits, and through bring-your-own with a DeepSeek API key. The Built-in path uses VM0 Managed routing and the credit multiplier explained below; the bring-your-own path bills you directly with the upstream vendor and skips the VM0 credit conversion entirely.

VM0's recommendation

VM0 positions DeepSeek V4 Pro as a cost-saving option rather than a core agent model. Use it to optimise unit cost on non-core work, such as bulk classification, pre-filters, latency-critical short replies, or pinned legacy agents, while keeping Claude Opus 4.7, Claude Opus 4.6, or Claude Sonnet 4.6 on the steps that decide the run.

Credits and the ×0.3 multiplier

Every Built-in model on VM0 is priced as a multiple of Claude Sonnet 4.6, which sits at the ×1 credit baseline. DeepSeek V4 Pro bills at ×0.3 credits. The multiplier is what shows up on your VM0 invoice; the vendor list price in the pricing table above is what the upstream provider charges before VM0 converts it into credits.

DeepSeek V4 Pro bills at ×0.3, which means a step here costs only 0.3× the credits of an equivalent step on Sonnet 4.6 (the ×1 baseline). That puts it well below the credit baseline and makes it the natural pick for high-volume background work where cost-per-step matters more than peak reasoning quality.

Available on VM0 since April 24, 2026.