DeepSeek V4 Flash on VM0. The cheapest model

Das Modell mit dem niedrigsten Multiplikator auf VM0. ×0,02 Credits — 1M-Kontext für weniger als ein Fünfzigstel der Sonnet-Kosten.

1M tokens · Text / Code · Prompt cache

DeepSeek V4 Flash auf VM0 nutzen

DeepSeek V4 Flash ist die abgespeckte Variante von V4 Pro, veröffentlicht am 24. April 2026. Es teilt das 1M-Token-Kontextfenster und die Open-Source-Lizenz mit V4 Pro, jedoch mit reduzierter Reasoning-Tiefe und einem extrem niedrigen Preis.

Listenpreis $0,14/$0,28 pro 1M Tokens mit gecachtem Input bei $0,028/1M. Cache Writes sind kostenlos. Auf VM0 ist der ×0,02-Multiplikator der niedrigste im gesamten Katalog — Schritte kosten weniger als ein Fünfzigstel von Sonnet 4.6.

Was ist DeepSeek V4 Flash?

24. April 2026 · Leichtgewicht der DeepSeek V4-Familie. Optimiert für minimale Kosten.

DeepSeek V4 Flash is the cost-leader in DeepSeek's V4 generation, released April 24, 2026 alongside V4 Pro. Where V4 Pro is positioned for reasoning, Flash is positioned for the absolute lowest unit cost. A model you can run at very high volumes without thinking about budget.

Flash is a 284B-parameter MoE with 13B active per token (vs Pro's 1.6T / 49B). Both share the V4 family's identical feature set: 1M-token context, 384K maximum output, three reasoning effort modes, JSON output, and tool calls.

On VM0 it carries a ×0.02 credit multiplier. The lowest in the entire Built-in catalogue. That makes it the default for bulk classification, tagging, extraction, and pre-filter workloads where the prompt does most of the work and the model just needs to follow instructions reliably. It shares the V4 family's free cache-write economics: only cache reads bill.

Technische Daten auf einen Blick

FamilieDeepSeek V4-Familie

ParameterNicht veröffentlicht

ModalitätenText, Code

SprachenMehrsprachig

Kontextfenster1.000K Token

Max Output8K Token

LizenzOpen Source

Verfügbar auf VM024. April 2026

DeepSeek V4 Flash Benchmarks

Vendor-reported scores from DeepSeek's V4 release. Flash matches Pro on simpler benchmarks but loses ground on multi-step tool use (Terminal-Bench) and factual recall (SimpleQA). Exactly what you'd expect from the smaller MoE.

SWE-bench Verifiedvendor-reported; within 1.6pts of V4 Pro

79.0%

Terminal-Bench 2.0vendor-reported; trails V4 Pro by 11pts

56.9%

SimpleQA-Verifiedvendor-reported; trails V4 Pro

34.1%

DeepSeek V4 Flash Preise

Listenpreis des Anbieters, pro 1 Mio. Tokens.

Input$0.14

Output$0.28

Cache Read$0.03

Cache WriteNicht abgerechnet

Wie sich DeepSeek V4 Flash in der Praxis verhält

Beobachtetes Verhalten aus produktiven Agent-Durchläufen.

Cost

By far the lowest cost in the Built-in lineup. The right pick whenever unit cost dominates the decision.

Single-shot accuracy

Good when the prompt is explicit and the task fits in one or two turns. Drops noticeably when asked to plan, branch, and remember across many steps.

Multi-step tool use

Vendor-reported Terminal-Bench 2.0 is 56.9% (vs V4 Pro's 67.9%). Meaningfully behind on complex multi-step tool flows. Don't put V4 Flash in a planner role.

Context window

1M tokens. Same as V4 Pro and far larger than Anthropic Haiku (200K).

Beste Agent-Aufgaben für DeepSeek V4 Flash

The classifier that runs on every record without flinching

Tag a million tickets by category, route inbound forms to the right team, score every review on the dimensions that matter. Per-record cost on Flash is fractions of a cent, which is what makes "classify everything as it arrives" workflows actually sustainable instead of getting throttled to a sample.

The pre-filter in front of a stronger model

Run V4 Flash on every record first, then route the top 5% (or the cases Flash isn't confident about) up to V4 Pro or Sonnet 4.6. Two-stage pipelines beat single-model pipelines on total cost almost every time — Flash handles the easy 95%, the stronger model only sees the hard 5%, and your bill scales with reasoning need rather than total volume.

The bulk-extraction job that pulls structured data from anywhere

Email backlogs, PDFs, meeting transcripts, scanned invoices — anywhere there's a fixed system prompt asking for the same JSON shape. Flash bills cache reads but not cache writes, so the long fixed prefix that defines the output schema is paid for once and amortises across the entire batch, driving the marginal per-document cost close to zero.

The long-document one-shot Q&A

Drop a whole book, a 200-page contract, or a codebase into the 1M-token context window and ask a single targeted question. Flash answers in one shot at fractions of a cent per call — more than fast enough for answering "does this document mention X?" across a long document at scale, which is one of the workflows agentic loops genuinely don't help with.

Wann du DeepSeek V4 Flash überspringen solltest

Skip V4 Flash on multi-step agent loops where it drifts on long tool chains, and on hard reasoning, code edits, or planner roles where V4 Pro or Sonnet 4.6 is the right call.

DeepSeek V4 Flash vs andere Modelle

DeepSeek V4 Flash vs DeepSeek V4 Pro

Same vendor; V4 Pro (×0.3) does the reasoning, V4 Flash (×0.02) does the volume. The classic split: Flash as the pre-filter, Pro as the escalator. Vendor-reported SWE-bench Verified is within 1.6 points (79.0 vs 80.6); Terminal-Bench 2.0 favours Pro by 11 points (67.9 vs 56.9).

DeepSeek V4 Flash vs Claude Haiku 4.5

Haiku 4.5 (×0.3) is more reliable on multi-tool routing and faster on interactive flows. V4 Flash (×0.02) wins on raw cost and context size. Pick Flash for batch jobs; pick Haiku for interactive Slack-style replies.

DeepSeek V4 Flash vs MiniMax M2.7

M2.7 (×0.1) is stronger on multilingual reasoning and has a 50-minute timeout for long thinking. V4 Flash (×0.02) is faster and far cheaper for single-shot work.

Fazit: Solltest du DeepSeek V4 Flash nutzen?

DeepSeek V4 Flash ist das Modell für High-Volume-Hintergrundarbeit, bei der Kosten pro tausend Schritte die entscheidende Metrik sind. Bei ×0,02 Credits kannst du es für Bulk-Operationen einsetzen, ohne das Budget zu sprengen.

Häufig gestellte Fragen

When was DeepSeek V4 Flash released?

DeepSeek released V4 Flash and V4 Pro together on April 24, 2026 under the MIT License with open weights.

Should I run my entire agent on V4 Flash?

Probably not. Flash is great at one-shot tasks but drifts on long multi-step loops (vendor-reported Terminal-Bench 2.0 is 11 points behind V4 Pro). The standard pattern is to use it as a pre-filter and escalate the hard cases to V4 Pro or Sonnet 4.6.

Are cache writes really free?

Yes. DeepSeek doesn't bill the cache-write portion. Only cache reads bill, at $0.028 per 1M tokens.

Is V4 Flash open-source?

Yes. Weights are published under the MIT License (284B total / 13B active MoE). The hosted DeepSeek API is the production path for VM0.

What's V4 Flash's context window?

1 million tokens. Identical to V4 Pro. Useful for long-document one-shot Q&A even at the cheapest tier.

Alternativen

DeepSeek V4 Pro

Stärkere Reasoning-Tiefe für wichtige Schritte (×0,3 Credits).

Claude Haiku 4.5

Bessere Anthropic-Integration zu höheren Kosten (×0,3 Credits).

MiniMax M2.7

DeepSeek V4 Flash auf VM0 nutzen

Zwei Wege, um DeepSeek V4 Flash auf VM0 zu nutzen

VM0 unterstützt DeepSeek V4 Flash als Built-in-Modell, das in VM0-Credits abgerechnet wird, sowie über Bring-your-own mit einem DeepSeek API key. Der Built-in-Weg nutzt VM0 Managed Routing und den unten erklärten Credit-Multiplikator; der Bring-your-own-Weg rechnet direkt mit dem Upstream-Anbieter ab und überspringt die VM0-Credit-Umrechnung.

VM0s Empfehlung

VM0 positioniert DeepSeek V4 Flash als kostensparende Option statt als Core-Agent-Modell. Nutze es zur Optimierung der Stückkosten bei Nicht-Kernarbeit wie Massenklassifikation, Vorfiltern, latenzkritischen Kurzantworten oder fest zugewiesenen Legacy-Agents, während Claude Opus 4.7, Claude Opus 4.6 oder Claude Sonnet 4.6 die entscheidenden Schritte übernehmen.

Credits und der ×0.02-Multiplikator

Jedes Built-in-Modell auf VM0 wird als Vielfaches von Claude Sonnet 4.6 bepreist, das die ×1-Credit-Basislinie bildet. DeepSeek V4 Flash wird mit ×0.02 Credits abgerechnet. Der Multiplikator erscheint auf deiner VM0-Rechnung; der Anbieter-Listenpreis in der obigen Preistabelle ist das, was der Upstream-Anbieter berechnet, bevor VM0 ihn in Credits umrechnet.

DeepSeek V4 Flash wird mit ×0.02 abgerechnet, d.h. ein Schritt kostet hier nur das 0.02-fache der Credits eines äquivalenten Schritts mit Sonnet 4.6 (der ×1-Basislinie). Damit ist es die günstigste Stufe im Built-in-Katalog und die naheliegende Wahl, wenn die Stückkosten entscheidend sind und die Workload überwiegend aus Single-Shot-Aufgaben besteht.

Verfügbar auf VM0 seit April 24, 2026.

Was ist DeepSeek V4 Flash?

Technische Daten auf einen Blick

DeepSeek V4 Flash Benchmarks

DeepSeek V4 Flash Preise

Wie sich DeepSeek V4 Flash in der Praxis verhält

Cost

Single-shot accuracy

Multi-step tool use

Context window

Beste Agent-Aufgaben für DeepSeek V4 Flash

The classifier that runs on every record without flinching

The pre-filter in front of a stronger model

The bulk-extraction job that pulls structured data from anywhere

The long-document one-shot Q&A

Wann du DeepSeek V4 Flash überspringen solltest

DeepSeek V4 Flash vs andere Modelle

DeepSeek V4 Flash vs DeepSeek V4 Pro

DeepSeek V4 Flash vs Claude Haiku 4.5

DeepSeek V4 Flash vs MiniMax M2.7

Fazit: Solltest du DeepSeek V4 Flash nutzen?

Häufig gestellte Fragen

When was DeepSeek V4 Flash released?

Should I run my entire agent on V4 Flash?

Are cache writes really free?

Is V4 Flash open-source?

What's V4 Flash's context window?

Alternativen

DeepSeek V4 Flash auf VM0 nutzen

Zwei Wege, um DeepSeek V4 Flash auf VM0 zu nutzen

VM0s Empfehlung

Credits und der ×0.02-Multiplikator

Weitere Modelle auf VM0