Claude Haiku 4.5 on VM0. Fast, cheap routing

Anthropicの軽量高速モデル。レイテンシ重視のエージェントステップ、バルク分類、コスト重視のワークロードに最適。

200K tokens · Text / Vision / Code · Prompt cache

Claude Haiku 4.5はClaude 4ファミリーで最も軽量かつ高速なモデルです。速度とコストが推論深度より重要なタスク（バルク分類、プレフィルター、短い要約、レイテンシ重視の応答）のために設計されています。

$1/$5/1Mトークン、VM0で×0.3クレジット — 最も安価なAnthropicモデル。200Kコンテキストウィンドウはほとんどの単発タスクに十分です。

Claude Haiku 4.5とは？

VM0ローンチ以来利用可能 · Claude 4ファミリーの軽量版。速度とコスト効率に最適化。

Claude Haiku 4.5 is the small, fast member of the Claude 4 family. It is built for latency-sensitive and high-volume work where Sonnet would be overkill. Single-tool calls, fast classifications, short summarisations, and simple Slack replies.

Haiku 4.5 is remarkably capable for its tier. Anthropic's vendor-reported SWE-bench Verified score is 73.3%. Only ~4 points behind Sonnet 4.5 at one-third the cost. In Augment's agentic coding evaluation it reportedly hits 90% of Sonnet 4.5's performance, which puts it in genuine sub-agent territory.

Despite being the small Claude, Haiku 4.5 is multimodal (vision-capable), supports prompt caching, and runs at ~97 tokens/sec. Comfortably the fastest model in our Built-in lineup.

Claude Haiku 4.5の注目ポイント

アーキテクチャと機能の主な特徴。

Haiku 4.5 ships with a 200K-token context window, multimodal input across text, vision, and code, and prompt caching that bills cached input at one-tenth the input rate. Output runs at roughly 97 tokens per second, four to five times faster than Sonnet 4.5.

スペック概要

ファミリーClaude 4世代

モダリティテキスト、画像、コード

言語英語中心、多言語対応

プロンプトキャッシュサポート（Anthropic）

コンテキストウィンドウ200Kトークン

最大出力16Kトークン

最適高ボリューム、レイテンシ重視タスク

Claude Haiku 4.5のベンチマーク

Vendor-reported numbers from Anthropic's Haiku 4.5 launch materials. Note that OpenAI flagged training-data contamination on SWE-bench Verified across all frontier models. Treat absolute numbers cautiously, but the relative ordering is robust.

SWE-bench Verifiedvendor-reported, 50-trial average

73.3%

SWE-bench Prothird-party (Scale AI)

39.5%

OSWorld (computer use)vendor-reported

50.7%

Speedvendor-reported

~97 tokens/sec

Claude Haiku 4.5の価格

プロバイダー定価、100万トークンあたり。

入力$1.00

出力$5.00

キャッシュ読み取り$0.10

キャッシュ書き込み$1.25

Claude Haiku 4.5の実践的な挙動

本番エージェント実行で観測された動作。

Speed

Fastest model in the Built-in lineup at ~97 tokens/sec. Reply latency is short enough for interactive Slack agents.

Routing accuracy

Good enough for single-tool flows; multi-tool routing is meaningfully behind Sonnet 4.6 on edge cases. Keep tool schemas tight.

Reasoning

Holds up on short tasks; loses track on long multi-step loops. Use it as a worker, not a planner.

Cost

Lowest cost in the Claude family on VM0. Prompt caching makes it the cheapest practical Anthropic option for repeated prompts.

Claude Haiku 4.5に最適なエージェントタスク

The Slack triage agent that feels instant

Reads every incoming message, classifies it ("bug report", "sales lead", "meeting request"), routes it to the right channel, and posts an acknowledgment in under two seconds. At Sonnet's speed the same flow would feel laggy; at Haiku's ~97 tokens per second it feels like the bot is actually paying attention in real time.

The sub-agent under a Sonnet or Opus planner

Sonnet (or Opus) picks the strategy and breaks the request into ten narrow steps; Haiku executes each one. "Pull this CRM field, summarise this email, format this list" — none of those steps need flagship reasoning, and routing them to Haiku instead of running the whole loop on Sonnet drops the per-conversation cost dramatically without changing the output quality.

The bulk classifier that runs on every record

Tag a million tickets, extract the structured fields out of last quarter's email backlog, route a stream of inbound forms. Haiku's low per-token cost plus prompt caching on the (stable) system prompt means the unit cost per record is essentially noise on the budget, which is what makes "classify everything" workflows actually viable.

The vision micro-task that needs to be fast

OCR a screenshot, identify what type of chart it is, pull a number out of a receipt image. Haiku 4.5 is multimodal and very fast, which means a UI agent that takes a screenshot every few seconds and asks "what just changed?" stays responsive instead of stuttering.

Claude Haiku 4.5を避けるべきケース

複雑なマルチステップのエージェント作業にHaiku 4.5を使用すること。タスクに計画やツールオーケストレーションが必要な場合は、Sonnet 4.6から始めてください。

Claude Haiku 4.5 vs 他のモデル

Claude Haiku 4.5 vs Claude Sonnet 4.6

Sonnet (×1) is the default for full agents. Haiku (×0.3) is the right pick when speed and cost matter more than long-loop coherence. Typically as a worker under a Sonnet/Opus planner. Vendor benchmarks put Haiku within ~4 points of Sonnet 4.5 on SWE-bench Verified.

Claude Haiku 4.5 vs DeepSeek V4 Flash

DeepSeek V4 Flash (×0.02) is much cheaper but with weaker tool-use and less reliable on multi-step loops. Use Flash for one-shot bulk work; use Haiku for short interactive Slack-style replies.

Claude Haiku 4.5 vs MiniMax M2.7

MiniMax M2.7 (×0.1) is cheaper and stronger on multilingual tasks. Haiku 4.5 leads on English-language tool-routing reliability and is multimodal.

結論: Claude Haiku 4.5を使うべきか？

Haiku 4.5は単純でレイテンシ重視、コスト重視のタスクに適切な選択です。要求の厳しいエージェント作業向けではありませんが、コストと速度が優先される高ボリュームワークロードに最適なAnthropicモデルです。

よくある質問

Haiku 4.5はマルチモーダルですか？

はい。Haiku 4.5はテキスト、画像、コードを入力として受け付けます — OpusやSonnetと同じモダリティです。

Haiku 4.5の速度は？

Claude 4ファミリーで最も高速で、最も低い初回トークン時間と最も高いスループットを持ちます。

HaikuとSonnet、どちらを選ぶべき？

コストまたはレイテンシが推論深度より重要な場合にHaikuを選択：バルク分類、プレフィルター、単純なQ&A、短い要約。

Haikuはマルチツールエージェントを実行できますか？

可能ですが、ツール選択の信頼性はSonnetやOpusより低くなります。2〜3以上のツールを持つエージェントにはSonnet 4.6がより安全です。

Haiku 4.5のSWE-benchスコアは？

SWE-bench Verifiedで58.0%（ベンダー報告）。軽量モデルとして立派ですが、Opus（80.6%）やSonnet（73.8%）には大きく劣ります。

代替モデル

Claude Sonnet 4.6

日常的なエージェント作業により優れた推論品質。

DeepSeek V4 Flash

1Mコンテキストでさらに安価（×0.02クレジット）。

MiniMax M2.7

バルク分類に超低価格（×0.1クレジット）。

VM0でClaude Haiku 4.5を使う

VM0でClaude Haiku 4.5にアクセスする2つの方法

VM0はClaude Haiku 4.5を、VM0クレジットで課金されるBuilt-inモデル、およびAnthropic API keyを使用したBring-your-ownの2通りでサポートしています。Built-inパスではVM0 Managedルーティングと後述のクレジット倍率が適用され、Bring-your-ownパスでは上流プロバイダーに直接課金され、VM0クレジットへの変換は行われません。

VM0の推奨

VM0はClaude Haiku 4.5をコアエージェントモデルではなく、コスト削減オプションとして位置付けています。一括分類、プレフィルター、レイテンシが重要な短い返信、固定のレガシーエージェントなど、非コア作業の単価最適化に使用し、実行を左右するステップにはClaude Opus 4.7、Claude Opus 4.6、またはClaude Sonnet 4.6を維持します。

クレジットと×0.3倍率

VM0のすべてのBuilt-inモデルは、×1クレジット基準となるClaude Sonnet 4.6の倍数で価格設定されています。Claude Haiku 4.5は×0.3クレジットで課金されます。倍率はVM0の請求書に表示されるもので、上記の価格表のベンダー定価はVM0がクレジットに変換する前に上流プロバイダーが請求する金額です。

Claude Haiku 4.5は×0.3で課金されます。つまり、1ステップのコストはSonnet 4.6（×1基準）の同等ステップのわずか0.3倍です。これはクレジット基準を大きく下回り、ピーク時の推論品質よりもステップあたりのコストが重視される高ボリュームのバックグラウンドワークに自然な選択肢です。

VM0でAvailable since launchから利用可能。