Claude Haiku 4.5

Claude Haiku 4.5 is Anthropic's latest small AI model designed for high-speed, cost-efficient performance while maintaining near-frontier coding and task execution capabilities. It achieves coding performance comparable to Claude Sonnet 4, a previous state-of-the-art model, while operating at twice the speed and one-third the cost. The model is optimized for real-time applications and complex workflows requiring low latency and high responsiveness.
The core value lies in bridging the gap between premium model capabilities and operational efficiency, enabling developers and enterprises to deploy AI at scale without compromising speed or budget. It delivers 90% of Claude Sonnet 4.5's coding quality while significantly reducing inference costs and latency. This makes it ideal for applications where rapid iteration and cost management are critical.

Claude Haiku 4.5 operates at 2x the speed of Claude Sonnet 4 while maintaining comparable coding performance, as demonstrated by achieving 73.3% accuracy on the SWE-bench Verified dataset with tool-assisted problem-solving. It supports 128K-token thinking budgets for complex reasoning tasks and integrates natively with tools like bash and file editing via string replacements.
The model excels at computer interaction tasks, outperforming Claude Sonnet 4 in OSWorld evaluations with a 100-step action limit and 128K-token thinking budgets. It achieves 65% accuracy in slide text generation benchmarks, surpassing competing models by 21 percentage points in instruction-following precision.
Safety enhancements include a 50% reduction in misaligned behaviors compared to Haiku 3.5 and superior alignment metrics relative to Sonnet 4.5 and Opus 4.1. It operates under AI Safety Level 2 (ASL-2) certification, with limited CBRN risks validated through rigorous testing protocols.

The model addresses the trade-off between AI performance and operational costs in latency-sensitive applications like real-time coding assistance and customer service automation. It eliminates the need to choose between premium model capabilities and budget constraints through optimized inference efficiency.
Primary users include developers building AI-powered coding tools (e.g., GitHub Copilot alternatives), enterprises deploying high-volume customer service agents, and teams requiring parallel task execution in agentic workflows. Technical leads implementing multi-model architectures for complex problem decomposition will benefit significantly.
Typical scenarios include rapid code prototyping with Claude Code, real-time document generation in tools like Gamma, and multi-agent systems where Sonnet 4.5 handles planning while orchestrating Haiku 4.5 instances for parallel subtask execution. It also powers Chrome extensions requiring instant response times for AI-assisted browsing.

Unlike comparable small models, Haiku 4.5 matches the coding performance of Anthropic's previous frontier model (Sonnet 4) while reducing inference costs by 66% and doubling processing speed. This performance-cost ratio exceeds industry benchmarks for models in its class.
The model introduces novel multi-agent orchestration capabilities, enabling Sonnet 4.5 to delegate parallelizable tasks to Haiku 4.5 instances through optimized API integrations. This architecture allows complex workflows to maintain Sonnet-level quality while achieving Haiku-level efficiency at scale.
Competitive advantages include 4-5x faster inference speeds than Sonnet 4.5 in coding applications, combined with superior safety metrics that make it Anthropic's safest model to date. The $1/$5 per million token pricing undercuts comparable models while delivering frontier-adjacent performance.

How does Claude Haiku 4.5 compare to Claude Sonnet 4.5 in safety metrics? Claude Haiku 4.5 demonstrates 22% fewer misaligned behaviors than Sonnet 4.5 in automated alignment assessments, despite operating under less restrictive ASL-2 safety protocols. Its CBRN risk profile meets enterprise security standards for general deployment.
Can Haiku 4.5 integrate with existing cloud platforms? Yes, the model is available as a drop-in replacement on Amazon Bedrock and Google Cloud's Vertex AI, maintaining API compatibility with previous Haiku and Sonnet versions. Developers can switch models without codebase modifications.
What benchmarks validate the coding performance claims? The model achieves 73.3% accuracy on SWE-bench Verified using tool-assisted problem-solving across 500 trials, matching Sonnet 4's performance. In Terminal-Bench evaluations, it scores 40.21% without thinking budgets and 41.75% with 32K-token reasoning cycles.
How does pricing compare to previous models? At $1 per million input tokens and $5 per million output tokens, Haiku 4.5 offers 33% cost reduction compared to Sonnet 4's pricing structure while delivering comparable technical capabilities.
What agentic coding frameworks does it support? The model works natively with Terminus 2 for terminal operations and OSWorld-Verified frameworks, supporting 100-step action sequences with 128K-token thinking budgets per step. Developers can implement custom agents using Anthropic's updated XML parser specifications.

The fastest, most affordable coding model