Claude Model Tiers: Haiku vs Sonnet vs Opus vs Capybara
Anthropic organizes its Claude models into four tiers, each designed for different performance needs and budgets. From Haiku at the bottom to the newly revealed Capybara at the top, every tier makes a specific trade-off between speed, intelligence, and cost. Our what Capybara means guide explores this in depth.

This guide covers all four tiers with current pricing, real benchmark numbers, and clear recommendations for when to use each one. Whether you are building a chatbot, coding assistant, or enterprise security tool, there is a Claude tier optimized for your use case.
The Four Tiers at a Glance
Here is how the complete Claude lineup compares across the dimensions that matter most for choosing the right model:
| Haiku 4.5 | Sonnet 4.6 | Opus 4.6 | Capybara (Mythos) | |
|---|---|---|---|---|
| Speed | Fastest | Fast | Moderate | Slowest |
| Reasoning | Basic | Strong | Excellent | Breakthrough |
| Coding | Good | Strong | Excellent | Dramatically better |
| Cybersecurity | Limited | Moderate | Good | Far ahead of all AI |
| Context Window | 200K tokens | 200K tokens | 200K tokens | Unknown |
| Input $/M tokens | $0.80 | $3.00 | $15.00 | TBD (~$30-75) |
| Output $/M tokens | $4.00 | $15.00 | $75.00 | TBD (~$150-375) |
| Status | Available | Available | Available | Early testing |
The pricing column shows a consistent ~5x multiplier between tiers: Haiku to Sonnet (~4x), Sonnet to Opus (5x). Capybara is expected to follow the same pattern at 2-5x Opus pricing.
Haiku 4.5: The Speed Tier
Claude Haiku 4.5 is designed for one thing: getting answers fast and cheap. It sacrifices depth of reasoning for response speed and low per-token cost.
What Haiku Does Well
Haiku handles text classification, content tagging, sentiment analysis, and simple summarization with sub-second response times. When your application needs to process thousands of requests per minute without deep analysis, Haiku delivers.
It is also effective for real-time chatbot responses where conversational fluency matters more than expert-level reasoning. Customer service bots, FAQ responders, and interactive UIs benefit from Haiku’s speed profile.
Where Haiku Falls Short
Complex multi-step reasoning defeats Haiku. It struggles with nuanced code debugging, architectural decisions, and problems requiring synthesis across multiple domains. If your task needs the model to “think hard,” Haiku is the wrong choice.
Haiku Pricing
At $0.80 per million input tokens and $4.00 per million output tokens, Haiku is by far the cheapest Claude tier. With prompt caching, input costs drop to $0.10/M for cached tokens — making it nearly free for repeated prompt patterns.
Sonnet 4.6: The Balanced Tier
Claude Sonnet 4.6 is the tier most developers should default to. It offers the best ratio of capability to cost in the Claude lineup.
What Sonnet Does Well
Sonnet handles everyday programming assistance with strong accuracy — writing functions, debugging code, explaining complex logic, and generating documentation. For content creation tasks like drafting articles, editing copy, and analyzing data into reports, Sonnet performs close to Opus quality at one-fifth the price.
It also handles medium-complexity reasoning tasks reliably: analyzing datasets, comparing options with trade-offs, and synthesizing information from multiple sources. For most professional use cases, Sonnet’s output quality is indistinguishable from Opus.
Where Sonnet Falls Short
Sonnet shows its limits on genuinely hard problems: proving mathematical theorems, designing novel algorithms, and reasoning through deeply nested logical chains. On the ARC-AGI-2 benchmark (which tests novel reasoning), the gap between Sonnet and Opus is significant.
Sonnet Pricing
At $3.00 per million input tokens and $15.00 per million output tokens, Sonnet costs roughly 4x more than Haiku but 5x less than Opus. Extended thinking mode costs $12/M for thinking tokens, giving access to deeper reasoning when needed.
Opus 4.6: The Flagship Tier
Claude Opus 4.6 was Anthropic’s most capable model until the Capybara tier was revealed. It remains the best publicly available option for complex professional work.
What Opus Does Well
Opus excels at problems that require sustained, deep reasoning. Complex code architecture decisions, multi-file refactoring across large codebases, and debugging subtle concurrency issues — these are tasks where Opus outperforms Sonnet consistently.
For research tasks, Opus handles long-document analysis across its 200K+ token context window, synthesizing information from multiple papers or reports into coherent analysis. Its GPQA Diamond score of 91.31% (graduate-level science reasoning) leads all publicly available models by over 3 points.
Benchmark Performance
| Benchmark | Opus 4.6 Score | What It Measures |
|---|---|---|
| SWE-Bench Verified | 80.8% | Real-world coding (GitHub issues) |
| Terminal-Bench 2.0 | 65.4% | Terminal/CLI operations |
| GPQA Diamond | 91.31% | Graduate-level science reasoning |
| ARC-AGI-2 | 68.8% | Novel reasoning and abstraction |
| OSWorld | 72.7% | Computer use and UI interaction |
Where Opus Falls Short
Opus is expensive for simple tasks. Running text classification through Opus wastes money that Haiku handles equally well at 1/19th the cost. Opus is also slower than Sonnet and Haiku, with response latency that matters for real-time applications.
Opus Pricing
At $15.00 per million input tokens and $75.00 per million output tokens, Opus is a premium tier. Batch API processing provides a 50% discount for non-real-time workloads, bringing effective costs closer to Sonnet levels for background tasks.
Capybara (Mythos): The Breakthrough Tier
The Capybara tier was revealed through an accidental data leak on March 26, 2026. Claude Mythos is its first model, and Anthropic calls it “the most capable we’ve built to date.”
What Makes Capybara Different
Capybara is not a bigger Opus. The leaked documentation describes it as a “step change” — a qualitative leap rather than incremental improvement. Where each previous tier improved on the last by degree, Capybara introduces capabilities that didn’t exist before.
The six capabilities identified in leaked documents are: automated vulnerability discovery, ultra-difficult multi-step reasoning, large codebase refactoring, enterprise security audits, advanced agent workflows, and zero-day identification.
Cybersecurity: The Defining Capability
Anthropic states Capybara is “currently far ahead of any other AI model in cyber capabilities” — a claim strong enough to crash cybersecurity stocks (CrowdStrike dropped ~7%, Palo Alto Networks ~6%). This capability is why the Capybara tier exists separately from Opus and why its release strategy prioritizes cyber defenders over general availability.
When You’ll Be Able to Use It
Capybara is in restricted early testing with cybersecurity defense organizations. No public access exists yet. The most likely general availability window is late 2026, potentially October — aligned with Anthropic’s expected IPO.
Expected Pricing
No official pricing, but the tier structure suggests $30-75 per million input tokens and $150-375 per million output tokens. The leaked docs confirmed it is “expensive to run.”
How to Choose the Right Tier
The decision framework is straightforward: use the cheapest tier that handles your task reliably.
Use Haiku when your task has a simple, factual answer and speed or cost is the priority. Text classification, content moderation, chatbot responses, data extraction from structured documents.
Use Sonnet when you need good reasoning at reasonable cost. Programming assistance, content creation, data analysis, most business tasks. This should be your default unless you have a specific reason to go higher or lower.
Use Opus when the quality of reasoning directly impacts the value of output. Architecture decisions, deep research, mathematical proofs, problems with subtle edge cases. If a wrong answer is expensive and the problem is genuinely hard, Opus is worth the premium.
Use Capybara when you need capabilities that don’t exist in other tiers. Automated security auditing, frontier reasoning problems, large-scale agent workflows requiring high autonomy. Most users will not need this tier for everyday work.
API: Switching Between Tiers
All four tiers use the same Claude API interface. Switching between tiers requires changing only the model parameter:
# Haiku - fastest, cheapest
model = "claude-haiku-4-5-20251001"
# Sonnet - balanced (recommended default)
model = "claude-sonnet-4-6-20250514"
# Opus - complex tasks
model = "claude-opus-4-6-20250205"
# Capybara - when available
model = "claude-capybara-mythos" # placeholder
No code changes, no SDK updates, no migration. Build your application with Sonnet today and upgrade specific use cases to Opus or Capybara as needed.
Questions About Claude Model Tiers
Which Claude tier is best for coding?
Sonnet 4.6 handles most coding tasks well at a reasonable price. For complex architecture decisions and large refactoring jobs, Opus 4.6 is worth the premium. Capybara is expected to dramatically exceed both when it becomes available.
What is the cheapest Claude model?
Haiku 4.5 at $0.80 per million input tokens and $4.00 per million output tokens. With prompt caching, cached input tokens drop to $0.10 per million.
Is Opus 4.6 worth the price over Sonnet?
For complex reasoning tasks, yes. Opus scores 68.8% on ARC-AGI-2 versus Sonnet’s lower score, and leads on graduate-level science reasoning. For everyday coding and writing, Sonnet provides comparable quality at one-fifth the cost.
How many Claude tiers exist?
Four tiers as of March 2026: Haiku (speed), Sonnet (balanced), Opus (flagship), and Capybara (breakthrough). Capybara was revealed through a data leak and is not yet publicly available.
Can I mix Claude tiers in one application?
Yes. Many applications use tiered routing — sending simple requests to Haiku, standard work to Sonnet, and complex tasks to Opus. The unified API makes this straightforward to implement.
When will Capybara pricing be announced?
No official date. It will likely coincide with general availability, expected late 2026. Community estimates range from 2-3x to 4-5x Opus pricing.
