Skip to content

Claude Opus 4.8

Claude Opus 4.8 targets long-horizon agentic execution, complex multi-step coding including mid-task refactors, and knowledge-work writing with clearer, less hedgy prose. It is available on AI Gateway via anthropic, bedrock, vertexAnthropic with adaptive thinking and configurable effort.

Tool UseReasoningVision (Image)File InputExplicit CachingWeb Search
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'anthropic/claude-opus-4.8',
prompt: 'Why is the sky blue?'
})

Playground

Try out Claude Opus 4.8 by Anthropic. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Anthropic
Legal:Terms
Privacy
1M
3.3s
86tps
$5.00/M$25.00/M
Read:$0.5/M
Write:
$6.25/M
$10/K
+ input costs
+4
05/28/2026
Amazon Bedrock
Legal:Terms
Privacy
1M
1.3s
96tps
$5.00/M$25.00/M
Read:$0.5/M
Write:
$6.25/M
$10/K
+ input costs
+4
05/28/2026
Google Vertex AI
Legal:Terms
Privacy
1M
4.5s
87tps
$5.00/M$25.00/M
Read:$0.5/M
Write:
$6.25/M
$10/K
+ input costs
+4
05/28/2026
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by Anthropic

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
1.2s
94tps
$5.00/M$25.00/M
Read:$0.5/M
Write:
$6.25/M
$10/K
+ input costs
+4
anthropic logo
bedrock logo
vertexAnthropic logo
04/16/2026
1M
0.7s
51tps
$3.00/M$15.00/M
Read:$0.3/M
Write:
$3.75/M
$10/K
+ input costs
+4
anthropic logo
bedrock logo
vertexAnthropic logo
02/17/2026
1M
0.8s
60tps
$5.00/M$25.00/M
Read:$0.5/M
Write:
$6.25/M
$10/K
+ input costs
+4
anthropic logo
bedrock logo
vertexAnthropic logo
02/05/2026
200K
0.5s
109tps
$1.00/M$5.00/M
Read:$0.1/M
Write:
$1.25/M
$10.00/K
+ input costs
+3
anthropic logo
bedrock logo
vertexAnthropic logo
10/15/2025
1M
0.6s
58tps
$3.00/M
$15.00/M
Read:
$0.3/M
Write:
$3.75/M
$10.00/K
+ input costs
+3
anthropic logo
bedrock logo
vertexAnthropic logo
09/29/2025
200K
0.7s
51tps
$5.00/M$25.00/M
Read:$0.5/M
Write:
$6.25/M
$10.00/K
+ input costs
+3
anthropic logo
bedrock logo
vertexAnthropic logo
11/24/2024

About Claude Opus 4.8

Claude Opus 4.8 became available on AI Gateway on May 28, 2026. Anthropic positions it for long-horizon agentic execution: complex, multi-step coding workflows where the model maintains coherence across many turns, including refactors that previously required human correction mid-task.

For knowledge-work writing, Claude Opus 4.8 produces clearer, less hedgy prose. Anthropic highlights drafting, data analysis, and building presentations as workflows that benefit from this style. Pair it with structured tool calls or memory files in your agent pipeline to keep extended sessions on track.

The sample integration uses adaptive thinking and configurable effort. Set thinking to { type: 'adaptive' } and effort to 'high' (or another level) under providerOptions.anthropic to control how deeply Claude Opus 4.8 reasons on each request.

Through AI Gateway, Claude Opus 4.8 is available with the standard unified API, observability, retries, and failover across anthropic, bedrock, vertexAnthropic. Set the model to anthropic/claude-opus-4.8 in the AI SDK, Chat Completions API, Responses API, Messages API, or other API formats, from TypeScript or Python. AI Gateway mirrors provider pricing with no markup and adds no platform fee on inference, including for Bring Your Own Key requests.

What To Consider When Choosing a Provider

  • Configuration: Adaptive thinking lets Claude Opus 4.8 scale reasoning per request. Pair it with the effort parameter when running mixed workloads so the model spends compute where it pays off and stays efficient on simpler turns.
  • Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Claude Opus 4.8

Best For

  • Long-horizon agentic execution: Multi-step pipelines where the model maintains coherence across many turns and tool calls
  • Complex multi-step coding: Refactors and feature work that previously required human correction mid-task
  • Knowledge-work writing: Drafting, data analysis, and presentation building where clearer, less hedgy prose matters
  • Adaptive thinking workloads: Mixed-complexity requests where letting Claude Opus 4.8 calibrate reasoning depth per turn is more efficient than fixed budgets
  • Production agent infrastructure: Workflows that benefit from AI Gateway routing across anthropic, bedrock, vertexAnthropic with retries and failover

Consider Alternatives When

  • Cost-sensitive high-volume traffic: Claude Sonnet 4.6 offers Opus-approaching intelligence at Sonnet pricing
  • Latency-critical interactive use: Claude Haiku 4.5 is faster and cheaper on well-bounded high-throughput requests
  • Fixed thinking budget required: Claude Opus 4.6 accepts a fixed thinking budget when you need deterministic compute per request

Conclusion

Claude Opus 4.8 extends the Opus line toward long-horizon execution and clearer knowledge-work output. Use it for agent pipelines that run for many turns, refactors that previously required human handholding, and writing tasks where less hedging produces more useful drafts.

Frequently Asked Questions

  • What is Claude Opus 4.8 built for?

    Anthropic positions Claude Opus 4.8 for long-horizon agentic execution: complex, multi-step coding work and knowledge-work writing where the model needs to maintain coherence across many turns and produce clearer prose.

  • How do I configure adaptive thinking and effort for Claude Opus 4.8 in the AI SDK?

    Set the model to anthropic/claude-opus-4.8. Under providerOptions.anthropic, set thinking to { type: 'adaptive' } and pass an effort level (for example, 'high').

  • Which providers back Claude Opus 4.8 on AI Gateway?

    AI Gateway routes Claude Opus 4.8 through anthropic, bedrock, vertexAnthropic, with retries and automatic failover.

  • How is Claude Opus 4.8 useful for coding agents?

    Anthropic highlights complex multi-step coding work, including refactors that previously required human correction mid-task. Claude Opus 4.8 is suited for agent pipelines that span many turns and tool calls.

  • What knowledge-work tasks does Claude Opus 4.8 target?

    Drafting, data analysis, and building presentations. Anthropic describes the prose output as clearer and less hedgy, which makes drafts more useful as a starting point.

  • Does AI Gateway support Zero Data Retention for Claude Opus 4.8?

    Yes, Zero Data Retention is available for this model. Zero Data Retention is offered on a per-provider basis. See https://vercel.com/docs/ai-gateway/capabilities/zdr for details.

  • What is the pricing for Claude Opus 4.8?

    AI Gateway mirrors provider pricing with no markup and adds no platform fee on inference. Rates are listed on this page and shift when providers update their pricing.

  • How do I call Claude Opus 4.8 through AI Gateway?

    Set the model to anthropic/claude-opus-4.8 in the AI SDK, Chat Completions API, Responses API, Messages API, or other API formats, from TypeScript or Python. AI Gateway handles authentication, retries, and failover across anthropic, bedrock, vertexAnthropic.