Qwen 3.7 Max
Qwen 3.7 Max is Alibaba's flagship agent-tuned model in the Qwen 3.7 line, with a context window of 991K tokens and an emphasis on long-horizon tool use, multi-file coding, and office workflow automation.
import { streamText } from 'ai'
const result = streamText({ model: 'alibaba/qwen3.7-max', prompt: 'Why is the sky blue?'})Playground
Try out Qwen 3.7 Max by Alibaba. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.
More models by Alibaba
| Model |
|---|
About Qwen 3.7 Max
Qwen 3.7 Max is the Max-tier release in the Qwen 3.7 generation, succeeding Qwen3.6-Max-Preview in Alibaba's closed-weight API line. The model is served through alibaba with a context window of 991K tokens, which suits full-repository ingestion, long agent traces, and multi-document analysis without segmentation.
Alibaba describes Qwen 3.7 Max as designed as an agent foundation. The model targets coding agents that plan and act across many turns, office and productivity tasks that route work through multi-agent orchestration, and long-horizon autonomous execution where the model must maintain coherent reasoning across hundreds of sequential tool calls. Reported improvements over Qwen3.6-Max-Preview concentrate in frontend prototyping and complex multi-file engineering work.
Like other Max-tier entries, Qwen 3.7 Max supports tool calling and structured outputs, with extended-thinking mode available for high-difficulty reasoning, scientific computation, and expert-level queries. The thinking budget can be tuned per request to balance depth of reasoning against latency and token spend. Qwen 3.7 Max is text-only; for vision input, the sibling Qwen3.7-Plus is the multimodal entry in the 3.7 lineup.
You can integrate Qwen 3.7 Max through AI SDK, Chat Completions API, Responses API, Messages API, or other API formats, from TypeScript or Python.
What To Consider When Choosing a Provider
- Configuration: Agent workflows that chain hundreds of tool calls produce high output-token volume. Use the AI Gateway cost dashboard to monitor per-session spend and tune the thinking budget before running production traffic at scale.
- Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Qwen 3.7 Max
Best For
- Long-Horizon Coding Agents: Sustained tool-calling sessions across many turns with planning, retries, and dead-end recovery
- Multi-File Software Engineering: Refactoring, diff editing, and frontend prototyping across a repository
- Office Workflow Automation: Routing productivity tasks through multi-agent orchestration
- Expert Reasoning Tasks: Scientific computation, mathematics, and structured analysis with extended-thinking mode
- Repository Ingestion: Long-context workloads using the window of 991K tokens for full codebases and tool traces
Consider Alternatives When
- Vision Or Multimodal Input: Qwen3.7-Plus is the multimodal entry in the 3.7 line when image inputs are needed
- Latency-Sensitive Pipelines: A Plus or Flash-tier model serves users better when extended-thinking traces add unnecessary overhead
- Strict Token Budgets: A smaller model is a closer fit when per-session spend on a flagship Max model isn't justified
- Built-In Autonomous Search: Qwen3-Max-Thinking is a stronger fit when integrated search and code interpreter tools are required
Conclusion
Qwen 3.7 Max extends the Qwen Max tier with an agent-first design that targets long-horizon tool use, multi-file coding, and office workflow automation. Routing through AI Gateway gives you a single integration surface, provider failover, and consolidated billing while you build against the latest generation in the Max line.
Frequently Asked Questions
What is the relationship between Qwen 3.7 Max and Qwen3.6-Max-Preview?
Qwen 3.7 Max is the Max-tier flagship in the Qwen 3.7 generation, succeeding Qwen3.6-Max-Preview. Alibaba positions Qwen 3.7 Max as an agent foundation with improvements in long-horizon tool use, multi-file coding, and office workflow automation.
What is the context window for Qwen 3.7 Max?
The context window is 991K tokens. This supports full-repository ingestion, long agent traces with hundreds of sequential tool calls, and multi-document analysis without segmentation.
Can Qwen 3.7 Max accept image or file inputs?
Qwen 3.7 Max is text-only. For vision input within the 3.7 line, use Qwen3.7-Plus, which is the multimodal entry in the generation.
Does Qwen 3.7 Max support tool calling?
Yes. Qwen 3.7 Max supports structured tool calling and is tuned for agent workflows that chain many sequential tool invocations across long-horizon sessions.
How does extended-thinking mode work on Qwen 3.7 Max?
Extended-thinking mode generates an internal reasoning trace before producing the final answer, which improves accuracy on high-difficulty logical reasoning, scientific computation, and expert-level queries. The thinking budget is tunable per request to balance depth against latency and token spend.
How do I access Qwen 3.7 Max through AI Gateway?
Authenticate with an AI Gateway API key or OIDC token and reference `
alibaba/qwen3.7-max` as the model. You can call Qwen 3.7 Max through AI SDK, Chat Completions API, Responses API, Messages API, or other API formats, from TypeScript or Python.Does Qwen 3.7 Max support zero data retention?
Zero Data Retention is not currently available for this model. Zero Data Retention is offered on a per-provider basis. See https://vercel.com/docs/ai-gateway/capabilities/zdr for details.
Where can I see live latency and cost data for Qwen 3.7 Max?
This page shows live throughput, time-to-first-token, and pricing metrics for Qwen 3.7 Max measured across real AI Gateway traffic.