Blog

SambaCloud Now Supports the Anthropic Messages API

By SambaNova

July 1, 2026

Your Claude Code just got a new place to run. SambaCloud now supports Anthropic's Messages API, so you can point the Anthropic SDK at SambaCloud's fast inference and keep the format you already know. Because the Messages API is built for the tool-calling loop that powers agents, agentic workflows like coding assistants, autonomous tools, and anything that acts rather than just answers, it can now run on SambaCloud too.

TL;DR

SambaCloud now supports the Anthropic Messages API, allowing developers to use the Anthropic SDK with SambaCloud's fast inference and no rewrite required.
MiniMax M2.7 is also available via the Messages API on SambaCloud today.
The Messages API can be used within Claude Code by setting three environment variables: base URL, API key, and model name.
Limitations to note: Server-side tools, PDF document blocks, and image URLs are not supported.
To get started, grab an API key from SambaNova, install the Anthropic SDK, and point the base URL at SambaCloud.

Now in Claude Code

You can use MiniMax-M2.7 within Claude Code through the Messages API, running your agentic coding workflows on SambaCloud with the terminal and commands you already know. And if you want to go further, the SambaNova Claude Code plugin adds complementary functionality on top.

The plugin adds skills for managing your SambaNova models and handing off work. Fire /code to delegate a task like build-and-test, code review, a second opinion to Continue or OpenCode on the model you pick. Each run stays isolated, so your existing setup is left untouched.

Set the base URL and key before launching Claude Code:

export ANTHROPIC_BASE_URL="https://api.sambanova.ai"
export ANTHROPIC_API_KEY="your-sambanova-api-key"
export ANTHROPIC_MODEL="MiniMax-M2.7"

Built for Agents, Not Just Answers

Compared to the Chat Completions style most people start with, the Messages API rethinks the response as a list of typed blocks rather than a single string. That one change is what makes tool calls, reasoning, and the agent loop first-class instead of bolted on. The table below breaks down how the two compare.

Messages API vs. Chat Completions (at a Glance)

Aspect	Anthropic Messages API	OpenAI Chat Completions
Response shape	content[] — list of typed blocks	choices[0].message.content — string
System prompt	top-level system param	role: "system" message
max_tokens	required	optional
Tool definition	flat: name, description, input_schema	nested under function: parameters
Tool request	tool_use block in content[]	tool_calls field on the message
Tool arguments	input — parsed object	arguments — JSON string
Tool result	tool_result block in a user message	role: "tool" message
Result references call by	tool_use_id	tool_call_id
Stop signal	stop_reason	finish_reason

The Agent Loop, Built into the Format

The agent loop is the repeating cycle of model output, tool execution, and result return that powers autonomous AI workflows.

Here's what makes this exciting. These differences all trace back to one choice: The Messages API models a response as a list of typed blocks, not a single string. That turns a chat API into a substrate for agents — the model emits a tool_use block, your code runs it and hands back a tool_result block, and the model picks up where it left off. Each step is just another block in the same stream, paired by ID. You append, resend, repeat.

That's the entire engine behind every agent you've used. A coding assistant that edits files, runs tests, and fixes what broke isn't doing anything new — it's this loop, running over and over until the work is done. Because the loop is baked into the format rather than bolted on, building an agent on the Messages API feels less like hooking up plumbing and more like the API was waiting for you to do it.

Add SambaCloud in 3 Lines

The example uses MiniMax-M2.7, an open-weights model that pairs well with the Messages API's agentic strengths: It runs fast on SambaCloud's hardware, keeps costs low, and performs strongly on the multi-step, tool-calling work the loop above is built for. Pointing existing code at it requires no rewrite and no new SDK — only the base URL, API key, and model name change. For a deeper look at what the model can do, see the guide to self-evolving agents.

import anthropic

client = anthropic.Anthropic(
    base_url="https://api.sambanova.ai/v1",
    api_key="your-sambanova-api-key"
)

message = client.messages.create(
    model="MiniMax-M2.7",
    max_tokens=4096,
    messages=[{"role": "user", "content": "Hello from the Messages API"}]
)

That's it. Your existing Anthropic-based apps can run on SambaCloud with minimal changes.

Key Advantages Using Messages API

Typed content blocks — Responses come back as text, tool_use, and thinking blocks, just like you'd expect.
Thinking, automatically — Reasoning-capable models surface their thinking block with no extra parameters.
Tool-calling — Define your functions, get a tool_use block, return a tool_result. The familiar loop.
Structured streaming — Typed SSE events from message_start to message_stop.
Token counting — Estimate cost before you send with the count_tokens endpoint.

Full request / response details, plus examples for streaming, thinking, tool-calling, and multi-turn conversations, are in the Messages API docs.

Anthropic Messages API Limitations

Before you migrate, note the boundaries: Server-side tools (web search, code execution) aren't supported, PDF document blocks aren't either, and images must be base64, not URLs. Bring your own client-executed tools and you're set.

Get started: Grab a key from the SambaNova API keys page, pip install anthropic, and follow the Anthropic compatibility guide to point the SDK at SambaCloud.

FAQs

The Anthropic Messages API is a developer interface that structures model responses as a list of typed content blocks rather than a single string, making tool calls, reasoning, and the agent loop native to the format rather than bolted on top.

The key differences are in response shape, system prompt handling, tool definition, and stop signals. The Messages API returns a content array of typed blocks; Chat Completions returns a string. Tool definitions are flat in the Messages API versus nested under a function parameter in Chat Completions. Tool arguments come back as a parsed object in the Messages API and as a JSON string in Chat Completions.

Yes. SambaCloud now supports the Anthropic Messages API, meaning you can point the Anthropic SDK at SambaCloud's inference infrastructure and use the same format you already know.

Yes. You can run MiniMax-M2.7 within Claude Code through the Messages API on SambaCloud. You set the base URL, API key, and model name as environment variables before launching Claude, with no rewrite or new SDK required.

MiniMax-M2.7 is available. It is an open-weights model that runs fast on SambaCloud's hardware, keeps costs low, and performs well on multi-step, tool-calling workflows.

Responses come back as text, tool_use, and thinking blocks. Reasoning-capable models surface their thinking block automatically, with no extra parameters needed.

Yes. Server-side tools such as web search and code execution are not supported. PDF document blocks are not supported either. Images must be provided as base64, not as URLs.

Grab an API key from the SambaNova API keys page, install the Anthropic SDK, and follow the Anthropic compatibility guide to point the SDK at SambaCloud.

← Gemma 4 31B Running Fastest on SambaCloud

Understanding Disaggregated Inference →