Blog

DeepSeek-V3.1 Is Live on SambaCloud

by SambaNova

August 25, 2025

Vibe code with lightning-fast DeepSeek-V3.1 on SambaCloud!

Just last week, DeepSeek dropped their latest model: DeepSeek-V3.1. Launching today, DeepSeek-V3.1 is now available for developers on SambaCloud — running over 200 tokens/second!

DeepSeek-V3.1 shares the same architecture as the original DeepSeek R1 that caused waves earlier in the year and has been available on SambaCloud since February. In this update, the model now introduces a hybrid thinking mode, which enables developers to switch between reasoning and non-reasoning modes (see code snippet below). According to Artificial Analysis, both modes of this model show significant improvements in intelligence and in particular show major improvements in coding capabilities!

AA DeepSeek_V3.1 Benchmark

Why DeepSeek-V3.1?

DeepSeek-V3.1 shows significant improvements across various benchmarks compared to prior iterations of both the R1 and V3 models for thinking and non-thinking modes, respectively. On the Aider polyglot benchmark, which specializes in measuring a model's ability to write and edit code, the V3.1 thinking model scored 76.3 compared to Claude Opus 4 at 72. In short, DeepSeek-V3.1 delivers much better performance for a fraction of the price.

Benchmark Metrics	DeepSeek V3.1-NonThinking	DeepSeek V3 0324	DeepSeek V3.1-Thinking	DeepSeek 0528
LiveCodeBench (2408-2505)	56.4	43.0	74.8	73.3
Aider-Polyglot (Acc.)	68.4	55.1	76.3	71.6
SWE Verified (Agent mode)	66.0	45.4	*	44.6
SWE-bench Multilingual (Agent mode)	54.5	29.3	*	30.5
Terminal-bench (Terminus 1 framework)	31.3	13.3	*	5.7

* Only DeepSeek V3.1 Non-Thinking supports function calling, which is why there is no information for these benchmarks. Despite this, DeepSeek-V3.1 Non-Thinking scores better than DeepSeek R1-0528 on these benchmarks.

These benchmarks highlight the model's significant improvements in coding and suggest that it is best suited for coding agents like in the case of Blackbox or integrating into coding assistants like Cline. Moreover, DeepSeek has improved the function calling capabilities of this model in non-thinking mode, which make it even better for use with agentic frameworks like CrewAI.

Best of all, as an open-source model, developers and enterprises have the flexibility to privately and securely deploy these models on the hardware and in the locations that best suit their applications. SambaCloud servers are based in the U.S. and Japan, so no data is sent to China when using the model in our cloud. With SambaNova, developers can start building on our cloud today and scale to meet their customers' requirements with hardware deployed on-premises or even at the edge. Moreover, developers save more by running open-source models at a fraction of the price compared to closed source alternatives.

Model	Input Price per Million Tokens	Output Price per Million Tokens
DeepSeek-V3.1 on SambaCloud	$3*	$4.50*
Claude Opus 4.1	$15	$75
Grok 4	$3	$15
OpenAI o3	$3	$15

* Custom pricing options are available for Enterprise tier offerings.

Start building with relentless intelligence in minutes on SambaCloud

SambaCloud is a powerful platform that enables developers to easily integrate the best open-source models with the fastest inference speeds. Powered by our state-of-the-art AI chip, the SN40L, SambaCloud provides a seamless and efficient way to build AI applications with fast inference. Get started today and experience the benefits of fast inference speeds, maximum accuracy, and an enhanced developer experience, in just three easy steps!

sambanova_Hardware_Chip2.1_1000x1000_72dpi_NoBG

Head over to SambaCloud and create your own account
Get an API Key
Make your first API call with our open AI compatible API

curl -H "Authorization: Bearer $API_KEY" \
   -H "Content-Type: application/json" \
   -d '{
    "stream": true,
    "chat_template_kwargs" : { "enable_thinking": true },
    "model": "DeepSeek-V3.1",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant"
        },
        {
            "role": "user",
            "content": "Hello"
        }
    ]
    }' \
   -X POST https://api.sambanova.ai/v1/chat/completions

To learn more about how to use the model and how to use it in various frameworks, developers can read more on our documentation portal. This includes documentation using function calling with integrations into coding assistants like Cline.

← Supercharging AI Agents with Function Calling on DeepSeek!

SambaNova vs. Cerebras: The Ultimate AI Inference Comparison →